A recent Euler Problem, number 201, asks for the unique sums in the 50 element subsets of the set S = {1^2, 2^2, ... , 100^2}. It mentions that S has 100891344545564193334812497256 50-element subsets, so clearly calculating all combinations is not a viable approach. An approach that does work is to calculate all reachable sums by builing up a list of reachable sums, and counting in how may ways a particular sum can be reached. The base case is that with 0 numbers, the sum of 0 can be reached in only 1 way. Add 1 to this and we find that with 1 number, the sum of 1 can be reached in 1 way. Add 4 to this and we find that 1 and 4 can be reached with a sum of 1 elements, while 5 can be reached with a sum of 2 elements, namely 1+4. Including 9 gives us [1,4,9] can be reached with a sum of 1 element, [5,10,13] can be reached with a sum of 2 elements, and [14] can be reached with a sum of 3 elements. We proceed including elements of the set S, calculating the number of elements included in the sum, the sum itself, and the number of ways that the sum can be reached. After we have included all if the elements of S, we then scan the list for those sums which can be reached in only 1 way, and which require 50 elements to make.

I tried this approach, first using a Map, where the key is a list [number of elements, sum reached] and the value is the number of ways that the sum can be reached. This works great when the set S has a small number of elements, but bogs down quickly as the number of elements exceeds about 50. It never ran to completion on my machine when I set maxN equal to 100.

In the second approach I used arrays. The only tricky bit is that you have to run backwards through the /number of elements/ loop in order to avoid spurious elements being added. If you count forwards, then when 4 is added at ((1,4),1), then ((2,8),1) will be spuriously added by counting the just added 4 again at then next iteration.

So, now for my question. Running the array version is hundreds, if not thousands of times faster than the map version, even though the array version runs through the entire range of possible sums on the order of maxN^2 (10000) times, while the map version only process the actual sums that have been reached, which are always less than the range (maxSum). Is it just that creating and destoying maps is much slower than indexing through arrays? Any insights will be appreciated, and I will update this page and turn it into a tutorial as enlightenment is reached.

> {-# OPTIONS -O2 -optc-O #-} > > import Data.Map > import Data.List (foldl') > import Data.Array.ST > import Data.Array.IArray > import Control.Monad.ST.Strict > import Control.Monad > import System.CPUTime > > maxN = 30 > numN = maxN `div` 2 > sq = [ k^2 | k<-[1..maxN]] > maxSum = sum $ take numN $ reverse sq > > type SumMap = Map [Int] Int > > r0 :: SumMap > r0 = singleton [0,0] 1 > > includeN :: SumMap -> Int -> SumMap > includeN m n = > let > nextKey l s = [l+1,s+n] > fold1 (l:s:[]) a b = > if l < numN then > insert (nextKey l s) > ((findWithDefault 0 (nextKey l s) m) + a) b > else b > a1 = foldWithKey fold1 empty m > in union a1 m > > runSums :: SumMap > runSums = foldl' (\a b -> includeN a b) r0 sq > > singletons :: SumMap -> Integer > singletons m = > let fold2 k a b = if a == 1 && (k!!0) == numN > then b + (fromIntegral $ k!!1) else b > in foldWithKey fold2 0 m > > main1 = print $ singletons runSums > > ------------------------------------------------------------------- > > runArray = runSTUArray ( > do a <- newArray ((0,0),(numN,maxSum)) 0 :: ST s (STUArray s (Int,Int) Int) > writeArray a (0,0) 1 > reachableSums a > return a) > > reachableSums ar = do > forM_ [1 .. maxN] (\i -> do > let ii = i^2 > j1 = if i > numN then numN else i > forM_ [j1, (j1-1) .. 1] (\j -> do > forM_ [ii .. maxSum] (\k -> do > x <- readArray ar (j-1,k-ii) > when (x > 0) $ do > y <- readArray ar (j,k) > writeArray ar (j,k) (x+y)))) > > sumSingletons ar = > foldl' (\a b -> > if ar Data.Array.IArray.! (numN,b) == 1 > then a+(fromIntegral b) else a) 0 [l..u] > where ((_,l),(_,u)) = bounds ar > > > main2 = print $ sumSingletons runArray > main = do > start <- getCPUTime > main1 > withMap <- getCPUTime > main2 > withArray <- getCPUTime > let (mtime,atime) = (withMap-start ,withArray-withMap) > print (mtime,atime) > putStrLn $ "Array is " ++ show (mtime `div` atime) ++ " times faster than Map"Running this with ghc on my machine results in:

888770 888770 (7164447000000,56004000000) Array is 127 times faster than MapThis file is also available as an lhs file if you want to play with it.