I have some numbers on this. I have a list of first names for males from the census data. I have a function 'male :: String -> Maybe Double' which returns (maybe) the probability of a person being given that name. I have two versions, one based on string matching, the other based on building the data into a trie and then converting the trie into haskell source and the using that to match). For example, we have:
> module Male where > > male "james" = Just 3.318 > male "john" = Just 3.271 > male "robert" = Just 3.143 > male "michael" = Just 2.629 > male "william" = Just 2.451 > male "david" = Just 2.363 > male "richard" = Just 1.703 > ... > male _ = Nothing and in the other version, we have > male = male_start > where > male_start ('a':xs) = male_66 xs > male_start ('b':xs) = male_67 xs > male_start ('c':xs) = male_68 xs > male_start ('d':xs) = male_69 xs > ... > male_start _ = Nothing > male_66 ('a':xs) = male_66_66 xs > male_66 ('b':xs) = male_66_67 xs > male_66 ('d':xs) = male_66_69 xs > male_66 ('g':xs) = male_66_72 xs > ... Finally, I implemented a version which reads data into a finitemap. the original database contains 1219 names. i test this by taking all male names, all female names, randomizing them, and repeating this data 20 times. this leads to 109880 runs. i ran each implementation 5 times; the results are: using trie: 0.890u 0.020s 0:00.90 101.1% 0+0k 0+0io 327pf+0w 0.910u 0.000s 0:00.90 101.1% 0+0k 0+0io 327pf+0w 0.870u 0.030s 0:00.90 100.0% 0+0k 0+0io 327pf+0w 0.910u 0.020s 0:00.93 100.0% 0+0k 0+0io 327pf+0w 0.920u 0.020s 0:00.95 98.9% 0+0k 0+0io 327pf+0w using string-matching: 10.280u 0.060s 0:10.51 98.3% 0+0k 0+0io 280pf+0w 10.340u 0.030s 0:10.86 95.4% 0+0k 0+0io 279pf+0w 10.310u 0.040s 0:10.72 96.5% 0+0k 0+0io 281pf+0w 10.330u 0.040s 0:10.55 98.2% 0+0k 0+0io 280pf+0w 10.420u 0.020s 0:10.63 98.2% 0+0k 0+0io 280pf+0w for finitemap: 1.110u 0.020s 0:01.14 99.1% 0+0k 0+0io 195pf+0w 1.110u 0.010s 0:01.14 98.2% 0+0k 0+0io 195pf+0w 1.100u 0.030s 0:01.14 99.1% 0+0k 0+0io 195pf+0w 1.120u 0.010s 0:01.15 98.2% 0+0k 0+0io 195pf+0w 1.190u 0.010s 0:01.24 96.7% 0+0k 0+0io 195pf+0w so string-matching is terribly slow; using the finitemap is actually surprisingly fast, though still about 30% slower than the trie version. perhaps a better-optimized trie version would do better, but it's hard to say. On Mon, 23 Feb 2004, Simon Peyton-Jones wrote: > The trouble is that you probably *don't* want to expand this > case x of { "foogle" -> e1; _ -> e2 } > to this > > case x of > c1:x1 -> case c1 of > 'f' -> case x1 of > c2:x2 -> case c2 of > 'o' -> of .... > > So GHC generates a series of equality tests instead. A decent > alternative might be: > > generate case expressions when there is more > than one string in the list, otherwise use an equality test > > That would not be hard to do. If it becomes important to you, I'd have > a go. But before doing so, could you do the work by hand and see if it > makes a useful performance difference? > > Simon > > | -----Original Message----- > | From: [EMAIL PROTECTED] > [mailto:glasgow-haskell-users- > | [EMAIL PROTECTED] On Behalf Of Sven Panne > | Sent: 22 February 2004 15:32 > | To: John Meacham > | Cc: [EMAIL PROTECTED] > | Subject: Re: optimization question > | > | John Meacham wrote: > | > I was wondering if: > | > > | > case x of > | > "foo" -> Foo > | > "bar" -> Bar > | > "fuzz" -> Fuzz > | > "fuzo" -> Fuzo > | > x -> other .. thing > | > > | > would optimize to > | > > | > let z = other .. thing in > | > case x of > | > ('f':x) -> case x of > | > ('u':'z': x) -> > | > "z" -> Fuzz > | > "o" -> Fuzo > | > _ -> z > | > "oo" -> Foo > | > _ -> z > | > "bar" -> Bar > | > _ -> z > | > | String literals are handled in a special way in GHC, so your example > is > | essentially converted into an if-cascade, which is not what you want. > | OTOH, if you write the strings in their expanded form like > ['f','o','o'], > | you get your optimized version automatically. Perhaps Simon^2 can > comment > | on the rationale behind this, I can't remember the reason... > | > | Cheers, > | S. > | > | _______________________________________________ > | Glasgow-haskell-users mailing list > | [EMAIL PROTECTED] > | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users > _______________________________________________ > Glasgow-haskell-users mailing list > [EMAIL PROTECTED] > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users > -- Hal Daume III | [EMAIL PROTECTED] "Arrest this man, he talks in maths." | www.isi.edu/~hdaume _______________________________________________ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users