> Cheating a little bit, now that the C solution has been made public,
> you just win with the C solution with much smaller data sets because
> all you need to do now is compile it :-) More seriously, it can be
> quickly adapted to do other relatively simple file fixing tasks, in
> particular the ones that require several primitives in Perl and/or
> Python but are at the core nature simple enough by doing some table
> lookups and moving around bytes accordingly. Each new primitive you
> have to call in a scripting language comes at a cost, while in C you
> are coding your own primitive that just gets slightly more complex, so
> that type of set up favors C heavily.

You're severely overestimating the cost of "scripting language"
primitives. Very few real programs spend much time in bits of code
where the tight loop instruction counts matter. Most scripting
languages have implemented those bits (like regular expression
matching, list manipulation, hash tables) in optimized C already.

C is a really stupid language to use for tasks like this.  That's why
the people who came up with C and worked with it a lot also came up
with the unix shells and nifty little languages like awk. If you want
to spend a lot of time writing and debugging C programs to increase
your C skills, that's fine, but I think most people here don't
actually need to program in C much, and would find their time better
spent honing more useful skills.  Those of us who make a living
writing C code get enough practice at work!

> Levi - I still would like to see some code in Haskell, even if it is
> dog slow and I have to do some work to get it to run on my box. I
> programmed in Scheme for a class back in 1994 (CS 330), I do remember
> writing a program in C to help me track down my mismatched braces, and
> writing a poem "Count your braces, count them one by one", but that is
> my extent of exposure to functional programming.

Here's a quick to write, slow to execute version in Haskell.  It's
primarily slow because it uses the Haskell String datatype, which is
implemented as a list of characters.  I don't really care that it's
slow, because it still executes in a fraction of a second.  It was
really easy to write despite the fact that I had to look up the file
IO stuff, because I rarely do file-based programming in haskell.

import System.IO

main = do
  handle <- openFile "/usr/share/dict/words" ReadMode
  hSetEncoding handle utf8
  hSetEncoding stdout utf8
  contents <- hGetContents handle
  putStr $ unlines $ map reverse $ lines contents

This also happens to deal correctly with multibyte characters, which
my /usr/share/dict/words has a few of.  I didn't bother reading the
file name from the command line, but it's not hard to do.

If you want to run it, install ghc and do ghc --make <filename.hs> to
compile it, or you can do runhaskell <filename.hs> to just run it with
the interpreter.

I may port it to one of the more efficient Haskell data types for
manipulating text just for fun, but I'm also in the middle of trying
to set up the toolchain for my Beaglebone Black, on which I'll be
hacking some kernel drivers and doing some low-level assembly code for
the Programmable Realtime Units on it, all of which is more
interesting than writing programs to reverse strings. :)

      --Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to