On Jul 20, 2011, at 6:32 AM, Didier Verna wrote:
> Right now, I would like to know if any of you have DSL "pearls", nice
> examples of DSLs that you have written in Lisp by using some of its
> features in a clever or elegant way. I would also gladly accept any
> point of view or comment on what's important to mention, in terms of
> design principle or anything else, things that I may have missed in the
> list above.


One perspective that I haven't seen addressed yet is:
know your audience and related concepts.

Last year, I wrote a tool for processing large quantities of dirty .csv files: 
more than can fit into RAM, multiple files comprising a single dataset, column 
sequence changing across files, junk in cell values, different number of 
columns across files, etc.  And extremely short turn-around time.

The idea was to process this data without having to clean it up first, such 
that one might gain some insights (sums, uniques, bucketing, etc.) helpful in 
determining whether or not the data was of any value or gain clues on how to 
proceed further.  Proper database ETL would be too time consuming.


My intended audience was a statistical analyst that favored mousing around 
Excel spreadsheets while letting prior SAS training go unused because of not 
wanting to write scripts.  (That should have been the first clue to be 
concerned, but oh, what we do for friends!)

Ultimately, the tool was useful to me, and that was sufficient to meet 
deadlines.


But the DSL code became overly complex due to this extraneous design criteria 
of accommodating someone who doesn't want to write scripts in the first place.  

Instead of a few simple macros such as WITH-MESSY-CSV appropriate to a Lisp 
programmer, I effectively created a register machine with "simple" commands: 
include/exclude filtering, sum, count, unique, look-up, group-by, etc.  

Otherwise, the approach was sound enough: load relatively small .csv files as 
look-up tables and iterate over the entire dataset in a single pass, applying 
lexically scoped blocks of filtering & calculations.  Convert only named 
columns of interest regardless of position changes across files, parse numbers 
on demand only for those operations that required it, skip unparseable rows as 
last resort, etc.  Some error in results-- but some results with reduced 
confidence are better than none in this case.


Lessons learned:  (a few more while I'm here)

  1. Know your audience, and build for the correct users.

  2. Build the right tool.  (I'm a systems programmer; a good stats person 
would likely have come up with a better work-flow, likely using R so rich 
reports could also be generated quickly.)

  3. Good language design can be challenging.  I would have been better off 
(perhaps) stealing SQL or XQuery's FLOWR conventions than inventing my own 
"simple" set of commands.  (Syntax is another matter... as you know.)
  
  4. Being adept at backquotes, comma substitution and unrolling lists is not 
necessarily enough skill to create a good, clean DSL implementation.  But keep 
trying.  Do your best to make one for "keeps".  Then throw it away, anyway.  
It's important to not hold anything back in the first version.  Ah, experience! 
 (I'll likely go at this one again just for the fun of it.)  
e.g., unrelated project from years ago: http://play.org/learning-lisp/html.lisp

  5. Collaborate: Get input from others.  My co-workers who also use Common 
Lisp were many time-zones and an ocean away, busy with looming deadlines of 
their own. However, their 10 years CL experience to my 5 (and their far deeper 
stats familiarity) would certainly have helped here.

-Daniel

--
first name at last name dot com

 


_______________________________________________
pro mailing list
pro@common-lisp.net
http://lists.common-lisp.net/cgi-bin/mailman/listinfo/pro

Reply via email to