The problem of human vs computer readability resurfaced for me recently when planning a J paper for MagPi, the how-to journal for the Raspberry Pi. http://www.themagpi.com It took no imagination to predict the response of the average (computer-literate) reader on seeing J code for the first time. I hoped to forestall it with a ref to: "The alleged unreadability of J - and what to do about it" http://www.jsoftware.com/jwiki/Essays/unreadability which essentially covers the ground of this topic for the "educated layman".
Problem not solved, however. I belatedly realise that the article's reading age needs to be (techie) 14-18 for MagPi, whereas the aforementioned essay has a reading age of 50+ (and maybe 70+ :-\ ) On Tue, Jan 14, 2014 at 12:02 PM, Joe Bogner <joebog...@gmail.com> wrote: > I was curious to do some basic benchmarking. Posting here if others > are interested. Not sure if this should be moved to another thread. If > someone replies please do what is appropriate (same thread or new one) > > https://gist.github.com/joebo/61e0841fbf511e1aab4d > > The state machine implementation runs 1000 iterations in around 4 seconds > > Using powershell to benchmark on windows: > > PS C:\Users\Joe\downloads\j801\bin> Measure-Command { .\jconsole.exe > c:\joe\parse.ijs -js "goTime''" } > TotalMilliseconds : 4224.26 > > Dan's implementation is slightly faster > > PS C:\Users\Joe\downloads\j801\bin> Measure-Command { .\jconsole.exe > c:\joe\parse.ijs -js "go2Time''" } > TotalMilliseconds : 4070.153 > > The rosettacode PicoLisp implementation was quite a bit slower > > PS C:\Users\Joe\Downloads\picolisp3.1.4_cygwin\bin> Measure-Command { > .\picolisp.exe parse.l } > TotalMilliseconds : 5069.1331 > > Rewriting it to eliminate the piping sped it up > > PS C:\Users\Joe\Downloads\picolisp3.1.4_cygwin\bin> Measure-Command { > .\picolisp.exe parse2.l } > TotalMilliseconds : 3522.3157 > > Here is the final picolisp implementation: > > (de rdConf (File) > (in File > (until (eof) > (setq Line (line)) > (when (and Line (<> (car Line) "#") (<> (car Line) ";")) > (let (Length (length Line) > Index (index " " Line) > Key (pack (if Index (pack (head (- Index 1) Line)) Line)) > Value (if Index (pack (tail (- Length Index) Line)) T) ) > (set (intern Key) Value) ) ) ) ) ) > > (do 10000 (rdConf "c:/temp/test.conf")) > > (printsp (list FULLNAME FAVOURITEFRUIT NEEDSPEELING SEEDSREMOVED > OTHERFAMILY)) > (bye) > > ("Foo Barber" "banana" T NIL "Rhu Barber, Harry Barber") > > Speed probably isn't important in a typical implementation of this > task. I'm not posting the PicoLisp code or timings as a competition > but just as another reference point. > > > On Tue, Jan 14, 2014 at 12:25 AM, Raul Miller <rauldmil...@gmail.com> > wrote: > > Ok, I think I understand. > > > > The basic issue, here, seems to be that PicoLisp is stream oriented > > and this is a stream oriented task. No one in the J community has > > cared enough to build a stream oriented library for J. J has enough to > > do stream oriented design for academic purposes, but ... Consider > > xml/sax as an example of how one might approach streams in J - call > > out to some standardized implementation and instead focus on what is > > unique to the application. > > > > Meanwhile, for a J programmer, words like [: & @ [ and ] occupy a role > > not too different from parenthesis for a Lisp programmer. Parenthesis > > might seem simple, but in fact there are a fair number of contextual > > rules that one must learn before really understanding their > > significance. Do the parenthesis delimit a lambda definition? an > > argument list? Do they denote a function call? Some other special > > form? That, I think, is the issue you were focusing on when counting > > tokens - how many special rules does a person have to understand to > > parse the code. J has 9 parsing rules, each roughly at the same > > complexity level as a lisp-like lambda. Explicit contexts add a few > > more, though that's mostly syntactic sugar. > > > > Meanwhile, J is and is not fast. It can be fast, but only if you > > design your code using big, homogeneous data structures to represent > > large data sets. > > > > I'm not sure if I am making sense, so I suppose I should write some > > code instead. > > > > Here's an implementation of a config file reader which should perform > > reasonably well: > > > > ChrCls=: '#;';(' ',TAB);LF;a.-.'#; ',TAB,LF > > NB. comment, space, line, other > > > > tokens=: (0;(0 10#:10*".;._2]0 :0);<ChrCls)&;: > > 1.0 0.0 0.0 2.1 NB. 0: skip whitespace (start here) > > 1.0 1.0 0.0 1.0 NB. 1: comment > > 3.3 4.3 5.2 2.0 NB. 2: word > > 3.0 3.0 5.1 3.0 NB. 3: comment after word > > 3.0 4.0 5.1 2.1 NB. 4: space after word > > 1.3 0.3 0.3 2.1 NB. 5: line end after word > > ) > > > > readConf=: ({.,<@(;:inv)@}.);._2@tokens@fread > > > > This uses a state machine to deal with all the low level character > > munging and then forms the result into a two column table where the > > first column is the name and the remainder is whatever followed that > > (with redundant whitespace removed). > > > > Is it readable? Not if you are not familiar with the language. In > > fact, this took me about half an hour to write. And I would not bother > > doing something like this, normally, unless performance really > > mattered (which implies large file size). But, if performance does > > matter, this approach should behave reasonably well and (at least for > > J) should have much better performance than an implementation which > > loops or otherwise uses separate primitives for separate states. > > > > That said, I should also note that the idea of using decimal fractions > > for state machine instructions was Ken Iverson's. I'll not go into why > > sequential machine was not implemented that way, because I feel guilty > > about it. > > > > Thanks, > > > > -- > > Raul > > > > > > On Mon, Jan 13, 2014 at 11:18 PM, Joe Bogner <joebog...@gmail.com> > wrote: > >> Yes, it is complete. I didn't write it or test it as it was already > posted > >> to rosettacode. I will explain how it works assuming there is interest > >> > >> It uses some uncommon tricks. It leverages the read function which is > the > >> same function used in the repl to read input characters. So the goal > is to > >> take the file and skip any comments and then pass it on to set the > variable > >> with the key and value. > >> > >> (pipe (in File (while (echo "#" ";") (till "^J"))) > >> > >> Reads the file and echos until it encounters a comment character and > then > >> reads til EOL > >> > >> while (read) > >>> (skip) > >>> (set @ (or (line T) T)) ) ) ) > >> > >> Then read those echoed characters. read gets the first symbol, the key. > >> Skip moves the input stream ahead by the space or nothing. Set assigns > the > >> variable @ which is the result from the last operation (read - which is > the > >> key) with the value from the rest of the line or T if it is blank (for > >> boolean example in the config) > >> > >> My brain has been trained to think of parens as whitespace. It didn't > start > >> that way. I can see why you may consider them tokens. I was also > counting > >> unique function/operation tokens, not characters. The idea being if I > only > >> have 4 english words with 3 characters each on a line, that is easier > for > >> my brain to parse than 5 operations using 2 ascii symbols that I don't > >> recognize the meaning. > >> > >> However as my J vocabulary improves it becomes less of an issue. I can > >> parse i. or e. As fast as a function called idx or el? > >> > >> Line length is still important I think. Also a functional style with > >> splitting up the train may help reusability, comprehension, and may > help > >> identify small areas to refactor. Those small topics like "filter out > >> lines starting with a comment character" can get lost to me in a long > line > >> of compound operations. Again, some balance and personal perference and > >> familiarity > >> > >> I became interested in picolisp for its speed, conciseness and > >> expressiveness. Many of the same attributes as J. It is almost always > in > >> the shortest solutions on rosettacode too. Happy to help resolve your > build > >> issue off the list if you are interested. > >> On Jan 13, 2014 10:28 PM, "Raul Miller" <rauldmil...@gmail.com> wrote: > >> > >>> On Mon, Jan 13, 2014 at 8:32 PM, Joe Bogner <joebog...@gmail.com> > wrote: > >>> > PicoLisp > >>> > > >>> > (de rdConf (File) > >>> > (pipe (in File (while (echo "#" ";") (till "^J"))) > >>> > (while (read) > >>> > (skip) > >>> > (set @ (or (line T) T)) ) ) ) > >>> > >>> Is that complete? > >>> > >>> I learned lisp back in highschool, and I've used drracket and emacs > >>> and other such lisp environments, but never learned picolisp. I tried > >>> to install picolisp but it would not build for me and I do not feel > >>> like debugging the source for picolisp just for this message. > >>> > >>> My impression, though, is that a J implementation like what you have > >>> written would look something like this: > >>> > >>> conf=: a:-.~(#~ 1 -.&e. '#;'&e.S:0)<;._2 fread file > >>> > >>> In other words, read the file as lines, removing blank lines and > comment > >>> lines. > >>> > >>> If all you are doing is saving the unparsed lines then we should > >>> expect simpler code. But maybe I have missed a subtlety of picolisp? > >>> > >>> I get that @ is a wild card, but I do not understand the mechanism > >>> well enough to say whether your implementation is correct, nor do I > >>> know whether (while (read) .. is stashing the read result somewhere or > >>> what. Nor do I know if your skip is assuming a rigid file structure or > >>> is allowing free-form comments in the config file. > >>> > >>> > If I compare that to the your J implementation > >>> > > >>> >> deb L:0@:(({.~ ; [: < [: ;^:(1=#) ',' cut (}.~>:)) i.&1@:e.&' > >>> =')&>@(#~ > >>> >> a:&~: > ';#'e.~{.&>)@:(dlb&.>)@:(LF&cut) > >>> > > >>> > This J implementation feels more like code golf or a compressed > >>> > string. How many tokens/operations are included in it? I won't count, > >>> > but I am fairly sure it's more than the (pipe, in, while, echo, till, > >>> > read, skip, set, or, line) 9 in the PicoLisp example. > >>> > >>> I count 64 tokens in the J implementation and 54 tokens in your > >>> PicoLisp example. I'm not sure why you have implied that parentheses > >>> are not tokens but I do not think they qualify as whitespace? > >>> > >>> We could get further into parsing and punctuation issues, but I'm not > >>> sure whether that would be relevant. > >>> > >>> > When reading a long J string or an entry in the obfuscated C code > >>> > contest, I try to recognize patterns or operations. Having used J for > >>> > about 6 months, I can recognize probably about half the operations in > >>> > that string without having to look them up. That's progress. It still > >>> > feels like a "run on sentence" which is harder to read than short > >>> > sentences. > >>> > >>> I also usually prefer shorter sentences. Not always, but I'd probably > >>> try to split Dan's code into two or three lines. Posting a fair bit of > >>> code to email has been an influence there. > >>> > >>> I imagine I would also favor a shorter implmentation than what Dan has > >>> done here. For example, in his code I see the phrase e.&' =' but I see > >>> no equals signs in the config file nor in the specification on the > >>> companion task (whose J entry should perhaps be simplified?). > >>> > >>> > I think there's a fine balance between tacit expressions and clarity. > >>> > It may be my level of inexperience with the language. However, I > >>> > wonder if I've put as much time on it as any intro-level APL > >>> > programmer. Are there any conventions in the language for # of > tokens, > >>> > trains, etc for a readable sentence? > >>> > >>> Personal taste? > >>> > >>> Thanks, > >>> > >>> -- > >>> Raul > >>> ---------------------------------------------------------------------- > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >>> > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm