Awhile back, I saw the Data::Maker package on CPAN. I noticed that it made use of the Text::Lorem package. Being unemployed, I tend to try to teach myself things other than Perl.
After feeling the pain of trying to write Text::Lorem in Prolog, Clojure, and Talend, I somehow remembered Factor. I spent hours and days of teeth-pulling in those other technologies. This was a single-day task in Factor, where most of the day was spent following some tutorials and installing Factor. The full source code for both versions can be viewed at: http://gitorious.org/project_factor/lorem/blobs/master/lorem.factor http://gitorious.org/text-lorem/text-lorem/blobs/master/lib/Text/Lorem.pm What follows is a comparison of the Perl and Factor programs which generate random text that looks like Latin. > package Text::Lorem; > > use strict; > use warnings; > use vars qw($VERSION); > > $VERSION = "0.3"; Factor doesnt have a versioning system yet, or a CPAN. Basically the 'extra' directory in the distribution grows as people contribute. > > my $lorem_singleton; > > sub new { > my $class = shift; > $lorem_singleton ||= bless {},$class; > return $lorem_singleton; > } The Perl class insures that only one instance of Text::Lorem is ever created. My Factor vocabulary (their term for module) is module-oriented, not object-oriented, so there's no need for singleton object creation. Yes Factor does have a CLOS-inspired object system, but that's a topic for another time. > > sub generate_wordlist { > my $self = shift; > [ map { s/\W//; lc($_) }split(/\s+/, <DATA>) ]; > } : clean ( -- string ) text >lower R/ [^\sA-Za-z]/ "" re-replace ; Here the sample ipsum lorem text is split into words, cleaned and lowercased > > sub wordlist { > my $self = shift; > $self->{ wordlist } ||= $self->generate_wordlist(); > } MEMO: wordlist ( -- array ) clean R/ \s+/ re-split [ >string ] map ; This word takes the clean ipsum text, generates the wordlist, leaves the array on the stack and memoizes it so that subsequent calls for the wordlist execute in O(1) time instead of re-cleaning and splitting. > > sub wordcount { > my $self = shift; > return scalar(@{$self->{ wordlist }}); > } Wordcount is a function designed to help random indexing into the wordlist. The random vocabulary in Factor is similar to Python in that it has sampling functions so you dont have to do this sort of thing: you just pass the sampling function the array and the number of elements you want and you are done. Notice that wordcount() would be better off memoized. There's no need for O(N) counting of the amount of words in a statically computed list. Perhaps the author should have hauled in Memoize so that both his wordlist() and wordcount () functions could benefit from it instead of using his homegrown memoization. Either that or wordcount should have been set when constructing the wordlist: $self->wordcount (...) So that it would only be accessed (not computed) later. > > sub get_word { > my $self = shift; > return $self->wordlist->[ int( rand( $self->wordcount ) ) ]; > } : getword ( -- string ) wordlist random ; random is used when you want 1 element of a list. For getting a variable range of elements, you use the sample word, as we shall see. Notice, how Perl is constantly getting arguments and accessing slots. Because Factor is stack-based, each word simply consumes what it expects from the stack. So, a lot of times, there is less argument fiddling. But you will find yourself doing stack fiddling in Factor, such as when I call swap below to switch the order of stack elements before calling sample. > > sub words { > my $self = shift; > my $num = shift; > my @words; > push @words, $self->get_word() for (1..$num); > return join(' ', @words); > } > : getwords ( n -- array ) wordlist swap sample ; The Perl routine is named "words" yet it returns a string of words. I dont think this is properly factored. I think this routine should've returned an array of words. But I understand the author's intention. He wanted to make a simple public API and most people are going to want a string of words, not an array. Anyway, it is supposed to get n random words. In Factor's random vocabulary, the sample word expects an array and a number IN THAT ORDER. Because the number is already on the stack, we place the wordlist on the stack and then call swap so they are in the right order for sample and we are done. You almost wish there were some sort of stack-fiddling shorthand so that getwords could be written as: : getwords ( n -- array ) wordlist sample(1,0) ; Where 2,1 indicates that positions 0 and 1 on the stack need to be switched. The advantage of this approach is that you can specify complicated reorderings with numbers instead of remembering the various shuffle words. Another idea is to overload the stack signature: : getwords ( n -swap- array ) wordlist sample ; Indicating that a swap should happen after wordlist is added to the stack. > sub get_sentence { > my $self = shift; > my $words = $self->words( 4 + int( rand( 6 ) ) ); > ucfirst( $words ); > } > : sentence ( -- string ) sentencestring ucfirst "." append ; This Factor code reads basically like English: you take a sentencestring, ucfirst it, then append a period! Factor behooves you to write small understandable words like that. Here are the auxilliary words that sentence used: : sentencestring ( -- string ) sentencewords " " join ; : sentencewords ( -- array ) 4 10 [a,b] random getwords ; : ucfirst ( string -- string ) 1 cut [ >upper ] dip append ; So while Perl had 1 subroutine, it was natural in Factor to decompose it into 4 words. Just another brief note. "[a,b]" above is nothing special. It's just a word. Like lisp, many characters not commonly encountered in identifiers are acceptable as word characters in Factor. > sub sentences { > my $self = shift; > my $num = shift; > my @sentences; > push @sentences, $self->get_sentence for (1..$num); > join( '. ', @sentences ) . '.'; > } : sentences ( n -- array ) [ sentence ] replicate ; A sentence is a string with a ucfirst()ed first character and a period. So, this code is again not really factored correctly. A routine to create sentences should return an array of sentences. The Perl people are seeing "[ sentence ]" and will see that as some sort of array construct, because Perl anonymous arrays are formed using square brackets. However, that is not what is happening. What you see there is a "quoted word" ... normally when Factor encounters a word, it evaluates it immediately. So here we are creating a "Perl closure" that we can call many times and collect the output into an array (that's what replicate does for us). But again, I also think that the plural nature of the subroutine implies that an array of sentences should be returned. Because an array of sentences is much more tractable for later manipulation than a joined string. As the saying goes, once you make chocolate milk, it's hard to get the chocolate or milk out later! > > sub get_paragraph { > my $self = shift; > my $sentences = $self->sentences(3 + int( rand( 4 ) ) ); > > } : paragraph ( -- string ) 3 7 [a,b] random sentences " " join ; Wow, is this English or what: "give me between 3 and 7 random sentences joined by space" > > sub paragraphs { > my $self = shift; > my $num = shift; > my @paragraphs; > push @paragraphs, $self->get_paragraph for (1..$num); > join( "\n\n", @paragraphs ); > } : paragraphs ( n -- array ) [ paragraph ] replicate ; Well, I hope you enjoyed this Perl/Factor comparison article. You can learn more about Factor at http://www.FactorCode.org ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk