Re: [racket-users] Re: Places performance & channel capacity

Brian Adkins Thu, 21 Jan 2016 18:14:37 -0800

On Thursday, January 21, 2016 at 6:54:54 PM UTC-5, Neil Van Dyke wrote:
> If I understand correctly, you're ultimately looking for a general way 
> that you can write this kind of record processing code simply in the 
> future.  And that, right now, you're investing some one-time 
> experimental effort, to assess feasibility and to find an 
> approach/guidelines that you can apply more simply in the future.


I think I'm past the stage of assessing the feasibility of Racket for my work 
in general - I simply enjoy expressing code in Racket more than other 
languages. This particular project was really exploring the boundaries of 
Racket with respect to performance (and parallelization). In other words, which 
tasks are appropriate for Racket and which ones may be more suitable for 
something like C.

I primarily develop web applications, but most projects involve some sort of 
data import either from files, web services, etc. However, for most of these, 
performance isn't a huge factor, so a simple, straightforward, sequential 
solution in Racket would be fine.

This case study involves one atypical program that imports 45 million criminal 
records four times a year, so even for that, it's only mildly annoying that it 
takes a while, and the postgres index creation takes longer than creating the 
bulk import file. So, it's more of an excuse to use a real world program to 
explore than a bona fide need for absolute performance.

> Regarding feasibility, unless I misunderstand this pilot application, I 
> think that it could be done in Racket in a way that scales almost 
> perfectly with each additional core added, limited only by the ultimate 
> filesystem I/O.  That might involve Places, perhaps with larger work 
> units or more efficient communication and coordination; 

Yes, I think "more efficient communication and coordination" is the key, and I 
expect that's going to be a learning experience for me. I would probably get 
better parallel results by coding up an Othello or Chess game and parallelizing 
the game tree search.

> [...]
> Going back to "simply", rather than 
> simply-after-upfront-hard-work-done-by-application-programmer, maybe 
> there's opportunity for 
> simply-after-further-hard-work-done-by-core-Racket-programmers... For 
> example, perhaps some of the core Racket string routines could be 
> optimized further (I imagine they're already got a working-over, when 
> Unicode was added), so that even simple programs run faster. And maybe 
> there are Places facilities that could be optimized further, or some new 
> facility added.  And maybe there's a research project for better 
> parallelizing support.

I may be interested in further researching "places" performance later in a more 
organized and helpful manner vs. my current seat-of-the-pants, 
throw-up-some-mud-and-see-what-sticks approach :)

> BTW, there might still be a few relatively simple efficiency tweaks in 
> your current approach (e.g., while skimming, I think I saw a snippet of 
> code doing something like `(write-bytes (bytes-append ...))`, perhaps to 
> try to keep a chunk of bytes contiguous, for interleaving with other 
> threads' writes).

I used bytes-append to join fields with tabs plus a trailing newline to format 
the output record. I suppose I could use bytes-join instead and manually append 
the newline.

> > If the work was parsing HTML or JSON, then the places version would 
> > probably be worth it on a 4 core machine.
> 
> For HTML and JSON parsing, unlike your records application, I think the 
> parser itself has to be one thread, but you could probably put some 
> expensive application-specific behavior that happens during the parse in 
> other threads.  Neither my HTML nor JSON parsers was designed to be used 
> that way, but my streaming JSON parser might be amenable to it. The HTML 
> parser is intended to build a potentially-big AST in one shot, so no 
> other threads while it's working, though it should be reasonably fast 
> about it (it was written on a 166MHz Pentium laptop with 48MB RAM, 
> usually on battery power).

I simply meant that the unit of work for parsing HTML or JSON (i.e. one web 
page or one JSON document) is probably large enough to warrant the overhead of 
copying the bytes to a place vs. my program which is copying individual lines.

A streaming parser would like wind up with the same problem as my program - one 
line of HTML or JSON is too fine grained to be worth copying to a place for 
parsing.

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [racket-users] Re: Places performance & channel capacity

Reply via email to