>In hindsight, I guess that's not too surprising since the majority of work
is copying byte from >one place to another anyway (soundex isn't that
slow), so if the parallel version has to copy
>the line twice when the work is ~ copying the line once, it's going to be
expensive getting
>the data into and out of places.

I didn't read the implementation of places, so some details here may be
wrong ...

Sending some bytes from one place to another probably involves some kind of
lock or semaphore to avoid race conditions. Also, the data has to be copied
from a core of the micro to another core (and perhaps to actual memory).
This is very cache unfriendly, so this destroy the automatic speedup off
using all the internal caches of the micro. So I expect a lot of overhead
for each call.

Gustavo




On Thursday, January 21, 2016, Brian Adkins <lojicdot...@gmail.com> wrote:

> On Thursday, January 21, 2016 at 2:53:48 PM UTC-5, Brian Adkins wrote:
> >
> > Ok, so the huge place channel for each worker isn't the main issue. I
> changed the code so the output process sends a message to the main process
> every N messages, and the main process waits on a message from the output
> process every N messages. This has the effect of setting a limit on the
> place channels.
> >
> > Through some trial and error, I discovered that waiting every 8,000
> messages provides the best performance. So, the size of the place channels
> for each worker is < 8000 / num-workers (currently 4 workers), and the size
> of the place channel for the output process is < 8,000 messages.
> >
> > I don't know the exact count since I don't know the ratio of consumer to
> producer.
> >
> > My input lines are all exactly 300 bytes, so the output channel has <
> 2.3 MB and each worker has < 600 KB
> >
> > Current elapsed time for sequential is 2.467s and places is 3.732 s  >
> 50% slower.
> >
> > Limiting the place channel size reduced GC time also:
> >
> > cpu time: 13249 real time: 3012 gc time: 280
> >
> > Ratio of CPU time to real time is 4.4 which is good, but minimizing the
> elapsed time is the goal.
> >
> > So, in summary, copy byte strings to other places is too expensive in my
> scenario.
>
> I did some more experimenting by adding a sleep to the processing function
> to see how expensive the processing function had to be for the sequential
> and places versions to be equal.
>
> Adding a 2.1 microsecond sleep gives the following:
>
> Sequential:
> cpu time: 3574 real time: 3887 gc time: 77 (operating system elapsed =
> 4.321s)
>
> Places:
> cpu time: 13791 real time: 2958 gc time: 231 (operating system elapsed =
> 3.659s)
>
> less sleep, e.g. 2.0 microseconds, puts the sequential version ahead.
>
> I then increased the sleep as follows:
>
> 17.5 microseconds => places ~ 1/2 as long as sequential
> 32.0 microseconds => places ~ 1/3 as long as sequential
> 50.0 microseconds => places ~ 1/4 as long as sequential
> 100 microseconds => places ~ 1/3.8 as long as sequential (only 4 workers)
>
> The work itself is ~ 10 microseconds, so it would appear that a (32+10)=42
> microsecond unit of work is required to get a 3x speed up on 4 cores when
> needing to copy ~ 300 bytes twice into place channels.
>
> In hindsight, I guess that's not too surprising since the majority of work
> is copying bytes from one place to another anyway (soundex isn't that
> slow), so if the parallel version has to copy the line twice when the work
> is ~ copying the line once, it's going to be expensive getting the data
> into and out of places.
>
> If the work was parsing HTML or JSON, then the places version would
> probably be worth it on a 4 core machine.
>
> Brian
>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com <javascript:;>.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to