Bulat Ziganshin wrote:

Weird thing #1: The first time you sort the data, it takes a few
seconds. The other 7 times, it takes a split second - roughly 100x faster. Wuh?

this looks like disk caching effects. if data are read from disj on
first run and from disk cache on the next runs, this only means that
your algorithm works faster than reading its data from disk

Negative. No data is ever *read* from disk, only *written* to disk. (And each test writes to a different file.)

The data to be sorted is generated using a trivial LCG PRNG.

there are plenty of reasons: first, -threaded make i/o overlapped
with calculations.

Not with -N1.

second, parallel version may exhibit better cpu
cache behavior - such as processing all data in cache before sending
it back to memory

Again, with -N1, it is *still* only using 1 CPU core.

Weird thing #4: Adding "-N2" makes *everything* slow down a few percent.
In particular, Task Manager shows only one CPU core in use.

it's easy - your algorithm isn't really parallel.

Fails to explain why the parallel version is faster than the sequential one (even with no parallelism), or why the sequential algorithm should slow down with more threads. (Surely the extra threads just sit idle?)

there are many subtle effects making optimization much more interesting
than using simple schemas ;)  it's why i like it so much :))

Well, based on the results I've seen so far, it seems that parallelism is a complete waste of time because it doesn't gain you anything. And that doesn't make a lot of sense...

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to