Robert, all,

* Robert Haas ( wrote:
> There is a considerable amount of variation in the amount of time this
> takes to run based on how much of the relation is cached.  Clearly,
> there's no way for the system to cache it all, but it can cache a
> significant portion, and that affects the results to no small degree.
> dd on hydra prints information on the data transfer rate; on uncached
> 1GB segments, it runs at right around 400 MB/s, but that can soar to
> upwards of 3GB/s when the relation is fully cached.  I tried flushing
> the OS cache via echo 1 > /proc/sys/vm/drop_caches, and found that
> immediately after doing that, the above command took 5m21s to run -
> i.e. ~321000 ms.  Most of your test times are faster than that, which
> means they reflect some degree of caching.  When I immediately reran
> the command a second time, it finished in 4m18s the second time, or
> ~258000 ms.  The rate was the same as the first test - about 400 MB/s
> - for most of the files, but 27 of the last 28 files went much faster,
> between 1.3 GB/s and 3.7 GB/s.


> With 0 workers, first run took 883465.352 ms, and second run took 295050.106 
> ms.
> With 8 workers, first run took 340302.250 ms, and second run took 307767.758 
> ms.
> This is a confusing result, because you expect parallelism to help
> more when the relation is partly cached, and make little or no
> difference when it isn't cached.  But that's not what happened.

These numbers seem to indicate that the oddball is the single-threaded
uncached run.  If I followed correctly, the uncached 'dd' took 321s,
which is relatively close to the uncached-lots-of-workers and the two
cached runs.  What in the world is the uncached single-thread case doing
that it takes an extra 543s, or over twice as long?  It's clearly not
disk i/o which is causing the slowdown, based on your dd tests.

One possibility might be round-trip latency.  The multi-threaded case is
able to keep the CPUs and the i/o system going, and the cached results
don't have as much latency since things are cached, but the
single-threaded uncached case going i/o -> cpu -> i/o -> cpu, ends up
with a lot of wait time as it switches between being on CPU and waiting
on the i/o.

Just some thoughts.



Attachment: signature.asc
Description: Digital signature

Reply via email to