Good call. I kind of forgot about BatchScanner threads and trying to
factor those in :). I guess doing one thread in the BatchScanners would
be more accurate.
Although, I only had one TServer, so I don't *think* there would be any
difference. I don't believe we have concurrent requests from one
BatchScanner to one TServer.
Dylan Hutchison wrote:
Nice setup Josh. Thank you for putting together the tests. A few
questions:
The serial scanner implementation uses 6 threads: one for each thread in
the thread pool.
The batch scanner implementation uses 60 threads: 10 for each thread in
the thread pool, since the BatchScanner was configured with 10 threads
and there are 10 (9?) tablets.
Isn't 60 threads of communication naturally inefficient? I wonder if we
would see the same performance if we set each BatchScanner to use 1 or 2
threads.
Maybe this would motivate a /MultiTableBatchScanner/, which maintains a
fixed number of threads across any number of concurrent scans, possibly
to the same table.
On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:
Sven, et al:
So, it would appear that I have been able to reproduce this one
(better late than never, I guess...). tl;dr Serially using Scanners
to do point lookups instead of a BatchScanner is ~20x faster. This
sounds like a pretty serious performance issue to me.
Here's a general outline for what I did.
* Accumulo 1.8.0
* Created a table with 1M rows, each row with 10 columns using YCSB
(workloada)
* Split the table into 9 tablets
* Computed the set of all rows in the table
For a number of iterations:
* Shuffle this set of rows
* Choose the first N rows
* Construct an equivalent set of Ranges from the set of Rows,
choosing a random column (0-9)
* Partition the N rows into X collections
* Submit X tasks to query one partition of the N rows (to a thread
pool with X fixed threads)
I have two implementations of these tasks. One, where all ranges in
a partition are executed via one BatchWriter. A second where each
range is executed in serial using a Scanner. The numbers speak for
themselves.
** BatchScanners **
2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO : Shuffled
all rows
2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO : All
ranges calculated: 3000 ranges found
2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 40178 ms
2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 42296 ms
2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 46094 ms
2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 47704 ms
2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 49221 ms
** Scanners **
2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO : Shuffled
all rows
2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO : All
ranges calculated: 3000 ranges found
2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 2833 ms
2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 2536 ms
2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 2150 ms
2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 2061 ms
2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO :
Executing 6 range partitions using a pool of 6 threads
2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO : Queries
executed in 2140 ms
Query code is available
https://github.com/joshelser/accumulo-range-binning
<https://github.com/joshelser/accumulo-range-binning>
Sven Hodapp wrote:
Hi Keith,
I've tried it with 1, 2 or 10 threads. Unfortunately there where
no amazing differences.
Maybe it's a problem with the table structure? For example it
may happen that one row id (e.g. a sentence) has several
thousand column families. Can this affect the seek performance?
So for my initial example it has about 3000 row ids to seek,
which will return about 500k entries. If I filter for specific
column families (e.g. a document without annotations) it will
return about 5k entries, but the seek time will only be halved.
Are there to much column families to seek it fast?
Thanks!
Regards,
Sven