RE: Accumulo Seek performance

Dan Blum Mon, 12 Sep 2016 10:57:13 -0700

I tested a large scan on a 1.6.2 cluster with 11 tablet servers - using 
Scanners was much slower than using a BatchScanner with 11 threads, by about a 
5:1 ratio. There were 450 ranges.


-----Original Message-----
From: Josh Elser [mailto:[email protected]] 
Sent: Monday, September 12, 2016 1:42 PM
To: [email protected]
Subject: Re: Accumulo Seek performance

I had increased the readahead threed pool to 32 (from 16). I had also 
increased the minimum thread pool size from 20 to 40. I had 10 tablets 
with the data block cache turned on (probably only 256M tho).

Each tablet had a single file (manually compacted). Did not observe 
cache rates.

I've been working through this with Keith on IRC this morning too. Found 
that a single batchscanner (one partition) is faster than the Scanner. 
Two partitions and things started to slow down.

Two interesting points to still pursue, IMO:

1. I saw that the tserver-side logging for MultiScanSess was near 
identical to the BatchScanner timings
2. The minimum server threads did not seem to be taking effect. Despite 
having the value set to 64, I only saw a few ClientPool threads in a 
jstack after running the test.

Adam Fuchs wrote:
> Sorry, Monday morning poor reading skills, I guess. :)
>
> So, 3000 ranges in 40 seconds with the BatchScanner. In my past
> experience HDFS seeks tend to take something like 10-100ms, and I would
> expect that time to dominate here. With 60 client threads your
> bottleneck should be the readahead pool, which I believe defaults to 16
> threads. If you get perfect index caching then you should be seeing
> something like 3000/16*50ms = 9,375ms. That's in the right ballpark, but
> it assumes no data cache hits. Do you have any idea of how many files
> you had per tablet after the ingest? Do you know what your cache hit
> rate was?
>
> Adam
>
>
> On Mon, Sep 12, 2016 at 9:14 AM, Josh Elser <[email protected]
> <mailto:[email protected]>> wrote:
>
>     5 iterations, figured that would be apparent from the log messages :)
>
>     The code is already posted in my original message.
>
>     Adam Fuchs wrote:
>
>         Josh,
>
>         Two questions:
>
>         1. How many iterations did you do? I would like to see an absolute
>         number of lookups per second to compare against other observations.
>
>         2. Can you post your code somewhere so I can run it?
>
>         Thanks,
>         Adam
>
>
>         On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser
>         <[email protected] <mailto:[email protected]>
>         <mailto:[email protected] <mailto:[email protected]>>> wrote:
>
>              Sven, et al:
>
>              So, it would appear that I have been able to reproduce this one
>              (better late than never, I guess...). tl;dr Serially using
>         Scanners
>              to do point lookups instead of a BatchScanner is ~20x
>         faster. This
>              sounds like a pretty serious performance issue to me.
>
>              Here's a general outline for what I did.
>
>              * Accumulo 1.8.0
>              * Created a table with 1M rows, each row with 10 columns
>         using YCSB
>              (workloada)
>              * Split the table into 9 tablets
>              * Computed the set of all rows in the table
>
>              For a number of iterations:
>              * Shuffle this set of rows
>              * Choose the first N rows
>              * Construct an equivalent set of Ranges from the set of Rows,
>              choosing a random column (0-9)
>              * Partition the N rows into X collections
>              * Submit X tasks to query one partition of the N rows (to a
>         thread
>              pool with X fixed threads)
>
>              I have two implementations of these tasks. One, where all
>         ranges in
>              a partition are executed via one BatchWriter. A second
>         where each
>              range is executed in serial using a Scanner. The numbers
>         speak for
>              themselves.
>
>              ** BatchScanners **
>              2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO :
>         Shuffled
>              all rows
>              2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO : All
>              ranges calculated: 3000 ranges found
>              2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 40178 ms
>              2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 42296 ms
>              2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 46094 ms
>              2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 47704 ms
>              2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 49221 ms
>
>              ** Scanners **
>              2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO :
>         Shuffled
>              all rows
>              2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO : All
>              ranges calculated: 3000 ranges found
>              2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 2833 ms
>              2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 2536 ms
>              2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 2150 ms
>              2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 2061 ms
>              2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO :
>              Executing 6 range partitions using a pool of 6 threads
>              2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO :
>         Queries
>              executed in 2140 ms
>
>              Query code is available
>         https://github.com/joshelser/accumulo-range-binning
>         <https://github.com/joshelser/accumulo-range-binning>
>         <https://github.com/joshelser/accumulo-range-binning
>         <https://github.com/joshelser/accumulo-range-binning>>
>
>
>              Sven Hodapp wrote:
>
>                  Hi Keith,
>
>                  I've tried it with 1, 2 or 10 threads. Unfortunately
>         there where
>                  no amazing differences.
>                  Maybe it's a problem with the table structure? For
>         example it
>                  may happen that one row id (e.g. a sentence) has several
>                  thousand column families. Can this affect the seek
>         performance?
>
>                  So for my initial example it has about 3000 row ids to
>         seek,
>                  which will return about 500k entries. If I filter for
>         specific
>                  column families (e.g. a document without annotations)
>         it will
>                  return about 5k entries, but the seek time will only be
>         halved.
>                  Are there to much column families to seek it fast?
>
>                  Thanks!
>
>                  Regards,
>                  Sven
>
>
>

RE: Accumulo Seek performance

Reply via email to