Re: Accumulo Seek performance

Keith Turner Tue, 13 Sep 2016 07:29:34 -0700

On Mon, Sep 12, 2016 at 5:50 PM, Adam J. Shook <[email protected]> wrote:
> As an aside, this is actually pretty relevant to the work I've been doing
> for Presto/Accumulo integration.  It isn't uncommon to have around a million
> exact Ranges (that is, Ranges with a single row ID)  spread across the five
> Presto worker nodes we use for scanning Accumulo.  Right now, these ranges
> get packed into PrestoSplits, 10k ranges per split (an arbitrary number I
> chose), and each split is run in parallel (depending on the overall number
> of splits, they may be queued for execution).
>
> I'm curious to see the query impact of changing it to use a fixed thread
> pool of Scanners over the current BatchScanner implementation.  Maybe I'll
> play around with it sometime soon.


I added a readme to Josh's GH repo w/ the info I learned from Josh on
IRC.   So this should make it quicker for others to experiment.

>
> --Adam
>
> On Mon, Sep 12, 2016 at 2:47 PM, Dan Blum <[email protected]> wrote:
>>
>> I think the 450 ranges returned a total of about 7.5M entries, but the
>> ranges were in fact quite small relative to the size of the table.
>>
>> -----Original Message-----
>> From: Josh Elser [mailto:[email protected]]
>> Sent: Monday, September 12, 2016 2:43 PM
>> To: [email protected]
>> Subject: Re: Accumulo Seek performance
>>
>> What does a "large scan" mean here, Dan?
>>
>> Sven's original problem statement was running many small/pointed Ranges
>> (e.g. point lookups). My observation was that BatchScanners were slower
>> than running each in a Scanner when using multiple BS's concurrently.
>>
>> Dan Blum wrote:
>> > I tested a large scan on a 1.6.2 cluster with 11 tablet servers - using
>> > Scanners was much slower than using a BatchScanner with 11 threads, by 
>> > about
>> > a 5:1 ratio. There were 450 ranges.
>> >
>> > -----Original Message-----
>> > From: Josh Elser [mailto:[email protected]]
>> > Sent: Monday, September 12, 2016 1:42 PM
>> > To: [email protected]
>> > Subject: Re: Accumulo Seek performance
>> >
>> > I had increased the readahead threed pool to 32 (from 16). I had also
>> > increased the minimum thread pool size from 20 to 40. I had 10 tablets
>> > with the data block cache turned on (probably only 256M tho).
>> >
>> > Each tablet had a single file (manually compacted). Did not observe
>> > cache rates.
>> >
>> > I've been working through this with Keith on IRC this morning too. Found
>> > that a single batchscanner (one partition) is faster than the Scanner.
>> > Two partitions and things started to slow down.
>> >
>> > Two interesting points to still pursue, IMO:
>> >
>> > 1. I saw that the tserver-side logging for MultiScanSess was near
>> > identical to the BatchScanner timings
>> > 2. The minimum server threads did not seem to be taking effect. Despite
>> > having the value set to 64, I only saw a few ClientPool threads in a
>> > jstack after running the test.
>> >
>> > Adam Fuchs wrote:
>> >> Sorry, Monday morning poor reading skills, I guess. :)
>> >>
>> >> So, 3000 ranges in 40 seconds with the BatchScanner. In my past
>> >> experience HDFS seeks tend to take something like 10-100ms, and I would
>> >> expect that time to dominate here. With 60 client threads your
>> >> bottleneck should be the readahead pool, which I believe defaults to 16
>> >> threads. If you get perfect index caching then you should be seeing
>> >> something like 3000/16*50ms = 9,375ms. That's in the right ballpark,
>> >> but
>> >> it assumes no data cache hits. Do you have any idea of how many files
>> >> you had per tablet after the ingest? Do you know what your cache hit
>> >> rate was?
>> >>
>> >> Adam
>> >>
>> >>
>> >> On Mon, Sep 12, 2016 at 9:14 AM, Josh Elser<[email protected]
>> >> <mailto:[email protected]>>  wrote:
>> >>
>> >>      5 iterations, figured that would be apparent from the log messages
>> >> :)
>> >>
>> >>      The code is already posted in my original message.
>> >>
>> >>      Adam Fuchs wrote:
>> >>
>> >>          Josh,
>> >>
>> >>          Two questions:
>> >>
>> >>          1. How many iterations did you do? I would like to see an
>> >> absolute
>> >>          number of lookups per second to compare against other
>> >> observations.
>> >>
>> >>          2. Can you post your code somewhere so I can run it?
>> >>
>> >>          Thanks,
>> >>          Adam
>> >>
>> >>
>> >>          On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser
>> >>          <[email protected]<mailto:[email protected]>
>> >>          <mailto:[email protected]<mailto:[email protected]>>>
>> >> wrote:
>> >>
>> >>               Sven, et al:
>> >>
>> >>               So, it would appear that I have been able to reproduce
>> >> this one
>> >>               (better late than never, I guess...). tl;dr Serially
>> >> using
>> >>          Scanners
>> >>               to do point lookups instead of a BatchScanner is ~20x
>> >>          faster. This
>> >>               sounds like a pretty serious performance issue to me.
>> >>
>> >>               Here's a general outline for what I did.
>> >>
>> >>               * Accumulo 1.8.0
>> >>               * Created a table with 1M rows, each row with 10 columns
>> >>          using YCSB
>> >>               (workloada)
>> >>               * Split the table into 9 tablets
>> >>               * Computed the set of all rows in the table
>> >>
>> >>               For a number of iterations:
>> >>               * Shuffle this set of rows
>> >>               * Choose the first N rows
>> >>               * Construct an equivalent set of Ranges from the set of
>> >> Rows,
>> >>               choosing a random column (0-9)
>> >>               * Partition the N rows into X collections
>> >>               * Submit X tasks to query one partition of the N rows (to
>> >> a
>> >>          thread
>> >>               pool with X fixed threads)
>> >>
>> >>               I have two implementations of these tasks. One, where all
>> >>          ranges in
>> >>               a partition are executed via one BatchWriter. A second
>> >>          where each
>> >>               range is executed in serial using a Scanner. The numbers
>> >>          speak for
>> >>               themselves.
>> >>
>> >>               ** BatchScanners **
>> >>               2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Shuffled
>> >>               all rows
>> >>               2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO
>> >> : All
>> >>               ranges calculated: 3000 ranges found
>> >>               2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 40178 ms
>> >>               2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 42296 ms
>> >>               2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 46094 ms
>> >>               2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 47704 ms
>> >>               2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 49221 ms
>> >>
>> >>               ** Scanners **
>> >>               2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Shuffled
>> >>               all rows
>> >>               2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO
>> >> : All
>> >>               ranges calculated: 3000 ranges found
>> >>               2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 2833 ms
>> >>               2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 2536 ms
>> >>               2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 2150 ms
>> >>               2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 2061 ms
>> >>               2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>               Executing 6 range partitions using a pool of 6 threads
>> >>               2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO
>> >> :
>> >>          Queries
>> >>               executed in 2140 ms
>> >>
>> >>               Query code is available
>> >>          https://github.com/joshelser/accumulo-range-binning
>> >>          <https://github.com/joshelser/accumulo-range-binning>
>> >>          <https://github.com/joshelser/accumulo-range-binning
>> >>          <https://github.com/joshelser/accumulo-range-binning>>
>> >>
>> >>
>> >>               Sven Hodapp wrote:
>> >>
>> >>                   Hi Keith,
>> >>
>> >>                   I've tried it with 1, 2 or 10 threads. Unfortunately
>> >>          there where
>> >>                   no amazing differences.
>> >>                   Maybe it's a problem with the table structure? For
>> >>          example it
>> >>                   may happen that one row id (e.g. a sentence) has
>> >> several
>> >>                   thousand column families. Can this affect the seek
>> >>          performance?
>> >>
>> >>                   So for my initial example it has about 3000 row ids
>> >> to
>> >>          seek,
>> >>                   which will return about 500k entries. If I filter for
>> >>          specific
>> >>                   column families (e.g. a document without annotations)
>> >>          it will
>> >>                   return about 5k entries, but the seek time will only
>> >> be
>> >>          halved.
>> >>                   Are there to much column families to seek it fast?
>> >>
>> >>                   Thanks!
>> >>
>> >>                   Regards,
>> >>                   Sven
>> >>
>> >>
>> >>
>> >
>>
>

Re: Accumulo Seek performance

Reply via email to