On Mon, Sep 12, 2016 at 5:50 PM, Adam J. Shook <[email protected]> wrote: > As an aside, this is actually pretty relevant to the work I've been doing > for Presto/Accumulo integration. It isn't uncommon to have around a million > exact Ranges (that is, Ranges with a single row ID) spread across the five > Presto worker nodes we use for scanning Accumulo. Right now, these ranges > get packed into PrestoSplits, 10k ranges per split (an arbitrary number I > chose), and each split is run in parallel (depending on the overall number > of splits, they may be queued for execution). > > I'm curious to see the query impact of changing it to use a fixed thread > pool of Scanners over the current BatchScanner implementation. Maybe I'll > play around with it sometime soon.
I added a readme to Josh's GH repo w/ the info I learned from Josh on IRC. So this should make it quicker for others to experiment. > > --Adam > > On Mon, Sep 12, 2016 at 2:47 PM, Dan Blum <[email protected]> wrote: >> >> I think the 450 ranges returned a total of about 7.5M entries, but the >> ranges were in fact quite small relative to the size of the table. >> >> -----Original Message----- >> From: Josh Elser [mailto:[email protected]] >> Sent: Monday, September 12, 2016 2:43 PM >> To: [email protected] >> Subject: Re: Accumulo Seek performance >> >> What does a "large scan" mean here, Dan? >> >> Sven's original problem statement was running many small/pointed Ranges >> (e.g. point lookups). My observation was that BatchScanners were slower >> than running each in a Scanner when using multiple BS's concurrently. >> >> Dan Blum wrote: >> > I tested a large scan on a 1.6.2 cluster with 11 tablet servers - using >> > Scanners was much slower than using a BatchScanner with 11 threads, by >> > about >> > a 5:1 ratio. There were 450 ranges. >> > >> > -----Original Message----- >> > From: Josh Elser [mailto:[email protected]] >> > Sent: Monday, September 12, 2016 1:42 PM >> > To: [email protected] >> > Subject: Re: Accumulo Seek performance >> > >> > I had increased the readahead threed pool to 32 (from 16). I had also >> > increased the minimum thread pool size from 20 to 40. I had 10 tablets >> > with the data block cache turned on (probably only 256M tho). >> > >> > Each tablet had a single file (manually compacted). Did not observe >> > cache rates. >> > >> > I've been working through this with Keith on IRC this morning too. Found >> > that a single batchscanner (one partition) is faster than the Scanner. >> > Two partitions and things started to slow down. >> > >> > Two interesting points to still pursue, IMO: >> > >> > 1. I saw that the tserver-side logging for MultiScanSess was near >> > identical to the BatchScanner timings >> > 2. The minimum server threads did not seem to be taking effect. Despite >> > having the value set to 64, I only saw a few ClientPool threads in a >> > jstack after running the test. >> > >> > Adam Fuchs wrote: >> >> Sorry, Monday morning poor reading skills, I guess. :) >> >> >> >> So, 3000 ranges in 40 seconds with the BatchScanner. In my past >> >> experience HDFS seeks tend to take something like 10-100ms, and I would >> >> expect that time to dominate here. With 60 client threads your >> >> bottleneck should be the readahead pool, which I believe defaults to 16 >> >> threads. If you get perfect index caching then you should be seeing >> >> something like 3000/16*50ms = 9,375ms. That's in the right ballpark, >> >> but >> >> it assumes no data cache hits. Do you have any idea of how many files >> >> you had per tablet after the ingest? Do you know what your cache hit >> >> rate was? >> >> >> >> Adam >> >> >> >> >> >> On Mon, Sep 12, 2016 at 9:14 AM, Josh Elser<[email protected] >> >> <mailto:[email protected]>> wrote: >> >> >> >> 5 iterations, figured that would be apparent from the log messages >> >> :) >> >> >> >> The code is already posted in my original message. >> >> >> >> Adam Fuchs wrote: >> >> >> >> Josh, >> >> >> >> Two questions: >> >> >> >> 1. How many iterations did you do? I would like to see an >> >> absolute >> >> number of lookups per second to compare against other >> >> observations. >> >> >> >> 2. Can you post your code somewhere so I can run it? >> >> >> >> Thanks, >> >> Adam >> >> >> >> >> >> On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser >> >> <[email protected]<mailto:[email protected]> >> >> <mailto:[email protected]<mailto:[email protected]>>> >> >> wrote: >> >> >> >> Sven, et al: >> >> >> >> So, it would appear that I have been able to reproduce >> >> this one >> >> (better late than never, I guess...). tl;dr Serially >> >> using >> >> Scanners >> >> to do point lookups instead of a BatchScanner is ~20x >> >> faster. This >> >> sounds like a pretty serious performance issue to me. >> >> >> >> Here's a general outline for what I did. >> >> >> >> * Accumulo 1.8.0 >> >> * Created a table with 1M rows, each row with 10 columns >> >> using YCSB >> >> (workloada) >> >> * Split the table into 9 tablets >> >> * Computed the set of all rows in the table >> >> >> >> For a number of iterations: >> >> * Shuffle this set of rows >> >> * Choose the first N rows >> >> * Construct an equivalent set of Ranges from the set of >> >> Rows, >> >> choosing a random column (0-9) >> >> * Partition the N rows into X collections >> >> * Submit X tasks to query one partition of the N rows (to >> >> a >> >> thread >> >> pool with X fixed threads) >> >> >> >> I have two implementations of these tasks. One, where all >> >> ranges in >> >> a partition are executed via one BatchWriter. A second >> >> where each >> >> range is executed in serial using a Scanner. The numbers >> >> speak for >> >> themselves. >> >> >> >> ** BatchScanners ** >> >> 2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Shuffled >> >> all rows >> >> 2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO >> >> : All >> >> ranges calculated: 3000 ranges found >> >> 2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 40178 ms >> >> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 42296 ms >> >> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 46094 ms >> >> 2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 47704 ms >> >> 2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 49221 ms >> >> >> >> ** Scanners ** >> >> 2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Shuffled >> >> all rows >> >> 2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO >> >> : All >> >> ranges calculated: 3000 ranges found >> >> 2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 2833 ms >> >> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 2536 ms >> >> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 2150 ms >> >> 2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 2061 ms >> >> 2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Executing 6 range partitions using a pool of 6 threads >> >> 2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO >> >> : >> >> Queries >> >> executed in 2140 ms >> >> >> >> Query code is available >> >> https://github.com/joshelser/accumulo-range-binning >> >> <https://github.com/joshelser/accumulo-range-binning> >> >> <https://github.com/joshelser/accumulo-range-binning >> >> <https://github.com/joshelser/accumulo-range-binning>> >> >> >> >> >> >> Sven Hodapp wrote: >> >> >> >> Hi Keith, >> >> >> >> I've tried it with 1, 2 or 10 threads. Unfortunately >> >> there where >> >> no amazing differences. >> >> Maybe it's a problem with the table structure? For >> >> example it >> >> may happen that one row id (e.g. a sentence) has >> >> several >> >> thousand column families. Can this affect the seek >> >> performance? >> >> >> >> So for my initial example it has about 3000 row ids >> >> to >> >> seek, >> >> which will return about 500k entries. If I filter for >> >> specific >> >> column families (e.g. a document without annotations) >> >> it will >> >> return about 5k entries, but the seek time will only >> >> be >> >> halved. >> >> Are there to much column families to seek it fast? >> >> >> >> Thanks! >> >> >> >> Regards, >> >> Sven >> >> >> >> >> >> >> > >> >
