We started with no salt buckets, but the performance was terrible in our testing.
A leading date column is in our schema model. We don't seem to be getting hotspotting after salting. Date range scans are very common as are slice and dice on many dimension columns. We have tested with a range of SALT values from 0 to 120 for bulk loading, upserts, selects at different concurrent load levels on a test cluster before moving to production (with some tweaking post-production). However, we had fewer average regions per RS during the testing. The larger SALT numbers definitely gave overall better performance on our predominantly read-heavy environment. I appreciate any insights to identify bottlenecks. On Thu, Jun 22, 2017 at 6:26 PM, James Taylor <jamestay...@apache.org> wrote: > My recommendation: don't use salt buckets unless you have a monatomically > increasing row key, for example one that leads with the current date/time. > Otherwise you'll be putting more load (# of salt buckets more load worst > case) for bread-and-butter small-range-scan Phoenix queries. > > Thanks, > James > > On Fri, Jun 23, 2017 at 10:06 AM Michael Young <yomaiq...@gmail.com> > wrote: > >> The ulimit open files was only 1024 for the user executing the query. >> After increasing, the queries behaves better. >> >> How can we tell if we need to reduce/increase the number of salt buckets? >> >> Our team set this based on read/write performance using data volume and >> expected queries to be run by users. >> >> However, now it seems the performance has degraded. We can recreate the >> schemas using fewer/more buckets and reload the data, but I haven't seen a >> hard and fast rule for setting the number of buckets. >> >> We have 12 data nodes, 4 SSDs per node, 128 GB Ram per node, 24 core w/ >> hyperthreading (HDP 2.5 running, hbase is primary service). >> and 800+ regions per RS (seems high) >> >> Any orientation on this would be greatly appreciated. >> >> >> On Tue, Jun 20, 2017 at 11:54 AM, Josh Elser <josh.el...@gmail.com> >> wrote: >> >>> I think this is more of an issue of your 78 salt buckets than the width >>> of your table. Each chunk, running in parallel, is spilling incremental >>> counts to disk. >>> >>> I'd check your ulimit settings on the node which you run this query from >>> and try to increase the number of open files allowed before going into this >>> one in more depth :) >>> >>> >>> On 6/16/17 2:31 PM, Michael Young wrote: >>> >>>> >>>> We are running a 13-node hbase cluster. One table uses 78 SALT BUCKETS >>>> which seems to work reasonable well for both read and write. This table >>>> has 130 columns with a PK having 30 columns (fairly wide table). >>>> >>>> However, after adding several new tables we are seeing errors about too >>>> many open files when running a full table scan. >>>> >>>> >>>> Caused by: org.apache.phoenix.exception.PhoenixIOException: Too many >>>> open files >>>> at org.apache.phoenix.util.ServerUtil.parseServerException( >>>> ServerUtil.java:111) >>>> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>( >>>> SpoolingResultIterator.java:152) >>>> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>( >>>> SpoolingResultIterator.java:84) >>>> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>( >>>> SpoolingResultIterator.java:63) >>>> at org.apache.phoenix.iterate.SpoolingResultIterator$ >>>> SpoolingResultIteratorFactory.newIterator(SpoolingResultIterator.java: >>>> 79) >>>> at org.apache.phoenix.iterate.ParallelIterators$1.call( >>>> ParallelIterators.java:112) >>>> at org.apache.phoenix.iterate.ParallelIterators$1.call( >>>> ParallelIterators.java:103) >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>> at org.apache.phoenix.job.JobManager$ >>>> InstrumentedJobFutureTask.run(JobManager.java:183) >>>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>>> ThreadPoolExecutor.java:1142) >>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>> ThreadPoolExecutor.java:617) >>>> at java.lang.Thread.run(Thread.java:745) >>>> Caused by: java.io.IOException: Too many open files >>>> at java.io.UnixFileSystem.createFileExclusively(Native Method) >>>> at java.io.File.createTempFile(File.java:2024) >>>> at org.apache.phoenix.shaded.org.apache.commons.io.output. >>>> DeferredFileOutputStream.thresholdReached( >>>> DeferredFileOutputStream.java:176) >>>> at org.apache.phoenix.iterate.SpoolingResultIterator$1. >>>> thresholdReached(SpoolingResultIterator.java:116) >>>> at org.apache.phoenix.shaded.org.apache.commons.io.output. >>>> ThresholdingOutputStream.checkThreshold(ThresholdingOutputStream.java: >>>> 224) >>>> at org.apache.phoenix.shaded.org.apache.commons.io.output. >>>> ThresholdingOutputStream.write(ThresholdingOutputStream.java:92) >>>> at java.io.DataOutputStream.writeByte(DataOutputStream. >>>> java:153) >>>> at org.apache.hadoop.io.WritableUtils.writeVLong( >>>> WritableUtils.java:273) >>>> at org.apache.hadoop.io.WritableUtils.writeVInt( >>>> WritableUtils.java:253) >>>> at org.apache.phoenix.util.TupleUtil.write(TupleUtil.java:149) >>>> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>( >>>> SpoolingResultIterator.java:127) >>>> ... 10 more >>>> >>>> >>>> When running an explain plan: >>>> explain select count(1) from MYBIGTABLE >>>> >>>> +----------------------------------------------------------- >>>> -------------------------------------------------------+ >>>> | PLAN >>>> | >>>> +----------------------------------------------------------- >>>> -------------------------------------------------------+ >>>> | CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL 78-WAY >>>> FULL SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP | >>>> | ROW TIMESTAMP FILTER [0, 9223372036854775807) >>>> | >>>> | SERVER FILTER BY FIRST KEY ONLY >>>> | >>>> | SERVER AGGREGATE INTO SINGLE ROW >>>> | >>>> +----------------------------------------------------------- >>>> -------------------------------------------------------+ >>>> >>>> I has a lot of chunks. Normally this query would return at least some >>>> result after running for a few minutes. With appropriate filters in the >>>> WHERE clause, the queries run fine. >>>> >>>> Any suggestions on how to avoid this error and get better performance >>>> from the table scans? Realizing that we don't need to run full table scans >>>> regularly, just trying to understand better best practices for Phoenix >>>> Hbase. >>>> >>>> Thank you, >>>> Michael >>>> >>> >>