Re: Getting too many open files during table scan

Michael Young Fri, 23 Jun 2017 15:35:39 -0700

Sergey, thanks for this tip!

Since our client data volume varies a lot from site to site, would
splitting only on the first letters of the client_id lead to some regions
being much larger than others?
Or does phoenix distribute fairly across the different region servers?


Would this continue to work as need client_id's are added to the cluster,
or does splitting need to be done again manually somehow?

As a follow up question, is there a way to get hbase to balance the regions
based on their size rather than the region count?

Cheers,
Michael

On Fri, Jun 23, 2017 at 12:38 PM, Sergey Soldatov <[email protected]>
wrote:

> You may check "Are there any tips for optimizing Phoenix?" section of
> Apache Phoenix FAQ at https://phoenix.apache.org/faq.html. It says how to
> pre-split table. In your case you may split on the first letters of
> client_id.
>
> When we are talking about monotonous data, we usually mean the primary key
> only. For example if we have primary key integer ID and writing something
> with auto increment ID, all the data will go to a single region, creating a
> hot spot there. In this case (and actually only in this case) salting may
> be useful, since it adds an additional random byte in front of primary key,
> giving us a chance to distribute the write load across the cluster. In all
> other cases salting causes more work on the cluster since we will be unable
> to do a single point lookup/range scan by primary key and need to make
> lookup for all salting keys + pk.
>
> Thanks,
> Sergey
>
>
> On Fri, Jun 23, 2017 at 12:00 PM, Michael Young <[email protected]>
> wrote:
>
>> >>Don't you have any other column which is obligatory in queries during
>> reading but not monotonous with ingestion?
>> We have several columns used in typical query WHERE clauses (like
>> userID='abc' or a specific user attributes, data types). However, there are
>> a number of columns which are monotonous with many rows having the same
>> value.
>>
>> We have tried running after update STATISTICS on tables, but that would
>> be worth investigating again.
>>
>> Can you give me a hint how to pre-split the data?
>>
>> Let's say we have the following PK columns (all varchar except dt=date):
>> client_id,dt (date),rule_id,user_id,attribute_1,attribute_2,rule_name,
>> browser_type,device_type,os_type,page,group_name,period
>>
>> and non-PK columns in the same table
>> requests,connections,queues,queue_time
>>
>> What would be the suggested way to pre-split?  I'm not familiar with this
>> technique beyond very simple use cases.
>>
>> Thanks!
>>
>> On Thu, Jun 22, 2017 at 11:31 PM, Ankit Singhal <[email protected]
>> > wrote:
>>
>>> bq. A leading date column is in our schema model:-
>>> Don't you have any other column which is obligatory in queries during
>>> reading but not monotonous with ingestion? As pre-split can help you
>>> avoiding hot-spotting.
>>> For parallelism/performance comparison, have you tried running a query
>>> on a non-salted table after updating the stats and comparing performance
>>> with a salted table?
>>>
>>>
>>> On Fri, Jun 23, 2017 at 9:49 AM, Michael Young <[email protected]>
>>> wrote:
>>>
>>>> We started with no salt buckets, but the performance was terrible in
>>>> our testing.
>>>>
>>>> A leading date column is in our schema model.  We don't seem to be
>>>> getting hotspotting after salting.  Date range scans are very common as are
>>>> slice and dice on many dimension columns.
>>>>
>>>> We have tested with a range of SALT values from 0 to 120 for bulk
>>>> loading, upserts, selects at different concurrent load levels on a test
>>>> cluster before moving to production (with some tweaking post-production).
>>>> However, we had fewer average regions per RS during the testing.  The
>>>> larger SALT numbers definitely gave overall better performance on our
>>>> predominantly read-heavy environment.
>>>>
>>>> I appreciate any insights to identify bottlenecks.
>>>>
>>>> On Thu, Jun 22, 2017 at 6:26 PM, James Taylor <[email protected]>
>>>> wrote:
>>>>
>>>>> My recommendation: don't use salt buckets unless you have a
>>>>> monatomically increasing row key, for example one that leads with the
>>>>> current date/time. Otherwise you'll be putting more load (# of salt 
>>>>> buckets
>>>>> more load worst case) for bread-and-butter small-range-scan Phoenix 
>>>>> queries.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> On Fri, Jun 23, 2017 at 10:06 AM Michael Young <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> The ulimit open files was only 1024 for the user executing the
>>>>>> query.  After increasing, the queries behaves better.
>>>>>>
>>>>>> How can we tell if we need to reduce/increase the number of salt
>>>>>> buckets?
>>>>>>
>>>>>> Our team set this based on read/write performance using data volume
>>>>>> and expected queries to be run by users.
>>>>>>
>>>>>> However, now it seems the performance has degraded.  We can recreate
>>>>>> the schemas using fewer/more buckets and reload the data, but I haven't
>>>>>> seen a hard and fast rule for setting the number of buckets.
>>>>>>
>>>>>> We have 12 data nodes, 4 SSDs per node, 128 GB Ram per node, 24 core
>>>>>> w/ hyperthreading (HDP 2.5 running, hbase is primary service).
>>>>>> and 800+ regions per RS (seems high)
>>>>>>
>>>>>> Any orientation on this would be greatly appreciated.
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 20, 2017 at 11:54 AM, Josh Elser <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> I think this is more of an issue of your 78 salt buckets than the
>>>>>>> width of your table. Each chunk, running in parallel, is spilling
>>>>>>> incremental counts to disk.
>>>>>>>
>>>>>>> I'd check your ulimit settings on the node which you run this query
>>>>>>> from and try to increase the number of open files allowed before going 
>>>>>>> into
>>>>>>> this one in more depth :)
>>>>>>>
>>>>>>>
>>>>>>> On 6/16/17 2:31 PM, Michael Young wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> We are running a 13-node hbase cluster.  One table uses 78 SALT
>>>>>>>> BUCKETS which seems to work reasonable well for both read and write.  
>>>>>>>> This
>>>>>>>> table has 130 columns with a PK having 30 columns (fairly wide table).
>>>>>>>>
>>>>>>>> However, after adding several new tables we are seeing errors about
>>>>>>>> too many open files when running a full table scan.
>>>>>>>>
>>>>>>>>
>>>>>>>> Caused by: org.apache.phoenix.exception.PhoenixIOException: Too
>>>>>>>> many open files
>>>>>>>>          at org.apache.phoenix.util.Server
>>>>>>>> Util.parseServerException(ServerUtil.java:111)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:152)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:84)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:63)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator$SpoolingResultIteratorFactory.newIterato
>>>>>>>> r(SpoolingResultIterator.java:79)
>>>>>>>>          at org.apache.phoenix.iterate.Par
>>>>>>>> allelIterators$1.call(ParallelIterators.java:112)
>>>>>>>>          at org.apache.phoenix.iterate.Par
>>>>>>>> allelIterators$1.call(ParallelIterators.java:103)
>>>>>>>>          at java.util.concurrent.FutureTas
>>>>>>>> k.run(FutureTask.java:266)
>>>>>>>>          at org.apache.phoenix.job.JobMana
>>>>>>>> ger$InstrumentedJobFutureTask.run(JobManager.java:183)
>>>>>>>>          at java.util.concurrent.ThreadPoo
>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>          at java.util.concurrent.ThreadPoo
>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>          at java.lang.Thread.run(Thread.java:745)
>>>>>>>> Caused by: java.io.IOException: Too many open files
>>>>>>>>          at java.io.UnixFileSystem.createFileExclusively(Native
>>>>>>>> Method)
>>>>>>>>          at java.io.File.createTempFile(File.java:2024)
>>>>>>>>          at org.apache.phoenix.shaded.org.
>>>>>>>> apache.commons.io.output.DeferredFileOutputStream.thresholdR
>>>>>>>> eached(DeferredFileOutputStream.java:176)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator$1.thresholdReached(SpoolingResultIterato
>>>>>>>> r.java:116)
>>>>>>>>          at org.apache.phoenix.shaded.org.
>>>>>>>> apache.commons.io.output.ThresholdingOutputStream.checkThres
>>>>>>>> hold(ThresholdingOutputStream.java:224)
>>>>>>>>          at org.apache.phoenix.shaded.org.
>>>>>>>> apache.commons.io.output.ThresholdingOutputStream.write(Thre
>>>>>>>> sholdingOutputStream.java:92)
>>>>>>>>          at java.io.DataOutputStream.write
>>>>>>>> Byte(DataOutputStream.java:153)
>>>>>>>>          at org.apache.hadoop.io.WritableU
>>>>>>>> tils.writeVLong(WritableUtils.java:273)
>>>>>>>>          at org.apache.hadoop.io.WritableU
>>>>>>>> tils.writeVInt(WritableUtils.java:253)
>>>>>>>>          at org.apache.phoenix.util.TupleU
>>>>>>>> til.write(TupleUtil.java:149)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:127)
>>>>>>>>          ... 10 more
>>>>>>>>
>>>>>>>>
>>>>>>>> When running an explain plan:
>>>>>>>> explain select count(1) from MYBIGTABLE
>>>>>>>>
>>>>>>>> +-----------------------------------------------------------
>>>>>>>> -------------------------------------------------------+
>>>>>>>> |                                                       PLAN
>>>>>>>>                                                |
>>>>>>>> +-----------------------------------------------------------
>>>>>>>> -------------------------------------------------------+
>>>>>>>> | CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL
>>>>>>>> 78-WAY FULL SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP  |
>>>>>>>> |     ROW TIMESTAMP FILTER [0, 9223372036854775807)
>>>>>>>>                                                 |
>>>>>>>> |     SERVER FILTER BY FIRST KEY ONLY
>>>>>>>>                                                 |
>>>>>>>> |     SERVER AGGREGATE INTO SINGLE ROW
>>>>>>>>                                                |
>>>>>>>> +-----------------------------------------------------------
>>>>>>>> -------------------------------------------------------+
>>>>>>>>
>>>>>>>> I has a lot of chunks.  Normally this query would return at least
>>>>>>>> some result after running for a few minutes.  With appropriate filters 
>>>>>>>> in
>>>>>>>> the WHERE clause, the queries run fine.
>>>>>>>>
>>>>>>>> Any suggestions on how to avoid this error and get better
>>>>>>>> performance from the table scans?  Realizing that we don't need to run 
>>>>>>>> full
>>>>>>>> table scans regularly, just trying to understand better best practices 
>>>>>>>> for
>>>>>>>> Phoenix Hbase.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Michael
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>

Re: Getting too many open files during table scan

Reply via email to