split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Edwin Litterst
Hi,   I am using PhoenixInputFormat as input source for mapreduce jobs. The split count (which determines how many mappers are used for the job) is always equal to the number of regions of the table from where I select the input. Is there a way to increase the number of splits? My job is runnin

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Josh Elser
You can extend/customize the PhoenixInputFormat with your own code to increase the number of InputSplits and Mappers. On 1/30/19 6:43 AM, Edwin Litterst wrote: Hi, I am using PhoenixInputFormat as input source for mapreduce jobs. The split count (which determines how many mappers are used for t

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Thomas D'Silva
If stats are enabled PhoenixInputFormat will generate a split per guidepost. On Wed, Jan 30, 2019 at 7:31 AM Josh Elser wrote: > You can extend/customize the PhoenixInputFormat with your own code to > increase the number of InputSplits and Mappers. > > On 1/30/19 6:43 AM, Edwin Litterst wrote: >

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread venkata subbarayudu
You may recreate the table with salt_bucket table option to have reasonable regions and you may try having a secondary index to make the query run faster incase if your Mapreduce job performing specific filters On Thu 31 Jan, 2019, 12:09 AM Thomas D'Silva If stats are enabled PhoenixInputFormat w

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Ankit Singhal
As Thomas said, no. of splits will be equal to the number of guideposts available for the table or the ones required to cover the filter. if you are seeing one split per region then either stats are disabled or guidePostwidth is set higher than the size of the region , so try reducing the guidepost

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Josh Elser
Please do not take this advice lightly. Adding (or increasing) salt buckets can have a serious impact on the execution of your queries. On 1/30/19 5:33 PM, venkata subbarayudu wrote: You may recreate the table with salt_bucket table option to have reasonable regions and you may try having a sec

Re: client does not have phoenix.schema.isNamespaceMappingEnabled

2019-01-30 Thread Ajit Bhingarkar
user-unsubscr...@phoenix.apache.org On Fri, Nov 30, 2018 at 12:04 AM M. Aaron Bossert wrote: > So, sorry for the super late reply...there is weird lag between the time a > message is sent or received to this mailing list and when I actually see > it...But, I have got it working now as follows: >

Phoenix JDBC Connection Warmup

2019-01-30 Thread William Shen
Hi there, I have a component that makes Phoenix queries via the Phoenix JDBC Connection. I noticed that consistently, the Phoenix Client takes longer to execute a PreparedStatement and it takes longer to read through the ResultSet for a period of time (~15m) after a restart of the component. It se

Re: Phoenix JDBC Connection Warmup

2019-01-30 Thread Jaanai Zhang
It is expected when firstly query tables after establishing the connection. Something likes loads some meta information into local cache that need take some time, mainly including two aspects: 1. access SYSTEM.CATALOG table to get schema information of the table 2. access the meta table of HBase