Here it is: https://issues.apache.org/jira/browse/PHOENIX-2762
We are having performance problem while doing write to our main table from our MapReduce job. I think, this problem was definitely degrading our performance. Gonna try testing my hypothesis. On Sat, Mar 12, 2016 at 1:08 PM, James Taylor <[email protected]> wrote: > Yes, good idea. Please file a JIRA. > > On Sat, Mar 12, 2016 at 1:07 PM, anil gupta <[email protected]> wrote: > >> To provide more insight, This table has around 1100 columns. I create >> this index on one column. (1/1100) * 8GB comes around 8MB. So, i think, we >> need to set a lower bound on region size of secondary index tables in >> Phoenix. Please let me know if you need me to file a JIRA. >> >> On Sat, Mar 12, 2016 at 12:45 PM, anil gupta <[email protected]> >> wrote: >> >>> Ok, Oversight on my side. MAX_FILESIZE => '11994435' for the secondary >>> index table. >>> Main table still doesnt shows MAX_FILESIZE attribute. >>> >>> On Sat, Mar 12, 2016 at 12:41 PM, James Taylor <[email protected]> >>> wrote: >>> >>>> It should show up for the index table. I did a test on my local HBase, >>>> and this is what I see: >>>> >>>> hbase(main):004:0> describe 'FOO_IDX' >>>> Table FOO_IDX is ENABLED >>>> >>>> FOO_IDX, {TABLE_ATTRIBUTES => {MAX_FILESIZE => '6710886400', ... >>>> >>>> On Sat, Mar 12, 2016 at 12:36 PM, anil gupta <[email protected]> >>>> wrote: >>>> >>>>> 8GB setting of region size is set at the cluster level. So, we havent >>>>> set MAX_FILESIZE in main table explicitly. I ran the describe statement >>>>> for >>>>> both tables but its not showing up MAX_FILESIZE since we didnt do any >>>>> custom setting to these tables. Hope this makes sense. >>>>> >>>>> On Sat, Mar 12, 2016 at 12:26 PM, James Taylor <[email protected] >>>>> > wrote: >>>>> >>>>>> Ok - before you reset the MAX_FILESIZE, it'd be help if you could >>>>>> open an HBase shell and let us know what the current values are for your >>>>>> data table and index table: >>>>>> >>>>>> describe YOUR_DATA_TABLE; >>>>>> describe YOUR_INDEX_TABLE; >>>>>> >>>>>> If your data table is 8GB, I'd guess your index should be 4GB at the >>>>>> smallest. I think 1GB would be too low. >>>>>> >>>>>> Thanks, >>>>>> James >>>>>> >>>>>> On Sat, Mar 12, 2016 at 12:23 PM, anil gupta <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thanks for the reply, James. We have 2 global secondary index in >>>>>>> this table and both of them exhibit same behavior. Going to give your >>>>>>> suggestion a try. I also think that regionsize for secondary index >>>>>>> should >>>>>>> not be 8GB. Will try to set the regionsize=1GB for secondary index and >>>>>>> see >>>>>>> how it goes. >>>>>>> >>>>>>> On Sat, Mar 12, 2016 at 12:00 PM, James Taylor < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Anil, >>>>>>>> Phoenix estimates the ratio between the data table and index table >>>>>>>> as shown below to attempt to get the same number of splits in your >>>>>>>> index >>>>>>>> table as your data table. >>>>>>>> >>>>>>>> /* >>>>>>>> * Approximate ratio between index table size and data table size: >>>>>>>> * More or less equal to the ratio between the number of key value >>>>>>>> * columns in each. We add one to the key value column count to >>>>>>>> * take into account our empty key value. We add 1/4 for any key >>>>>>>> * value data table column that was moved into the index table row >>>>>>>> key. >>>>>>>> */ >>>>>>>> >>>>>>>> Phoenix then multiples the MAX_FILESIZE of the data table to come >>>>>>>> up with a reasonable default value for the index table. Can you check >>>>>>>> in >>>>>>>> the HBase shell what the MAX_FILESIZE is for the data table versus the >>>>>>>> index table? Maybe there's a bug in Phoenix in how it calculates this >>>>>>>> ration. >>>>>>>> >>>>>>>> You can override the MAX_FILESIZE for your index through an ALTER >>>>>>>> TABLE statement: >>>>>>>> >>>>>>>> ALTER TABLE my_table_schema.my_index_name SET MAX_FILESIZE= >>>>>>>> 8589934592 >>>>>>>> >>>>>>>> You can ignore the warnings you get in sqlline and you can verify >>>>>>>> the setting took affect through the HBase shell by running the >>>>>>>> following >>>>>>>> command: >>>>>>>> >>>>>>>> describe 'MY_TABLE_SCHEMA.MY_INDEX_NAME' >>>>>>>> >>>>>>>> HTH, >>>>>>>> >>>>>>>> James >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Mar 12, 2016 at 10:18 AM, anil gupta <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> We are using HDP2.3.4 and Phoenix4.4. >>>>>>>>> Our global index table is doing excessive splitting. Our cluster >>>>>>>>> region size setting is 8 Gigabytes but global index table has 18 >>>>>>>>> regions >>>>>>>>> and max size of region is 10.9 MB. >>>>>>>>> This is definitely not a good behavior. I looked into tuning ( >>>>>>>>> https://phoenix.apache.org/tuning.html) and i could not find >>>>>>>>> anything relevant. Is this region splitting intentionally done by >>>>>>>>> Phoenix >>>>>>>>> for secondary index tables? >>>>>>>>> >>>>>>>>> Here is the output of du command: >>>>>>>>> [ag@hdpclient1 ~]$ hadoop fs -du -h >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX >>>>>>>>> 761 /apps/hbase/data/data/default/SEC_INDEX/.tabledesc >>>>>>>>> 0 /apps/hbase/data/data/default/SEC_INDEX/.tmp >>>>>>>>> 9.3 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/079db2c953c30a8270ecbd52582e81ff >>>>>>>>> 2.9 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/0952c070234c05888bfc2a01645e9e88 >>>>>>>>> 10.9 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/0d69bbb8991b868f0437b624410e9bed >>>>>>>>> 8.2 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/206562491fd1de9db48cf422dd8c2059 >>>>>>>>> 7.9 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/25318837ab8e1db6922f5081c840d2e7 >>>>>>>>> 9.5 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/5369e0d6526b3d2cdab9937cb320ccb3 >>>>>>>>> 9.6 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/62704ee3c9418f0cd48210a747e1f8ac >>>>>>>>> 7.8 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/631376fc5515d7785b2bcfc8a1f64223 >>>>>>>>> 2.8 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/6648d5396ba7a3c3bf884e5e1300eb0e >>>>>>>>> 9.4 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/6e6e133580aea9a19a6b3ea643735072 >>>>>>>>> 8.1 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/8535a5c8a0989dcdfad2b1e9e9f3e18c >>>>>>>>> 7.8 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/8ffa32e0c6357c2a0b413f3896208439 >>>>>>>>> 9.3 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/c27e2809cd352e3b06c0f11d3e7278c6 >>>>>>>>> 8.0 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/c4f5a98ce6452a6b5d052964cc70595a >>>>>>>>> 8.1 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/c578d3190363c32032b4d92c8d307215 >>>>>>>>> 7.9 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/d750860bac8aa372eb28aaf055ea63e7 >>>>>>>>> 9.6 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/e9756aa4c7c8b9bfcd0857b43ad5bfbe >>>>>>>>> 8.0 M >>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/ebaae6c152e82c9b74c473babaf644dd >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Anil Gupta >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Anil Gupta >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Anil Gupta >>>>> >>>> >>>> >>> >>> >>> -- >>> Thanks & Regards, >>> Anil Gupta >>> >> >> >> >> -- >> Thanks & Regards, >> Anil Gupta >> > > -- Thanks & Regards, Anil Gupta
