Hello Alex. Please refer to this JIRA https://issues.apache.org/jira/browse/PHOENIX-1734 . Since v4.8 local index it's just a shadow CF within data table.
On Fri, Jul 26, 2019 at 5:08 AM Alexander Lytchier < [email protected]> wrote: > Thanks for the reply. > > We will attempt to update to Phoenix 4.14.X and re-try adding secondary > indexes. > > Can you help to clarify “local indexes are stored in the same table as the > data”. When a local index is created in Phoenix I observe that a new table > is created in HBase *_LOCAL_IDX_TABLE_NAME*. It was my assumption that > this is where the columns for the index table are stored, along with the PK > values? Moreover using *EXPLAIN* in Phoenix I can see that it will > attempt to SCAN OVER *_LOCAL_IDX_TABLE_NAME* when my query is using the > index. > > > > On 2019/07/25 14:00:25, Josh Elser <[email protected]> wrote: > > > Local indexes are stored in the same table as the data. They are "local" > > > > > to the data.> > > > > > > I would not be surprised if you are running into issues because you are > > > > > using such an old version of Phoenix.> > > > > > > On 7/24/19 10:35 PM, Alexander Lytchier wrote:> > > > > Hi,> > > > > > > > > > We are currently using Cloudera as a package manager for our Hadoop > > > > > Cluster with Phoenix 4.7.0 (CLABS_PHOENIX)and HBase 1.2.0-cdh5.7.6. > > > > > Phoenix 4.7.0 appears to be the latest version supported > > > > > (http://archive.cloudera.com/cloudera-labs/phoenix/parcels/latest/) > even > > > > > though it’s old.> > > > > > > > > > The table in question has a binary row-key: pk BINARY(30): 1 Byte for > > > > > > salting, 8 Bytes - timestamp (Long), 20 Bytes - hash result of other > > > > > record fields. + 1 extra byte for unknown issue about updating schema > in > > > > > future (not sure if relevant). We are currently facing performance > > > > > issues and are attempting to mitigate it by adding secondary indexes.> > > > > > > > > > When generating a local index synchronously with the following > command:> > > > > > > > > > CREATE LOCAL INDEX INDEX_TABLE ON “MyTable” (“cf”.”type”);> > > > > > > > > > I can see that the resulting index table in Phoenix is populated, in > > > > > HBase I can see the row-key of the index table and queries work as > expected:> > > > > > > > > > \x00\x171545413\x00 column=cf:cf:type, timestamp=1563954319353, > > > > > value=1545413> > > > > > > > > > \x00\x00\x00\x01b\xB2s\xDB> > > > > > > > > > @\x1B\x94\xFA\xD4\x14c\x0B> > > > > > > > > > d$\x82\xAD\xE6\xB3\xDF\x06> > > > > > > > > > \xC9\x07@\xB9\xAE\x00> > > > > > > > > > However, for the case where the index is created asynchronously, and > > > > > then populated using the IndexTool, with the following commands:> > > > > > > > > > > > > > > CREATE LOCAL INDEX INDEX_TABLE ON “MyTable” (“cf”.”type”) ASYNC;> > > > > > > > > > sudo -u hdfs HADOOP_CLASSPATH=`hbase classpath` hadoop jar > > > > > > /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hbase/bin/../lib/hbase-client-1.2.0-cdh5.7.1.jar > > > > > > org.apache.phoenix.mapreduce.index.IndexTool --data-table "MyTable" > > > > > --index-table INDEX_TABLE --output-path hdfs://nameservice1/> > > > > > > > > > I get the following row-key in HBase:> > > > > > > > > > > > > > > \x00\x00\x00\x00\x00\x00\x column=cf:cf:type, timestamp=1563954000238, > > > > > > value=1545413> > > > > > > > > > 00\x00\x00\x00\x00\x00\x00> > > > > > > > > > \x00\x00\x00\x00\x00\x00\x> > > > > > > > > > 00\x00\x00\x00\x00\x00\x00> > > > > > > > > > \x00\x00\x00\x00\x00\x00\x> > > > > > > > > > 151545413\x00\x00\x> > > > > > > > > > 00\x00\x01b\xB2s\xDB@\x1B\> > > > > > > > > > x94\xFA\xD4\x14c\x0Bd$\x82> > > > > > > > > > \xAD\xE6\xB3\xDF\x06\xC9\x> > > > > > > > > > 07@\xB9\xAE\x00> > > > > > > > > > It is has 32 additional 0-bytes (\x00). Why is there a difference – is > > > > > > one expected? What’s more, the index table in Phoenix is empty (I > guess > > > > > it’s not able to read the underlying HBase index table with that > key?), > > > > > so any queries that use the local index in Phoenix return no value.> > > > > > > > > > Do you have any suggestions? We must use the /async /method to > populate > > > > > the index table on production because of the massive amounts of data, > > > > > > but if Phoenix is not able to read the index table it cannot be used > for > > > > > queries.> > > > > > > > > > Is it possible this issue has been fixed in a newer version?> > > > > > > > > > Thanks> > > > > > > > > > -- Aleksandr Saraseka DBA 380997600401 *•* [email protected] *•* eztexting.com <http://eztexting.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
