Re: Indexes

Justin Workman Mon, 03 Feb 2014 08:41:54 -0800

We updated to the 3.0.0-SNAPSHOT in an effort to also test the Flume
component, and we are not able to query any of our existing tables now
through sqlline or a java jdbc connection. However the Flume component
works fine writing to a new table. Here is the error we are getting when
doing a select count(1) from keywords;


Error: org.apache.hadoop.hbase.DoNotRetryIOException: keywords: at index 4
at
com.salesforce.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:83)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1034)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.regionserver.HRegion.exec(HRegion.java:5482)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.execCoprocessor(HRegionServer.java:3720)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:308)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
Caused by: java.lang.NullPointerException: at index 4
at
com.google.common.collect.ImmutableList.checkElementNotNull(ImmutableList.java:305)
at com.google.common.collect.ImmutableList.construct(ImmutableList.java:296)
at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:272)
at com.salesforce.phoenix.schema.PTableImpl.init(PTableImpl.java:290)
at com.salesforce.phoenix.schema.PTableImpl.<init>(PTableImpl.java:219)
at com.salesforce.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:212)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:436)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:254)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:1082)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:279)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:430)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:254)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:1082)
at
com.salesforce.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1028)
... 10 more (state=08000,code=101)


On Thu, Jan 30, 2014 at 4:01 PM, Justin Workman <[email protected]>wrote:

> I will test with the latest master build. When this table goes live we
> will shorten the cf name, that was a mistake. Thanks for all the info. I do
> think going forward we will be creating these tables via Phoenix. We are
> still testing the flume sink and pig handlers before completely committing.
>
> I'll update the list once I've had a chance to test with the latest build
> and file a Jira if the problem persists.
>
> Thanks!
> Justin
>
> Sent from my iPhone
>
> On Jan 30, 2014, at 1:25 PM, James Taylor <[email protected]> wrote:
>
> Thanks for all the detail, Justin. Based on this, it looks like a bug
> related to using case sensitive column names. Maryann checked in a fix
> related to this, so it might be fixed in the latest on master.
>
> If it's not fixed, would you mind filing a JIRA?
>
> FWIW, you may want to consider a shorter column family name, like "k" or
> "kw" as that'll make your table smaller. Also, did you know you can provide
> your HBase table and column family config parameters in your CREATE TABLE
> statement and it'll create the HBase table and the column families, like
> below?
>
> CREATE TABLE SEO.KEYWORDIDEAS (
>     "pk" VARCHAR PRIMARY KEY,
>     "keyword"."jobId" VARCHAR,
>     "keyword"."jobName" VARCHAR,
>     "keyword"."jobType" VARCHAR,
>     "keyword"."keywordText" VARCHAR,
>     "keyword"."parentKeywordText" VARCHAR,
>     "keyword"."refinementName" VARCHAR,
>     "keyword"."refinementValue" VARCHAR,
>     "keyword"."relatedKeywordRank" VARCHAR
>     ) IMMUTABLE_ROWS=true, COMPRESSION='SNAPPY' ;
>
>
>
>
> On Thu, Jan 30, 2014 at 8:50 AM, Justin Workman 
> <[email protected]>wrote:
>
>> I don't think that is the issue we are hitting. Details are below. The
>> Hbase table does have more columns than we are defining the phoenix table.
>> We were hoping to just be able to use the dynamic column features for
>> if/when we need to access data in other columns in the underlying table. As
>> you can see from the output of the explain statement below, a simple query
>> does not use the index.
>>
>> However if I create another identical table using Phoenix and upsert into
>> that new table from the table below, create the same index on that table
>> and then run the same select query, it does use the index on that table.
>>
>> So I am still very confused as to why the index is not invoked when the
>> table is created on top of an existing Hbase table.
>>
>> Hbase Create Table
>> > create 'SEO.KEYWORDIDEAS', { NAME=>'keyword', COMPRESSION=>'SNAPPY' }
>>
>> Phoenix Create Table
>> CREATE TABLE SEO.KEYWORDIDEAS (
>>     "pk" VARCHAR PRIMARY KEY,
>>     "keyword"."jobId" VARCHAR,
>>     "keyword"."jobName" VARCHAR,
>>     "keyword"."jobType" VARCHAR,
>>     "keyword"."keywordText" VARCHAR,
>>     "keyword"."parentKeywordText" VARCHAR,
>>     "keyword"."refinementName" VARCHAR,
>>     "keyword"."refinementValue" VARCHAR,
>>     "keyword"."relatedKeywordRank" VARCHAR
>>     ) IMMUTABLE_ROWS=true;
>>
>> Create Index
>> CREATE INDEX KWDIDX ON SEO.KEYWORDIDEAS ("parentKeywordText");
>>
>> Show and count indexes
>>
>> +-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
>> | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | NON_UNIQUE | INDEX_QUALIFIER |
>> INDEX_NAME | TYPE | ORDINAL_POSITION | COLUMN_NAME | ASC_OR_DESC |
>> CARDINALITY | PAGES  | FILTER_CONDITION | DATA_TYPE | TYPE_NAME |
>>
>> +-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
>> | null      | SEO         | KEYWORDIDEAS | true       | null            |
>> KWDIDX     | 3    | 1                | keyword:parentKeywordText | A
>>     | null        | null   | null             | |
>> | null      | SEO         | KEYWORDIDEAS | true       | null            |
>> KWDIDX     | 3    | 2                | :pk         | A           | null
>>    | null   | null             | 12        | V |
>> | null      | SEO         | KEYWORDIDEAS | true       | null            |
>> RA_TEST_ID | 3    | 1                | keyword:jobId | A           | null
>>      | null   | null             | 12        | |
>> | null      | SEO         | KEYWORDIDEAS | true       | null            |
>> RA_TEST_ID | 3    | 2                | :pk         | A           | null
>>    | null   | null             | 12        | V |
>>
>> +-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
>>
>> > select count(1) from seo.keywordideas;
>> +----------+
>> | COUNT(1) |
>> +----------+
>> | 423229   |
>> +----------+
>> > select count(1) from seo.kwdidx;
>> +----------+
>> | COUNT(1) |
>> +----------+
>> | 423229   |
>> +----------+
>>
>> > explain select count(1) from seo.keywords where "parentKeywordText" =
>> 'table';
>> +------------+
>> |    PLAN    |
>> +------------+
>> | CLIENT PARALLEL 18-WAY FULL SCAN OVER SEO.KEYWORDIDEAS |
>> |     SERVER FILTER BY keyword.parentKeywordText = 'sheets' |
>> |     SERVER AGGREGATE INTO SINGLE ROW |
>> +------------+
>>
>> Now here is where I can get the index to be invoked.
>> > CREATE TABLE SEO.NEW_KEYWORDIDEAS (
>>     PK VARCHAR PRIMARY KEY,
>>     JOB_ID VARCHAR
>>     JOB_NAME VARCHAR,
>>     JOB_TYPE VARCHAR,
>>     KEYWORD_TEXT VARCHAR,
>>     PARENT_KEYWORD_TEXT VARCHAR,
>>     REFINEMENT_NAME VARCHAR,
>>     REFINEMENT_VALUE VARCHAR,
>>     RELATED_KEYWORD_RANK VARCHAR
>>     ) IMMUTABLE_ROWS=true;
>>
>> > UPSERT INTO SEO.NEW_KEYWORDIEAS SELECT * FROM SEO.KEYWORDIDEAS;
>>
>> > CREATE INDEX NEW_KWD_IDX ON SEO.NEW_KEYWORDIDEAS (PARENT_KEYWORD_TEXT);
>>
>> > explain select count(1) from seo.new_keywordideas where
>> parent_keyword_text = 'table';
>>
>> +------------+
>>
>> |    PLAN    |
>>
>> +------------+
>>
>> | CLIENT PARALLEL 1-WAY RANGE SCAN OVER SEO.NEW_KWD_IDX ['table'] |
>>
>> |     SERVER AGGREGATE INTO SINGLE ROW |
>>
>>
>>
>>
>>
>> On Wed, Jan 29, 2014 at 5:21 PM, James Taylor <[email protected]>wrote:
>>
>>> Hi Justin,
>>> Please take a look at this FAQ:
>>> http://phoenix.incubator.apache.org/faq.html#/Why_isnnullt_my_secondary_index_being_used
>>>
>>> If that's not the case for you, can you include your CREATE TABLE,
>>> CREATE INDEX, SELECT statement, and EXPLAIN plan?
>>>
>>> Thanks,
>>> James
>>>
>>>
>>> On Wed, Jan 29, 2014 at 4:13 PM, Justin Workman <
>>> [email protected]> wrote:
>>>
>>>> I am seeing some odd behavior with indexes and want some clarification
>>>> on how they are used.
>>>>
>>>> When I create an table in phoenix on top of an existing Hbase table,
>>>> and then create an index on this table, I can see the index get built and
>>>> populated properly, however no queries show that they are using this index
>>>> when I run an explain on the query.
>>>>
>>>> However, if I create an seperete table in Phoenix and do an upsert from
>>>> my hbase table into the new table that I created, and create the same index
>>>> as on the previous table. Then my queries show that they would use the
>>>> index when running them through the explain plan.
>>>>
>>>> Are we not able to create or use an index on a table we create over an
>>>> exiting Hbase table, or am I doing something wrong?
>>>>
>>>> Thanks in advance for any help.
>>>> Justin
>>>>
>>>
>>>
>>
>

Re: Indexes

Reply via email to