chenzhiming created PHOENIX-3555: ------------------------------------ Summary: Building async local index by IndexTool generate wrong data Key: PHOENIX-3555 URL: https://issues.apache.org/jira/browse/PHOENIX-3555 Project: Phoenix Issue Type: Bug Affects Versions: 4.8.0 Environment: phoenix4.8.0 Reporter: chenzhiming
1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table ------------------------------------------------------------------------ hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -------------------------------------------------------------------------- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; +---------+----------------------+----------+-----------+ | ORG_ID | CAP_DATE | CAR_NUM | ORG_NAME | +---------+----------------------+----------+-----------+ | null | 2016-01-01 00:00:00 | car1 | | +---------+----------------------+----------+-----------+ ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)