[jira] [Updated] (PHOENIX-4847) Index tables will produce dirty data when use BukloadTool
[ https://issues.apache.org/jira/browse/PHOENIX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaanai Zhang updated PHOENIX-4847: -- Description: Index tables are inconsistent with data tables when occurs update primary key of data tables with CSVBulkload tool. CSV data: {code:sql} k1,v1_1,1 k2,v2_2,2 k1,v1_1,3 {code} Imported table {code:sql} DROP TABLE IF EXISTS TABLE1; CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) SALT_BUCKETS = 8, UPDATE_CACHE_FREQUENCY = 12; CREATE INDEX V1_IDX on TABLE1(V1) include(v2); CREATE INDEX V2_IDX on TABLE1(V2) include(v1); {code} The following is query results after executing `CsvBulkLoadTool` {code:sql} 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; +-+---+-+ | ID | V1 | V2 | +-+---+-+ | k2 | v2_2 | 2 | | k1 | v1_1 | 3 | +-+---+-+ 2 rows selected (0.066 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; +---+--+---+ | 0:V1 | :ID | 0:V2 | +---+--+---+ | v1_1 | k1 | 3 | | v2_2 | k2 | 2 | +---+--+---+ 2 rows selected (0.074 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; +---+--+---+ | 0:V2 | :ID | 0:V1 | +---+--+---+ | 1 | k1 | v1_1 | | 3 | k1 | v1_1 | | 2 | k2 | v2_2 | +---+--+---+ 3 rows selected (0.062 seconds) {code} was: Index tables are inconsistent with data tables when occurs update primary key of data tables with CSVBulkload tool. CSV data: {code:sql} k1,v1_1,1 k2,v2_2,2 k1,v1_1,3 {code} Imported table {code:sql} DROP TABLE IF EXISTS TABLE1; CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) SALT_BUCKETS = 8, UPDATE_CACHE_FREQUENCY = 12; CREATE INDEX V1_IDX on TABLE1(V1) include(v2); CREATE INDEX V2_IDX on TABLE1(V2) include(v1); upsert into TABLE1 values('k1','v1_1',1); upsert into TABLE1 values('k2','v2_2',2); upsert into TABLE1 values('k1','v1_1',3); {code} The following is query results after executing `CsvBulkLoadTool` {code:sql} 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; +-+---+-+ | ID | V1 | V2 | +-+---+-+ | k2 | v2_2 | 2 | | k1 | v1_1 | 3 | +-+---+-+ 2 rows selected (0.066 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; +---+--+---+ | 0:V1 | :ID | 0:V2 | +---+--+---+ | v1_1 | k1 | 3 | | v2_2 | k2 | 2 | +---+--+---+ 2 rows selected (0.074 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; +---+--+---+ | 0:V2 | :ID | 0:V1 | +---+--+---+ | 1 | k1 | v1_1 | | 3 | k1 | v1_1 | | 2 | k2 | v2_2 | +---+--+---+ 3 rows selected (0.062 seconds) {code} > Index tables will produce dirty data when use BukloadTool > -- > > Key: PHOENIX-4847 > URL: https://issues.apache.org/jira/browse/PHOENIX-4847 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0, 4.13.0, 4.14.0, 5.0.0 >Reporter: Jaanai Zhang >Priority: Critical > > Index tables are inconsistent with data tables when occurs update primary key > of data tables with CSVBulkload tool. > CSV data: > {code:sql} > k1,v1_1,1 > k2,v2_2,2 > k1,v1_1,3 > {code} > Imported table > {code:sql} > DROP TABLE IF EXISTS TABLE1; > CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) > SALT_BUCKETS = 8, > UPDATE_CACHE_FREQUENCY = 12; > CREATE INDEX V1_IDX on TABLE1(V1) include(v2); > CREATE INDEX V2_IDX on TABLE1(V2) include(v1); > {code} > The following is query results after executing `CsvBulkLoadTool` > {code:sql} > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; > +-+---+-+ > | ID | V1 | V2 | > +-+---+-+ > | k2 | v2_2 | 2 | > | k1 | v1_1 | 3 | > +-+---+-+ > 2 rows selected (0.066 seconds) > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; > +---+--+---+ > | 0:V1 | :ID | 0:V2 | > +---+--+---+ > | v1_1 | k1 | 3 | > | v2_2 | k2 | 2 | > +---+--+---+ > 2 rows selected (0.074 seconds) > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; > +---+--+---+ > | 0:V2 | :ID | 0:V1 | > +---+--+---+ > | 1 | k1 | v1_1 | > | 3 | k1 | v1_1 | > | 2 | k2 | v2_2 | > +---+--+---+ > 3 rows selected (0.062 seconds) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4847) Index tables will produce dirty data when use BukloadTool
[ https://issues.apache.org/jira/browse/PHOENIX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaanai Zhang updated PHOENIX-4847: -- Description: Index tables are inconsistent with data tables when occurs update primary key of data tables with CSVBulkload tool. CSV data: {code:sql} k1,v1_1,1 k2,v2_2,2 k1,v1_1,3 {code} Imported table {code:sql} DROP TABLE IF EXISTS TABLE1; CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) SALT_BUCKETS = 8, UPDATE_CACHE_FREQUENCY = 12; CREATE INDEX V1_IDX on TABLE1(V1) include(v2); CREATE INDEX V2_IDX on TABLE1(V2) include(v1); upsert into TABLE1 values('k1','v1_1',1); upsert into TABLE1 values('k2','v2_2',2); upsert into TABLE1 values('k1','v1_1',3); {code} The following is query results after executing `CsvBulkLoadTool` {code:sql} 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; +-+---+-+ | ID | V1 | V2 | +-+---+-+ | k2 | v2_2 | 2 | | k1 | v1_1 | 3 | +-+---+-+ 2 rows selected (0.066 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; +---+--+---+ | 0:V1 | :ID | 0:V2 | +---+--+---+ | v1_1 | k1 | 3 | | v2_2 | k2 | 2 | +---+--+---+ 2 rows selected (0.074 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; +---+--+---+ | 0:V2 | :ID | 0:V1 | +---+--+---+ | 1 | k1 | v1_1 | | 3 | k1 | v1_1 | | 2 | k2 | v2_2 | +---+--+---+ 3 rows selected (0.062 seconds) {code} was: Index tables are inconsistent with data tables when occurs update primary key of data tables with CSVBulkload tool. CSV data: {code:csv} k1,v1_1,1 k2,v2_2,2 k1,v1_1,3 {code} Imported table {code:sql} DROP TABLE IF EXISTS TABLE1; CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) SALT_BUCKETS = 8, UPDATE_CACHE_FREQUENCY = 12; CREATE INDEX V1_IDX on TABLE1(V1) include(v2); CREATE INDEX V2_IDX on TABLE1(V2) include(v1); upsert into TABLE1 values('k1','v1_1',1); upsert into TABLE1 values('k2','v2_2',2); upsert into TABLE1 values('k1','v1_1',3); {code} The following is query results after executing `CsvBulkLoadTool` {code:sql} 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; +-+---+-+ | ID | V1 | V2 | +-+---+-+ | k2 | v2_2 | 2 | | k1 | v1_1 | 3 | +-+---+-+ 2 rows selected (0.066 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; +---+--+---+ | 0:V1 | :ID | 0:V2 | +---+--+---+ | v1_1 | k1 | 3 | | v2_2 | k2 | 2 | +---+--+---+ 2 rows selected (0.074 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; +---+--+---+ | 0:V2 | :ID | 0:V1 | +---+--+---+ | 1 | k1 | v1_1 | | 3 | k1 | v1_1 | | 2 | k2 | v2_2 | +---+--+---+ 3 rows selected (0.062 seconds) {code} > Index tables will produce dirty data when use BukloadTool > -- > > Key: PHOENIX-4847 > URL: https://issues.apache.org/jira/browse/PHOENIX-4847 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0, 4.13.0, 4.14.0, 5.0.0 >Reporter: Jaanai Zhang >Priority: Critical > > Index tables are inconsistent with data tables when occurs update primary key > of data tables with CSVBulkload tool. > CSV data: > {code:sql} > k1,v1_1,1 > k2,v2_2,2 > k1,v1_1,3 > {code} > Imported table > {code:sql} > DROP TABLE IF EXISTS TABLE1; > CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) > SALT_BUCKETS = 8, > UPDATE_CACHE_FREQUENCY = 12; > CREATE INDEX V1_IDX on TABLE1(V1) include(v2); > CREATE INDEX V2_IDX on TABLE1(V2) include(v1); > upsert into TABLE1 values('k1','v1_1',1); > upsert into TABLE1 values('k2','v2_2',2); > upsert into TABLE1 values('k1','v1_1',3); > {code} > The following is query results after executing `CsvBulkLoadTool` > {code:sql} > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; > +-+---+-+ > | ID | V1 | V2 | > +-+---+-+ > | k2 | v2_2 | 2 | > | k1 | v1_1 | 3 | > +-+---+-+ > 2 rows selected (0.066 seconds) > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; > +---+--+---+ > | 0:V1 | :ID | 0:V2 | > +---+--+---+ > | v1_1 | k1 | 3 | > | v2_2 | k2 | 2 | > +---+--+---+ > 2 rows selected (0.074 seconds) > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; > +---+--+---+ > | 0:V2 | :ID | 0:V1 | > +---+--+---+ > | 1 | k1 | v1_1 | > | 3 | k1 | v1_1 | > | 2 | k2 | v2_2 | > +---+--+
[jira] [Updated] (PHOENIX-4847) Index tables will produce dirty data when use BukloadTool
[ https://issues.apache.org/jira/browse/PHOENIX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaanai Zhang updated PHOENIX-4847: -- Description: Index tables are inconsistent with data tables when occurs update primary key of data tables with CSVBulkload tool. CSV data: {code:csv} k1,v1_1,1 k2,v2_2,2 k1,v1_1,3 {code} Imported table {code:sql} DROP TABLE IF EXISTS TABLE1; CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) SALT_BUCKETS = 8, UPDATE_CACHE_FREQUENCY = 12; CREATE INDEX V1_IDX on TABLE1(V1) include(v2); CREATE INDEX V2_IDX on TABLE1(V2) include(v1); upsert into TABLE1 values('k1','v1_1',1); upsert into TABLE1 values('k2','v2_2',2); upsert into TABLE1 values('k1','v1_1',3); {code} The following is query results after executing `CsvBulkLoadTool` {code:sql} 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; +-+---+-+ | ID | V1 | V2 | +-+---+-+ | k2 | v2_2 | 2 | | k1 | v1_1 | 3 | +-+---+-+ 2 rows selected (0.066 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; +---+--+---+ | 0:V1 | :ID | 0:V2 | +---+--+---+ | v1_1 | k1 | 3 | | v2_2 | k2 | 2 | +---+--+---+ 2 rows selected (0.074 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; +---+--+---+ | 0:V2 | :ID | 0:V1 | +---+--+---+ | 1 | k1 | v1_1 | | 3 | k1 | v1_1 | | 2 | k2 | v2_2 | +---+--+---+ 3 rows selected (0.062 seconds) {code} was: Index tables are inconsistent with data tables when occurs update primary key of data tables with CSVBulkload tool. CSV data: {code:csv} k1,v1_1,1 k2,v2_2,2 k1,v1_1,3 {code} Imported table {code:sql} DROP TABLE IF EXISTS TABLE1; CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) SALT_BUCKETS = 8, UPDATE_CACHE_FREQUENCY = 12; CREATE INDEX V1_IDX on TABLE1(V1) include(v2); CREATE INDEX V2_IDX on TABLE1(V2) include(v1); upsert into TABLE1 values('k1','v1_1',1); upsert into TABLE1 values('k2','v2_2',2); upsert into TABLE1 values('k1','v1_1',3); {code} The following is query results after executing `CsvBulkLoadTool` {code:sql} 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; +-+---+-+ | ID | V1 | V2 | +-+---+-+ | k2 | v2_2 | 2 | | k1 | v1_1 | 3 | +-+---+-+ 2 rows selected (0.066 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; +---+--+---+ | 0:V1 | :ID | 0:V2 | +---+--+---+ | v1_1 | k1 | 3 | | v2_2 | k2 | 2 | +---+--+---+ 2 rows selected (0.074 seconds) 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; +---+--+---+ | 0:V2 | :ID | 0:V1 | +---+--+---+ | 1 | k1 | v1_1 | | 3 | k1 | v1_1 | | 2 | k2 | v2_2 | +---+--+---+ 3 rows selected (0.062 seconds) {code} > Index tables will produce dirty data when use BukloadTool > -- > > Key: PHOENIX-4847 > URL: https://issues.apache.org/jira/browse/PHOENIX-4847 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0, 4.13.0, 4.14.0, 5.0.0 >Reporter: Jaanai Zhang >Priority: Critical > > Index tables are inconsistent with data tables when occurs update primary key > of data tables with CSVBulkload tool. > CSV data: > {code:csv} > k1,v1_1,1 > k2,v2_2,2 > k1,v1_1,3 > {code} > Imported table > {code:sql} > DROP TABLE IF EXISTS TABLE1; > CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT) > SALT_BUCKETS = 8, > UPDATE_CACHE_FREQUENCY = 12; > CREATE INDEX V1_IDX on TABLE1(V1) include(v2); > CREATE INDEX V2_IDX on TABLE1(V2) include(v1); > upsert into TABLE1 values('k1','v1_1',1); > upsert into TABLE1 values('k2','v2_2',2); > upsert into TABLE1 values('k1','v1_1',3); > {code} > The following is query results after executing `CsvBulkLoadTool` > {code:sql} > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1; > +-+---+-+ > | ID | V1 | V2 | > +-+---+-+ > | k2 | v2_2 | 2 | > | k1 | v1_1 | 3 | > +-+---+-+ > 2 rows selected (0.066 seconds) > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V1_IDX; > +---+--+---+ > | 0:V1 | :ID | 0:V2 | > +---+--+---+ > | v1_1 | k1 | 3 | > | v2_2 | k2 | 2 | > +---+--+---+ > 2 rows selected (0.074 seconds) > 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from V2_IDX; > +---+--+---+ > | 0:V2 | :ID | 0:V1 | > +---+--+---+ > | 1 | k1 | v1_1 | > | 3 | k1 | v1_1 | > | 2 | k2 | v2_2 | > +---+--+--