[jira] [Updated] (PHOENIX-4847) Index tables will produce dirty data when use BukloadTool

2018-08-13 Thread Jaanai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaanai Zhang updated PHOENIX-4847:
--
Description: 
Index tables are inconsistent with data tables when occurs update primary key 
of data tables with CSVBulkload tool.

CSV data:

{code:sql}
k1,v1_1,1
k2,v2_2,2
k1,v1_1,3
{code}

Imported table 

{code:sql}
DROP TABLE IF EXISTS TABLE1;
CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
SALT_BUCKETS = 8,
UPDATE_CACHE_FREQUENCY = 12;

CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
CREATE INDEX V2_IDX on TABLE1(V2) include(v1);
{code}

The following is query results after executing `CsvBulkLoadTool`

{code:sql}
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
+-+---+-+
| ID  |  V1   | V2  |
+-+---+-+
| k2  | v2_2  | 2   |
| k1  | v1_1  | 3   |
+-+---+-+
2 rows selected (0.066 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
+---+--+---+
| 0:V1  | :ID  | 0:V2  |
+---+--+---+
| v1_1  | k1   | 3 |
| v2_2  | k2   | 2 |
+---+--+---+
2 rows selected (0.074 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
+---+--+---+
| 0:V2  | :ID  | 0:V1  |
+---+--+---+
| 1 | k1   | v1_1  |
| 3 | k1   | v1_1  |
| 2 | k2   | v2_2  |
+---+--+---+
3 rows selected (0.062 seconds)
{code}


  was:
Index tables are inconsistent with data tables when occurs update primary key 
of data tables with CSVBulkload tool.

CSV data:

{code:sql}
k1,v1_1,1
k2,v2_2,2
k1,v1_1,3
{code}

Imported table 

{code:sql}
DROP TABLE IF EXISTS TABLE1;
CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
SALT_BUCKETS = 8,
UPDATE_CACHE_FREQUENCY = 12;

CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
CREATE INDEX V2_IDX on TABLE1(V2) include(v1);

upsert into TABLE1 values('k1','v1_1',1);
upsert into TABLE1 values('k2','v2_2',2);
upsert into TABLE1 values('k1','v1_1',3);
{code}

The following is query results after executing `CsvBulkLoadTool`

{code:sql}
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
+-+---+-+
| ID  |  V1   | V2  |
+-+---+-+
| k2  | v2_2  | 2   |
| k1  | v1_1  | 3   |
+-+---+-+
2 rows selected (0.066 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
+---+--+---+
| 0:V1  | :ID  | 0:V2  |
+---+--+---+
| v1_1  | k1   | 3 |
| v2_2  | k2   | 2 |
+---+--+---+
2 rows selected (0.074 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
+---+--+---+
| 0:V2  | :ID  | 0:V1  |
+---+--+---+
| 1 | k1   | v1_1  |
| 3 | k1   | v1_1  |
| 2 | k2   | v2_2  |
+---+--+---+
3 rows selected (0.062 seconds)
{code}



> Index tables will produce dirty data when use BukloadTool 
> --
>
> Key: PHOENIX-4847
> URL: https://issues.apache.org/jira/browse/PHOENIX-4847
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0, 4.13.0, 4.14.0, 5.0.0
>Reporter: Jaanai Zhang
>Priority: Critical
>
> Index tables are inconsistent with data tables when occurs update primary key 
> of data tables with CSVBulkload tool.
> CSV data:
> {code:sql}
> k1,v1_1,1
> k2,v2_2,2
> k1,v1_1,3
> {code}
> Imported table 
> {code:sql}
> DROP TABLE IF EXISTS TABLE1;
> CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
> SALT_BUCKETS = 8,
> UPDATE_CACHE_FREQUENCY = 12;
> CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
> CREATE INDEX V2_IDX on TABLE1(V2) include(v1);
> {code}
> The following is query results after executing `CsvBulkLoadTool`
> {code:sql}
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
> +-+---+-+
> | ID  |  V1   | V2  |
> +-+---+-+
> | k2  | v2_2  | 2   |
> | k1  | v1_1  | 3   |
> +-+---+-+
> 2 rows selected (0.066 seconds)
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
> +---+--+---+
> | 0:V1  | :ID  | 0:V2  |
> +---+--+---+
> | v1_1  | k1   | 3 |
> | v2_2  | k2   | 2 |
> +---+--+---+
> 2 rows selected (0.074 seconds)
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
> +---+--+---+
> | 0:V2  | :ID  | 0:V1  |
> +---+--+---+
> | 1 | k1   | v1_1  |
> | 3 | k1   | v1_1  |
> | 2 | k2   | v2_2  |
> +---+--+---+
> 3 rows selected (0.062 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4847) Index tables will produce dirty data when use BukloadTool

2018-08-13 Thread Jaanai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaanai Zhang updated PHOENIX-4847:
--
Description: 
Index tables are inconsistent with data tables when occurs update primary key 
of data tables with CSVBulkload tool.

CSV data:

{code:sql}
k1,v1_1,1
k2,v2_2,2
k1,v1_1,3
{code}

Imported table 

{code:sql}
DROP TABLE IF EXISTS TABLE1;
CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
SALT_BUCKETS = 8,
UPDATE_CACHE_FREQUENCY = 12;

CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
CREATE INDEX V2_IDX on TABLE1(V2) include(v1);

upsert into TABLE1 values('k1','v1_1',1);
upsert into TABLE1 values('k2','v2_2',2);
upsert into TABLE1 values('k1','v1_1',3);
{code}

The following is query results after executing `CsvBulkLoadTool`

{code:sql}
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
+-+---+-+
| ID  |  V1   | V2  |
+-+---+-+
| k2  | v2_2  | 2   |
| k1  | v1_1  | 3   |
+-+---+-+
2 rows selected (0.066 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
+---+--+---+
| 0:V1  | :ID  | 0:V2  |
+---+--+---+
| v1_1  | k1   | 3 |
| v2_2  | k2   | 2 |
+---+--+---+
2 rows selected (0.074 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
+---+--+---+
| 0:V2  | :ID  | 0:V1  |
+---+--+---+
| 1 | k1   | v1_1  |
| 3 | k1   | v1_1  |
| 2 | k2   | v2_2  |
+---+--+---+
3 rows selected (0.062 seconds)
{code}


  was:
Index tables are inconsistent with data tables when occurs update primary key 
of data tables with CSVBulkload tool.

CSV data:

{code:csv}
k1,v1_1,1
k2,v2_2,2
k1,v1_1,3
{code}

Imported table 

{code:sql}
DROP TABLE IF EXISTS TABLE1;
CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
SALT_BUCKETS = 8,
UPDATE_CACHE_FREQUENCY = 12;

CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
CREATE INDEX V2_IDX on TABLE1(V2) include(v1);

upsert into TABLE1 values('k1','v1_1',1);
upsert into TABLE1 values('k2','v2_2',2);
upsert into TABLE1 values('k1','v1_1',3);
{code}

The following is query results after executing `CsvBulkLoadTool`

{code:sql}
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
+-+---+-+
| ID  |  V1   | V2  |
+-+---+-+
| k2  | v2_2  | 2   |
| k1  | v1_1  | 3   |
+-+---+-+
2 rows selected (0.066 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
+---+--+---+
| 0:V1  | :ID  | 0:V2  |
+---+--+---+
| v1_1  | k1   | 3 |
| v2_2  | k2   | 2 |
+---+--+---+
2 rows selected (0.074 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
+---+--+---+
| 0:V2  | :ID  | 0:V1  |
+---+--+---+
| 1 | k1   | v1_1  |
| 3 | k1   | v1_1  |
| 2 | k2   | v2_2  |
+---+--+---+
3 rows selected (0.062 seconds)
{code}



> Index tables will produce dirty data when use BukloadTool 
> --
>
> Key: PHOENIX-4847
> URL: https://issues.apache.org/jira/browse/PHOENIX-4847
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0, 4.13.0, 4.14.0, 5.0.0
>Reporter: Jaanai Zhang
>Priority: Critical
>
> Index tables are inconsistent with data tables when occurs update primary key 
> of data tables with CSVBulkload tool.
> CSV data:
> {code:sql}
> k1,v1_1,1
> k2,v2_2,2
> k1,v1_1,3
> {code}
> Imported table 
> {code:sql}
> DROP TABLE IF EXISTS TABLE1;
> CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
> SALT_BUCKETS = 8,
> UPDATE_CACHE_FREQUENCY = 12;
> CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
> CREATE INDEX V2_IDX on TABLE1(V2) include(v1);
> upsert into TABLE1 values('k1','v1_1',1);
> upsert into TABLE1 values('k2','v2_2',2);
> upsert into TABLE1 values('k1','v1_1',3);
> {code}
> The following is query results after executing `CsvBulkLoadTool`
> {code:sql}
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
> +-+---+-+
> | ID  |  V1   | V2  |
> +-+---+-+
> | k2  | v2_2  | 2   |
> | k1  | v1_1  | 3   |
> +-+---+-+
> 2 rows selected (0.066 seconds)
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
> +---+--+---+
> | 0:V1  | :ID  | 0:V2  |
> +---+--+---+
> | v1_1  | k1   | 3 |
> | v2_2  | k2   | 2 |
> +---+--+---+
> 2 rows selected (0.074 seconds)
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
> +---+--+---+
> | 0:V2  | :ID  | 0:V1  |
> +---+--+---+
> | 1 | k1   | v1_1  |
> | 3 | k1   | v1_1  |
> | 2 | k2   | v2_2  |
> +---+--+

[jira] [Updated] (PHOENIX-4847) Index tables will produce dirty data when use BukloadTool

2018-08-13 Thread Jaanai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaanai Zhang updated PHOENIX-4847:
--
Description: 
Index tables are inconsistent with data tables when occurs update primary key 
of data tables with CSVBulkload tool.

CSV data:

{code:csv}
k1,v1_1,1
k2,v2_2,2
k1,v1_1,3
{code}

Imported table 

{code:sql}
DROP TABLE IF EXISTS TABLE1;
CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
SALT_BUCKETS = 8,
UPDATE_CACHE_FREQUENCY = 12;

CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
CREATE INDEX V2_IDX on TABLE1(V2) include(v1);

upsert into TABLE1 values('k1','v1_1',1);
upsert into TABLE1 values('k2','v2_2',2);
upsert into TABLE1 values('k1','v1_1',3);
{code}

The following is query results after executing `CsvBulkLoadTool`

{code:sql}
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
+-+---+-+
| ID  |  V1   | V2  |
+-+---+-+
| k2  | v2_2  | 2   |
| k1  | v1_1  | 3   |
+-+---+-+
2 rows selected (0.066 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
+---+--+---+
| 0:V1  | :ID  | 0:V2  |
+---+--+---+
| v1_1  | k1   | 3 |
| v2_2  | k2   | 2 |
+---+--+---+
2 rows selected (0.074 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
+---+--+---+
| 0:V2  | :ID  | 0:V1  |
+---+--+---+
| 1 | k1   | v1_1  |
| 3 | k1   | v1_1  |
| 2 | k2   | v2_2  |
+---+--+---+
3 rows selected (0.062 seconds)
{code}


  was:
Index tables are inconsistent with data tables when occurs update primary key 
of data tables with CSVBulkload tool.

CSV data:
{code:csv}
k1,v1_1,1
k2,v2_2,2
k1,v1_1,3
{code}

Imported table 
{code:sql}
DROP TABLE IF EXISTS TABLE1;
CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
SALT_BUCKETS = 8,
UPDATE_CACHE_FREQUENCY = 12;

CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
CREATE INDEX V2_IDX on TABLE1(V2) include(v1);

upsert into TABLE1 values('k1','v1_1',1);
upsert into TABLE1 values('k2','v2_2',2);
upsert into TABLE1 values('k1','v1_1',3);
{code}

The following is query results after executing `CsvBulkLoadTool`

{code:sql}
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
+-+---+-+
| ID  |  V1   | V2  |
+-+---+-+
| k2  | v2_2  | 2   |
| k1  | v1_1  | 3   |
+-+---+-+
2 rows selected (0.066 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
+---+--+---+
| 0:V1  | :ID  | 0:V2  |
+---+--+---+
| v1_1  | k1   | 3 |
| v2_2  | k2   | 2 |
+---+--+---+
2 rows selected (0.074 seconds)
0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
+---+--+---+
| 0:V2  | :ID  | 0:V1  |
+---+--+---+
| 1 | k1   | v1_1  |
| 3 | k1   | v1_1  |
| 2 | k2   | v2_2  |
+---+--+---+
3 rows selected (0.062 seconds)
{code}



> Index tables will produce dirty data when use BukloadTool 
> --
>
> Key: PHOENIX-4847
> URL: https://issues.apache.org/jira/browse/PHOENIX-4847
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0, 4.13.0, 4.14.0, 5.0.0
>Reporter: Jaanai Zhang
>Priority: Critical
>
> Index tables are inconsistent with data tables when occurs update primary key 
> of data tables with CSVBulkload tool.
> CSV data:
> {code:csv}
> k1,v1_1,1
> k2,v2_2,2
> k1,v1_1,3
> {code}
> Imported table 
> {code:sql}
> DROP TABLE IF EXISTS TABLE1;
> CREATE TABLE TABLE1 (ID VARCHAR NOT NULL PRIMARY KEY, V1 VARCHAR, V2 BIGINT)
> SALT_BUCKETS = 8,
> UPDATE_CACHE_FREQUENCY = 12;
> CREATE INDEX V1_IDX on TABLE1(V1) include(v2);
> CREATE INDEX V2_IDX on TABLE1(V2) include(v1);
> upsert into TABLE1 values('k1','v1_1',1);
> upsert into TABLE1 values('k2','v2_2',2);
> upsert into TABLE1 values('k1','v1_1',3);
> {code}
> The following is query results after executing `CsvBulkLoadTool`
> {code:sql}
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas> select * from table1;
> +-+---+-+
> | ID  |  V1   | V2  |
> +-+---+-+
> | k2  | v2_2  | 2   |
> | k1  | v1_1  | 3   |
> +-+---+-+
> 2 rows selected (0.066 seconds)
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V1_IDX;
> +---+--+---+
> | 0:V1  | :ID  | 0:V2  |
> +---+--+---+
> | v1_1  | k1   | 3 |
> | v2_2  | k2   | 2 |
> +---+--+---+
> 2 rows selected (0.074 seconds)
> 0: jdbc:phoenix:hb-bp18j460748jq41v0-002.hbas>  select * from V2_IDX;
> +---+--+---+
> | 0:V2  | :ID  | 0:V1  |
> +---+--+---+
> | 1 | k1   | v1_1  |
> | 3 | k1   | v1_1  |
> | 2 | k2   | v2_2  |
> +---+--+--