[
https://issues.apache.org/jira/browse/PHOENIX-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Istvan Toth updated PHOENIX-5490:
---------------------------------
Affects Version/s: 4.13.0
> Queries with leading part of the primary key doing a parallel scan causes
> missing data
> --------------------------------------------------------------------------------------
>
> Key: PHOENIX-5490
> URL: https://issues.apache.org/jira/browse/PHOENIX-5490
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.13.0
> Reporter: Amarnath Ramamoorthi
> Priority: Major
> Attachments: foo_bigtable.zip
>
>
> Table properties
> * saltbucket_size=10
> * compression=snappy
> * normalization_enabled=true
> * SPLIT POLICY = ConstantSizeRegionSplitPolicy
> * SALT_BUCKETS = 10 (on creation)
> * ENCODING = FAST_DIFF
> PRIMARY KEY ("id_1", "id_2", "id_3", "id_4", "id_5", "id_6", "id")
> Partial primary key lookup is not working as expected.
> On using only one key "id_1" = 60268375; should return 2 rows
> {code:java|title=Parallel scan shows 1 rows}
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> explain SELECT "id_1",
> "id_2", "id_3", "id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE
> "id_1" = 60268375;
> +--------------------------------------------------------------------------------------------------------------------------+
> | PLAN
> |
> +--------------------------------------------------------------------------------------------------------------------------+
> | CLIENT 8-CHUNK 0 ROWS 0 BYTES PARALLEL 8-WAY ROUND ROBIN RANGE SCAN OVER
> foo.BIG_TABLE [0,60268375] - [9,60268375] |
> | SERVER FILTER BY FIRST KEY ONLY
> |
> +--------------------------------------------------------------------------------------------------------------------------+
> 2 rows selected (0.047 seconds)
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2",
> "id_3", "id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" =
> 60268375;
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | id_1 | id_2 | id_3 | id_4 | id_5 | id_6 | id
> |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | 60268375 | 1306309959 | 4 | 365224321 | 359338114 | 523433837 |
> 1052445537 |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> 1 row selected (3.24 seconds)
> {code}
> {code:java|title=Serial scan shows 2 rows}
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> explain SELECT "id_1",
> "id_2", "id_3", "id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE
> "id_1" = 60268375 limit 10;
> +-----------------------------------------------------------------------------------------------------------+
> | PLAN
> |
> +-----------------------------------------------------------------------------------------------------------+
> | CLIENT 10-CHUNK SERIAL 10-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE
> [0,60268375] - [9,60268375] |
> | SERVER FILTER BY FIRST KEY ONLY
> |
> | SERVER 10 ROW LIMIT
> |
> | CLIENT 10 ROW LIMIT
> |
> +-----------------------------------------------------------------------------------------------------------+
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2",
> "id_3", "id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" =
> 60268375 limit 10;
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | id_1 | id_2 | id_3 | id_4 | id_5 | id_6 | id
> |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | 60268375 | 1306309960 | 4 | 365224322 | 359338115 | 523433838 |
> 1052445538 |
> | 60268375 | 1306309959 | 4 | 365224321 | 359338114 | 523433837 |
> 1052445537 |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> 2 rows selected (0.12 seconds)
> {code}
> Now the missing row can be retrieved on using full primary key.
> {code:java}
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> select "id_1", "id_2",
> "id_3", "id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" IN
> (60268375) AND "id_2" IN (1306309960) AND "id_3" = 4 AND "id_4" = 365224322
> AND "id_5" = 359338115 AND "id_6" = 523433838 AND "id" = 1052445538;
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | id_1 | id_2 | id_3 | id_4 | id_5 | id_6 | id
> |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | 60268375 | 1306309960 | 4 | 365224322 | 359338115 | 523433838 |
> 1052445538 |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> 1 row selected (0.07 seconds)
> {code}
> – Phoenix (version 4.13)
> – HBase (Version 1.2.0)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)