[
https://issues.apache.org/jira/browse/PHOENIX-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052297#comment-15052297
]
Sumit Nigam edited comment on PHOENIX-2514 at 12/11/15 7:29 AM:
----------------------------------------------------------------
This file contains 10 sample entries out of 200K+ entries. Also, note that the
table schema for this one is -
CREATE TABLE IF NOT EXISTS "ldmns:exDocStore" (CURRENT_TIMESTAMP BIGINT NOT
NULL, ID VARCHAR(96), CURR_EXDOC VARCHAR, CURR_CHECKSUM VARCHAR(32), PREV_EXDOC
VARCHAR, PREV_CHECKSUM VARCHAR(32), PREV_TIMESTAMP BIGINT, SUMMARY VARCHAR,
OBJ_SUMMARY VARCHAR, PARAM_SAMPLES VARCHAR, BULK_PUBLISH_UUID VARCHAR,
CONSTRAINT PK PRIMARY KEY(CURRENT_TIMESTAMP, ID))
COMPRESSION=SNAPPY, SALT_BUCKETS=8, BLOCKCACHE=FALSE
The secondary index on this table -
CREATE INDEX IF NOT EXISTS "ldmns:indx_exdoc" ON "ldmns:exDocStore"(ID) INCLUDE
(SUMMARY, OBJ_SUMMARY, PARAM_SAMPLES, BULK_PUBLISH_UUID)
was (Author: sumit.nigam):
This file contains 10 sample entries out of 200K+ entries. Also, note that the
table schema for this one is -
CREATE TABLE IF NOT EXISTS "ldmns:exDocStore" (CURRENT_TIMESTAMP BIGINT NOT
NULL, ID VARCHAR(96), CURR_EXDOC VARCHAR, CURR_CHECKSUM VARCHAR(32), PREV_EXDOC
VARCHAR, PREV_CHECKSUM VARCHAR(32), PREV_TIMESTAMP BIGINT, SUMMARY VARCHAR,
OBJ_SUMMARY VARCHAR, PARAM_SAMPLES VARCHAR, BULK_PUBLISH_UUID VARCHAR,
CONSTRAINT PK PRIMARY KEY(CURRENT_TIMESTAMP, ID))
COMPRESSION=SNAPPY, SALT_BUCKETS=8, BLOCKCACHE=FALSE
> Even with ORDER BY clause the LIMIT does not work correctly with salted
> tables containing many records.
> -------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2514
> URL: https://issues.apache.org/jira/browse/PHOENIX-2514
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.5.1
> Environment: HBase-0.98.14
> Reporter: Sumit Nigam
> Priority: Critical
> Labels: LIMIT, hbase, phoenix, salted
> Fix For: 4.5.1
>
> Attachments: data.zip
>
>
> A query such as SELECT CURRENT_TIMESTAMP FROM TBL ORDER BY CURRENT_TIMESTAMP
> DESC LIMIT 1 does not really return the MAX(CURRENT_TIMESTAMP). The table is
> salted and has 200272 records.
> select current_timestamp from TBL order by current_timestamp desc limit 1;
> +------------------------------------------+
> | CURRENT_TIMESTAMP |
> +------------------------------------------+
> | 1448815328556 |
> +------------------------------------------+
> select max(current_timestamp) from TBL;
> +------------------------------------------+
> | MAX("CURRENT_TIMESTAMP") |
> +------------------------------------------+
> | 1449732792090 |
> +------------------------------------------+
> The results are different. MAX is of course, returning the right record.
> The above query is one example. There are other queries which also seem to be
> returning incorrect record with ORDER BY and LIMIT.
> Is this also correct that when there is a WHERE clause limiting the number of
> projected records, then LIMIT seems to work fine? I seem to be noticing that
> also.
> The table DDL is:
> CREATE TABLE IF NOT EXISTS TBL
> (CURRENT_TIMESTAMP BIGINT NOT NULL, ID VARCHAR(96), CURR_EXDOC VARCHAR,
> CURR_CHECKSUM VARCHAR(32), SUMMARY VARCHAR,
> CONSTRAINT PK PRIMARY KEY(CURRENT_TIMESTAMP, ID))
> BLOCKCACHE=FALSE, COMPRESSION=SNAPPY, SALT_BUCKETS=8
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)