Yufan, What version of Phoenix are you using? Thanks, James On Wed, Jul 1, 2015 at 2:34 PM, Yufan Liu <yli...@kent.edu> wrote:
> When I made more tests, I find that this problem happens after table got > split. > > Here is the DDL I use to create table and index: > CREATE TABLE IF NOT EXISTS t1 ( > uid BIGINT NOT NULL, > timestamp BIGINT NOT NULL, > eventName VARCHAR > CONSTRAINT my_pk PRIMARY KEY (uid, timestamp)) COMPRESSION='SNAPPY'; > > CREATE INDEX timestamp_index ON t1 (timestamp) INCLUDE (eventName) > > Attach is the sample data I used for test. It has about 4000 rows, when > the timestamp_index table has one region, the query returns correct result: > 1433334048443, but when I manually split it into 4 regions (use hbase > tool), it returns 1433333024961. > > Let know if you find anything. Thanks! > > > 2015-07-01 11:27 GMT-07:00 James Taylor <jamestay...@apache.org>: > >> If you could put a complete test (including your DDL and upsert of data), >> that would be much appreciated. >> Thanks, >> James >> >> On Wed, Jul 1, 2015 at 11:20 AM, Yufan Liu <yli...@kent.edu> wrote: >> >>> I have tried to use query "SELECT timestamp FROM t1 ORDER BY timestamp >>> DESC NULLS LAST LIMIT 1". But it still returns the same unexpected result. >>> There seems to be some internal problems related. >>> >>> 2015-06-30 18:03 GMT-07:00 James Taylor <jamestay...@apache.org>: >>> >>>> Yes, reverse scan will be leveraged when possible. Make you use NULLS >>>> LAST in your ORDER BY as rows are ordered with nulls first. >>>> >>>> On Tue, Jun 30, 2015 at 5:25 PM, Yufan Liu <yli...@kent.edu> wrote: >>>> >>>>> I used the HBase reverse scan to find the last row on the index table. >>>>> It returned the expected result. I would like to know is Phoenix's "ORDER >>>>> BY" >>>>> and "DESC" implemented based on HBase reverse scan? >>>>> >>>>> 2015-06-26 17:25 GMT-07:00 Yufan Liu <yli...@kent.edu>: >>>>> >>>>>> Thank you anyway, Michael! >>>>>> >>>>>> 2015-06-26 17:21 GMT-07:00 Michael McAllister < >>>>>> mmcallis...@homeaway.com>: >>>>>> >>>>>>> OK, I’m a Phoenix newbie, so that was the extent of the advice I >>>>>>> could give you. There are people here far more experienced than I am who >>>>>>> should be able to give you deeper advice. Have a great weekend! >>>>>>> >>>>>>> >>>>>>> >>>>>>> Mike >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* Yufan Liu [mailto:yli...@kent.edu] >>>>>>> *Sent:* Friday, June 26, 2015 7:19 PM >>>>>>> *To:* user@phoenix.apache.org >>>>>>> *Subject:* Re: Problem in finding the largest value of an indexed >>>>>>> column >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi Michael, >>>>>>> >>>>>>> Thanks for the advice, for the first one, it's "CLIENT 67-CHUNK >>>>>>> PARALLEL 1-WAY FULL SCAN OVER TIMESTAMP_INDEX; SERVER FILTER BY FIRST >>>>>>> KEY >>>>>>> ONLY; SERVER AGGREGATE INTO SINGLE ROW" which is as expected. For the >>>>>>> second one, it's "CLIENT 67-CHUNK SERIAL 1-WAY REVERSE FULL SCAN OVER >>>>>>> TIMESTAMP_INDEX; SERVER FILTER BY FIRST KEY ONLY; SERVER 1 ROW LIMIT" >>>>>>> which >>>>>>> looks correct, but still returns the unexpected result. >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2015-06-26 16:59 GMT-07:00 Michael McAllister < >>>>>>> mmcallis...@homeaway.com>: >>>>>>> >>>>>>> Yufan >>>>>>> >>>>>>> >>>>>>> >>>>>>> Have you tried using the EXPLAIN command to see what plan is being >>>>>>> used to access the data? >>>>>>> >>>>>>> >>>>>>> >>>>>>> Michael McAllister >>>>>>> >>>>>>> Staff Data Warehouse Engineer | Decision Systems >>>>>>> >>>>>>> mmcallis...@homeaway.com | C: 512.423.7447 | skype: >>>>>>> michael.mcallister.ha <zimmk...@hotmail.com> | webex: >>>>>>> https://h.a/mikewebex >>>>>>> >>>>>>> [image: Description: Description: cid:3410354473_30269081] >>>>>>> >>>>>>> This electronic communication (including any attachment) is >>>>>>> confidential. If you are not an intended recipient of this >>>>>>> communication, >>>>>>> please be advised that any disclosure, dissemination, distribution, >>>>>>> copying >>>>>>> or other use of this communication or any attachment is strictly >>>>>>> prohibited. If you have received this communication in error, please >>>>>>> notify the sender immediately by reply e-mail and promptly destroy all >>>>>>> electronic and printed copies of this communication and any attachment. >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* Yufan Liu [mailto:yli...@kent.edu] >>>>>>> *Sent:* Friday, June 26, 2015 6:31 PM >>>>>>> *To:* user@phoenix.apache.org >>>>>>> *Subject:* Problem in finding the largest value of an indexed column >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We have created a table (eg, t1), and a global index of one numeric >>>>>>> column of t1 (eg, timestamp). Now we want to find the largest value of >>>>>>> timestamp, we have tried two approaches: >>>>>>> >>>>>>> >>>>>>> 1. select max(timestamp) from t1; This query takes forever to >>>>>>> finish, so I think it maybe doing a full table scan/comparison . >>>>>>> >>>>>>> 2. select timestamp from t1 order by timestamp desc limit 1; This >>>>>>> query finished fast, but the result it returns is far from the largest >>>>>>> value. It seems it just return the largest value for a certain range of >>>>>>> data. >>>>>>> >>>>>>> Did anyone else encounter this issue/have any suggestion? >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yufan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> best, >>>>>>> Yufan >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> best, >>>>>> Yufan >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> best, >>>>> Yufan >>>>> >>>>> >>>> >>> >>> >>> -- >>> best, >>> Yufan >>> >>> >> > > > -- > best, > Yufan > >