[jira] [Commented] (PHOENIX-3685) Extra DeleteFamily marker in non tx index table when setting covered column to null
[ https://issues.apache.org/jira/browse/PHOENIX-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884095#comment-15884095 ] James Taylor commented on PHOENIX-3685: --- That sounds like the root cause, [~tdsilva]. Is it easy to fix? I didn't see it happening for transactional tables. > Extra DeleteFamily marker in non tx index table when setting covered column > to null > --- > > Key: PHOENIX-3685 > URL: https://issues.apache.org/jira/browse/PHOENIX-3685 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Thomas D'Silva > Attachments: PHOENIX-3685-test.patch > > > Based on some testing (see patch), I noticed a mysterious DeleteFamily marker > when a covered column is set to null. This could potentially delete an actual > row with that row key, so it's bad. > Here's a raw scan dump taken after the MutableIndexIT.testCoveredColumns() > test: > {code} > dumping IDX_T02;hconnection-0x211e75ea ** > \x00a/0:/1487356752097/DeleteFamily/vlen=0/seqid=0 value = > x\x00a/0:0:V2/1487356752231/Put/vlen=1/seqid=0 value = 4 > x\x00a/0:0:V2/1487356752225/Put/vlen=1/seqid=0 value = 4 > x\x00a/0:0:V2/1487356752202/Put/vlen=1/seqid=0 value = 3 > x\x00a/0:0:V2/1487356752149/DeleteColumn/vlen=0/seqid=0 value = > x\x00a/0:0:V2/1487356752097/Put/vlen=1/seqid=0 value = 1 > x\x00a/0:_0/1487356752231/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752225/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752202/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752149/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752097/Put/vlen=2/seqid=0 value = _0 > --- > {code} > That first DeleteFamily marker shouldn't be there. This occurs for both > global and local indexes, but not for transactional tables. A further > optimization would be not to issue the first Put since the value behind it is > the same. > On the plus side, we're not issuing DeleteFamily markers when only the > covered column is being set which is good. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (PHOENIX-3690) Support declaring default values in Phoenix-Calcite
[ https://issues.apache.org/jira/browse/PHOENIX-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla resolved PHOENIX-3690. -- Resolution: Fixed > Support declaring default values in Phoenix-Calcite > --- > > Key: PHOENIX-3690 > URL: https://issues.apache.org/jira/browse/PHOENIX-3690 > Project: Phoenix > Issue Type: Sub-task >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Labels: calcite > Attachments: PHOENIX-3690.patch, PHOENIX-3690_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3690) Support declaring default values in Phoenix-Calcite
[ https://issues.apache.org/jira/browse/PHOENIX-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3690: - Attachment: PHOENIX-3690_v2.patch Here is the patch addresses reviews comments. Thanks for review [~maryannxue], [~kliew]. Going to commit it. > Support declaring default values in Phoenix-Calcite > --- > > Key: PHOENIX-3690 > URL: https://issues.apache.org/jira/browse/PHOENIX-3690 > Project: Phoenix > Issue Type: Sub-task >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Labels: calcite > Attachments: PHOENIX-3690.patch, PHOENIX-3690_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3690) Support declaring default values in Phoenix-Calcite
[ https://issues.apache.org/jira/browse/PHOENIX-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883983#comment-15883983 ] Kevin Liew commented on PHOENIX-3690: - I suggest we set PhoenixSchema.typeFactory and PhoenixTable.initializerExpressionFactory in the constructor and make them immutable. We can make the InitializerExpressionFactory a nested class so it becomes more readable. {code:java} initializerExpressionFactory = typeFactory == null ? null : new PhoenixTableInitializerExpressionFactory(typeFactory); {code} There is some common code in PhoenixPrepareImpl that can be made into a private method. > Support declaring default values in Phoenix-Calcite > --- > > Key: PHOENIX-3690 > URL: https://issues.apache.org/jira/browse/PHOENIX-3690 > Project: Phoenix > Issue Type: Sub-task >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Labels: calcite > Attachments: PHOENIX-3690.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3685) Extra DeleteFamily marker in non tx index table when setting covered column to null
[ https://issues.apache.org/jira/browse/PHOENIX-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883954#comment-15883954 ] Thomas D'Silva commented on PHOENIX-3685: - I think this happens because we always generate a delete along with the put for an index on a mutable table. When we insert a new row (not mutate an existing row) in IndexMaintainer.buildDeleteMutation the old value of the indexed column is null and so a family delete is generated to delete the old row. This delete is not required since we are inserting a new row. {code} public Delete buildDeleteMutation(KeyValueBuilder kvBuilder, ValueGetter oldState, ImmutableBytesWritable dataRowKeyPtr, Collection pendingUpdates, long ts, byte[] regionStartKey, byte[] regionEndKey) throws IOException { byte[] indexRowKey = this.buildRowKey(oldState, dataRowKeyPtr, regionStartKey, regionEndKey); // Delete the entire row if any of the indexed columns changed DeleteType deleteType = null; if (oldState == null || (deleteType=getDeleteTypeOrNull(pendingUpdates)) != null || hasIndexedColumnChanged(oldState, pendingUpdates)) { // Deleting the entire row byte[] emptyCF = emptyKeyValueCFPtr.copyBytesIfNecessary(); Delete delete = new Delete(indexRowKey); for (ColumnReference ref : getCoveredColumns()) { ColumnReference indexColumn = coveredColumnsMap.get(ref); // If table delete was single version, then index delete should be as well if (deleteType == DeleteType.SINGLE_VERSION) { delete.deleteFamilyVersion(indexColumn.getFamily(), ts); } else { delete.deleteFamily(indexColumn.getFamily(), ts); } } if (deleteType == DeleteType.SINGLE_VERSION) { delete.deleteFamilyVersion(emptyCF, ts); } else { delete.deleteFamily(emptyCF, ts); } delete.setDurability(!indexWALDisabled ? Durability.USE_DEFAULT : Durability.SKIP_WAL); return delete; } {code} > Extra DeleteFamily marker in non tx index table when setting covered column > to null > --- > > Key: PHOENIX-3685 > URL: https://issues.apache.org/jira/browse/PHOENIX-3685 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Thomas D'Silva > Attachments: PHOENIX-3685-test.patch > > > Based on some testing (see patch), I noticed a mysterious DeleteFamily marker > when a covered column is set to null. This could potentially delete an actual > row with that row key, so it's bad. > Here's a raw scan dump taken after the MutableIndexIT.testCoveredColumns() > test: > {code} > dumping IDX_T02;hconnection-0x211e75ea ** > \x00a/0:/1487356752097/DeleteFamily/vlen=0/seqid=0 value = > x\x00a/0:0:V2/1487356752231/Put/vlen=1/seqid=0 value = 4 > x\x00a/0:0:V2/1487356752225/Put/vlen=1/seqid=0 value = 4 > x\x00a/0:0:V2/1487356752202/Put/vlen=1/seqid=0 value = 3 > x\x00a/0:0:V2/1487356752149/DeleteColumn/vlen=0/seqid=0 value = > x\x00a/0:0:V2/1487356752097/Put/vlen=1/seqid=0 value = 1 > x\x00a/0:_0/1487356752231/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752225/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752202/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752149/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752097/Put/vlen=2/seqid=0 value = _0 > --- > {code} > That first DeleteFamily marker shouldn't be there. This occurs for both > global and local indexes, but not for transactional tables. A further > optimization would be not to issue the first Put since the value behind it is > the same. > On the plus side, we're not issuing DeleteFamily markers when only the > covered column is being set which is good. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3695) CaseStatementIT#testUnfoundSingleColumnCaseStatement is failing on encode columns branch
[ https://issues.apache.org/jira/browse/PHOENIX-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-3695: -- Issue Type: Sub-task (was: Bug) Parent: PHOENIX-1598 > CaseStatementIT#testUnfoundSingleColumnCaseStatement is failing on encode > columns branch > > > Key: PHOENIX-3695 > URL: https://issues.apache.org/jira/browse/PHOENIX-3695 > Project: Phoenix > Issue Type: Sub-task >Reporter: Samarth Jain >Assignee: Samarth Jain > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (PHOENIX-3695) CaseStatementIT#testUnfoundSingleColumnCaseStatement is failing on encode columns branch
Samarth Jain created PHOENIX-3695: - Summary: CaseStatementIT#testUnfoundSingleColumnCaseStatement is failing on encode columns branch Key: PHOENIX-3695 URL: https://issues.apache.org/jira/browse/PHOENIX-3695 Project: Phoenix Issue Type: Bug Reporter: Samarth Jain Assignee: Samarth Jain -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3695) CaseStatementIT#testUnfoundSingleColumnCaseStatement is failing on encode columns branch
[ https://issues.apache.org/jira/browse/PHOENIX-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883918#comment-15883918 ] Samarth Jain commented on PHOENIX-3695: --- This test seems to be failing for immutable indexes. Need to look deeper. > CaseStatementIT#testUnfoundSingleColumnCaseStatement is failing on encode > columns branch > > > Key: PHOENIX-3695 > URL: https://issues.apache.org/jira/browse/PHOENIX-3695 > Project: Phoenix > Issue Type: Sub-task >Reporter: Samarth Jain >Assignee: Samarth Jain > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3649) After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on immutable index creation with multiple regions on single RS
[ https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-3649: -- Priority: Blocker (was: Major) > After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on > immutable index creation with multiple regions on single RS > -- > > Key: PHOENIX-3649 > URL: https://issues.apache.org/jira/browse/PHOENIX-3649 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.9.0 >Reporter: Mujtaba Chohan >Priority: Blocker > Fix For: 4.9.1, 4.10.0 > > Attachments: PHOENIX-3649.patch > > > *Configuration* > hbase-0.98.23 standalone > Heap 5GB > *When* > Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across > cluster. > https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6 > To repro > {noformat} > CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL, > FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB > BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, > FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760 > {noformat} > Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create > ~4 regions on a single RS and then create index with data present > {noformat} > CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR) > {noformat} > From RS log > {noformat} > 2017-02-02 13:29:06,899 WARN [rs,51371,1486070044538-HeapMemoryChore] > regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap > occupancy alarm watermark (0.95) > 2017-02-02 13:29:18,198 INFO [SessionTracker] server.ZooKeeperServer: > Expiring session 0x15a00ad4f31, timeout of 1ms exceeded > 2017-02-02 13:29:18,231 WARN [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms > GC pool 'ParNew' had collection(s): count=4 time=139ms > 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread] > regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538: > regionserver:51371-0x15a00ad4f31, quorum=localhost:2181, baseZNode=/hbase > regionserver:51371-0x15a00ad4f31 received expired from ZooKeeper, aborting > {noformat} > Prior to the change index creation succeeds with as little as 2GB heap. > [~an...@apache.org] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (PHOENIX-3649) After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on immutable index creation with multiple regions on single RS
[ https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor reassigned PHOENIX-3649: - Assignee: Ankit Singhal > After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on > immutable index creation with multiple regions on single RS > -- > > Key: PHOENIX-3649 > URL: https://issues.apache.org/jira/browse/PHOENIX-3649 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.9.0 >Reporter: Mujtaba Chohan >Assignee: Ankit Singhal >Priority: Blocker > Fix For: 4.9.1, 4.10.0 > > Attachments: PHOENIX-3649.patch > > > *Configuration* > hbase-0.98.23 standalone > Heap 5GB > *When* > Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across > cluster. > https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6 > To repro > {noformat} > CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL, > FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB > BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, > FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760 > {noformat} > Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create > ~4 regions on a single RS and then create index with data present > {noformat} > CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR) > {noformat} > From RS log > {noformat} > 2017-02-02 13:29:06,899 WARN [rs,51371,1486070044538-HeapMemoryChore] > regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap > occupancy alarm watermark (0.95) > 2017-02-02 13:29:18,198 INFO [SessionTracker] server.ZooKeeperServer: > Expiring session 0x15a00ad4f31, timeout of 1ms exceeded > 2017-02-02 13:29:18,231 WARN [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms > GC pool 'ParNew' had collection(s): count=4 time=139ms > 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread] > regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538: > regionserver:51371-0x15a00ad4f31, quorum=localhost:2181, baseZNode=/hbase > regionserver:51371-0x15a00ad4f31 received expired from ZooKeeper, aborting > {noformat} > Prior to the change index creation succeeds with as little as 2GB heap. > [~an...@apache.org] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3649) After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on immutable index creation with multiple regions on single RS
[ https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883829#comment-15883829 ] Mujtaba Chohan commented on PHOENIX-3649: - [~an...@apache.org] Just checking when can we expect your commit? > After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on > immutable index creation with multiple regions on single RS > -- > > Key: PHOENIX-3649 > URL: https://issues.apache.org/jira/browse/PHOENIX-3649 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.9.0 >Reporter: Mujtaba Chohan > Fix For: 4.9.1, 4.10.0 > > Attachments: PHOENIX-3649.patch > > > *Configuration* > hbase-0.98.23 standalone > Heap 5GB > *When* > Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across > cluster. > https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6 > To repro > {noformat} > CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL, > FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB > BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, > FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760 > {noformat} > Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create > ~4 regions on a single RS and then create index with data present > {noformat} > CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR) > {noformat} > From RS log > {noformat} > 2017-02-02 13:29:06,899 WARN [rs,51371,1486070044538-HeapMemoryChore] > regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap > occupancy alarm watermark (0.95) > 2017-02-02 13:29:18,198 INFO [SessionTracker] server.ZooKeeperServer: > Expiring session 0x15a00ad4f31, timeout of 1ms exceeded > 2017-02-02 13:29:18,231 WARN [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms > GC pool 'ParNew' had collection(s): count=4 time=139ms > 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread] > regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538: > regionserver:51371-0x15a00ad4f31, quorum=localhost:2181, baseZNode=/hbase > regionserver:51371-0x15a00ad4f31 received expired from ZooKeeper, aborting > {noformat} > Prior to the change index creation succeeds with as little as 2GB heap. > [~an...@apache.org] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3667) Optimize BooleanExpressionFilter for tables with encoded columns
[ https://issues.apache.org/jira/browse/PHOENIX-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883723#comment-15883723 ] James Taylor commented on PHOENIX-3667: --- +1. Looks great, [~samarthjain]. > Optimize BooleanExpressionFilter for tables with encoded columns > > > Key: PHOENIX-3667 > URL: https://issues.apache.org/jira/browse/PHOENIX-3667 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-3667.patch, PHOENIX-3667_v2.patch, > PHOENIX-3667_wip.patch, WhereClause.jpg > > > The client side of Phoenix determines the subclass of BooleanExpressionFilter > we use based on how many column families and column qualifiers are being > referenced. The idea is to minimize the lookup cost during filter evaluation. > For encoded columns, instead of using a Map or Set, we can create a few new > subclasses of BooleanExpressionFilter that use an array instead. No need for > any lookups or equality checks - just fill in the position based on the > column qualifier value instead. Since filters are applied on every row > between the start/stop key, this will improve performance quite a bit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3667) Optimize BooleanExpressionFilter for tables with encoded columns
[ https://issues.apache.org/jira/browse/PHOENIX-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-3667: -- Attachment: PHOENIX-3667_v2.patch Thanks for the reviews, [~jamestaylor]. Updated patch with the getCellAtIndex() method as you suggested. > Optimize BooleanExpressionFilter for tables with encoded columns > > > Key: PHOENIX-3667 > URL: https://issues.apache.org/jira/browse/PHOENIX-3667 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-3667.patch, PHOENIX-3667_v2.patch, > PHOENIX-3667_wip.patch, WhereClause.jpg > > > The client side of Phoenix determines the subclass of BooleanExpressionFilter > we use based on how many column families and column qualifiers are being > referenced. The idea is to minimize the lookup cost during filter evaluation. > For encoded columns, instead of using a Map or Set, we can create a few new > subclasses of BooleanExpressionFilter that use an array instead. No need for > any lookups or equality checks - just fill in the position based on the > column qualifier value instead. Since filters are applied on every row > between the start/stop key, this will improve performance quite a bit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3667) Optimize BooleanExpressionFilter for tables with encoded columns
[ https://issues.apache.org/jira/browse/PHOENIX-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883678#comment-15883678 ] James Taylor commented on PHOENIX-3667: --- This looks good, except for the EncodedCQIncrementalResultTuple.getValue(int index) call. When used, this call allows the Cells to be iterated over like this: {code} for (int i = 0; i < tuple.size(); i++) { System.out.println( tuple.getValue(i) ); } {code} Here's how I'd recommend implementing getCellForIndex (with a javadoc comment that let's folks know it won't perform well): {code} public void Cell getCellAtIndex(int index) { int bitIndex; for (bitIndex = filteredQualifiers.nextSetBit(0); bitIndex >= 0 && index >= 0; bitIndex = filteredQualifiers.nextSetBit(bitIndex+1)) { index--; } if (bitIndex < 0) { throw new NoSuchElementException(); } return filteredCells[bitIndex]; } {code} > Optimize BooleanExpressionFilter for tables with encoded columns > > > Key: PHOENIX-3667 > URL: https://issues.apache.org/jira/browse/PHOENIX-3667 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-3667.patch, PHOENIX-3667_wip.patch, > WhereClause.jpg > > > The client side of Phoenix determines the subclass of BooleanExpressionFilter > we use based on how many column families and column qualifiers are being > referenced. The idea is to minimize the lookup cost during filter evaluation. > For encoded columns, instead of using a Map or Set, we can create a few new > subclasses of BooleanExpressionFilter that use an array instead. No need for > any lookups or equality checks - just fill in the position based on the > column qualifier value instead. Since filters are applied on every row > between the start/stop key, this will improve performance quite a bit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3667) Optimize BooleanExpressionFilter for tables with encoded columns
[ https://issues.apache.org/jira/browse/PHOENIX-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-3667: -- Attachment: WhereClause.jpg My first perf test was to track query performance as we increase the number of columns we filter by in the where clause. As the number of columns increased, the query performance kept getting worse with the older filter. Around 20 columns or so, the perf difference was close to 100%. > Optimize BooleanExpressionFilter for tables with encoded columns > > > Key: PHOENIX-3667 > URL: https://issues.apache.org/jira/browse/PHOENIX-3667 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-3667.patch, PHOENIX-3667_wip.patch, > WhereClause.jpg > > > The client side of Phoenix determines the subclass of BooleanExpressionFilter > we use based on how many column families and column qualifiers are being > referenced. The idea is to minimize the lookup cost during filter evaluation. > For encoded columns, instead of using a Map or Set, we can create a few new > subclasses of BooleanExpressionFilter that use an array instead. No need for > any lookups or equality checks - just fill in the position based on the > column qualifier value instead. Since filters are applied on every row > between the start/stop key, this will improve performance quite a bit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3694) Drop schema does not invalidate schema from the server cache
[ https://issues.apache.org/jira/browse/PHOENIX-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883660#comment-15883660 ] Hadoop QA commented on PHOENIX-3694: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12854566/PHOENIX-3694.patch against master branch at commit 05c37a91511b21d01b30107b5fd4dc98eacb041f. ATTACHMENT ID: 12854566 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 42 warning messages. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.RenewLeaseIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.UpsertSelectIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ServerExceptionIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ArithmeticQueryIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SubqueryIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/780//testReport/ Javadoc warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/780//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/780//console This message is automatically generated. > Drop schema does not invalidate schema from the server cache > > > Key: PHOENIX-3694 > URL: https://issues.apache.org/jira/browse/PHOENIX-3694 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Fix For: 4.10.0 > > Attachments: PHOENIX-3694.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3667) Optimize BooleanExpressionFilter for tables with encoded columns
[ https://issues.apache.org/jira/browse/PHOENIX-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-3667: -- Attachment: PHOENIX-3667.patch Updated patch with review comments. I also added a new filter - EncodedQualifiersProjectionFilter which removes the need to using a TreeMap for tracking the column qualifiers. I have instead used a BitSet where the index into the bit set is the decoded column qualifier. Will list perf test results in the next comment. > Optimize BooleanExpressionFilter for tables with encoded columns > > > Key: PHOENIX-3667 > URL: https://issues.apache.org/jira/browse/PHOENIX-3667 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-3667.patch, PHOENIX-3667_wip.patch > > > The client side of Phoenix determines the subclass of BooleanExpressionFilter > we use based on how many column families and column qualifiers are being > referenced. The idea is to minimize the lookup cost during filter evaluation. > For encoded columns, instead of using a Map or Set, we can create a few new > subclasses of BooleanExpressionFilter that use an array instead. No need for > any lookups or equality checks - just fill in the position based on the > column qualifier value instead. Since filters are applied on every row > between the start/stop key, this will improve performance quite a bit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3663) Implement resource controls in Phoenix JDBC driver
[ https://issues.apache.org/jira/browse/PHOENIX-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883612#comment-15883612 ] James Taylor commented on PHOENIX-3663: --- +1. LGTM. Thanks, [~gjacoby]! > Implement resource controls in Phoenix JDBC driver > -- > > Key: PHOENIX-3663 > URL: https://issues.apache.org/jira/browse/PHOENIX-3663 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.9.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby > Fix For: 4.10.0 > > Attachments: PHOENIX-3663.patch, PHOENIX-3663.v2.patch > > > It would be very useful for service protection to be able to limit how many > Phoenix connections a particular client machine can request at one time. > This feature should be optional, and default to off for backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3663) Implement resource controls in Phoenix JDBC driver
[ https://issues.apache.org/jira/browse/PHOENIX-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Jacoby updated PHOENIX-3663: - Attachment: PHOENIX-3663.v2.patch Minor revision to patch to use SQLExceptionInfo.Builder and SQLExceptionCode to generate exception. > Implement resource controls in Phoenix JDBC driver > -- > > Key: PHOENIX-3663 > URL: https://issues.apache.org/jira/browse/PHOENIX-3663 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.9.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby > Fix For: 4.10.0 > > Attachments: PHOENIX-3663.patch, PHOENIX-3663.v2.patch > > > It would be very useful for service protection to be able to limit how many > Phoenix connections a particular client machine can request at one time. > This feature should be optional, and default to off for backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3694) Drop schema does not invalidate schema from the server cache
[ https://issues.apache.org/jira/browse/PHOENIX-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883457#comment-15883457 ] Sergey Soldatov commented on PHOENIX-3694: -- LGTM except the incorrect indentation (tabs are used instead of spaces) > Drop schema does not invalidate schema from the server cache > > > Key: PHOENIX-3694 > URL: https://issues.apache.org/jira/browse/PHOENIX-3694 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Fix For: 4.10.0 > > Attachments: PHOENIX-3694.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3663) Implement resource controls in Phoenix JDBC driver
[ https://issues.apache.org/jira/browse/PHOENIX-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883451#comment-15883451 ] James Taylor commented on PHOENIX-3663: --- Patch looks good, [~gjacoby]. One minor issue: - Instead of instantiating a SQLException directly, create a new enum in SQLExceptionCode and use SQLExceptionInfo.Builder to create it: {code} @Override public void addConnection(PhoenixConnection connection) throws SQLException { -connectionQueues.get(getQueueIndex(connection)).add(new WeakReference(connection)); -if (returnSequenceValues) { +if (returnSequenceValues || shouldThrottleNumConnections) { synchronized (connectionCountLock) { +if (shouldThrottleNumConnections && connectionCount + 1 > maxConnectionsAllowed){ +GLOBAL_PHOENIX_CONNECTIONS_THROTTLED_COUNTER.increment(); +throw new SQLException(String.format("Could not create connection " + +"because this client already has the maximum number" + +" of %d connections to the target cluster.", maxConnectionsAllowed)); +} + {code} Something like this: {code} throw new SQLExceptionInfo.Builder(SQLExceptionCode.MAX_CONNECTIONS_EXCEEDED) .setMessage("Maximum connections = " + maxConnectionsAllowed) .build().buildException(); {code} That way it'll have a unique error code that clients can match against. You could optionally create a new class derived from SQLException if we think it's likely that clients might catch this exception. You'd then instantiate the class in SQLExceptionCode (see other examples there). > Implement resource controls in Phoenix JDBC driver > -- > > Key: PHOENIX-3663 > URL: https://issues.apache.org/jira/browse/PHOENIX-3663 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.9.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby > Attachments: PHOENIX-3663.patch > > > It would be very useful for service protection to be able to limit how many > Phoenix connections a particular client machine can request at one time. > This feature should be optional, and default to off for backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3694) Drop schema does not invalidate schema from the server cache
[ https://issues.apache.org/jira/browse/PHOENIX-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-3694: --- Attachment: PHOENIX-3694.patch > Drop schema does not invalidate schema from the server cache > > > Key: PHOENIX-3694 > URL: https://issues.apache.org/jira/browse/PHOENIX-3694 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Fix For: 4.10.0 > > Attachments: PHOENIX-3694.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (PHOENIX-3694) Drop schema does not invalidate schema from the server cache
Ankit Singhal created PHOENIX-3694: -- Summary: Drop schema does not invalidate schema from the server cache Key: PHOENIX-3694 URL: https://issues.apache.org/jira/browse/PHOENIX-3694 Project: Phoenix Issue Type: Bug Reporter: Ankit Singhal Assignee: Ankit Singhal Fix For: 4.10.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3686) De-couple PQS's use of Kerberos to talk to HBase and client authentication
[ https://issues.apache.org/jira/browse/PHOENIX-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883421#comment-15883421 ] Josh Elser commented on PHOENIX-3686: - Actually, I should add a unit test for this one too. Should be able to get something given what I already have in place.. > De-couple PQS's use of Kerberos to talk to HBase and client authentication > -- > > Key: PHOENIX-3686 > URL: https://issues.apache.org/jira/browse/PHOENIX-3686 > Project: Phoenix > Issue Type: New Feature >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 4.10.0 > > Attachments: PHOENIX-3686.001.patch > > > Was trying to help a user that was using > https://bitbucket.org/lalinsky/python-phoenixdb to talk to PQS. After > upgrading Phoenix (to a version that actually included client > authentication), their application suddenly broke and they were upset. > Because they were running Phoenix/HBase on a cluster with Kerberos > authentication enabled, they suddenly "inherited" this client authentication. > AFAIK, the python-phoenixdb project doesn't presently include the ability to > authenticate via SPNEGO. This means a Phoenix upgrade broke their app which > stinks. > This happens because, presently, when sees that HBase is configured for > Kerberos auth (via hbase-site.xml), it assumes that clients should be > required to also authenticate via Kerberos to it. In certain circumstances, > users might not actually want to do this. > It's a pretty trivial change I've hacked together which shows that this is > possible, and I think that, with adequate disclaimer/documentation about this > property, it's OK to do. As long as we are very clear about what exactly this > configuration property is doing (allowing *anyone* into your HBase instance > as the PQS Kerberos user), it will unblock these users while the various > client drivers build proper support for authentication. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3667) Optimize BooleanExpressionFilter for tables with encoded columns
[ https://issues.apache.org/jira/browse/PHOENIX-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883416#comment-15883416 ] James Taylor commented on PHOENIX-3667: --- A few minor nits on patch for EncodedCQIncrementalResultTuple: - Can we name markTupleAsImmutable method as setImmutable to match the other one? - The refCount member variable goes from 0 to expectedCardinality directly. How about adding a setCell method to EncodedCQIncrementalResultTuple and calling that instead of making the filteredKeyValues.setCell call this directly? Then you could increment refCount when that's called. {code} +if (isQualifierForColumnInWhereExpression(qualifier)) { +filteredKeyValues.setCell(qualifier, cell); {code} - Can the getValue(int index) method call filteredKeyValues.get(index)? I know it won't be efficient, but it's not currently called anyway. > Optimize BooleanExpressionFilter for tables with encoded columns > > > Key: PHOENIX-3667 > URL: https://issues.apache.org/jira/browse/PHOENIX-3667 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-3667_wip.patch > > > The client side of Phoenix determines the subclass of BooleanExpressionFilter > we use based on how many column families and column qualifiers are being > referenced. The idea is to minimize the lookup cost during filter evaluation. > For encoded columns, instead of using a Map or Set, we can create a few new > subclasses of BooleanExpressionFilter that use an array instead. No need for > any lookups or equality checks - just fill in the position based on the > column qualifier value instead. Since filters are applied on every row > between the start/stop key, this will improve performance quite a bit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3693) Update to Tephra 0.11.0
[ https://issues.apache.org/jira/browse/PHOENIX-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-3693: -- Fix Version/s: 4.10.0 > Update to Tephra 0.11.0 > --- > > Key: PHOENIX-3693 > URL: https://issues.apache.org/jira/browse/PHOENIX-3693 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor > Fix For: 4.10.0 > > > When Tephra 0.11.0 is released, we should upgrade to it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (PHOENIX-3693) Update to Tephra 0.11.0
James Taylor created PHOENIX-3693: - Summary: Update to Tephra 0.11.0 Key: PHOENIX-3693 URL: https://issues.apache.org/jira/browse/PHOENIX-3693 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: James Taylor When Tephra 0.11.0 is released, we should upgrade to it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3663) Implement resource controls in Phoenix JDBC driver
[ https://issues.apache.org/jira/browse/PHOENIX-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Jacoby updated PHOENIX-3663: - Attachment: PHOENIX-3663.patch Patch to create an optional per-ConnectionQueryServices max number of PhoenixConnections. I also noticed that the addendum to PHOENIX-3611 to remove the maxSize on the ConnectionInfo/CQSI cache had been reverted or not applied to trunk, so I re-did that as well. > Implement resource controls in Phoenix JDBC driver > -- > > Key: PHOENIX-3663 > URL: https://issues.apache.org/jira/browse/PHOENIX-3663 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.9.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby > Attachments: PHOENIX-3663.patch > > > It would be very useful for service protection to be able to limit how many > Phoenix connections a particular client machine can request at one time. > This feature should be optional, and default to off for backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (PHOENIX-3689) Not determinist order by with limit
[ https://issues.apache.org/jira/browse/PHOENIX-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882399#comment-15882399 ] chenglei edited comment on PHOENIX-3689 at 2/24/17 10:27 AM: - Thank your for adding DDL,but it seems that the sql you given before can not match the DDL, may be it is better that you give us a complete test case,just like the PHOENIX-3578 does,thanks. was (Author: comnetwork): Thank your for adding DDL,but it seems that the sql given before can not match the DDL, may be it is better that you give us a complete test case,just like the PHOENIX-3578 does,thanks. > Not determinist order by with limit > --- > > Key: PHOENIX-3689 > URL: https://issues.apache.org/jira/browse/PHOENIX-3689 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.7.0 >Reporter: Arthur > > The following request does not return the last value of myTable: > select * from myTable order by myKey desc limit 1; > Adding a 'group by myKey' clause gets back the good result. > I noticed that an order by with 'limit 10' returns a merge of 10 results from > each region and not 10 results of the whole request. > So 'order by' is not determinist. It is a bug or a feature ? > Here is my DDL: > CREATE TABLE TT (dt timestamp NOT NULL, message bigint NOT NULL, id > varchar(20) NOT NULL, version varchar CONSTRAINT PK PRIMARY KEY (dt, message, > id)); > And some data with a dynamic column (I have 2 millions of similar rows sorted > by time) : > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.3730',91,'','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.7170',91,'0001','PO','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:01.9030',91,'0002','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:02.7330',91,'0003','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:03.5470',91,'0004','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:04.7330',91,'0005','POUR','S_052305'); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (PHOENIX-3689) Not determinist order by with limit
[ https://issues.apache.org/jira/browse/PHOENIX-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882399#comment-15882399 ] chenglei edited comment on PHOENIX-3689 at 2/24/17 10:27 AM: - Thank your for adding DDL,but it seems that the sql given before can not match the DDL, may be it is better that you give us a complete test case,just like the PHOENIX-3578 does,thanks. was (Author: comnetwork): Thank your for adding DDL,but it seems that the sql given before can not match the DDL, may be you had better give us a complete test case,just like the PHOENIX-3578 does,thanks. > Not determinist order by with limit > --- > > Key: PHOENIX-3689 > URL: https://issues.apache.org/jira/browse/PHOENIX-3689 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.7.0 >Reporter: Arthur > > The following request does not return the last value of myTable: > select * from myTable order by myKey desc limit 1; > Adding a 'group by myKey' clause gets back the good result. > I noticed that an order by with 'limit 10' returns a merge of 10 results from > each region and not 10 results of the whole request. > So 'order by' is not determinist. It is a bug or a feature ? > Here is my DDL: > CREATE TABLE TT (dt timestamp NOT NULL, message bigint NOT NULL, id > varchar(20) NOT NULL, version varchar CONSTRAINT PK PRIMARY KEY (dt, message, > id)); > And some data with a dynamic column (I have 2 millions of similar rows sorted > by time) : > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.3730',91,'','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.7170',91,'0001','PO','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:01.9030',91,'0002','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:02.7330',91,'0003','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:03.5470',91,'0004','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:04.7330',91,'0005','POUR','S_052305'); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3689) Not determinist order by with limit
[ https://issues.apache.org/jira/browse/PHOENIX-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882399#comment-15882399 ] chenglei commented on PHOENIX-3689: --- Thank your for adding DDL,but it seems that the sql given before can not match the DDL, may be you had better give us a complete test case,just like the PHOENIX-3578 does,thanks. > Not determinist order by with limit > --- > > Key: PHOENIX-3689 > URL: https://issues.apache.org/jira/browse/PHOENIX-3689 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.7.0 >Reporter: Arthur > > The following request does not return the last value of myTable: > select * from myTable order by myKey desc limit 1; > Adding a 'group by myKey' clause gets back the good result. > I noticed that an order by with 'limit 10' returns a merge of 10 results from > each region and not 10 results of the whole request. > So 'order by' is not determinist. It is a bug or a feature ? > Here is my DDL: > CREATE TABLE TT (dt timestamp NOT NULL, message bigint NOT NULL, id > varchar(20) NOT NULL, version varchar CONSTRAINT PK PRIMARY KEY (dt, message, > id)); > And some data with a dynamic column (I have 2 millions of similar rows sorted > by time) : > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.3730',91,'','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.7170',91,'0001','PO','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:01.9030',91,'0002','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:02.7330',91,'0003','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:03.5470',91,'0004','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:04.7330',91,'0005','POUR','S_052305'); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PHOENIX-3689) Not determinist order by with limit
[ https://issues.apache.org/jira/browse/PHOENIX-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arthur updated PHOENIX-3689: Description: The following request does not return the last value of myTable: select * from myTable order by myKey desc limit 1; Adding a 'group by myKey' clause gets back the good result. I noticed that an order by with 'limit 10' returns a merge of 10 results from each region and not 10 results of the whole request. So 'order by' is not determinist. It is a bug or a feature ? Here is my DDL: CREATE TABLE TT (dt timestamp NOT NULL, message bigint NOT NULL, id varchar(20) NOT NULL, version varchar CONSTRAINT PK PRIMARY KEY (dt, message, id)); And some data with a dynamic column (I have 2 millions of similar rows sorted by time) : UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 03:31:00.3730',91,'','POUR','S_052303'); UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 03:31:00.7170',91,'0001','PO','S_052303'); UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 03:31:01.9030',91,'0002','POUR','S_052303'); UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 03:31:02.7330',91,'0003','POUR','S_052303'); UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 03:31:03.5470',91,'0004','POUR','S_052303'); UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 03:31:04.7330',91,'0005','POUR','S_052305'); was: The following request does not return the last value of myTable: select * from myTable order by myKey desc limit 1; Adding a 'group by myKey' clause gets back the good result. I noticed that an order by with 'limit 10' returns a merge of 10 results from each region and not 10 results of the whole request. So 'order by' is not determinist. It is a bug or a feature ? > Not determinist order by with limit > --- > > Key: PHOENIX-3689 > URL: https://issues.apache.org/jira/browse/PHOENIX-3689 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.7.0 >Reporter: Arthur > > The following request does not return the last value of myTable: > select * from myTable order by myKey desc limit 1; > Adding a 'group by myKey' clause gets back the good result. > I noticed that an order by with 'limit 10' returns a merge of 10 results from > each region and not 10 results of the whole request. > So 'order by' is not determinist. It is a bug or a feature ? > Here is my DDL: > CREATE TABLE TT (dt timestamp NOT NULL, message bigint NOT NULL, id > varchar(20) NOT NULL, version varchar CONSTRAINT PK PRIMARY KEY (dt, message, > id)); > And some data with a dynamic column (I have 2 millions of similar rows sorted > by time) : > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.3730',91,'','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:00.7170',91,'0001','PO','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:01.9030',91,'0002','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:02.7330',91,'0003','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:03.5470',91,'0004','POUR','S_052303'); > UPSERT INTO TT (dt, message, id, version, seg varchar) VALUES ('2013-12-03 > 03:31:04.7330',91,'0005','POUR','S_052305'); -- This message was sent by Atlassian JIRA (v6.3.15#6346)