[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
[ https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363598#comment-16363598 ] James Taylor commented on PHOENIX-4344: --- Phoenix will do a point delete (i.e. the Phoenix client will issue an HBase Delete with the full row key) because it thinks it has values for all the columns that make up the primary key of the base table. In this case, it doesn't need to issue a scan at all. The problem is, Phoenix doesn't know that there are derived views that have extended the PK. One solution would be to have a declaration on the base table that it would never be used to upsert data directly. Something like declaring it ABSTRACT. In that case, if you deleted from it, Phoenix could know to issue a scan instead of trying to optimize it as a point delete. Another solution would be to issue the delete statement against the view in the MR job. Since the view has extended the PK, Phoenix wouldn't issue a point delete, but would issue a scan. That might not be feasible, though, as it'd be tricky to know all the views. > MapReduce Delete Support > > > Key: PHOENIX-4344 > URL: https://issues.apache.org/jira/browse/PHOENIX-4344 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.12.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > Phoenix already has the ability to use MapReduce for asynchronous handling of > long-running SELECTs. It would be really useful to have this capability for > long-running DELETEs, particularly of tables with indexes where using HBase's > own MapReduce integration would be prohibitively complicated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363594#comment-16363594 ] chenglei commented on PHOENIX-4602: --- Pushed to master, 4.x-HBase-1.3, 4.x-HBase-1.2, 4.x-HBase-1.1, 4.x-cdh5.11.2, 4.x-HBase-0.98, and 5.x-HBase-2.0 branches. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Assignee: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v2.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-2566) Support NOT NULL constraint for any column for immutable table
[ https://issues.apache.org/jira/browse/PHOENIX-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2566: -- Attachment: PHOENIX-2566_v1.patch > Support NOT NULL constraint for any column for immutable table > -- > > Key: PHOENIX-2566 > URL: https://issues.apache.org/jira/browse/PHOENIX-2566 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-2566_v1.patch > > > Since write-once/append-only tables do not partially update rows, we can > support NOT NULL constraints for non PK columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-2566) Support NOT NULL constraint for any column for immutable table
[ https://issues.apache.org/jira/browse/PHOENIX-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363578#comment-16363578 ] James Taylor commented on PHOENIX-2566: --- Please review, [~tdsilva]. > Support NOT NULL constraint for any column for immutable table > -- > > Key: PHOENIX-2566 > URL: https://issues.apache.org/jira/browse/PHOENIX-2566 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-2566_v1.patch > > > Since write-once/append-only tables do not partially update rows, we can > support NOT NULL constraints for non PK columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-2566) Support NOT NULL constraint for any column for immutable table
[ https://issues.apache.org/jira/browse/PHOENIX-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor reassigned PHOENIX-2566: - Assignee: James Taylor (was: Vincent Poon) > Support NOT NULL constraint for any column for immutable table > -- > > Key: PHOENIX-2566 > URL: https://issues.apache.org/jira/browse/PHOENIX-2566 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > > Since write-once/append-only tables do not partially update rows, we can > support NOT NULL constraints for non PK columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-2566) Support NOT NULL constraint for any column for immutable table
[ https://issues.apache.org/jira/browse/PHOENIX-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2566: -- Fix Version/s: 4.14.0 > Support NOT NULL constraint for any column for immutable table > -- > > Key: PHOENIX-2566 > URL: https://issues.apache.org/jira/browse/PHOENIX-2566 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Fix For: 4.14.0 > > > Since write-once/append-only tables do not partially update rows, we can > support NOT NULL constraints for non PK columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363572#comment-16363572 ] chenglei commented on PHOENIX-4602: --- Applied the patch to 4.x-HBase-1.3 and ran all the unit tests and IT tests in my local machine, the tests are all successful, and add more tests in patchV2. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Assignee: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v2.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Attachment: (was: PHOENIX-4602_v1.patch) > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Assignee: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v2.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Attachment: PHOENIX-4602_v2.patch > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Assignee: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v2.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363355#comment-16363355 ] Hudson commented on PHOENIX-4603: - SUCCESS: Integrated in Jenkins build Phoenix-4.x-HBase-1.3 #39 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-1.3/39/]) PHOENIX-4603 Remove check for table existence in (jtaylor: rev 106daa347e89e762c30089023ae8389b95b01fd3) * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/DynamicColumnIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/MappingTableDataTypeIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/NamespaceSchemaMappingIT.java > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0, 5.1.0 > > Attachments: PHOENIX-4603_v1.patch, PHOENIX-4603_v2.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: phoenix newbie build question
Hi Josh, Thanks for your reply. I got java1.8 and maven 3.3.9 as below. Apache Maven 3.3.9 Maven home: /usr/share/maven Java version: 1.8.0_151, vendor: Oracle Corporation Ok. Sounds good. Thank you. Xu On Tue, Feb 13, 2018 at 4:55 PM, Josh Elser wrote: > Hi Xu, > > What version of Java and Maven are you using? > > I wouldn't be super worried about the test failures -- it's likely just an > indication that the unit test is reliant on something in the local > environment which isn't there on your computer (e.g. a default krb5.conf). > Ideally, we can figure out why it failed and fix it for the future, but > would need to get to the bottom of it.. > > > On 2/13/18 6:51 PM, Xu Cang wrote: > >> Hi, >> >> I am trying to build Phoenix (on Ubuntu) and run tests by following >> 'build.txt' instruction from code repo. >> >> Commands I ran: >> >> 1. mvn install -DskipTests >> 2. mvn process-sources >> 3. mvn package >> >> Thenm I got this error: >> >> [ERROR] >> testMultipleConnectionsAsSameUserWithoutLogin(org.apache.pho >> enix.jdbc.SecureUserConnectionsTest) >> Time elapsed: 0.013 s <<< ERROR! >> java.lang.RuntimeException: Couldn't get the current user!! >> at >> org.apache.phoenix.jdbc.SecureUserConnectionsTest.testMultip >> leConnectionsAsSameUserWithoutLogin(SecureUserConnectionsTest.java:378) >> >> [INFO] >> [INFO] Results: >> [INFO] >> [ERROR] Errors: >> [ERROR] >> SecureUserConnectionsTest.testMultipleConnectionsAsSameUserW >> ithoutLogin:378 >> Runtime >> [INFO] >> [ERROR] Tests run: 1592, Failures: 0, Errors: 1, Skipped: 3 >> [INFO] >> [INFO] >> >> [INFO] Reactor Summary: >> [INFO] >> [INFO] Apache Phoenix . SUCCESS [ >> 0.924 s] >> [INFO] Phoenix Core ... FAILURE [ >> 35.155 s] >> >> >> The error comes from this code piece: >> >> *try {* >> *this.user = User.getCurrent();* >> *} catch (IOException e) {* >> *throw new RuntimeException("Couldn't get the current >> user!!");* >> *}* >> >> >> My question is, am I missing any dependencies in order to get this user? >> Any pointer or help is appreciated. Thanks, >> >> >> (BTW, IndexUtilTest.java unit test ran successfully. ) >> >> Best Regards, >> Xu >> >>
Re: phoenix newbie build question
Hi Xu, What version of Java and Maven are you using? I wouldn't be super worried about the test failures -- it's likely just an indication that the unit test is reliant on something in the local environment which isn't there on your computer (e.g. a default krb5.conf). Ideally, we can figure out why it failed and fix it for the future, but would need to get to the bottom of it.. On 2/13/18 6:51 PM, Xu Cang wrote: Hi, I am trying to build Phoenix (on Ubuntu) and run tests by following 'build.txt' instruction from code repo. Commands I ran: 1. mvn install -DskipTests 2. mvn process-sources 3. mvn package Thenm I got this error: [ERROR] testMultipleConnectionsAsSameUserWithoutLogin(org.apache.phoenix.jdbc.SecureUserConnectionsTest) Time elapsed: 0.013 s <<< ERROR! java.lang.RuntimeException: Couldn't get the current user!! at org.apache.phoenix.jdbc.SecureUserConnectionsTest.testMultipleConnectionsAsSameUserWithoutLogin(SecureUserConnectionsTest.java:378) [INFO] [INFO] Results: [INFO] [ERROR] Errors: [ERROR] SecureUserConnectionsTest.testMultipleConnectionsAsSameUserWithoutLogin:378 Runtime [INFO] [ERROR] Tests run: 1592, Failures: 0, Errors: 1, Skipped: 3 [INFO] [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Phoenix . SUCCESS [ 0.924 s] [INFO] Phoenix Core ... FAILURE [ 35.155 s] The error comes from this code piece: *try {* *this.user = User.getCurrent();* *} catch (IOException e) {* *throw new RuntimeException("Couldn't get the current user!!");* *}* My question is, am I missing any dependencies in order to get this user? Any pointer or help is appreciated. Thanks, (BTW, IndexUtilTest.java unit test ran successfully. ) Best Regards, Xu
[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
[ https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363286#comment-16363286 ] Akshita Malhotra commented on PHOENIX-4344: --- [~jamestaylor] Can you explain why would it do a point scan? Maybe I am thinking in the wrong direction but as [~gjacoby] explained, even if the initial delete is deleting over a non PK column, when a point phoenix delete query is being issued, I can provide the PK information (obtain from the map reduce scan) along with the extra predicate that would include the non-PK column. > MapReduce Delete Support > > > Key: PHOENIX-4344 > URL: https://issues.apache.org/jira/browse/PHOENIX-4344 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.12.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > Phoenix already has the ability to use MapReduce for asynchronous handling of > long-running SELECTs. It would be really useful to have this capability for > long-running DELETEs, particularly of tables with indexes where using HBase's > own MapReduce integration would be prohibitively complicated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
phoenix newbie build question
Hi, I am trying to build Phoenix (on Ubuntu) and run tests by following 'build.txt' instruction from code repo. Commands I ran: 1. mvn install -DskipTests 2. mvn process-sources 3. mvn package Thenm I got this error: [ERROR] testMultipleConnectionsAsSameUserWithoutLogin(org.apache.phoenix.jdbc.SecureUserConnectionsTest) Time elapsed: 0.013 s <<< ERROR! java.lang.RuntimeException: Couldn't get the current user!! at org.apache.phoenix.jdbc.SecureUserConnectionsTest.testMultipleConnectionsAsSameUserWithoutLogin(SecureUserConnectionsTest.java:378) [INFO] [INFO] Results: [INFO] [ERROR] Errors: [ERROR] SecureUserConnectionsTest.testMultipleConnectionsAsSameUserWithoutLogin:378 Runtime [INFO] [ERROR] Tests run: 1592, Failures: 0, Errors: 1, Skipped: 3 [INFO] [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Phoenix . SUCCESS [ 0.924 s] [INFO] Phoenix Core ... FAILURE [ 35.155 s] The error comes from this code piece: *try {* *this.user = User.getCurrent();* *} catch (IOException e) {* *throw new RuntimeException("Couldn't get the current user!!");* *}* My question is, am I missing any dependencies in order to get this user? Any pointer or help is appreciated. Thanks, (BTW, IndexUtilTest.java unit test ran successfully. ) Best Regards, Xu
[jira] [Resolved] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor resolved PHOENIX-4603. --- Resolution: Fixed Fix Version/s: 5.1.0 > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0, 5.1.0 > > Attachments: PHOENIX-4603_v1.patch, PHOENIX-4603_v2.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4592) BaseResultIterators.getStatsForParallelizationProp() should use retry looking up the table without tenantId if cannot find the table using the tenantId
[ https://issues.apache.org/jira/browse/PHOENIX-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363175#comment-16363175 ] James Taylor commented on PHOENIX-4592: --- +1 > BaseResultIterators.getStatsForParallelizationProp() should use retry looking > up the table without tenantId if cannot find the table using the tenantId > --- > > Key: PHOENIX-4592 > URL: https://issues.apache.org/jira/browse/PHOENIX-4592 > Project: Phoenix > Issue Type: Bug >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > Attachments: PHOENIX-4592-4.x-HBase-0.98.patch > > > Running a query using a tenant specific connection logs the following warning > : > {code} > 2018-02-09 17:41:45,497 WARN [main] iterate.BaseResultIterators - Unable to > find parent table "X" of table "X" to determine USE_STATS_FOR_PARALLELIZATION > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table > undefined. tableName=X > at > org.apache.phoenix.schema.PMetaDataImpl.getTableRef(PMetaDataImpl.java:71) > at > org.apache.phoenix.jdbc.PhoenixConnection.getTable(PhoenixConnection.java:567) > at > org.apache.phoenix.iterate.BaseResultIterators.getStatsForParallelizationProp(BaseResultIterators.java:1282) > at > org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:500) > at > org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:67) > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:240) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:345) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:212) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:207) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:202) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:289) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:288) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:282) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1692) > at sqlline.Commands.execute(Commands.java:822) > at sqlline.Commands.sql(Commands.java:732) > at sqlline.SqlLine.dispatch(SqlLine.java:807) > at sqlline.SqlLine.begin(SqlLine.java:681) > at sqlline.SqlLine.start(SqlLine.java:398) > at sqlline.SqlLine.main(SqlLine.java:292) > {code} > The following code needs to be modified > {code} > if (table.getType() == PTableType.INDEX && table.getParentName() != null) { > PhoenixConnection conn = context.getConnection(); > String parentTableName = table.getParentName().getString(); > try { > PTable parentTable = > conn.getTable(new PTableKey(conn.getTenantId(), > parentTableName)); > useStats = parentTable.useStatsForParallelization(); > if (useStats != null) { > return useStats; > } > } catch (TableNotFoundException e) { > logger.warn("Unable to find parent table \"" + > parentTableName + "\" of table \"" > + table.getName().getString() > + "\" to determine USE_STATS_FOR_PARALLELIZATION", > e); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4592) BaseResultIterators.getStatsForParallelizationProp() should use retry looking up the table without tenantId if cannot find the table using the tenantId
[ https://issues.apache.org/jira/browse/PHOENIX-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363163#comment-16363163 ] Thomas D'Silva commented on PHOENIX-4592: - [~jamestaylor] Can you please review? I also changed USE_STATS_FOR_PARALLELIZATION isMutableOnView property to be false. > BaseResultIterators.getStatsForParallelizationProp() should use retry looking > up the table without tenantId if cannot find the table using the tenantId > --- > > Key: PHOENIX-4592 > URL: https://issues.apache.org/jira/browse/PHOENIX-4592 > Project: Phoenix > Issue Type: Bug >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > Attachments: PHOENIX-4592-4.x-HBase-0.98.patch > > > Running a query using a tenant specific connection logs the following warning > : > {code} > 2018-02-09 17:41:45,497 WARN [main] iterate.BaseResultIterators - Unable to > find parent table "X" of table "X" to determine USE_STATS_FOR_PARALLELIZATION > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table > undefined. tableName=X > at > org.apache.phoenix.schema.PMetaDataImpl.getTableRef(PMetaDataImpl.java:71) > at > org.apache.phoenix.jdbc.PhoenixConnection.getTable(PhoenixConnection.java:567) > at > org.apache.phoenix.iterate.BaseResultIterators.getStatsForParallelizationProp(BaseResultIterators.java:1282) > at > org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:500) > at > org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:67) > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:240) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:345) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:212) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:207) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:202) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:289) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:288) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:282) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1692) > at sqlline.Commands.execute(Commands.java:822) > at sqlline.Commands.sql(Commands.java:732) > at sqlline.SqlLine.dispatch(SqlLine.java:807) > at sqlline.SqlLine.begin(SqlLine.java:681) > at sqlline.SqlLine.start(SqlLine.java:398) > at sqlline.SqlLine.main(SqlLine.java:292) > {code} > The following code needs to be modified > {code} > if (table.getType() == PTableType.INDEX && table.getParentName() != null) { > PhoenixConnection conn = context.getConnection(); > String parentTableName = table.getParentName().getString(); > try { > PTable parentTable = > conn.getTable(new PTableKey(conn.getTenantId(), > parentTableName)); > useStats = parentTable.useStatsForParallelization(); > if (useStats != null) { > return useStats; > } > } catch (TableNotFoundException e) { > logger.warn("Unable to find parent table \"" + > parentTableName + "\" of table \"" > + table.getName().getString() > + "\" to determine USE_STATS_FOR_PARALLELIZATION", > e); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4592) BaseResultIterators.getStatsForParallelizationProp() should use retry looking up the table without tenantId if cannot find the table using the tenantId
[ https://issues.apache.org/jira/browse/PHOENIX-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-4592: Attachment: PHOENIX-4592-4.x-HBase-0.98.patch > BaseResultIterators.getStatsForParallelizationProp() should use retry looking > up the table without tenantId if cannot find the table using the tenantId > --- > > Key: PHOENIX-4592 > URL: https://issues.apache.org/jira/browse/PHOENIX-4592 > Project: Phoenix > Issue Type: Bug >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > Attachments: PHOENIX-4592-4.x-HBase-0.98.patch > > > Running a query using a tenant specific connection logs the following warning > : > {code} > 2018-02-09 17:41:45,497 WARN [main] iterate.BaseResultIterators - Unable to > find parent table "X" of table "X" to determine USE_STATS_FOR_PARALLELIZATION > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table > undefined. tableName=X > at > org.apache.phoenix.schema.PMetaDataImpl.getTableRef(PMetaDataImpl.java:71) > at > org.apache.phoenix.jdbc.PhoenixConnection.getTable(PhoenixConnection.java:567) > at > org.apache.phoenix.iterate.BaseResultIterators.getStatsForParallelizationProp(BaseResultIterators.java:1282) > at > org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:500) > at > org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:67) > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:240) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:345) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:212) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:207) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:202) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:289) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:288) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:282) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1692) > at sqlline.Commands.execute(Commands.java:822) > at sqlline.Commands.sql(Commands.java:732) > at sqlline.SqlLine.dispatch(SqlLine.java:807) > at sqlline.SqlLine.begin(SqlLine.java:681) > at sqlline.SqlLine.start(SqlLine.java:398) > at sqlline.SqlLine.main(SqlLine.java:292) > {code} > The following code needs to be modified > {code} > if (table.getType() == PTableType.INDEX && table.getParentName() != null) { > PhoenixConnection conn = context.getConnection(); > String parentTableName = table.getParentName().getString(); > try { > PTable parentTable = > conn.getTable(new PTableKey(conn.getTenantId(), > parentTableName)); > useStats = parentTable.useStatsForParallelization(); > if (useStats != null) { > return useStats; > } > } catch (TableNotFoundException e) { > logger.warn("Unable to find parent table \"" + > parentTableName + "\" of table \"" > + table.getName().getString() > + "\" to determine USE_STATS_FOR_PARALLELIZATION", > e); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363147#comment-16363147 ] James Taylor commented on PHOENIX-4603: --- Thanks for the review, [~tdsilva]. I uploaded the final version of the patch which fixes a couple of tests which needed to explicitly disable column encoding. > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch, PHOENIX-4603_v2.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4603: -- Attachment: PHOENIX-4603_v2.patch > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch, PHOENIX-4603_v2.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4605) Add TRANSACTION_PROVIDER and DEFAULT_TRANSACTION_PROVIDER instead of using boolean
[ https://issues.apache.org/jira/browse/PHOENIX-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363046#comment-16363046 ] James Taylor commented on PHOENIX-4605: --- FYI, [~ohads]. Not sure what we should do about QueryServices.TRANSACTIONS_ENABLED (currently a boolean as well). Maybe it should contain a list of supported/configured transaction providers? We use that mostly in tests, but we also use it when we open a cluster connection to conditionally establish a connection to the transaction manager. Is there any initialization required for Omid along these lines? If so, should we add a new TAL method? {code} private void openConnection() throws SQLException { try { boolean transactionsEnabled = props.getBoolean( QueryServices.TRANSACTIONS_ENABLED, QueryServicesOptions.DEFAULT_TRANSACTIONS_ENABLED); this.connection = HBaseFactoryProvider.getHConnectionFactory().createConnection(this.config); GLOBAL_HCONNECTIONS_COUNTER.increment(); logger.info("HConnection established. Stacktrace for informational purposes: " + connection + " " + LogUtil.getCallerStackTrace()); // only initialize the tx service client if needed and if we succeeded in getting a connection // to HBase if (transactionsEnabled) { initTxServiceClient(); } } catch (IOException e) { throw new SQLExceptionInfo.Builder(SQLExceptionCode.CANNOT_ESTABLISH_CONNECTION) .setRootCause(e).build().buildException(); } if (this.connection.isClosed()) { // TODO: why the heck doesn't this throw above? throw new SQLExceptionInfo.Builder(SQLExceptionCode.CANNOT_ESTABLISH_CONNECTION).build().buildException(); } } {code} One more check needed would be in MutationState to disallow updates to both Tephra and Omid tables in the same transaction. > Add TRANSACTION_PROVIDER and DEFAULT_TRANSACTION_PROVIDER instead of using > boolean > -- > > Key: PHOENIX-4605 > URL: https://issues.apache.org/jira/browse/PHOENIX-4605 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Priority: Major > > We should deprecate QueryServices.DEFAULT_TABLE_ISTRANSACTIONAL_ATTRIB and > instead have a QueryServices.DEFAULT_TRANSACTION_PROVIDER now that we'll have > two transaction providers: Tephra and Omid. Along the same lines, we should > add a TRANSACTION_PROVIDER column to SYSTEM.CATALOG and stop using the > IS_TRANSACTIONAL table property. For backwards compatibility, we can assume > the provider is Tephra if the existing properties are set to true. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363045#comment-16363045 ] Thomas D'Silva commented on PHOENIX-4603: - +1 > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4423) Phoenix-hive compilation broken on >=Hive 2.3
[ https://issues.apache.org/jira/browse/PHOENIX-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363038#comment-16363038 ] Sergey Soldatov commented on PHOENIX-4423: -- Ah, hive-it is not published as an official artifact. https://repository.apache.org/content/repositories/releases/org/apache/hive/ I believe that was the main reason why we used our own clone of test util class. > Phoenix-hive compilation broken on >=Hive 2.3 > - > > Key: PHOENIX-4423 > URL: https://issues.apache.org/jira/browse/PHOENIX-4423 > Project: Phoenix > Issue Type: Bug >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Critical > Fix For: 5.0.0 > > Attachments: PHOENIX-4423.002.patch, PHOENIX-4423_wip1.patch > > > HIVE-15167 removed an interface which we're using in Phoenix which obviously > fails compilation. Will need to figure out how to work with Hive 1.x, <2.3.0, > and >=2.3.0. > FYI [~sergey.soldatov] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4605) Add TRANSACTION_PROVIDER and DEFAULT_TRANSACTION_PROVIDER instead of using boolean
James Taylor created PHOENIX-4605: - Summary: Add TRANSACTION_PROVIDER and DEFAULT_TRANSACTION_PROVIDER instead of using boolean Key: PHOENIX-4605 URL: https://issues.apache.org/jira/browse/PHOENIX-4605 Project: Phoenix Issue Type: Bug Reporter: James Taylor We should deprecate QueryServices.DEFAULT_TABLE_ISTRANSACTIONAL_ATTRIB and instead have a QueryServices.DEFAULT_TRANSACTION_PROVIDER now that we'll have two transaction providers: Tephra and Omid. Along the same lines, we should add a TRANSACTION_PROVIDER column to SYSTEM.CATALOG and stop using the IS_TRANSACTIONAL table property. For backwards compatibility, we can assume the provider is Tephra if the existing properties are set to true. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363015#comment-16363015 ] James Taylor commented on PHOENIX-4603: --- Please review, [~tdsilva] and/or [~samarthjain]. The client-side cache is populated based on the client-side state, so if table already exists which is encoded, then Phoenix would think it's not encoded. I've filed PHOENIX-4604 to do the verification on the server-side if the table already exists. > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4604) If table already exists ensure that table metadata matches for non changeable properties
James Taylor created PHOENIX-4604: - Summary: If table already exists ensure that table metadata matches for non changeable properties Key: PHOENIX-4604 URL: https://issues.apache.org/jira/browse/PHOENIX-4604 Project: Phoenix Issue Type: Bug Reporter: James Taylor We should check that the non changeable properties of a Phoenix table matches with the metadata passed from the client when the table already exists. Otherwise, we can run into issues for existing data: for example, if it was encoded before and it's subsequently declared as not encoded. Same issue for SALT_BUCKETS changing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4603: -- Fix Version/s: 4.14.0 > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4603: -- Attachment: PHOENIX-4603_v1.patch > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
[ https://issues.apache.org/jira/browse/PHOENIX-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor reassigned PHOENIX-4603: - Assignee: James Taylor > Remove check for table existence in MetaDataClient.createTableInternal() > > > Key: PHOENIX-4603 > URL: https://issues.apache.org/jira/browse/PHOENIX-4603 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4603_v1.patch > > > Found some strange code in that should be removed. If a table is being > created but the HBase metadata already exists, we can't assume one way or the > other that it's encoded or not encoded. It's on the user to supply the > correct existing encoding in that case. > {code} > byte[] tableNameBytes = > SchemaUtil.getTableNameAsBytes(schemaName, tableName); > boolean tableExists = true; > try { > HTableDescriptor tableDescriptor = > connection.getQueryServices().getTableDescriptor(tableNameBytes); > if (tableDescriptor == null) { // for connectionless > tableExists = false; > } > } catch (org.apache.phoenix.schema.TableNotFoundException e) { > tableExists = false; > } > if (tableExists) { > encodingScheme = NON_ENCODED_QUALIFIERS; > immutableStorageScheme = ONE_CELL_PER_COLUMN; > } else ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4603) Remove check for table existence in MetaDataClient.createTableInternal()
James Taylor created PHOENIX-4603: - Summary: Remove check for table existence in MetaDataClient.createTableInternal() Key: PHOENIX-4603 URL: https://issues.apache.org/jira/browse/PHOENIX-4603 Project: Phoenix Issue Type: Bug Reporter: James Taylor Found some strange code in that should be removed. If a table is being created but the HBase metadata already exists, we can't assume one way or the other that it's encoded or not encoded. It's on the user to supply the correct existing encoding in that case. {code} byte[] tableNameBytes = SchemaUtil.getTableNameAsBytes(schemaName, tableName); boolean tableExists = true; try { HTableDescriptor tableDescriptor = connection.getQueryServices().getTableDescriptor(tableNameBytes); if (tableDescriptor == null) { // for connectionless tableExists = false; } } catch (org.apache.phoenix.schema.TableNotFoundException e) { tableExists = false; } if (tableExists) { encodingScheme = NON_ENCODED_QUALIFIERS; immutableStorageScheme = ONE_CELL_PER_COLUMN; } else ... {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4533) Phoenix Query Server should not use SPNEGO principal to proxy user requests
[ https://issues.apache.org/jira/browse/PHOENIX-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362972#comment-16362972 ] Josh Elser commented on PHOENIX-4533: - bq. I am not sure what should change for building, Nothing to change on that page -- it has the information on where to check out the website's source and how to build it :) > Phoenix Query Server should not use SPNEGO principal to proxy user requests > --- > > Key: PHOENIX-4533 > URL: https://issues.apache.org/jira/browse/PHOENIX-4533 > Project: Phoenix > Issue Type: Improvement >Reporter: Lev Bronshtein >Assignee: Lev Bronshtein >Priority: Minor > Fix For: 5.0.0, 4.14.0 > > Attachments: PHOENIX-4533.1.patch, PHOENIX-4533.2.patch, > PHOENIX-4533.3.patch, PHOENIX-4533.squash.patch > > > Currently the HTTP/ principal is used by various components in the HADOOP > ecosystem to perform SPNEGO authentication. Since there can only be one > HTTP/ per host, even outside of the Hadoop ecosystem, the keytab containing > key material for local HTTP/ principal is shared among a few applications. > With so many applications having access to the HTTP/ credentials, this > increases the chances of an attack on the proxy user capabilities of Hadoop. > This JIRA proposes that two different key tabs can be used to > 1. Authenticate kerberized web requests > 2. Communicate with the phoenix back end -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362926#comment-16362926 ] James Taylor commented on PHOENIX-4602: --- Patch looks good, [~comnetwork]. +1 assuming successful {{mvn verify}} run on 4.x-HBase-1.3 branch. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Assignee: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4602: -- Fix Version/s: 4.14.0 > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor reassigned PHOENIX-4602: - Assignee: chenglei > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Assignee: chenglei >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4533) Phoenix Query Server should not use SPNEGO principal to proxy user requests
[ https://issues.apache.org/jira/browse/PHOENIX-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362911#comment-16362911 ] Lev Bronshtein commented on PHOENIX-4533: - Can do the docs, I am not sure what should change for building, definitely for server, where are the source for the doc website? > Phoenix Query Server should not use SPNEGO principal to proxy user requests > --- > > Key: PHOENIX-4533 > URL: https://issues.apache.org/jira/browse/PHOENIX-4533 > Project: Phoenix > Issue Type: Improvement >Reporter: Lev Bronshtein >Assignee: Lev Bronshtein >Priority: Minor > Fix For: 5.0.0, 4.14.0 > > Attachments: PHOENIX-4533.1.patch, PHOENIX-4533.2.patch, > PHOENIX-4533.3.patch, PHOENIX-4533.squash.patch > > > Currently the HTTP/ principal is used by various components in the HADOOP > ecosystem to perform SPNEGO authentication. Since there can only be one > HTTP/ per host, even outside of the Hadoop ecosystem, the keytab containing > key material for local HTTP/ principal is shared among a few applications. > With so many applications having access to the HTTP/ credentials, this > increases the chances of an attack on the proxy user capabilities of Hadoop. > This JIRA proposes that two different key tabs can be used to > 1. Authenticate kerberized web requests > 2. Communicate with the phoenix back end -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] Apache Phoenix 5.0.0-alpha rc1
FYI, even though 72 hours has already elapsed, I plan to leave this open until Wednesday under hopes that some other folks will take a look before then (as a part of the work-week). Thanks in advance! On 2/12/18 10:34 AM, Josh Elser wrote: s/RC0/RC1/ below. I wasn't very diligent with my copy-paste-fix :) The git-commit SHA1 is correct. Please take a look if you can today! On 2/9/18 10:34 AM, Josh Elser wrote: Hello Everyone, This is a call for a vote on Apache Phoenix 5.0.0-alpha rc1. Please notice that there are known issues with this release which deserve the "alpha" designation. These are staged on the website[1]. (Atomic upsert does work on my local installation with trivial testing) Over rc0, this release contains the changes: PHOENIX-4586, PHOENIX-4546, PHOENIX-4549, PHOENIX-4582. The RC is available at the standard location: https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-5.0.0-alpha-HBase-2.0-rc1 RC0 is based on the following commit: 451d6a37d0d461b60edff36ceb42b17bb9610350 Signed with my key: 9E62822F4668F17B0972ADD9B7D5CD454677D66C, http://pgp.mit.edu/pks/lookup?op=get&search=0xB7D5CD454677D66C Vote will be open for at least 72 hours (2018/02/12 1600GMT). Please vote: [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Thanks, The Apache Phoenix Team [1] https://phoenix.apache.org/release_notes.html
[jira] [Commented] (PHOENIX-4423) Phoenix-hive compilation broken on >=Hive 2.3
[ https://issues.apache.org/jira/browse/PHOENIX-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362892#comment-16362892 ] Sergey Soldatov commented on PHOENIX-4423: -- Heh. There were some 'improvements' in HiveTestUtils comparing to the default hive-it runner to get it working in our case for several MR/Tez jobs in the query (and joins are the place where we are using it). Let me check it. > Phoenix-hive compilation broken on >=Hive 2.3 > - > > Key: PHOENIX-4423 > URL: https://issues.apache.org/jira/browse/PHOENIX-4423 > Project: Phoenix > Issue Type: Bug >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Critical > Fix For: 5.0.0 > > Attachments: PHOENIX-4423.002.patch, PHOENIX-4423_wip1.patch > > > HIVE-15167 removed an interface which we're using in Phoenix which obviously > fails compilation. Will need to figure out how to work with Hive 1.x, <2.3.0, > and >=2.3.0. > FYI [~sergey.soldatov] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4533) Phoenix Query Server should not use SPNEGO principal to proxy user requests
[ https://issues.apache.org/jira/browse/PHOENIX-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362890#comment-16362890 ] Hudson commented on PHOENIX-4533: - FAILURE: Integrated in Jenkins build Phoenix-master #1936 (See [https://builds.apache.org/job/Phoenix-master/1936/]) PHOENIX-4533 Modified Query Server to use two sets of Kerberos (elserj: rev a71c4b7e3c11f1c7d1955b51929ad65b252feb62) * (edit) phoenix-queryserver/src/it/java/org/apache/phoenix/end2end/HttpParamImpersonationQueryServerIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/query/QueryServices.java * (edit) phoenix-queryserver/src/main/java/org/apache/phoenix/queryserver/server/QueryServer.java * (edit) phoenix-queryserver/src/it/java/org/apache/phoenix/end2end/SecureQueryServerIT.java > Phoenix Query Server should not use SPNEGO principal to proxy user requests > --- > > Key: PHOENIX-4533 > URL: https://issues.apache.org/jira/browse/PHOENIX-4533 > Project: Phoenix > Issue Type: Improvement >Reporter: Lev Bronshtein >Assignee: Lev Bronshtein >Priority: Minor > Fix For: 5.0.0, 4.14.0 > > Attachments: PHOENIX-4533.1.patch, PHOENIX-4533.2.patch, > PHOENIX-4533.3.patch, PHOENIX-4533.squash.patch > > > Currently the HTTP/ principal is used by various components in the HADOOP > ecosystem to perform SPNEGO authentication. Since there can only be one > HTTP/ per host, even outside of the Hadoop ecosystem, the keytab containing > key material for local HTTP/ principal is shared among a few applications. > With so many applications having access to the HTTP/ credentials, this > increases the chances of an attack on the proxy user capabilities of Hadoop. > This JIRA proposes that two different key tabs can be used to > 1. Authenticate kerberized web requests > 2. Communicate with the phoenix back end -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-4592) BaseResultIterators.getStatsForParallelizationProp() should use retry looking up the table without tenantId if cannot find the table using the tenantId
[ https://issues.apache.org/jira/browse/PHOENIX-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva reassigned PHOENIX-4592: --- Assignee: Thomas D'Silva > BaseResultIterators.getStatsForParallelizationProp() should use retry looking > up the table without tenantId if cannot find the table using the tenantId > --- > > Key: PHOENIX-4592 > URL: https://issues.apache.org/jira/browse/PHOENIX-4592 > Project: Phoenix > Issue Type: Bug >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > > Running a query using a tenant specific connection logs the following warning > : > {code} > 2018-02-09 17:41:45,497 WARN [main] iterate.BaseResultIterators - Unable to > find parent table "X" of table "X" to determine USE_STATS_FOR_PARALLELIZATION > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table > undefined. tableName=X > at > org.apache.phoenix.schema.PMetaDataImpl.getTableRef(PMetaDataImpl.java:71) > at > org.apache.phoenix.jdbc.PhoenixConnection.getTable(PhoenixConnection.java:567) > at > org.apache.phoenix.iterate.BaseResultIterators.getStatsForParallelizationProp(BaseResultIterators.java:1282) > at > org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:500) > at > org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:67) > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:240) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:345) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:212) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:207) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:202) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:289) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:288) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:282) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1692) > at sqlline.Commands.execute(Commands.java:822) > at sqlline.Commands.sql(Commands.java:732) > at sqlline.SqlLine.dispatch(SqlLine.java:807) > at sqlline.SqlLine.begin(SqlLine.java:681) > at sqlline.SqlLine.start(SqlLine.java:398) > at sqlline.SqlLine.main(SqlLine.java:292) > {code} > The following code needs to be modified > {code} > if (table.getType() == PTableType.INDEX && table.getParentName() != null) { > PhoenixConnection conn = context.getConnection(); > String parentTableName = table.getParentName().getString(); > try { > PTable parentTable = > conn.getTable(new PTableKey(conn.getTenantId(), > parentTableName)); > useStats = parentTable.useStatsForParallelization(); > if (useStats != null) { > return useStats; > } > } catch (TableNotFoundException e) { > logger.warn("Unable to find parent table \"" + > parentTableName + "\" of table \"" > + table.getName().getString() > + "\" to determine USE_STATS_FOR_PARALLELIZATION", > e); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4533) Phoenix Query Server should not use SPNEGO principal to proxy user requests
[ https://issues.apache.org/jira/browse/PHOENIX-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362628#comment-16362628 ] Josh Elser commented on PHOENIX-4533: - Pushed this to the 4.x and 5.x branches. Thanks again, [~lbronshtein]. One final thing: any interest in updating the website with content for the new configuration properties you've added? We'd want to add them to https://phoenix.apache.org/server.html. https://phoenix.apache.org/building_website.html has instructions on how to do this. If you can get a diff against the website, I'd happily apply that too. Else, I'll just throw up something today myself. > Phoenix Query Server should not use SPNEGO principal to proxy user requests > --- > > Key: PHOENIX-4533 > URL: https://issues.apache.org/jira/browse/PHOENIX-4533 > Project: Phoenix > Issue Type: Improvement >Reporter: Lev Bronshtein >Assignee: Lev Bronshtein >Priority: Minor > Fix For: 5.0.0, 4.14.0 > > Attachments: PHOENIX-4533.1.patch, PHOENIX-4533.2.patch, > PHOENIX-4533.3.patch, PHOENIX-4533.squash.patch > > > Currently the HTTP/ principal is used by various components in the HADOOP > ecosystem to perform SPNEGO authentication. Since there can only be one > HTTP/ per host, even outside of the Hadoop ecosystem, the keytab containing > key material for local HTTP/ principal is shared among a few applications. > With so many applications having access to the HTTP/ credentials, this > increases the chances of an attack on the proxy user capabilities of Hadoop. > This JIRA proposes that two different key tabs can be used to > 1. Authenticate kerberized web requests > 2. Communicate with the phoenix back end -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4533) Phoenix Query Server should not use SPNEGO principal to proxy user requests
[ https://issues.apache.org/jira/browse/PHOENIX-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated PHOENIX-4533: Fix Version/s: 4.14.0 5.0.0 > Phoenix Query Server should not use SPNEGO principal to proxy user requests > --- > > Key: PHOENIX-4533 > URL: https://issues.apache.org/jira/browse/PHOENIX-4533 > Project: Phoenix > Issue Type: Improvement >Reporter: Lev Bronshtein >Assignee: Lev Bronshtein >Priority: Minor > Fix For: 5.0.0, 4.14.0 > > Attachments: PHOENIX-4533.1.patch, PHOENIX-4533.2.patch, > PHOENIX-4533.3.patch, PHOENIX-4533.squash.patch > > > Currently the HTTP/ principal is used by various components in the HADOOP > ecosystem to perform SPNEGO authentication. Since there can only be one > HTTP/ per host, even outside of the Hadoop ecosystem, the keytab containing > key material for local HTTP/ principal is shared among a few applications. > With so many applications having access to the HTTP/ credentials, this > increases the chances of an attack on the proxy user capabilities of Hadoop. > This JIRA proposes that two different key tabs can be used to > 1. Authenticate kerberized web requests > 2. Communicate with the phoenix back end -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362469#comment-16362469 ] chenglei edited comment on PHOENIX-4602 at 2/13/18 3:48 PM: [~jleach], in fact, Phoenix does not convert the where predicates expression to CNF expression in step one,but after WhereOptimizer.pushKeyExpressionsToScan method finished, you actually get a CNF for PK Columns in SkipScanFilter.slots, you can make a simple test to verify it , and you can mail to dev@phoenix.apache.org if you have more questions. was (Author: comnetwork): [~jleach], in fact, Phoenix does not convert the where predicates expression to CNF expression in step one,but after WhereOptimizer.pushKeyExpressionsToScan method finished, you actually get a CNF in SkipScanFilter.slots, you can make a simple test to verify it , and you can mail to dev@phoenix.apache.org if you have more questions. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362510#comment-16362510 ] John Leach commented on PHOENIX-4602: - [~comnetwork] Thank you for the pointer in the code! > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362469#comment-16362469 ] chenglei edited comment on PHOENIX-4602 at 2/13/18 3:41 PM: [~jleach], in fact, Phoenix does not convert the where predicates expression to CNF expression in step one,but after WhereOptimizer.pushKeyExpressionsToScan method finished, you actually get a CNF in SkipScanFilter.slots, you can make a simple test to verify it , and you can mail to dev@phoenix.apache.org if you have more questions. was (Author: comnetwork): [~jleach], Phoenix does not convert the where predicates to CNF, you can make a simple test to verify it , and you can mail to dev@phoenix.apache.org if you have more questions. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362469#comment-16362469 ] chenglei commented on PHOENIX-4602: --- [~jleach], Phoenix does not convert the where predicates to CNF, you can make a simple test to verify it , and you can mail to dev@phoenix.apache.org if you have more questions. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362414#comment-16362414 ] John Leach commented on PHOENIX-4602: - [~comnetwork] I am new to Phoenix, but when I look at the WhereOptimizer.java it is not clear to me how or when the predicates are moved to Conjunctive Normal Form ([https://en.wikipedia.org/wiki/Conjunctive_normal_form).] I have always seen the following process when dealing with predicates. 1. Move to Conjunctive Normal Form all predicates. 2. Mark predicates on the key. 3. Apply a function on the key predicates to assemble a set of scans. 4. Apply a function on the remaining predicates to assemble a filter (Usually in list is an exception case). Do you know where CNF occurs? > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain > sql is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following > line 763, because the pk2 column is not the leading pk column,so this method > return null, causing the expression > {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed > to scan: > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the PK Column in OrExpression is > leading PK Column,just guarantee there is only one PK Column in OrExpression > is enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to s
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{ (t.pk1 >=2 and t.pk1<5) }} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) }} does not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{(t.pk1 >=2 and t.pk1<5)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{ (t.pk1 >=2 and t.pk1<5) }} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) }} does not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{ (t.pk1 >=2 and t.pk1<5)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression {{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain sql is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan: {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the PK Column in OrExpression is leading PK Column,just guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the pk column in OrExpression is leading pk column,guarantee there is only one PK Column in OrExpression is enough. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression {{ (t.pk1 >=2 and t.pk1<5)}} to scan,the expl
[jira] [Commented] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362400#comment-16362400 ] chenglei commented on PHOENIX-4602: --- I uploaded my first patch,please help me have a review,thanks. > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the pk column in OrExpression is > leading pk column,guarantee there is only one PK Column in OrExpression is > enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Attachment: PHOENIX-4602_v1.patch > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > Attachments: PHOENIX-4602_v1.patch > > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems > that it is not necessary to make sure the pk column in OrExpression is > leading pk column,guarantee there is only one PK Column in OrExpression is > enough. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] Apache Phoenix 5.0.0-alpha rc1
+1 - Tested basic, index related queries with and without stats on large data. All queries working fine. - Verified cluster restart, region server restart and compaction related cases they are ok. Thanks, Rajeshbabu. On Tue, Feb 13, 2018 at 4:52 PM, Ankit Singhal wrote: > +1 > - All the tests are passing(except 2 which are flaky and can be ignored). > - Tested some basic and complex queries on cluster with large data - Ok > > > > On Tue, Feb 13, 2018 at 12:16 PM, Sergey Soldatov < > sergeysolda...@gmail.com> > wrote: > > > Tested with basic scenarios with a heavy load to salted/unsalted tables. > > Looks stable. > > > > +1 > > > > On Mon, Feb 12, 2018 at 8:01 AM, Artem Ervits > > wrote: > > > > > Hadoop 2.7.5 > > > HBase 2.0-beta1 > > > downloaded binary release: OK > > > md5: OK > > > loaded 1M rows with performance.py: OK > > > ran queries in sqlline: OK > > > started PQS and ran queries with phoenixdb python client: OK > > > ran a java Hello World example: OK > > > > > > > > > On Fri, Feb 9, 2018 at 10:34 AM, Josh Elser wrote: > > > > > > > Hello Everyone, > > > > > > > > This is a call for a vote on Apache Phoenix 5.0.0-alpha rc1. Please > > > notice > > > > that there are known issues with this release which deserve the > "alpha" > > > > designation. These are staged on the website[1]. (Atomic upsert does > > work > > > > on my local installation with trivial testing) > > > > > > > > Over rc0, this release contains the changes: PHOENIX-4586, > > PHOENIX-4546, > > > > PHOENIX-4549, PHOENIX-4582. > > > > > > > > The RC is available at the standard location: > > > > > > > > https://dist.apache.org/repos/dist/dev/phoenix/apache-phoeni > > > > x-5.0.0-alpha-HBase-2.0-rc1 > > > > > > > > RC0 is based on the following commit: 451d6a37d0d461b60edff36ceb42b1 > > > > 7bb9610350 > > > > > > > > Signed with my key: 9E62822F4668F17B0972ADD9B7D5CD454677D66C, > > > > http://pgp.mit.edu/pks/lookup?op=get&search=0xB7D5CD454677D66C > > > > > > > > Vote will be open for at least 72 hours (2018/02/12 1600GMT). Please > > > vote: > > > > > > > > [ ] +1 approve > > > > [ ] +0 no opinion > > > > [ ] -1 disapprove (and reason why) > > > > > > > > Thanks, > > > > The Apache Phoenix Team > > > > > > > > [1] https://phoenix.apache.org/release_notes.html > > > > > > > > > >
[jira] [Updated] (PHOENIX-4423) Phoenix-hive compilation broken on >=Hive 2.3
[ https://issues.apache.org/jira/browse/PHOENIX-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-4423: --- Attachment: PHOENIX-4423_wip1.patch > Phoenix-hive compilation broken on >=Hive 2.3 > - > > Key: PHOENIX-4423 > URL: https://issues.apache.org/jira/browse/PHOENIX-4423 > Project: Phoenix > Issue Type: Bug >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Critical > Fix For: 5.0.0 > > Attachments: PHOENIX-4423.002.patch, PHOENIX-4423_wip1.patch > > > HIVE-15167 removed an interface which we're using in Phoenix which obviously > fails compilation. Will need to figure out how to work with Hive 1.x, <2.3.0, > and >=2.3.0. > FYI [~sergey.soldatov] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4423) Phoenix-hive compilation broken on >=Hive 2.3
[ https://issues.apache.org/jira/browse/PHOENIX-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362176#comment-16362176 ] Ankit Singhal commented on PHOENIX-4423: [~elserj], [~sergey.soldatov] Attaching a WIP patch with Hive-3.0.0, Some tests are passing but two tests related to Joins are still failing for both Tez and MapReduce cluster. I'm able to run tests from Eclipse only by setting JAVA_HOME in the environment as while running through maven, I was getting an NPE in setup, however I have not spent much time in fixing it but I have seen the same sometimes with the 4.x build as well. Pending items:- # Fix the failing tests related to Join # Should be able to run tests with Maven verify # Need to see if same build can work with <3.0.0(or <2.3.0) > Phoenix-hive compilation broken on >=Hive 2.3 > - > > Key: PHOENIX-4423 > URL: https://issues.apache.org/jira/browse/PHOENIX-4423 > Project: Phoenix > Issue Type: Bug >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Critical > Fix For: 5.0.0 > > Attachments: PHOENIX-4423.002.patch, PHOENIX-4423_wip1.patch > > > HIVE-15167 removed an interface which we're using in Phoenix which obviously > fails compilation. Will need to figure out how to work with Hive 1.x, <2.3.0, > and >=2.3.0. > FYI [~sergey.soldatov] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] Apache Phoenix 5.0.0-alpha rc1
+1 - All the tests are passing(except 2 which are flaky and can be ignored). - Tested some basic and complex queries on cluster with large data - Ok On Tue, Feb 13, 2018 at 12:16 PM, Sergey Soldatov wrote: > Tested with basic scenarios with a heavy load to salted/unsalted tables. > Looks stable. > > +1 > > On Mon, Feb 12, 2018 at 8:01 AM, Artem Ervits > wrote: > > > Hadoop 2.7.5 > > HBase 2.0-beta1 > > downloaded binary release: OK > > md5: OK > > loaded 1M rows with performance.py: OK > > ran queries in sqlline: OK > > started PQS and ran queries with phoenixdb python client: OK > > ran a java Hello World example: OK > > > > > > On Fri, Feb 9, 2018 at 10:34 AM, Josh Elser wrote: > > > > > Hello Everyone, > > > > > > This is a call for a vote on Apache Phoenix 5.0.0-alpha rc1. Please > > notice > > > that there are known issues with this release which deserve the "alpha" > > > designation. These are staged on the website[1]. (Atomic upsert does > work > > > on my local installation with trivial testing) > > > > > > Over rc0, this release contains the changes: PHOENIX-4586, > PHOENIX-4546, > > > PHOENIX-4549, PHOENIX-4582. > > > > > > The RC is available at the standard location: > > > > > > https://dist.apache.org/repos/dist/dev/phoenix/apache-phoeni > > > x-5.0.0-alpha-HBase-2.0-rc1 > > > > > > RC0 is based on the following commit: 451d6a37d0d461b60edff36ceb42b1 > > > 7bb9610350 > > > > > > Signed with my key: 9E62822F4668F17B0972ADD9B7D5CD454677D66C, > > > http://pgp.mit.edu/pks/lookup?op=get&search=0xB7D5CD454677D66C > > > > > > Vote will be open for at least 72 hours (2018/02/12 1600GMT). Please > > vote: > > > > > > [ ] +1 approve > > > [ ] +0 no opinion > > > [ ] -1 disapprove (and reason why) > > > > > > Thanks, > > > The Apache Phoenix Team > > > > > > [1] https://phoenix.apache.org/release_notes.html > > > > > >
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems that it is not necessary to make sure the pk column in OrExpression is leading pk column,guarantee there is only one PK Column in OrExpression is enough. was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan > {code:java} > 757
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} does not pushed to scan > {code:java} > 757
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} was: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression \{{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan > {code:java} >
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : {code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression \{{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} was: Given following table: {code:sql} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code:sql} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : \{code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression \{{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > > Given following table: > {code} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > {code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression \{{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} is not push to sc
[jira] [Updated] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
[ https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-4602: -- Description: Given following table: {code:sql} CREATE TABLE test_table ( PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL, PK3 INTEGER NOT NULL, DATA INTEGER, CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) {code} and a sql: {code:sql} select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) {code} Obviously, it is a typical case for the sql to use SkipScanFilter,however, the sql actually does not use Skip Scan, it use Range Scan and just push the leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain is : \{code:sql} CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) {code} I think the problem is affected by the WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 763, because the pk2 column is not the leading pk column,so this method return null, causing the expression \{{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan {code:java} 757 boolean hasFirstSlot = true; 758 boolean prevIsNull = false; 759 // TODO: Do the same optimization that we do for IN if the childSlots specify a fully qualified row key 760 for (KeySlot slot : childSlot) { 761 if (hasFirstSlot) { 762 // if the first slot is null, return null immediately 763 if (slot == null) { 764 return null; 765 } 766 // mark that we've handled the first slot 767 hasFirstSlot = false; 768 } {code} > OrExpression should can also push non-leading pk columns to scan > > > Key: PHOENIX-4602 > URL: https://issues.apache.org/jira/browse/PHOENIX-4602 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 >Reporter: chenglei >Priority: Major > > Given following table: > {code:sql} > CREATE TABLE test_table ( > PK1 INTEGER NOT NULL, > PK2 INTEGER NOT NULL, > PK3 INTEGER NOT NULL, > DATA INTEGER, > CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3)) > {code} > and a sql: > {code:sql} > select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 > and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) > {code} > Obviously, it is a typical case for the sql to use SkipScanFilter,however, > the sql actually does not use Skip Scan, it use Range Scan and just push the > leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the > explain is : > \{code:sql} > CLIENT PARALLEL 1-WAY RANGE SCAN OVER OR_NO_LEADING_PK [2] - [5] > SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9)) > {code} > > I think the problem is affected by the > WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line > 763, because the pk2 column is not the leading pk column,so this > method return null, causing the expression \{{ ((t.pk2 >= 4 and t.pk2 <6) or > (t.pk2 >= 8 and t.pk2 <9))}} is not push to scan > {code:java} > 757 boolean hasFirstSlot = true; > 758 boolean prevIsNull = false; > 759 // TODO: Do the same optimization that we do for IN if the childSlots > specify a fully qualified row key > 760 for (KeySlot slot : childSlot) { > 761 if (hasFirstSlot) { > 762 // if the first slot is null, return null immediately > 763 if (slot == null) { > 764 return null; > 765 } > 766 // mark that we've handled the first slot > 767 hasFirstSlot = false; > 768 } > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4602) OrExpression should can also push non-leading pk columns to scan
chenglei created PHOENIX-4602: - Summary: OrExpression should can also push non-leading pk columns to scan Key: PHOENIX-4602 URL: https://issues.apache.org/jira/browse/PHOENIX-4602 Project: Phoenix Issue Type: Improvement Affects Versions: 4.13.0 Reporter: chenglei -- This message was sent by Atlassian JIRA (v7.6.3#76005)