[jira] [Comment Edited] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property
[ https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237122#comment-16237122 ] Ankit Singhal edited comment on PHOENIX-4328 at 11/3/17 5:50 AM: - bq. We can't do that since the properties are read from ConnectionQueryServicesImpl properties object is an instance of class ReadOnlyProps. You can use an instance variable(isNamespaceMappingEnabled) in ConnectionQueryServicesImpl which can be set by current logic in checkClientServerCompatibility and use it everywhere where the conversion of SYSTEM tables names are required? we may close https://issues.apache.org/jira/browse/PHOENIX-3288 as duplicate if this JIRA is trying to do the same. was (Author: an...@apache.org): bq. We can't do that since the properties are read from ConnectionQueryServicesImpl properties object is an instance of class ReadOnlyProps. You can use an instance variable(isNamespaceMappingEnabled) in ConnectionQueryServicesImpl which can be set by current logic in checkClientServerCompatibility and use it everywhere where the conversion of SYSTEM tables names are required? > Support clients having different "phoenix.schema.mapSystemTablesToNamespace" > property > - > > Key: PHOENIX-4328 > URL: https://issues.apache.org/jira/browse/PHOENIX-4328 > Project: Phoenix > Issue Type: Bug >Reporter: Karan Mehta >Priority: Major > Labels: namespaces > Fix For: 4.13.0 > > > Imagine a scenario when we enable namespaces for phoenix on the server side > and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A > bunch of clients are trying to connect to this cluster. All of these clients > have > {{phoenix.schema.isNamespaceMappingEnabled}} to true, however > for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to > false and it is true for others. (A typical case for rolling upgrade.) > The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will > acquire lock in SYSMUTEX and migrate the system tables. As soon as this > happens, all the other clients will start failing. > There are two scenarios here. > 1. A new client trying to connect to server without this property set > This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is > namespace mapped or not, If there is a mismatch, it throws an exception, thus > the client doesn't get any connection. > 2. Clients already connected to cluster but don't have this property set > This will fail because every query calls the endpoint coprocessor on SYSCAT > to determine the PTable of the query table and the physical HBase table name > is resolved based on the properties. Thus, we try to call the method on > SYSCAT instead of SYS:CAT and it results in a TableNotFoundException. > This JIRA is to discuss about the potential ways in which we can handle this > issue. > Some ideas around this after discussing with [~twdsi...@gmail.com]: > 1. Build retry logic around the code that works with SYSTEM tables > (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT > Cons: Difficult to maintain and code scattered all over. > 2. Use SchemaUtil.getPhyscialTableName method to return the table name that > actually exists. (Only for SYSTEM tables) > Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return > that name. The client properties get ignored on this one. > Cons: Expensive call every time, since this method is always called several > times. > [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property
[ https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237122#comment-16237122 ] Ankit Singhal commented on PHOENIX-4328: bq. We can't do that since the properties are read from ConnectionQueryServicesImpl properties object is an instance of class ReadOnlyProps. You can use an instance variable(isNamespaceMappingEnabled) in ConnectionQueryServicesImpl which can be set by current logic in checkClientServerCompatibility and use it everywhere where the conversion of SYSTEM tables names are required? > Support clients having different "phoenix.schema.mapSystemTablesToNamespace" > property > - > > Key: PHOENIX-4328 > URL: https://issues.apache.org/jira/browse/PHOENIX-4328 > Project: Phoenix > Issue Type: Bug >Reporter: Karan Mehta >Priority: Major > Labels: namespaces > Fix For: 4.13.0 > > > Imagine a scenario when we enable namespaces for phoenix on the server side > and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A > bunch of clients are trying to connect to this cluster. All of these clients > have > {{phoenix.schema.isNamespaceMappingEnabled}} to true, however > for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to > false and it is true for others. (A typical case for rolling upgrade.) > The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will > acquire lock in SYSMUTEX and migrate the system tables. As soon as this > happens, all the other clients will start failing. > There are two scenarios here. > 1. A new client trying to connect to server without this property set > This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is > namespace mapped or not, If there is a mismatch, it throws an exception, thus > the client doesn't get any connection. > 2. Clients already connected to cluster but don't have this property set > This will fail because every query calls the endpoint coprocessor on SYSCAT > to determine the PTable of the query table and the physical HBase table name > is resolved based on the properties. Thus, we try to call the method on > SYSCAT instead of SYS:CAT and it results in a TableNotFoundException. > This JIRA is to discuss about the potential ways in which we can handle this > issue. > Some ideas around this after discussing with [~twdsi...@gmail.com]: > 1. Build retry logic around the code that works with SYSTEM tables > (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT > Cons: Difficult to maintain and code scattered all over. > 2. Use SchemaUtil.getPhyscialTableName method to return the table name that > actually exists. (Only for SYSTEM tables) > Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return > that name. The client properties get ignored on this one. > Cons: Expensive call every time, since this method is always called several > times. > [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237120#comment-16237120 ] Ankit Singhal commented on PHOENIX-3460: bq. HBase does not allow creating a table name that contains the namespace separator. We should not allow using the namespace separator in the table or schema name. Instead we should throw a PhoenixParserException. [~tdsilva],[~jamestaylor], I think we had this to map existing tables with view/table when namespace mapping is not enabled. So are we making mandatory for the users to have namespace mapping enabled when they want to map tables under namespace in Phoenix? then we probably need some documentation around this. > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns
[ https://issues.apache.org/jira/browse/PHOENIX-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237067#comment-16237067 ] Samarth Jain commented on PHOENIX-4348: --- +1 > Point deletes do not work when there are immutable indexes with only row key > columns > > > Key: PHOENIX-4348 > URL: https://issues.apache.org/jira/browse/PHOENIX-4348 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.13.0 > > Attachments: PHOENIX-4348.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237075#comment-16237075 ] Hudson commented on PHOENIX-4287: - SUCCESS: Integrated in Jenkins build Phoenix-master #1863 (See [https://builds.apache.org/job/Phoenix-master/1863/]) PHOENIX-4287 Add null check for parent name (samarth: rev 895d067974639cd2205b14940e4e46864b4e2060) * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO
[jira] [Commented] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns
[ https://issues.apache.org/jira/browse/PHOENIX-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237027#comment-16237027 ] Hadoop QA commented on PHOENIX-4348: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895545/PHOENIX-4348.patch against master branch at commit 895d067974639cd2205b14940e4e46864b4e2060. ATTACHMENT ID: 12895545 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +public void testPointDeleteRowFromTableWithImmutableIndex(boolean localIndex, boolean addNonPKIndex) throws Exception { +"CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, FEATURE, \"DATE\")) IMMUTABLE_ROWS=true"); +stm.execute("CREATE " + (localIndex ? "LOCAL" : "") + " INDEX " + indexName1 + " ON " + tableName + " (\"DATE\", FEATURE)"); +stm.execute("CREATE " + (localIndex ? "LOCAL" : "") + " INDEX " + indexName2 + " ON " + tableName + " (FEATURE, DOMAIN)"); +stm.execute("CREATE " + (localIndex ? "LOCAL" : "") + " INDEX " + indexName3 + " ON " + tableName + " (\"DATE\", FEATURE, USAGE.DB)"); +.prepareStatement("UPSERT INTO " + tableName + "(HOST, DOMAIN, FEATURE, \"DATE\", CORE, DB, ACTIVE_VISITOR) VALUES(?,?, ? , ?, ?, ?, ?)"); +String dml = "DELETE FROM " + tableName + " WHERE (HOST, DOMAIN, FEATURE, \"DATE\") = (?,?,?,?)"; +return new MutationState(plan.getTableRef(), mutation, 0, maxSize, maxSizeBytes, connection); {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1613//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1613//console This message is automatically generated. > Point deletes do not work when there are immutable indexes with only row key > columns > > > Key: PHOENIX-4348 > URL: https://issues.apache.org/jira/browse/PHOENIX-4348 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.13.0 > > Attachments: PHOENIX-4348.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237000#comment-16237000 ] Hudson commented on PHOENIX-4287: - FAILURE: Integrated in Jenkins build Phoenix-master #1862 (See [https://builds.apache.org/job/Phoenix-master/1862/]) PHOENIX-4287 Make indexes inherit use stats property from their parent (samarth: rev 7d2205d0c9854f61e667a4939eeed645de518f45) * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/ExplainPlanWithStatsEnabledIT.java > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST
[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan
[ https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236997#comment-16236997 ] Hadoop QA commented on PHOENIX-4342: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895538/PHOENIX-4342-v2.patch against master branch at commit 895d067974639cd2205b14940e4e46864b4e2060. ATTACHMENT ID: 12895538 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +mutationPlans.add(new SingleRowDeleteMutationPlan(dataPlan, connection, maxSize, maxSizeBytes)); +return new ServerSelectDeleteMutationPlan(dataPlan, connection, aggPlan, projector, maxSize, maxSizeBytes); +return new ClientSelectDeleteMutationPlan(targetTableRef, dataPlan, bestPlan, hasPreOrPostProcessing, +parallelIteratorFactory, otherTableRefs, projectedTableRef, maxSize, maxSizeBytes, connection); +public SingleRowDeleteMutationPlan(QueryPlan dataPlan, PhoenixConnection connection, int maxSize, int maxSizeBytes) { +Mapmutation = Maps.newHashMapWithExpectedSize(ranges.getPointLookupCount()); +mutation.put(new ImmutableBytesPtr(iterator.next().getLowerRange()), new RowMutationState(PRow.DELETE_MARKER, statement.getConnection().getStatementExecutionCounter(), NULL_ROWTIMESTAMP_INFO, null)); +return new MutationState(context.getCurrentTable(), mutation, 0, maxSize, maxSizeBytes, connection); +public ServerSelectDeleteMutationPlan(QueryPlan dataPlan, PhoenixConnection connection, QueryPlan aggPlan, + RowProjector projector, int maxSize, int maxSizeBytes) { {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.RebuildIndexConnectionPropsIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SaltedViewIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1612//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1612//console This message is automatically generated. > Surface QueryPlan in MutationPlan > - > > Key: PHOENIX-4342 > URL: https://issues.apache.org/jira/browse/PHOENIX-4342 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Geoffrey Jacoby >Priority: Minor > Attachments: PHOENIX-4342-v2.patch, PHOENIX-4342.patch > > > For DELETE statements, it'd be good to be able to get at the QueryPlan > through the MutationPlan so we can get more structured information at compile > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236987#comment-16236987 ] Hadoop QA commented on PHOENIX-4287: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895529/PHOENIX-4287_addendum7.patch against master branch at commit 7d2205d0c9854f61e667a4939eeed645de518f45. ATTACHMENT ID: 12895529 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1611//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1611//console This message is automatically generated. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | >
[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan
[ https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236960#comment-16236960 ] James Taylor commented on PHOENIX-4342: --- This looks very good, [~gjacoby]. The one funky case in DeleteCompiler where we have multiple query plans for a point delete, it might be better to return the dataPlan here: {code} + +@Override +public QueryPlan getQueryPlan() { +return firstPlan.getQueryPlan(); +} {code} > Surface QueryPlan in MutationPlan > - > > Key: PHOENIX-4342 > URL: https://issues.apache.org/jira/browse/PHOENIX-4342 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Geoffrey Jacoby >Priority: Minor > Attachments: PHOENIX-4342-v2.patch, PHOENIX-4342.patch > > > For DELETE statements, it'd be good to be able to get at the QueryPlan > through the MutationPlan so we can get more structured information at compile > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns
[ https://issues.apache.org/jira/browse/PHOENIX-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4348: -- Attachment: PHOENIX-4348.patch Please review, [~samarthjain] or [~tdsilva]. We weren't joining the MutationStates together, so not all the Delete mutations were getting committed. > Point deletes do not work when there are immutable indexes with only row key > columns > > > Key: PHOENIX-4348 > URL: https://issues.apache.org/jira/browse/PHOENIX-4348 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Fix For: 4.13.0 > > Attachments: PHOENIX-4348.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4342) Surface QueryPlan in MutationPlan
[ https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Jacoby updated PHOENIX-4342: - Attachment: PHOENIX-4342-v2.patch v2 patch with getQueryPlan added to MutationPlan. > Surface QueryPlan in MutationPlan > - > > Key: PHOENIX-4342 > URL: https://issues.apache.org/jira/browse/PHOENIX-4342 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Geoffrey Jacoby >Priority: Minor > Attachments: PHOENIX-4342-v2.patch, PHOENIX-4342.patch > > > For DELETE statements, it'd be good to be able to get at the QueryPlan > through the MutationPlan so we can get more structured information at compile > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns
James Taylor created PHOENIX-4348: - Summary: Point deletes do not work when there are immutable indexes with only row key columns Key: PHOENIX-4348 URL: https://issues.apache.org/jira/browse/PHOENIX-4348 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: James Taylor Priority: Major -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
[ https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236914#comment-16236914 ] James Taylor commented on PHOENIX-4344: --- I see - yes, you're right - that would work. It'd do a point scan for each row if there was a non PK column as it'd need to look up that value to maintain the index. It'd work, it'd just be slow. > MapReduce Delete Support > > > Key: PHOENIX-4344 > URL: https://issues.apache.org/jira/browse/PHOENIX-4344 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.12.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > Phoenix already has the ability to use MapReduce for asynchronous handling of > long-running SELECTs. It would be really useful to have this capability for > long-running DELETEs, particularly of tables with indexes where using HBase's > own MapReduce integration would be prohibitively complicated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
[ https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236907#comment-16236907 ] Geoffrey Jacoby commented on PHOENIX-4344: -- I don't see how Option 1 is problematic for indexes on non-PK columns, because it's internally using the Phoenix JDBC API and so going through all the same index-handling logic that a point-delete query issued from outside MapReduce would be doing. Let's say that I have a table ENTITY_HISTORY with a compound primary key (Key1, Key2). I create my MapReduce job with a query like "DELETE FROM ENTITY_HISTORY WHERE Key1 > 'aaa'" That delete would be converted to a select, and the MapReduce job would iterate row by row over the result set. For each row, a new Delete query would be built using that row's PK, e.g "DELETE FROM ENTITY_HISTORY WHERE Key1 = 'foo' and Key2 = 'bar'" and executed using a PhoenixConnection (probably with some kind of commit batching). I'm somewhat concerned about the perf, but the correctness seems sound to me -- am I missing an issue? > MapReduce Delete Support > > > Key: PHOENIX-4344 > URL: https://issues.apache.org/jira/browse/PHOENIX-4344 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.12.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > Phoenix already has the ability to use MapReduce for asynchronous handling of > long-running SELECTs. It would be really useful to have this capability for > long-running DELETEs, particularly of tables with indexes where using HBase's > own MapReduce integration would be prohibitively complicated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236904#comment-16236904 ] Samarth Jain commented on PHOENIX-4287: --- Thanks. I added the comment in my commit. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236893#comment-16236893 ] James Taylor commented on PHOENIX-4287: --- Patch looks good, but please add a comment about needing that extra check for drop of a local index. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | >
[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4287: -- Attachment: PHOENIX-4287_addendum7.patch Looks like an NPE happens when dropping local indexes. Addressing it in this patch. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | >
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236864#comment-16236864 ] Hadoop QA commented on PHOENIX-4287: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895514/PHOENIX-4287_addendum5.patch against master branch at commit 8f9356a2bdd6ba603158899eba38750c85e8e574. ATTACHMENT ID: 12895514 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.IndexWithTableSchemaChangeIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.DropColumnIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1609//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1609//console This message is automatically generated. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236857#comment-16236857 ] Hadoop QA commented on PHOENIX-4287: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895524/PHOENIX-4287_addendum6.patch against master branch at commit 7d2205d0c9854f61e667a4939eeed645de518f45. ATTACHMENT ID: 12895524 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1610//console This message is automatically generated. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ >
[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
[ https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236856#comment-16236856 ] James Taylor commented on PHOENIX-4344: --- I'd go with Option #2. Option #1 will be problematic for tables with indexes on non pk columns. If you can tack on the correct RVC (or perhaps did below the Phoenix API and set the start/stop row of the Scan) based on the info in the QueryPlan, then the delete logic will all be handled completely by DeleteCompiler. You just need to grab the mutations using PhoenixRuntime.getUncommittedDataIterator(). You might just use FormatToBytesWritableMapper for inspiration/code borrowing. > MapReduce Delete Support > > > Key: PHOENIX-4344 > URL: https://issues.apache.org/jira/browse/PHOENIX-4344 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.12.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > Phoenix already has the ability to use MapReduce for asynchronous handling of > long-running SELECTs. It would be really useful to have this capability for > long-running DELETEs, particularly of tables with indexes where using HBase's > own MapReduce integration would be prohibitively complicated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan
[ https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236851#comment-16236851 ] Hadoop QA commented on PHOENIX-4342: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895489/PHOENIX-4342.patch against master branch at commit 8f9356a2bdd6ba603158899eba38750c85e8e574. ATTACHMENT ID: 12895489 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +List mutationPlans = Lists.newArrayListWithExpectedSize(queryPlans.size()); +mutationPlans.add(new SingleRowDeleteMutationPlan(dataPlan, connection, maxSize, maxSizeBytes)); +return new ServerSelectDeleteMutationPlan(dataPlan, connection, aggPlan, projector, maxSize, maxSizeBytes); +return new ClientSelectDeleteMutationPlan(targetTableRef, dataPlan, bestPlan, hasPreOrPostProcessing, +parallelIteratorFactory, otherTableRefs, projectedTableRef, maxSize, maxSizeBytes, connection); +public SingleRowDeleteMutationPlan(QueryPlan dataPlan, PhoenixConnection connection, int maxSize, int maxSizeBytes) { +Mapmutation = Maps.newHashMapWithExpectedSize(ranges.getPointLookupCount()); +mutation.put(new ImmutableBytesPtr(iterator.next().getLowerRange()), new RowMutationState(PRow.DELETE_MARKER, statement.getConnection().getStatementExecutionCounter(), NULL_ROWTIMESTAMP_INFO, null)); +return new MutationState(context.getCurrentTable(), mutation, 0, maxSize, maxSizeBytes, connection); +public ServerSelectDeleteMutationPlan(QueryPlan dataPlan, PhoenixConnection connection, QueryPlan aggPlan, {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1608//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1608//console This message is automatically generated. > Surface QueryPlan in MutationPlan > - > > Key: PHOENIX-4342 > URL: https://issues.apache.org/jira/browse/PHOENIX-4342 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Geoffrey Jacoby >Priority: Minor > Attachments: PHOENIX-4342.patch > > > For DELETE statements, it'd be good to be able to get at the QueryPlan > through the MutationPlan so we can get more structured information at compile > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236841#comment-16236841 ] James Taylor commented on PHOENIX-4287: --- +1 > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > |
[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan
[ https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236840#comment-16236840 ] James Taylor commented on PHOENIX-4342: --- My preference would be to either: - add MutationPlan.getQueryPlan() or - derive MutationPlan from QueryPlan (could use DelegateQueryPlan to help with that) For the cases that don't issue a query, we could use a new EmptyQueryPlan (or null). Otherwise, we'll end up with lots of {{mutationPlan instanceof DeleteMutationPlan}} checks. > Surface QueryPlan in MutationPlan > - > > Key: PHOENIX-4342 > URL: https://issues.apache.org/jira/browse/PHOENIX-4342 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Geoffrey Jacoby >Priority: Minor > Attachments: PHOENIX-4342.patch > > > For DELETE statements, it'd be good to be able to get at the QueryPlan > through the MutationPlan so we can get more structured information at compile > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4287: -- Attachment: PHOENIX-4287_addendum6.patch Updated patch with additional test on view and view index. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236809#comment-16236809 ] James Taylor commented on PHOENIX-4287: --- What about a test that turns the use stats property on/off at the view level. You could just add this to your new test: {code} +conn.createStatement().execute("ALTER VIEW " + view + " SET USE_STATS_FOR_PARALLELIZATION= " + !useStats); +// query against the view +query = "SELECT * FROM " + view; +rs = conn.createStatement().executeQuery(query).unwrap(PhoenixResultSet.class); +// assert query is against view +assertEquals(view, rs.unwrap(PhoenixResultSet.class).getStatement().getQueryPlan() +.getTableRef().getTable().getName().getString()); +// stats are being used for parallelization. So number of scans is higher. +assertEquals(!useStats ? 11 : 1, rs.unwrap(PhoenixResultSet.class).getStatement() +.getQueryPlan().getScans().get(0).size()); + +// query against the view index +query = "SELECT 1 FROM " + view + " WHERE B > 0"; +rs = conn.createStatement().executeQuery(query).unwrap(PhoenixResultSet.class); +// assert query is against viewIndex +assertEquals(viewIndex, rs.unwrap(PhoenixResultSet.class).getStatement().getQueryPlan() +.getTableRef().getTable().getName().getString()); +// stats are being used for parallelization. So number of scans is higher. +assertEquals(!useStats ? 11 : 1, rs.unwrap(PhoenixResultSet.class).getStatement() +.getQueryPlan().getScans().get(0).size()); {code} > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | >
[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236793#comment-16236793 ] Hudson commented on PHOENIX-3460: - SUCCESS: Integrated in Jenkins build Phoenix-master #1861 (See [https://builds.apache.org/job/Phoenix-master/1861/]) PHOENIX-3460 Namespace separator : should not be allowed in table or (tdsilva: rev 8f9356a2bdd6ba603158899eba38750c85e8e574) * (edit) phoenix-core/src/test/java/org/apache/phoenix/parse/QueryParserTest.java * (edit) phoenix-core/src/main/antlr3/PhoenixSQL.g > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
[ https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236759#comment-16236759 ] Geoffrey Jacoby commented on PHOENIX-4344: -- Some thoughts, [~jamestaylor] I want this to be usable for generic DELETE queries without the need for hand-written DBWritable subclasses. MapReduce goes line by line, rather than by Mapper Task/Scan, so while the client would be issuing a broad DELETE query, the mapper itself would either be: 1. Issuing point DELETE Phoenix queries by the complete primary key derived from a SELECT the MapReduce is iterating over (Mapper) OR 2. Issuing DELETE mutations down to several HTables via MultiHFileOutputFormat from a DELETE the MapReduce is iterating over (Mapper ) FormatToBytesWritableMapper relies heavily on a LineParser interface, and the only choices appear to be CsvLineParser, JsonLineParser, and RegexLineParser. That means that in either case the complete row key would have to be built by a new ResultSetLineParser that can take in a ResultSet and parse it into an intermediate form suitable making either DELETE DML (Option 1) or Delete Mutations (Option 2). The former would just need to grab the row key components, while the latter would potentially need everything, because an index can be on any column. Also either way, we need a concrete generalized subclass of the abstract DBWritable. Option 1 seems considerably simpler/higher level, while Option 2 seems more efficient > MapReduce Delete Support > > > Key: PHOENIX-4344 > URL: https://issues.apache.org/jira/browse/PHOENIX-4344 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.12.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > Phoenix already has the ability to use MapReduce for asynchronous handling of > long-running SELECTs. It would be really useful to have this capability for > long-running DELETEs, particularly of tables with indexes where using HBase's > own MapReduce integration would be prohibitively complicated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4287: -- Attachment: PHOENIX-4287_addendum5.patch Thanks for the code snippet, [~jamestaylor]. Attached is the addendum along with a test. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, > PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, > PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from
[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236705#comment-16236705 ] Hadoop QA commented on PHOENIX-3460: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895480/PHOENIX-3460-v2.patch against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61. ATTACHMENT ID: 12895480 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.PartialIndexRebuilderIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1607//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1607//console This message is automatically generated. > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236638#comment-16236638 ] James Taylor commented on PHOENIX-4333: --- Two other corner case: # Handle the case where there's a single region. In that case, we can use the time estimate from the single row we have in gps table. # Handle case where there's a guidepost in the first region, but it's *before* the startKey. We'll need to tweak this loop to stop first slightly sooner (when we're past the start key of the first region) so we know if there's a guidepost in the first region. If we enter the loop, then we have a gps for that region. Note too there are a couple of minor changes here that make sense to make, such as setting intersectWithGuidePosts and not checking the key length each time through the loop since it's not changing. {code} int startRegionIndex = regionIndex; boolean gpsForFirstRegion = false; try { if (gpsSize > 0) { stream = new ByteArrayInputStream(guidePosts.get(), guidePosts.getOffset(), guidePosts.getLength()); input = new DataInputStream(stream); decoder = new PrefixByteDecoder(gps.getMaxLength()); try { byte[] firstRegionStartKey = regionLocations.get(regionIndex).getRegionInfo().getStartKey(); if (firstRegionStartKey.getLength() > 0) { // Walk guideposts until we're past the first region start key while (firstRegionStartKey.compareTo(currentGuidePost = PrefixByteCodec.decode(decoder, input)) >= 0) { gpsForFirstRegion = true; minGuidePostTimestamp = Math.min(estimateTs, gps.getGuidePostTimestamps()[guideIndex]); guideIndex++; } // Continue walking guideposts until we get past the currentKey while (currentKey.compareTo(currentGuidePost = PrefixByteCodec.decode(decoder, input)) >= 0) { minGuidePostTimestamp = Math.min(estimateTs, gps.getGuidePostTimestamps()[guideIndex]); guideIndex++; } } } catch (EOFException e) { // expected. Thrown when we have decoded all guide posts. intersectWithGuidePosts = false; } } {code} # Then we'll want to consider {{gpsForFirstRegion}} in our setting of {{gpsAvailableForAllRegions}}. This would be necessary if the currentKey (i.e. the start key) is after the gps, but before the endKey. {code} // We have a guide post in the region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the // first guide post in that region). gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes || ( gpsForFirstRegion && regionIndex == startRegionIndex ) || ( endKey == stopKey && // If not comparing against region boundary ( endRegionKey.length == 0 || // then check if gp is in the region currentGuidePost.compareTo(endRegionKey) < 0) ); {code} > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property
[ https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236635#comment-16236635 ] Karan Mehta commented on PHOENIX-4328: -- bq. Would it be reasonable just to override the client configuration with the value instead of throwing an exception about inconsistent namespace mapping property? We can't do that since the properties are read from {{ConnectionQueryServicesImpl}} {{properties}} object is an instance of class {{ReadOnlyProps}}. As [~jamestaylor] suggested, one potential option is create a {{DelegateHTableInterface}} since every call to {{SchemaUtil#getPhysicalName()}} is used for getting a {{HTableInterface}}. This happens in the method {{CQSI#getTable()}}. This delegate class can have retry logic built in for certain type of tables (For example, SYSTEM tables). For the rest, we can bubble up the exception. We need to look into detail about potential corner cases here and how it will affect server and client side. [~twdsi...@gmail.com] FYI. > Support clients having different "phoenix.schema.mapSystemTablesToNamespace" > property > - > > Key: PHOENIX-4328 > URL: https://issues.apache.org/jira/browse/PHOENIX-4328 > Project: Phoenix > Issue Type: Bug >Reporter: Karan Mehta >Priority: Major > Labels: namespaces > Fix For: 4.13.0 > > > Imagine a scenario when we enable namespaces for phoenix on the server side > and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A > bunch of clients are trying to connect to this cluster. All of these clients > have > {{phoenix.schema.isNamespaceMappingEnabled}} to true, however > for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to > false and it is true for others. (A typical case for rolling upgrade.) > The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will > acquire lock in SYSMUTEX and migrate the system tables. As soon as this > happens, all the other clients will start failing. > There are two scenarios here. > 1. A new client trying to connect to server without this property set > This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is > namespace mapped or not, If there is a mismatch, it throws an exception, thus > the client doesn't get any connection. > 2. Clients already connected to cluster but don't have this property set > This will fail because every query calls the endpoint coprocessor on SYSCAT > to determine the PTable of the query table and the physical HBase table name > is resolved based on the properties. Thus, we try to call the method on > SYSCAT instead of SYS:CAT and it results in a TableNotFoundException. > This JIRA is to discuss about the potential ways in which we can handle this > issue. > Some ideas around this after discussing with [~twdsi...@gmail.com]: > 1. Build retry logic around the code that works with SYSTEM tables > (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT > Cons: Difficult to maintain and code scattered all over. > 2. Use SchemaUtil.getPhyscialTableName method to return the table name that > actually exists. (Only for SYSTEM tables) > Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return > that name. The client properties get ignored on this one. > Cons: Expensive call every time, since this method is always called several > times. > [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236589#comment-16236589 ] James Taylor commented on PHOENIX-4287: --- Something like this function in BaseResultIterators, [~samarthjain]: {code} private boolean useStatsForParallelization(StatementContext context, PTable table) { Boolean useStats = table.useStatsForParallelization(); if (useStats != null) { return useStats; } if (table.getType() == PTableType.INDEX) { PhoenixConnection conn = context.getConnection(); String parentTableName = table.getParentName().getString(); try { PTable parentTable = conn.getTable(new PTableKey(conn.getTenantId(), parentTableName)); useStats = parentTable.useStatsForParallelization(); if (useStats != null) { return useStats; } } catch (TableNotFoundException e) { Log.warn("Unable to find parent table \"" + parentTableName + "\" of table \"" + table.getName().getString() + "\" to determine USE_STATS_FOR_PARALLELIZATION", e); } } return context.getConnection().getQueryServices().getConfiguration().getBoolean( USE_STATS_FOR_PARALLELIZATION, DEFAULT_USE_STATS_FOR_PARALLELIZATION); } {code} Please add a test too. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; >
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236580#comment-16236580 ] Samarth Jain commented on PHOENIX-4287: --- Yes, that's correct. Will change the patch to fetch the property from the base table. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | >
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236574#comment-16236574 ] Mujtaba Chohan commented on PHOENIX-4287: - Alter on index leads to: {noformat} ERROR 1010 (42M01): Not allowed to mutate table. {noformat} > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > |
[jira] [Updated] (PHOENIX-4342) Surface QueryPlan in MutationPlan
[ https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Jacoby updated PHOENIX-4342: - Attachment: PHOENIX-4342.patch First cut at this patch. 1. Created "DeleteMutationPlan" which extends MutationPlan and adds a getQueryPlan method 2. Refactored all the anonymous delete MutationPlan classes into real inner classes implementing DeleteMutationPlan 3. Changed DeleteCompiler.compile to return a DeleteMutationPlan. If all MutationPlans contain a QueryPlan, we can dispense with DeleteMutationPlan and just add a getQueryPlan method to the base interface, but I didn't think that was the case. [~jamestaylor], fyi. > Surface QueryPlan in MutationPlan > - > > Key: PHOENIX-4342 > URL: https://issues.apache.org/jira/browse/PHOENIX-4342 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Geoffrey Jacoby >Priority: Minor > Attachments: PHOENIX-4342.patch > > > For DELETE statements, it'd be good to be able to get at the QueryPlan > through the MutationPlan so we can get more structured information at compile > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236557#comment-16236557 ] James Taylor commented on PHOENIX-4287: --- I thought we didn’t have a way of setting properties on indexes? > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 27
[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236547#comment-16236547 ] Hadoop QA commented on PHOENIX-3460: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895471/PHOENIX-3460.patch against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61. ATTACHMENT ID: 12895471 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1606//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1606//console This message is automatically generated. > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236544#comment-16236544 ] Samarth Jain commented on PHOENIX-4287: --- USE_STATS_FOR_PARALLELIZATION can be set at an index/view/base table level. For index to use parallelization, you need to set USE_STATS_FOR_PARALLELIZATION = true, else the default value will be used (which in your case is false) > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | >
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236538#comment-16236538 ] Samarth Jain commented on PHOENIX-4287: --- Just got back. Taking a look. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 27| > +---+ > {noformat}
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236516#comment-16236516 ] James Taylor commented on PHOENIX-4287: --- [~mujtabachohan] - can you confirm whether or not the index has stats collected for it? [~samarthjain] - you’ll need to check the right table if an index is being used (and with some special logic for an index on a view and local index). You need to trace back to the physical parent table. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null |
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236498#comment-16236498 ] Mujtaba Chohan commented on PHOENIX-4287: - Just figured that out as well :) That index was created with previous version and it had {{USE_STATS_FOR_PARALLELIZATION}} set to true causing index to use parallelization. Here's a case that still doesn't work with table created with latest version: Global {{USE_STATS_FOR_PARALLELIZATION}} is set to *false*. Table and global index created without setting parallelization. Table parallelization is set to *true* ALTER ...SET USE_STATS_FOR_PARALLELIZATION=true. Verified in SYSTEM.CATALOG as well as it is set for base table only. Parallelization is *not* used for queries when they get executed against *index*, parallelization is correctly used for base table. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | >
[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236475#comment-16236475 ] James Taylor commented on PHOENIX-3460: --- +1 assuming unit tests were run and are passing. > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236468#comment-16236468 ] James Taylor commented on PHOENIX-4287: --- One more question, [~mujtabachohan]. Prior versions of this patch were mistakenly writing the USE_STATS_FOR_PARALLELIZATION value into the table metadata, even when it wasn't set. Is your testing using new tables so that this doesn't impact you? You can query the SYSTEM.CATALOG directly for the table & index to see if there's a value for USE_STATS_FOR_PARALLELIZATION. If there is, this prior issue may be affecting you. If you create a new table and index and you see a value, there's definitely still an issue. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY
[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-3460: Attachment: PHOENIX-3460-v2.patch > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-3460: Attachment: PHOENIX-3460-v2.patch Thanks for the view, attaching a v2 patch that uses QueryConstants.NAMESPACE_SEPARATOR > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236460#comment-16236460 ] Thomas D'Silva edited comment on PHOENIX-3460 at 11/2/17 7:36 PM: -- Thanks for the review, attaching a v2 patch that uses QueryConstants.NAMESPACE_SEPARATOR was (Author: tdsilva): Thanks for the view, attaching a v2 patch that uses QueryConstants.NAMESPACE_SEPARATOR > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, > PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236458#comment-16236458 ] James Taylor commented on PHOENIX-4287: --- So something like this: - Set {{phoenix.use.stats.parallelization}} to false - Create table: CREATE TABLE t (k VARCHAR PRIMARY KEY, v VARCHAR); - Create index: CREATE INDEX idx ON t(v); - Execute query: SELECT v FROM t WHERE v='foo'; - Confirm through explain plan that a) index was used, and b) query isn't chunked up based on stats I suppose we should also have a test that calls ALTER TABLE t SET USE_STATS_FOR_PARALLELIZATION=false, and confirm that the query isn't chunked up based on stats as well. [~samarthjain] - I'm not seeing any tests around indexes for this. Did I miss them? > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null
[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236441#comment-16236441 ] James Taylor commented on PHOENIX-3460: --- Patch looks fine, but one minor nit. It'd be a little more clear *why* we're disallowing ':' if you use the HBase constant here: {code} c.contains(TableName.NAMESPACE_DELIM) {code} > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460.patch, > SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236431#comment-16236431 ] Mujtaba Chohan edited comment on PHOENIX-4287 at 11/2/17 7:24 PM: -- [~jamestaylor] At table creation time I didn't set USE_STATS_FOR_PARALLELIZATION and as global setting it's set to false. Later I turn it off as well with ALTER statement. Index is global. was (Author: mujtabachohan): [~jamestaylor] At table creation time I didn't set USE_STATS_FOR_PARALLELIZATION and as global setting it's set to false. Index is global. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > |
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236431#comment-16236431 ] Mujtaba Chohan commented on PHOENIX-4287: - [~jamestaylor] At table creation time I didn't set USE_STATS_FOR_PARALLELIZATION and as global setting it's set to false. Index is global. > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW | null >| null | null | > +--+-++--+ > select /*+NO_INDEX*/
[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-3460: Fix Version/s: (was: 4.7.0) 4.13.0 > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.13.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460.patch, > SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-3460: Attachment: PHOENIX-3460.patch HBase does not allow creating a table name that contains the namespace separator. We should not allow using the namespace separator in the table or schema name. Instead we should throw a PhoenixParserException. [~jamestaylor] , can you please review? > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.7.0 > > Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460.patch, > SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-3460: Summary: Namespace separator ":" should not be allowed in table or schema name (was: Phoenix Spark plugin cannot find table with a Namespace prefix) > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.7.0 > > Attachments: 0001-Phoenix-fix.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name
[ https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva reassigned PHOENIX-3460: --- Assignee: Thomas D'Silva > Namespace separator ":" should not be allowed in table or schema name > - > > Key: PHOENIX-3460 > URL: https://issues.apache.org/jira/browse/PHOENIX-3460 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 > Environment: HDP 2.5 >Reporter: Xindian Long >Assignee: Thomas D'Silva >Priority: Major > Labels: namespaces, phoenix, spark > Fix For: 4.7.0 > > Attachments: 0001-Phoenix-fix.patch, SchemaUtil.java > > > I am testing some code using Phoenix Spark plug in to read a Phoenix table > with a namespace prefix in the table name (the table is created as a phoenix > table not a hbase table), but it returns an TableNotFoundException. > The table is obviously there because I can query it using plain phoenix sql > through Squirrel. In addition, using spark sql to query it has no problem at > all. > I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245 > The problem does not exist at all when I was running the same code on HDP 2.4 > cluster, with phoenix 4.4. > Neither does the problem occur when I query a table without a namespace > prefix in the DB table name, on HDP 2.5 > The log is in the attached file: tableNoFound.txt > My testing code is also attached. > The weird thing is in the attached code, if I run testSpark alone it gives > the above exception, but if I run the testJdbc first, and followed by > testSpark, both of them work. > After changing to create table by using > create table ACME.ENDPOINT_STATUS > The phoenix-spark plug in seems working. I also find some weird behavior, > If I do both the following > create table ACME.ENDPOINT_STATUS ... > create table "ACME:ENDPOINT_STATUS" ... > Both table shows up in phoenix, the first one shows as Schema ACME, and table > name ENDPOINT_STATUS, and the later on shows as scheme none, and table name > ACME:ENDPOINT_STATUS. > However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, > upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is > the other way around. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236403#comment-16236403 ] James Taylor commented on PHOENIX-4287: --- [~mujtabachohan] - can you outline a simple unit test to make sure we have coverage for this? Is the base table initially setup with USE_STATS_FOR_PARALLELIZATION=true at creation time? And then later you alter the table to turn it off? Is the index a global index? The default, global setting for {{phoenix.use.stats.parallelization}} is false, right? > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151 | 1507928257617 | > +--+-+++ > select /*+NO_INDEX*/ count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 14| > +---+ > {noformat} > Without stats available, results are correct: > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-++--+ > | PLAN | > EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-++--+ > | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T | null| > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null >| null | null | > | SERVER AGGREGATE INTO SINGLE ROW
[jira] [Comment Edited] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236356#comment-16236356 ] Mujtaba Chohan edited comment on PHOENIX-4287 at 11/2/17 6:39 PM: -- [~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on base table and also config set to false globally, stats are correctly not used for parallelization when query runs on base table however for index table they are still used. See explain plan below. This is with https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a {noformat} ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false; explain select count(*) from T; +--+-+++ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +--+-+++ | CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER T_IDX | 63050353| 1161114| 1509646993152 | | SERVER FILTER BY FIRST KEY ONLY | 63050353| 1161114| 1509646993152 | | SERVER AGGREGATE INTO SINGLE ROW | 63050353| 1161114| 1509646993152 | +--+-+++ {noformat} was (Author: mujtabachohan): [~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on base table and also config set to false globally, stats are correctly not used for parallelization when query runs on base table however on for index it is still used. See explain plan below. This is with https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a {noformat} ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false; explain select count(*) from T; +--+-+++ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +--+-+++ | CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER T_IDX | 63050353| 1161114| 1509646993152 | | SERVER FILTER BY FIRST KEY ONLY | 63050353| 1161114| 1509646993152 | | SERVER AGGREGATE INTO SINGLE ROW | 63050353| 1161114| 1509646993152 | +--+-+++ {noformat} > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY
[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
[ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236356#comment-16236356 ] Mujtaba Chohan commented on PHOENIX-4287: - [~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on base table and also config set to false globally, stats are correctly not used for parallelization when query runs on base table however on for index it is still used. See explain plan below. This is with https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a {noformat} ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false; explain select count(*) from T; +--+-+++ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +--+-+++ | CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER T_IDX | 63050353| 1161114| 1509646993152 | | SERVER FILTER BY FIRST KEY ONLY | 63050353| 1161114| 1509646993152 | | SERVER AGGREGATE INTO SINGLE ROW | 63050353| 1161114| 1509646993152 | +--+-+++ {noformat} > Incorrect aggregate query results when stats are disable for parallelization > > > Key: PHOENIX-4287 > URL: https://issues.apache.org/jira/browse/PHOENIX-4287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 > Environment: HBase 1.3.1 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Labels: localIndex > Fix For: 4.13.0, 4.12.1 > > Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, > PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, > PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, > PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch > > > With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query > returns incorrect results when stats are available. > With local index and stats disabled for parallelization: > {noformat} > explain select count(*) from TABLE_T; > +---+-++---+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO | > +---+-++---+ > | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER > TABLE_T [1] | 625043899 | 332170 | 150792825 | > | SERVER FILTER BY FIRST KEY ONLY > | 625043899 | 332170 | 150792825 | > | SERVER AGGREGATE INTO SINGLE ROW > | 625043899 | 332170 | 150792825 | > +---+-++---+ > select count(*) from TABLE_T; > +---+ > | COUNT(1) | > +---+ > | 0 | > +---+ > {noformat} > Using data table > {noformat} > explain select /*+NO_INDEX*/ count(*) from TABLE_T; > +--+-+++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--+-+++ > | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER > TABLE_T | 438492470 | 332151 | 1507928257617 | > | SERVER FILTER BY FIRST KEY ONLY > | 438492470 | 332151 | 1507928257617 | > | SERVER AGGREGATE INTO SINGLE ROW > | 438492470 | 332151
[jira] [Comment Edited] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236115#comment-16236115 ] James Taylor edited comment on PHOENIX-4333 at 11/2/17 6:25 PM: We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in the region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the // first guide post in that region). gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes || ( endKey == stopKey && // If not comparing against region boundary ( endRegionKey.length == 0 || // then check if gp is in the region currentGuidePost.compareTo(endRegionKey) < 0) ); {code} Does this not pass all of your tests? was (Author: jamestaylor): We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in previous region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the first // guide post in that region gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes || (endKey == stopKey && (endRegionKey.length == 0 || currentGuidePost.compareTo(endRegionKey) < 0)); {code} Does this not pass all of your tests? > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236115#comment-16236115 ] James Taylor edited comment on PHOENIX-4333 at 11/2/17 6:19 PM: We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in previous region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the first // guide post in that region gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes || (endKey == stopKey && (endRegionKey.length == 0 || currentGuidePost.compareTo(endRegionKey) < 0)); {code} Does this not pass all of your tests? was (Author: jamestaylor): We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in previous region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the first // guide post in that region hasGuidePostInAllRegions &= currentKeyBytes != initialKeyBytes || (endKey == stopKey && currentGuidePost.compareTo(endRegionKey) < 0;) {code} Does this not pass all of your tests? > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236313#comment-16236313 ] James Taylor commented on PHOENIX-4333: --- Also, looking at ExplainPlanWithStatsEnabledIT.testSelectQueriesWithFilters(), the region boundaries are not going to intersect as expected with the guideposts, since the split points are using raw bytes which won't have the sign bit flipped. Below is what you want to do as Phoenix will do the right thing in that case wrt to data types. Some other tests need to be changed as well - I'd recommend just always having the SPLIT clause in the CREATE TABLE statement as it's just more clear. {code} private void testSelectQueriesWithFilters(boolean useStatsForParallelization) throws Exception { String tableName = generateUniqueName(); try (Connection conn = DriverManager.getConnection(getUrl())) { int guidePostWidth = 20; String ddl = "CREATE TABLE " + tableName + " (k INTEGER PRIMARY KEY, a bigint, b bigint) " + " GUIDE_POSTS_WIDTH=" + guidePostWidth + ", USE_STATS_FOR_PARALLELIZATION=" + useStatsForParallelization + " SPLIT ON (102,105,108)"; conn.createStatement().execute(ddl); conn.createStatement().execute("upsert into " + tableName + " values (100,100,3)"); {code} > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios
[ https://issues.apache.org/jira/browse/PHOENIX-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236122#comment-16236122 ] Hadoop QA commented on PHOENIX-4346: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895377/PHOENIX-4346.patch against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61. ATTACHMENT ID: 12895377 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.monitoring.PhoenixMetricsIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ColumnEncodedMutableTxStatsCollectorIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1604//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1604//console This message is automatically generated. > Add support for UNSIGNED_LONG type in Pherf scenarios > - > > Key: PHOENIX-4346 > URL: https://issues.apache.org/jira/browse/PHOENIX-4346 > Project: Phoenix > Issue Type: Improvement >Reporter: Monani Mihir >Priority: Minor > Attachments: PHOENIX-4346.patch > > > Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would > good to have UNSIGNED_LONG available for pherf scenarios. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236115#comment-16236115 ] James Taylor commented on PHOENIX-4333: --- We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in previous region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the first // guide post in that region hasGuidePostInAllRegions &= currentKeyBytes != initialKeyBytes || (endKey == stopKey && currentGuidePost.compareTo(endRegionKey) < 0;) {code} Does this not pass all of your tests? > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4332) Indexes should inherit guide post width of the base data table
[ https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236045#comment-16236045 ] Hudson commented on PHOENIX-4332: - FAILURE: Integrated in Jenkins build Phoenix-master #1860 (See [https://builds.apache.org/job/Phoenix-master/1860/]) PHOENIX-4332 Indexes should inherit guide post width of the base data (samarth: rev 61684c4431d16deff53adfbb91ea76c13642df61) * (add) phoenix-core/src/it/java/org/apache/phoenix/schema/stats/StatsCollectorIT.java * (delete) phoenix-core/src/it/java/org/apache/phoenix/end2end/StatsCollectorIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedMutableNonTxStatsCollectorIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/stats/DefaultStatisticsCollector.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/NonColumnEncodedImmutableNonTxStatsCollectorIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedImmutableTxStatsCollectorIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedImmutableNonTxStatsCollectorIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedMutableTxStatsCollectorIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/SysTableNamespaceMappedStatsCollectorIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/NonColumnEncodedImmutableTxStatsCollectorIT.java > Indexes should inherit guide post width of the base data table > -- > > Key: PHOENIX-4332 > URL: https://issues.apache.org/jira/browse/PHOENIX-4332 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4332.patch > > > Altering guidepost with on data table does not propagate to global index > using {{ALTER TABLE}} command. > Altering global index table runs in not allowed error. > {noformat} > ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1; > Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop > column referenced by VIEW columnName=IDX (state=42M01,code=1010) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
[ https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235907#comment-16235907 ] Josh Mahonin commented on PHOENIX-4347: --- Can you post this question to the phoenix-users mailing list? I suspect someone may have run into this and found a way to do it already. However, if you're able to provide a reproducible unit test in PhoenixSparkIT [ 1 ] which necessitates a patch, a contribution would be most welcome. Thanks! [ 1 ] https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala > Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue > > > Key: PHOENIX-4347 > URL: https://issues.apache.org/jira/browse/PHOENIX-4347 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.11.0 > Environment: CentOS 6.5, Fedora 25 >Reporter: Lokesh Kumar >Priority: Major > Labels: phoenix, spark-sql > > Created a Phoenix table with below schema: > {code:java} > CREATE TABLE IF NOT EXISTS sample_table ( > id VARCHAR NOT NULL, > metricid VARCHAR NOT NULL, > timestamp TIMESTAMP NOT NULL, > metricvalue DOUBLE, > CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; > {code} > Inserted some data into this and loaded as Spark Dataset using the Phoenix > spark datasource ('org.apache.phoenix.spark') options. > The Spark Dataset's schema is as given below: > root > |-- ID: string (nullable = true) > |-- METRICID: string (nullable = true) > |-- TIMESTAMP: timestamp (nullable = true) > |-- METRICVALUE: double (nullable = true) > I apply the Dataset's filter operation on Timestamp column as given below: > {code:java} > Dataset ds = > ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") > {code} > This operation throws me an exception as: > testPhoenixTimestamp(DatasetTest): > org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): > Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, > column 145. > The generated query looks like this: > {code:java} > 2017-11-02 15:29:31,722 INFO [main] > org.apache.phoenix.mapreduce.PhoenixInputFormat > Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM > SAMPLE_TABLE WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 > 00:00:00.0) > {code} > The issue is with Timestamp filter condition, where the timestamp value is > not wrapped in to_timestamp() function. > I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class > compileValue() function, by checking the value's class. If it is > java.sql.Timestamp then I am wrapping the value with to_timestamp() function. > Please let me know if there is another way of correctly querying Timestamp > values in Phoenix through Spark's Dataset API. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4303) Replace HTableInterface,HConnection with Table,Connection interfaces respectively
[ https://issues.apache.org/jira/browse/PHOENIX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235892#comment-16235892 ] Hadoop QA commented on PHOENIX-4303: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895387/PHOENIX-4303.patch against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61. ATTACHMENT ID: 12895387 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 15 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1605//console This message is automatically generated. > Replace HTableInterface,HConnection with Table,Connection interfaces > respectively > - > > Key: PHOENIX-4303 > URL: https://issues.apache.org/jira/browse/PHOENIX-4303 > Project: Phoenix > Issue Type: Sub-task >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla >Priority: Major > Labels: HBase-2.0 > Fix For: 4.13.0 > > Attachments: PHOENIX-4303.patch > > > In latest versions of HBase HTableInterface,HConnection are replaced with > Table and Connection respectively. We can make use of new interfaces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [VOTE] First hbase-2.0.0-alpha4 Release Candidate is available
+1 (non-binding), found 1 issue. Checked signatures, sums - OK Built from source tar and git tag (Oracle JDK 1.8.0_151, Maven 3.5.2) - OK Rat check - OK Starting standalone server from bin tar - OK LTT with 1M rows - OK LICENSE, NOTICE - OK Problem: Starting standalone server after building from src tar fails: same problem that Guanghao Zhang had. HBASE-18705 On Wed, Nov 1, 2017 at 3:17 PM, Stackwrote: > The first release candidate for HBase 2.0.0-alpha4 is up at: > > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0-alpha4RC0/ > > Maven artifacts are available from a staging directory here: > > https://repository.apache.org/content/repositories/orgapachehbase-1178 > > All was signed with my key at 8ACC93D2 [1] > > I tagged the RC as 2.0.0-alpha4RC0 > (5c4b985f89c99cc8b0f8515a4097c811a0848835) > > hbase-2.0.0-alpha4 is our fourth alpha release along our march toward > hbase-2.0.0. It includes all that was in previous alphas (new assignment > manager, offheap read/write path, in-memory compactions, etc.), but had a > focus on "Coprocessor Fixup": We no longer pass Coprocessors > InterfaceAudience.Private parameters and we cut down on the access and > ability to influence hbase core processing (See [2] on why the radical > changes in Coprocessor Interface). If you are a Coprocessor developer or > have Coprocessors to deploy on hbase-2.0.0, we need to hear about your > experience now before we make an hbase-2.0.0 beta. > > hbase-2.0.0-alpha4 is a rough cut ('alpha'), not-for-production preview of > what hbase-2.0.0 will look like. It is meant for devs and downstreamers to > test drive and flag us early if we messed up anything ahead of our rolling > GAs. > > The list of features addressed in 2.0.0 so far can be found here [3]. There > are thousands. The list of ~2k+ fixes in 2.0.0 exclusively can be found > here [4] (My JIRA JQL foo is a bit dodgy -- forgive me if mistakes). > > I've updated our overview doc. on the state of 2.0.0 [6]. 2.0.0-beta-1 will > be our next release. Its theme is the "Finishing up 2.0.0" release. Here is > the list of what we have targeted for beta-1 [5]. Check it out. Shout if > there is anything missing. We may do a 2.0.0-beta-2 if a need. We'll see. > > Please take this alpha for a spin especially if you are a Coprocessor > developer or have a Coprocessor you want to deploy on hbase-2.0.0. Please > vote on whether it ok to put out this RC as our first alpha (bar is low for > an 'alpha' -- e.g. CHANGES.txt has not been updated). Let the VOTE be open > for 72 hours (Saturday) > > Thanks, > Your 2.0.0 Release Manager > > 1. http://pgp.mit.edu/pks/lookup?op=get=0x9816C7FC8ACC93D2 > 2. Why CPs are Incompatible: > https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ > ktczrlKHK8N4SZzs/edit#heading=h.9k7mjbauv0wj > 3. https://goo.gl/scYjJr > 4. https://goo.gl/tMHkYS > 5. https://issues.apache.org/jira/projects/HBASE/versions/12340861 > 6. > https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ > ktczrlKHK8N4SZzs/ >
[jira] [Commented] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios
[ https://issues.apache.org/jira/browse/PHOENIX-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235768#comment-16235768 ] Monani Mihir commented on PHOENIX-4346: --- [~mujtabachohan] can you review this? > Add support for UNSIGNED_LONG type in Pherf scenarios > - > > Key: PHOENIX-4346 > URL: https://issues.apache.org/jira/browse/PHOENIX-4346 > Project: Phoenix > Issue Type: Improvement >Reporter: Monani Mihir >Priority: Minor > Attachments: PHOENIX-4346.patch > > > Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would > good to have UNSIGNED_LONG available for pherf scenarios. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: rename 5.0-HBase-2.0 branch to 5.x-HBase-2.0
Agreed James.. Thanks Josh for renaming it. On Thu, Nov 2, 2017 at 4:18 AM, Josh Elserwrote: > Just went ahead and did it. No problem from my POV. > > > On 10/31/17 1:54 PM, James Taylor wrote: > >> I propose we rename the 5.0-HBase-2.0 branch to 5.x-HBase-2.0 so that we >> can do all 5.x based releases from this branch similar to the way we do >> for >> 4.x-HBase-###. >> >>
[jira] [Updated] (PHOENIX-4303) Replace HTableInterface,HConnection with Table,Connection interfaces respectively
[ https://issues.apache.org/jira/browse/PHOENIX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-4303: - Attachment: PHOENIX-4303.patch > Replace HTableInterface,HConnection with Table,Connection interfaces > respectively > - > > Key: PHOENIX-4303 > URL: https://issues.apache.org/jira/browse/PHOENIX-4303 > Project: Phoenix > Issue Type: Sub-task >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla >Priority: Major > Labels: HBase-2.0 > Fix For: 4.13.0 > > Attachments: PHOENIX-4303.patch > > > In latest versions of HBase HTableInterface,HConnection are replaced with > Table and Connection respectively. We can make use of new interfaces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
[ https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Kumar updated PHOENIX-4347: -- Description: Created a Phoenix table with below schema: {code:java} CREATE TABLE IF NOT EXISTS sample_table ( id VARCHAR NOT NULL, metricid VARCHAR NOT NULL, timestamp TIMESTAMP NOT NULL, metricvalue DOUBLE, CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; {code} Inserted some data into this and loaded as Spark Dataset using the Phoenix spark datasource ('org.apache.phoenix.spark') options. The Spark Dataset's schema is as given below: root |-- ID: string (nullable = true) |-- METRICID: string (nullable = true) |-- TIMESTAMP: timestamp (nullable = true) |-- METRICVALUE: double (nullable = true) I apply the Dataset's filter operation on Timestamp column as given below: {code:java} Dataset ds = ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") {code} This operation throws me an exception as: testPhoenixTimestamp(DatasetTest): org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145. The generated query looks like this: {code:java} 2017-11-02 15:29:31,722 INFO [main] org.apache.phoenix.mapreduce.PhoenixInputFormat Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM SAMPLE_TABLE WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 00:00:00.0) {code} The issue is with Timestamp filter condition, where the timestamp value is not wrapped in to_timestamp() function. I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class compileValue() function, by checking the value's class. If it is java.sql.Timestamp then I am wrapping the value with to_timestamp() function. Please let me know if there is another way of correctly querying Timestamp values in Phoenix through Spark's Dataset API. was: Created a Phoenix table with below schema: {code:java} CREATE TABLE IF NOT EXISTS sample_table ( id VARCHAR NOT NULL, metricid VARCHAR NOT NULL, timestamp TIMESTAMP NOT NULL, metricvalue DOUBLE, CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; {code} Inserted some data into this and loaded as Spark Dataset using the Phoenix spark datasource ('org.apache.phoenix.spark') options. The Spark Dataset's schema is as given below: root |-- ID: string (nullable = true) |-- METRICID: string (nullable = true) |-- TIMESTAMP: timestamp (nullable = true) |-- METRICVALUE: double (nullable = true) I apply the Dataset's filter operation on Timestamp column as given below: {code:java} Dataset ds = ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") {code} This operation throws me an exception as: testPhoenixTimestamp(DatasetTest): org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145. The generated query looks like this: {code:java} 2017-11-02 15:29:31,722 INFO [main] org.apache.phoenix.mapreduce.PhoenixInputFormat Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 00:00:00.0) {code} The issue is with Timestamp filter condition, where the timestamp value is not wrapped in to_timestamp() function. I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class compileValue() function, by checking the value's class. If it is java.sql.Timestamp then I am wrapping the value with to_timestamp() function. Please let me know if there is another way of correctly querying Timestamp values in Phoenix through Spark's Dataset API. > Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue > > > Key: PHOENIX-4347 > URL: https://issues.apache.org/jira/browse/PHOENIX-4347 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.11.0 > Environment: CentOS 6.5, Fedora 25 >Reporter: Lokesh Kumar >Priority: Major > Labels: phoenix, spark-sql > > Created a Phoenix table with below schema: > {code:java} > CREATE TABLE IF NOT EXISTS sample_table ( > id VARCHAR NOT NULL, > metricid VARCHAR NOT NULL, > timestamp TIMESTAMP NOT NULL, > metricvalue DOUBLE, > CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; > {code} > Inserted some data into this and loaded as Spark Dataset using the Phoenix > spark datasource ('org.apache.phoenix.spark') options. > The Spark Dataset's schema is as given below: > root > |-- ID: string (nullable = true) > |-- METRICID: string (nullable = true) > |-- TIMESTAMP:
[jira] [Updated] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
[ https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Kumar updated PHOENIX-4347: -- Description: Created a Phoenix table with below schema: {code:java} CREATE TABLE IF NOT EXISTS sample_table ( id VARCHAR NOT NULL, metricid VARCHAR NOT NULL, timestamp TIMESTAMP NOT NULL, metricvalue DOUBLE, CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; {code} Inserted some data into this and loaded as Spark Dataset using the Phoenix spark datasource ('org.apache.phoenix.spark') options. The Spark Dataset's schema is as given below: root |-- ID: string (nullable = true) |-- METRICID: string (nullable = true) |-- TIMESTAMP: timestamp (nullable = true) |-- METRICVALUE: double (nullable = true) I apply the Dataset's filter operation on Timestamp column as given below: {code:java} Dataset ds = ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") {code} This operation throws me an exception as: testPhoenixTimestamp(DatasetTest): org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145. The generated query looks like this: {code:java} 2017-11-02 15:29:31,722 INFO [main] org.apache.phoenix.mapreduce.PhoenixInputFormat Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 00:00:00.0) {code} The issue is with Timestamp filter condition, where the timestamp value is not wrapped in to_timestamp() function. I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class compileValue() function, by checking the value's class. If it is java.sql.Timestamp then I am wrapping the value with to_timestamp() function. Please let me know if there is another way of correctly querying Timestamp values in Phoenix through Spark's Dataset API. was: Created a Phoenix table with below schema: {code:java} CREATE TABLE IF NOT EXISTS sample_table ( id VARCHAR NOT NULL, metricid VARCHAR NOT NULL, timestamp TIMESTAMP NOT NULL, metricvalue DOUBLE, CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; {code} Inserted some data into this and loaded as Spark Dataset using the Phoenix spark datasource ('org.apache.phoenix.spark') options. The Spark Dataset's schema is as given below: root |-- ID: string (nullable = true) |-- METRICID: string (nullable = true) |-- TIMESTAMP: timestamp (nullable = true) |-- METRICVALUE: double (nullable = true) I apply the Dataset's filter operation on Timestamp column as given below: {code:java} Dataset ds = ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") {code} This operation throws me an exception as: testPhoenixTimestamp(DatasetTest): org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145. The generated query looks like this: {code:java} 2017-11-02 15:29:31,722 INFO [main] org.apache.phoenix.mapreduce.PhoenixInputFormat Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= *2017-10-31 00:00:00.0*) {code} The issue is highlighted in bold above, where the timestamp value is not wrapped in to_timestamp() function. I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class compileValue() function, by checking the value's class. If it is java.sql.Timestamp then I am wrapping the value with to_timestamp() function. Please let me know if there is another way of correctly querying Timestamp values in Phoenix through Spark's Dataset API. > Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue > > > Key: PHOENIX-4347 > URL: https://issues.apache.org/jira/browse/PHOENIX-4347 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.11.0 > Environment: CentOS 6.5, Fedora 25 >Reporter: Lokesh Kumar >Priority: Major > Labels: phoenix, spark-sql > > Created a Phoenix table with below schema: > {code:java} > CREATE TABLE IF NOT EXISTS sample_table ( > id VARCHAR NOT NULL, > metricid VARCHAR NOT NULL, > timestamp TIMESTAMP NOT NULL, > metricvalue DOUBLE, > CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; > {code} > Inserted some data into this and loaded as Spark Dataset using the Phoenix > spark datasource ('org.apache.phoenix.spark') options. > The Spark Dataset's schema is as given below: > root > |-- ID: string (nullable = true) > |-- METRICID: string (nullable = true) > |-- TIMESTAMP: timestamp
[jira] [Created] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
Lokesh Kumar created PHOENIX-4347: - Summary: Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue Key: PHOENIX-4347 URL: https://issues.apache.org/jira/browse/PHOENIX-4347 Project: Phoenix Issue Type: Bug Affects Versions: 4.11.0 Environment: CentOS 6.5, Fedora 25 Reporter: Lokesh Kumar Priority: Major Created a Phoenix table with below schema: {code:java} CREATE TABLE IF NOT EXISTS sample_table ( id VARCHAR NOT NULL, metricid VARCHAR NOT NULL, timestamp TIMESTAMP NOT NULL, metricvalue DOUBLE, CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; {code} Inserted some data into this and loaded as Spark Dataset using the Phoenix spark datasource ('org.apache.phoenix.spark') options. The Spark Dataset's schema is as given below: root |-- ID: string (nullable = true) |-- METRICID: string (nullable = true) |-- TIMESTAMP: timestamp (nullable = true) |-- METRICVALUE: double (nullable = true) I apply the Dataset's filter operation on Timestamp column as given below: {code:java} Dataset ds = ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") {code} This operation throws me an exception as: testPhoenixTimestamp(DatasetTest): org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145. The generated query looks like this: {code:java} 2017-11-02 15:29:31,722 INFO [main] org.apache.phoenix.mapreduce.PhoenixInputFormat Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= *2017-10-31 00:00:00.0*) {code} The issue is highlighted in bold above, where the timestamp value is not wrapped in to_timestamp() function. I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class compileValue() function, by checking the value's class. If it is java.sql.Timestamp then I am wrapping the value with to_timestamp() function. Please let me know if there is another way of correctly querying Timestamp values in Phoenix through Spark's Dataset API. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios
[ https://issues.apache.org/jira/browse/PHOENIX-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Monani Mihir updated PHOENIX-4346: -- Attachment: PHOENIX-4346.patch Patch for master which adds support for UNSIGNED_LONG type for Pherf scenarios. > Add support for UNSIGNED_LONG type in Pherf scenarios > - > > Key: PHOENIX-4346 > URL: https://issues.apache.org/jira/browse/PHOENIX-4346 > Project: Phoenix > Issue Type: Improvement >Reporter: Monani Mihir >Priority: Minor > Attachments: PHOENIX-4346.patch > > > Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would > good to have UNSIGNED_LONG available for pherf scenarios. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios
Monani Mihir created PHOENIX-4346: - Summary: Add support for UNSIGNED_LONG type in Pherf scenarios Key: PHOENIX-4346 URL: https://issues.apache.org/jira/browse/PHOENIX-4346 Project: Phoenix Issue Type: Improvement Reporter: Monani Mihir Priority: Minor Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would good to have UNSIGNED_LONG available for pherf scenarios. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] phoenix pull request #280: indextool inedxTable is not an index table for da...
GitHub user xsq0718 opened a pull request: https://github.com/apache/phoenix/pull/280 indextool inedxTable is not an index table for dataTable Phoenix;phoenix-4.8.0-cdh5.8.0 Hbase;1.2.0 Create phoenixTable ; CREATE Table "everAp"(pk VARCHAR PRIMARY KEY,"ba"."ap" varchar,"ba"."ft" varchar,"ba"."et" varchar,"ba"."n" varchar); Create Index; create local index EVERAP_INDEX_AP on "everAp"("ba"."ap") async; Use indexToolï¼ ./hbase org.apache.phoenix.mapreduce.index.IndexTool -dt \"\"everAp\"\" -it EVERAP_INDEX_AP -op hdfs:/hbase/data/default/everApIndc /cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/hbase/bin/../lib/native/Linux-amd64-64 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:java.compiler= 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-504.el6.x86_64 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:user.name=root 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:user.home=/root 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/bin 17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=slave1:2181,slave2:2181,master:2181 sessionTimeout=6 watcher=hconnection-0x4470f8a60x0, quorum=slave1:2181,slave2:2181,master:2181, baseZNode=/hbase 17/11/02 15:08:09 INFO zookeeper.ClientCnxn: Opening socket connection to server master/192.168.0.250:2181. Will not attempt to authenticate using SASL (unknown error) 17/11/02 15:08:09 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.0.250:53140, server: master/192.168.0.250:2181 17/11/02 15:08:09 INFO zookeeper.ClientCnxn: Session establishment complete on server master/192.168.0.250:2181, sessionid = 0x35f518ca651786a, negotiated timeout = 6 17/11/02 15:08:10 INFO metrics.Metrics: Initializing metrics system: phoenix 17/11/02 15:08:10 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-phoenix.properties,hadoop-metrics2.properties 17/11/02 15:08:10 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 17/11/02 15:08:10 INFO impl.MetricsSystemImpl: phoenix metrics system started 17/11/02 15:08:11 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 17/11/02 15:08:12 ERROR index.IndexTool: An exception occurred while performing the indexing job: IllegalArgumentException: EVERAP_INDEX_AP is not an index table for everAp at: java.lang.IllegalArgumentException: EVERAP_INDEX_AP is not an index table for everAp at org.apache.phoenix.mapreduce.index.IndexTool.run(IndexTool.java:190) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.phoenix.mapreduce.index.IndexTool.main(IndexTool.java:394) You have mail in /var/spool/mail/root **help!!!** You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/phoenix master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/phoenix/pull/280.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #280 commit 489a945159e08e663dc73d3cd51568e6ba0a0f38 Author: Samarth JainDate: 2017-06-14T18:19:58Z PHOENIX-3890 Disable EncodedQualifierCellsList optimization for tables with more than one column family commit 993164b6388a263f1931eae624693e30bf848d29 Author: Thomas Date: 2017-06-08T18:20:28Z PHOENIX-3918 Ensure all function implementations handle null args correctly commit de6fbc4e2a13cdc482cbc1c91e51c4bc526aa12f Author: Samarth Jain Date: 2017-06-14T19:44:03Z PHOENIX-3937 Remove @AfterClass methods from test classes annotated with @NeedsOwnMiniClusterTest commit 64121a3c403a3c5206174b33b3c8762d530279f0 Author: Josh Elser Date: 2017-06-15T20:34:43Z PHOENIX-3940 Handle PERCENTILE_CONT against no rows commit 98db5d63bd3572328da6ba52ba53357f692c6222 Author: Samarth Jain Date: 2017-06-16T17:59:45Z PHOENIX-3942 Fix failing PhoenixMetricsIT test commit fba5fa28a03279e3fc427de800774690d280edca Author: Samarth Jain Date: 2017-06-19T20:54:39Z PHOENIX-3930 Move BaseQueryIT to ParallelStatsDisabledIT (Samarth Jain & James Taylor) commit
[jira] [Commented] (PHOENIX-4332) Indexes should inherit guide post width of the base data table
[ https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235314#comment-16235314 ] Hadoop QA commented on PHOENIX-4332: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12895338/PHOENIX-4332.patch against master branch at commit 82364f6b3083d309f2035f1fd6d132a77ecef71a. ATTACHMENT ID: 12895338 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +serverProps.put(QueryServices.STATS_GUIDEPOST_WIDTH_BYTES_ATTRIB, Long.toString(defaultGuidePostWidth)); +clientProps.put(QueryServices.STATS_GUIDEPOST_WIDTH_BYTES_ATTRIB, Long.toString(defaultGuidePostWidth)); {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SetPropertyOnEncodedTableIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.PartialIndexRebuilderIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SetPropertyOnNonEncodedTableIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1603//testReport/ Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1603//console This message is automatically generated. > Indexes should inherit guide post width of the base data table > -- > > Key: PHOENIX-4332 > URL: https://issues.apache.org/jira/browse/PHOENIX-4332 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4332.patch > > > Altering guidepost with on data table does not propagate to global index > using {{ALTER TABLE}} command. > Altering global index table runs in not allowed error. > {noformat} > ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1; > Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop > column referenced by VIEW columnName=IDX (state=42M01,code=1010) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235303#comment-16235303 ] Samarth Jain commented on PHOENIX-4333: --- Is it a safe assumption to make that if intersectScan is returning a non-null value, then we have an intersection? {code} Scan newScan = scanRanges.intersectScan(scan, currentKeyBytes, currentGuidePostBytes, keyOffset, false); if (newScan != null) { // guide post was available in the } {code} > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235294#comment-16235294 ] Samarth Jain commented on PHOENIX-4333: --- Good point, [~jamestaylor]. I don't think my check would work in the below case: REGION 1 - VIEW1 and VIEW2 REGION2 - VIEW2 and VIEW3 If we collect stats for VIEW1 and VIEW3, then even though both regions have stats, they don't have stats for VIEW2. I think I would also need to check whether there any guidepost intersected for the region. > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4333: -- Attachment: PHOENIX-4333_v2.patch > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4333: -- Attachment: PHOENIX-4333_v2.patch Updated patch that sets estimate timestamp to null when we don't have guideposts available for all regions. > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235275#comment-16235275 ] James Taylor commented on PHOENIX-4333: --- Does your check handle the case in which multiple regions are scanned and one in the middle has no guide posts? Not sure I understand why the check needs to be in the catch, but not a big deal. > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4333: -- Attachment: (was: PHOENIX-4333_v2.patch) > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235262#comment-16235262 ] Samarth Jain commented on PHOENIX-4333: --- Actually, the check needs to be done inside this catch block: {code} catch (EOFException e) { // We have read all guide posts } {code} And if we are doing there, I think the check I had makes it easier to understand what's going on, IMHO. {code} +if (regionIndex < stopIndex) { +/* + * We don't have guide posts available for all regions. So in this case we + * conservatively say that we cannot provide estimates + */ +gpsAvailableForAllRegions = false; +} } {code} > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235253#comment-16235253 ] Samarth Jain commented on PHOENIX-4333: --- Ah, I see. Yes, that's true. Let me update the patch. > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235250#comment-16235250 ] James Taylor commented on PHOENIX-4333: --- Haven’t tested it, but if currentKeyBytes gets set during the inner loop, then that means we’ve found at least one gp, no? Just a bit simpler way to detect that. If that doesn’t work, the way you have it is fine too. > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4332) Indexes should inherit guide post width of the base data table
[ https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4332: -- Summary: Indexes should inherit guide post width of the base data table (was: Stats - Allow setting guide post width on global indexes) > Indexes should inherit guide post width of the base data table > -- > > Key: PHOENIX-4332 > URL: https://issues.apache.org/jira/browse/PHOENIX-4332 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4332.patch > > > Altering guidepost with on data table does not propagate to global index > using {{ALTER TABLE}} command. > Altering global index table runs in not allowed error. > {noformat} > ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1; > Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop > column referenced by VIEW columnName=IDX (state=42M01,code=1010) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4343) In CREATE TABLE allow setting guide post width only on base data tables
[ https://issues.apache.org/jira/browse/PHOENIX-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4343: -- Summary: In CREATE TABLE allow setting guide post width only on base data tables (was: In CREATE TABLE only allow setting guide post width on tables and global indexes) > In CREATE TABLE allow setting guide post width only on base data tables > --- > > Key: PHOENIX-4343 > URL: https://issues.apache.org/jira/browse/PHOENIX-4343 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4343.patch, PHOENIX-4343_v2.patch, > PHOENIX-4343_v3.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235245#comment-16235245 ] Samarth Jain commented on PHOENIX-4333: --- It might be a late night and lack of coffee but I am not sure I see the co-relation here. {code} gpsAvailableForAllRegions &= initialKeyBytes != currentKeyBytes; {code} We set initialKeyBytes to currentKeyBytes when we know we are not using stats for parallelisation. {code} if (!useStatsForParallelization) { /* * If we are not using stats for generating parallel scans, we need to reset the * currentKey back to what it was at the beginning of the loop. */ currentKeyBytes = initialKeyBytes; } {code} bq. I also think we should set the estimatedRows and estimatedSize to what we've found, but only set estimateInfoTimestamp to null if !gpsAvailableForAllRegions. That way callers can choose to use or not use the partial estimates based on estimateInfoTimestamp. Makes sense. > Stats - Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-4343) In CREATE TABLE only allow setting guide post width on tables and global indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235240#comment-16235240 ] James Taylor commented on PHOENIX-4343: --- +1 > In CREATE TABLE only allow setting guide post width on tables and global > indexes > > > Key: PHOENIX-4343 > URL: https://issues.apache.org/jira/browse/PHOENIX-4343 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4343.patch, PHOENIX-4343_v2.patch, > PHOENIX-4343_v3.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4343) In CREATE TABLE only allow setting guide post width on tables and global indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4343: -- Attachment: PHOENIX-4343_v3.patch Thanks for the review, [~jamestaylor]. Attached is the updated patch. > In CREATE TABLE only allow setting guide post width on tables and global > indexes > > > Key: PHOENIX-4343 > URL: https://issues.apache.org/jira/browse/PHOENIX-4343 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Samarth Jain >Priority: Major > Attachments: PHOENIX-4343.patch, PHOENIX-4343_v2.patch, > PHOENIX-4343_v3.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)