[jira] [Comment Edited] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-11-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237122#comment-16237122
 ] 

Ankit Singhal edited comment on PHOENIX-4328 at 11/3/17 5:50 AM:
-

bq. We can't do that since the properties are read from 
ConnectionQueryServicesImpl properties object is an instance of class 
ReadOnlyProps.
You can use an instance variable(isNamespaceMappingEnabled) in 
ConnectionQueryServicesImpl which can be set by current logic in 
checkClientServerCompatibility and use it everywhere where the conversion of 
SYSTEM tables names are required?

we may close https://issues.apache.org/jira/browse/PHOENIX-3288 as duplicate if 
this JIRA is trying to do the same.


was (Author: an...@apache.org):
bq. We can't do that since the properties are read from 
ConnectionQueryServicesImpl properties object is an instance of class 
ReadOnlyProps.
You can use an instance variable(isNamespaceMappingEnabled) in 
ConnectionQueryServicesImpl which can be set by current logic in 
checkClientServerCompatibility and use it everywhere where the conversion of 
SYSTEM tables names are required?

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>Priority: Major
>  Labels: namespaces
> Fix For: 4.13.0
>
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-11-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237122#comment-16237122
 ] 

Ankit Singhal commented on PHOENIX-4328:


bq. We can't do that since the properties are read from 
ConnectionQueryServicesImpl properties object is an instance of class 
ReadOnlyProps.
You can use an instance variable(isNamespaceMappingEnabled) in 
ConnectionQueryServicesImpl which can be set by current logic in 
checkClientServerCompatibility and use it everywhere where the conversion of 
SYSTEM tables names are required?

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>Priority: Major
>  Labels: namespaces
> Fix For: 4.13.0
>
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237120#comment-16237120
 ] 

Ankit Singhal commented on PHOENIX-3460:


bq. HBase does not allow creating a table name that contains the namespace 
separator. We should not allow using the namespace separator in the table or 
schema name. Instead we should throw a PhoenixParserException.
[~tdsilva],[~jamestaylor], I think we had this to map existing tables with 
view/table when namespace mapping is not enabled. So are we making mandatory 
for the users to have namespace mapping enabled when they want to map tables 
under namespace in Phoenix? then we probably need some documentation around 
this.

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237067#comment-16237067
 ] 

Samarth Jain commented on PHOENIX-4348:
---

+1

> Point deletes do not work when there are immutable indexes with only row key 
> columns
> 
>
> Key: PHOENIX-4348
> URL: https://issues.apache.org/jira/browse/PHOENIX-4348
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4348.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237075#comment-16237075
 ] 

Hudson commented on PHOENIX-4287:
-

SUCCESS: Integrated in Jenkins build Phoenix-master #1863 (See 
[https://builds.apache.org/job/Phoenix-master/1863/])
PHOENIX-4287 Add null check for parent name (samarth: rev 
895d067974639cd2205b14940e4e46864b4e2060)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java


> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO 

[jira] [Commented] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237027#comment-16237027
 ] 

Hadoop QA commented on PHOENIX-4348:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895545/PHOENIX-4348.patch
  against master branch at commit 895d067974639cd2205b14940e4e46864b4e2060.
  ATTACHMENT ID: 12895545

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+public void testPointDeleteRowFromTableWithImmutableIndex(boolean 
localIndex, boolean addNonPKIndex) throws Exception {
+"CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, FEATURE, 
\"DATE\")) IMMUTABLE_ROWS=true");
+stm.execute("CREATE " + (localIndex ? "LOCAL" : "") + " INDEX " + 
indexName1 + " ON " + tableName + " (\"DATE\", FEATURE)");
+stm.execute("CREATE " + (localIndex ? "LOCAL" : "") + " INDEX " + 
indexName2 + " ON " + tableName + " (FEATURE, DOMAIN)");
+stm.execute("CREATE " + (localIndex ? "LOCAL" : "") + " INDEX 
" + indexName3 + " ON " + tableName + " (\"DATE\", FEATURE, USAGE.DB)");
+.prepareStatement("UPSERT INTO " + tableName + "(HOST, 
DOMAIN, FEATURE, \"DATE\", CORE, DB, ACTIVE_VISITOR) VALUES(?,?, ? , ?, ?, ?, 
?)");
+String dml = "DELETE FROM " + tableName + " WHERE (HOST, DOMAIN, 
FEATURE, \"DATE\") = (?,?,?,?)";
+return new MutationState(plan.getTableRef(), mutation, 
0, maxSize, maxSizeBytes, connection);

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1613//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1613//console

This message is automatically generated.

> Point deletes do not work when there are immutable indexes with only row key 
> columns
> 
>
> Key: PHOENIX-4348
> URL: https://issues.apache.org/jira/browse/PHOENIX-4348
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4348.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237000#comment-16237000
 ] 

Hudson commented on PHOENIX-4287:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1862 (See 
[https://builds.apache.org/job/Phoenix-master/1862/])
PHOENIX-4287 Make indexes inherit use stats property from their parent 
(samarth: rev 7d2205d0c9854f61e667a4939eeed645de518f45)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ExplainPlanWithStatsEnabledIT.java


> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST 

[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236997#comment-16236997
 ] 

Hadoop QA commented on PHOENIX-4342:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895538/PHOENIX-4342-v2.patch
  against master branch at commit 895d067974639cd2205b14940e4e46864b4e2060.
  ATTACHMENT ID: 12895538

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+mutationPlans.add(new 
SingleRowDeleteMutationPlan(dataPlan, connection, maxSize, maxSizeBytes));
+return new ServerSelectDeleteMutationPlan(dataPlan, connection, 
aggPlan, projector, maxSize, maxSizeBytes);
+return new ClientSelectDeleteMutationPlan(targetTableRef, 
dataPlan, bestPlan, hasPreOrPostProcessing,
+parallelIteratorFactory, otherTableRefs, 
projectedTableRef, maxSize, maxSizeBytes, connection);
+public SingleRowDeleteMutationPlan(QueryPlan dataPlan, 
PhoenixConnection connection, int maxSize, int maxSizeBytes) {
+Map mutation = 
Maps.newHashMapWithExpectedSize(ranges.getPointLookupCount());
+mutation.put(new 
ImmutableBytesPtr(iterator.next().getLowerRange()), new 
RowMutationState(PRow.DELETE_MARKER, 
statement.getConnection().getStatementExecutionCounter(), 
NULL_ROWTIMESTAMP_INFO, null));
+return new MutationState(context.getCurrentTable(), mutation, 0, 
maxSize, maxSizeBytes, connection);
+public ServerSelectDeleteMutationPlan(QueryPlan dataPlan, 
PhoenixConnection connection, QueryPlan aggPlan,
+  RowProjector projector, int 
maxSize, int maxSizeBytes) {

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.RebuildIndexConnectionPropsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SaltedViewIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1612//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1612//console

This message is automatically generated.

> Surface QueryPlan in MutationPlan
> -
>
> Key: PHOENIX-4342
> URL: https://issues.apache.org/jira/browse/PHOENIX-4342
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Geoffrey Jacoby
>Priority: Minor
> Attachments: PHOENIX-4342-v2.patch, PHOENIX-4342.patch
>
>
> For DELETE statements, it'd be good to be able to get at the QueryPlan 
> through the MutationPlan so we can get more structured information at compile 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236987#comment-16236987
 ] 

Hadoop QA commented on PHOENIX-4287:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12895529/PHOENIX-4287_addendum7.patch
  against master branch at commit 7d2205d0c9854f61e667a4939eeed645de518f45.
  ATTACHMENT ID: 12895529

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1611//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1611//console

This message is automatically generated.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> 

[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236960#comment-16236960
 ] 

James Taylor commented on PHOENIX-4342:
---

This looks very good, [~gjacoby]. The one funky case in DeleteCompiler where we 
have multiple query plans for a point delete, it might be better to return the 
dataPlan here:
{code}
+
+@Override
+public QueryPlan getQueryPlan() {
+return firstPlan.getQueryPlan();
+}
{code}

> Surface QueryPlan in MutationPlan
> -
>
> Key: PHOENIX-4342
> URL: https://issues.apache.org/jira/browse/PHOENIX-4342
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Geoffrey Jacoby
>Priority: Minor
> Attachments: PHOENIX-4342-v2.patch, PHOENIX-4342.patch
>
>
> For DELETE statements, it'd be good to be able to get at the QueryPlan 
> through the MutationPlan so we can get more structured information at compile 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns

2017-11-02 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4348:
--
Attachment: PHOENIX-4348.patch

Please review, [~samarthjain] or [~tdsilva]. We weren't joining the 
MutationStates together, so not all the Delete mutations were getting committed.

> Point deletes do not work when there are immutable indexes with only row key 
> columns
> 
>
> Key: PHOENIX-4348
> URL: https://issues.apache.org/jira/browse/PHOENIX-4348
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4348.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4342) Surface QueryPlan in MutationPlan

2017-11-02 Thread Geoffrey Jacoby (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-4342:
-
Attachment: PHOENIX-4342-v2.patch

v2 patch with getQueryPlan added to MutationPlan. 

> Surface QueryPlan in MutationPlan
> -
>
> Key: PHOENIX-4342
> URL: https://issues.apache.org/jira/browse/PHOENIX-4342
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Geoffrey Jacoby
>Priority: Minor
> Attachments: PHOENIX-4342-v2.patch, PHOENIX-4342.patch
>
>
> For DELETE statements, it'd be good to be able to get at the QueryPlan 
> through the MutationPlan so we can get more structured information at compile 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4348) Point deletes do not work when there are immutable indexes with only row key columns

2017-11-02 Thread James Taylor (JIRA)
James Taylor created PHOENIX-4348:
-

 Summary: Point deletes do not work when there are immutable 
indexes with only row key columns
 Key: PHOENIX-4348
 URL: https://issues.apache.org/jira/browse/PHOENIX-4348
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: James Taylor
Priority: Major






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236914#comment-16236914
 ] 

James Taylor commented on PHOENIX-4344:
---

I see - yes, you're right - that would work. It'd do a point scan for each row 
if there was a non PK column as it'd need to look up that value to maintain the 
index. It'd work, it'd just be slow.

> MapReduce Delete Support
> 
>
> Key: PHOENIX-4344
> URL: https://issues.apache.org/jira/browse/PHOENIX-4344
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.12.0
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
>
> Phoenix already has the ability to use MapReduce for asynchronous handling of 
> long-running SELECTs. It would be really useful to have this capability for 
> long-running DELETEs, particularly of tables with indexes where using HBase's 
> own MapReduce integration would be prohibitively complicated. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support

2017-11-02 Thread Geoffrey Jacoby (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236907#comment-16236907
 ] 

Geoffrey Jacoby commented on PHOENIX-4344:
--

I don't see how Option 1 is problematic for indexes on non-PK columns, because 
it's internally using the Phoenix JDBC API and so going through all the same 
index-handling logic that a point-delete query issued from outside MapReduce 
would be doing. 

Let's say that I have a table ENTITY_HISTORY with a compound primary key (Key1, 
Key2). 

I create my MapReduce job with a query like "DELETE FROM ENTITY_HISTORY WHERE 
Key1 > 'aaa'"

That delete would be converted to a select, and the MapReduce job would iterate 
row by row over the result set. For each row, a new Delete query would be built 
using that row's PK, e.g "DELETE FROM ENTITY_HISTORY WHERE Key1 = 'foo' and 
Key2 = 'bar'" and executed using a PhoenixConnection (probably with some kind 
of commit batching).

I'm somewhat concerned about the perf, but the correctness seems sound to me -- 
am I missing an issue? 

> MapReduce Delete Support
> 
>
> Key: PHOENIX-4344
> URL: https://issues.apache.org/jira/browse/PHOENIX-4344
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.12.0
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
>
> Phoenix already has the ability to use MapReduce for asynchronous handling of 
> long-running SELECTs. It would be really useful to have this capability for 
> long-running DELETEs, particularly of tables with indexes where using HBase's 
> own MapReduce integration would be prohibitively complicated. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236904#comment-16236904
 ] 

Samarth Jain commented on PHOENIX-4287:
---

Thanks. I added the comment in my commit.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236893#comment-16236893
 ] 

James Taylor commented on PHOENIX-4287:
---

Patch looks good, but please add a comment about needing that extra check for 
drop of a local index.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> 

[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4287:
--
Attachment: PHOENIX-4287_addendum7.patch

Looks like an NPE happens when dropping local indexes. Addressing it in this 
patch.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_addendum7.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236864#comment-16236864
 ] 

Hadoop QA commented on PHOENIX-4287:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12895514/PHOENIX-4287_addendum5.patch
  against master branch at commit 8f9356a2bdd6ba603158899eba38750c85e8e574.
  ATTACHMENT ID: 12895514

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.IndexWithTableSchemaChangeIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.DropColumnIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1609//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1609//console

This message is automatically generated.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236857#comment-16236857
 ] 

Hadoop QA commented on PHOENIX-4287:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12895524/PHOENIX-4287_addendum6.patch
  against master branch at commit 7d2205d0c9854f61e667a4939eeed645de518f45.
  ATTACHMENT ID: 12895524

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1610//console

This message is automatically generated.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> 

[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236856#comment-16236856
 ] 

James Taylor commented on PHOENIX-4344:
---

I'd go with Option #2. Option #1 will be problematic for tables with indexes on 
non pk columns. If you can tack on the correct RVC (or perhaps did below the 
Phoenix API and set the start/stop row of the Scan) based on the info in the 
QueryPlan, then the delete logic will all be handled completely by 
DeleteCompiler. You just need to grab the mutations using 
PhoenixRuntime.getUncommittedDataIterator(). You might just use 
FormatToBytesWritableMapper for inspiration/code borrowing.



> MapReduce Delete Support
> 
>
> Key: PHOENIX-4344
> URL: https://issues.apache.org/jira/browse/PHOENIX-4344
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.12.0
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
>
> Phoenix already has the ability to use MapReduce for asynchronous handling of 
> long-running SELECTs. It would be really useful to have this capability for 
> long-running DELETEs, particularly of tables with indexes where using HBase's 
> own MapReduce integration would be prohibitively complicated. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236851#comment-16236851
 ] 

Hadoop QA commented on PHOENIX-4342:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895489/PHOENIX-4342.patch
  against master branch at commit 8f9356a2bdd6ba603158899eba38750c85e8e574.
  ATTACHMENT ID: 12895489

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+List mutationPlans = 
Lists.newArrayListWithExpectedSize(queryPlans.size());
+mutationPlans.add(new SingleRowDeleteMutationPlan(dataPlan, 
connection, maxSize, maxSizeBytes));
+return new ServerSelectDeleteMutationPlan(dataPlan, connection, 
aggPlan, projector, maxSize, maxSizeBytes);
+return new ClientSelectDeleteMutationPlan(targetTableRef, 
dataPlan, bestPlan, hasPreOrPostProcessing,
+parallelIteratorFactory, otherTableRefs, 
projectedTableRef, maxSize, maxSizeBytes, connection);
+public SingleRowDeleteMutationPlan(QueryPlan dataPlan, 
PhoenixConnection connection, int maxSize, int maxSizeBytes) {
+Map mutation = 
Maps.newHashMapWithExpectedSize(ranges.getPointLookupCount());
+mutation.put(new 
ImmutableBytesPtr(iterator.next().getLowerRange()), new 
RowMutationState(PRow.DELETE_MARKER, 
statement.getConnection().getStatementExecutionCounter(), 
NULL_ROWTIMESTAMP_INFO, null));
+return new MutationState(context.getCurrentTable(), mutation, 0, 
maxSize, maxSizeBytes, connection);
+public ServerSelectDeleteMutationPlan(QueryPlan dataPlan, 
PhoenixConnection connection, QueryPlan aggPlan,

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1608//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1608//console

This message is automatically generated.

> Surface QueryPlan in MutationPlan
> -
>
> Key: PHOENIX-4342
> URL: https://issues.apache.org/jira/browse/PHOENIX-4342
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Geoffrey Jacoby
>Priority: Minor
> Attachments: PHOENIX-4342.patch
>
>
> For DELETE statements, it'd be good to be able to get at the QueryPlan 
> through the MutationPlan so we can get more structured information at compile 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236841#comment-16236841
 ] 

James Taylor commented on PHOENIX-4287:
---

+1

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 

[jira] [Commented] (PHOENIX-4342) Surface QueryPlan in MutationPlan

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236840#comment-16236840
 ] 

James Taylor commented on PHOENIX-4342:
---

My preference would be to either:
- add MutationPlan.getQueryPlan() or
- derive MutationPlan from QueryPlan (could use DelegateQueryPlan to help with 
that)

For the cases that don't issue a query, we could use a new EmptyQueryPlan (or 
null). Otherwise, we'll end up with lots of {{mutationPlan instanceof 
DeleteMutationPlan}} checks.

> Surface QueryPlan in MutationPlan
> -
>
> Key: PHOENIX-4342
> URL: https://issues.apache.org/jira/browse/PHOENIX-4342
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Geoffrey Jacoby
>Priority: Minor
> Attachments: PHOENIX-4342.patch
>
>
> For DELETE statements, it'd be good to be able to get at the QueryPlan 
> through the MutationPlan so we can get more structured information at compile 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4287:
--
Attachment: PHOENIX-4287_addendum6.patch

Updated patch with additional test on view and view index.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_addendum6.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236809#comment-16236809
 ] 

James Taylor commented on PHOENIX-4287:
---

What about a test that turns the use stats property on/off at the view level. 
You could just add this to your new test:
{code}
+conn.createStatement().execute("ALTER VIEW " + view + " SET 
USE_STATS_FOR_PARALLELIZATION= " + !useStats);
+// query against the view
+query = "SELECT * FROM " + view;
+rs = 
conn.createStatement().executeQuery(query).unwrap(PhoenixResultSet.class);
+// assert query is against view
+assertEquals(view, 
rs.unwrap(PhoenixResultSet.class).getStatement().getQueryPlan()
+.getTableRef().getTable().getName().getString());
+// stats are being used for parallelization. So number of scans is 
higher.
+assertEquals(!useStats ? 11 : 1, 
rs.unwrap(PhoenixResultSet.class).getStatement()
+.getQueryPlan().getScans().get(0).size());
+
+// query against the view index
+query = "SELECT 1 FROM " + view + " WHERE B > 0";
+rs = 
conn.createStatement().executeQuery(query).unwrap(PhoenixResultSet.class);
+// assert query is against viewIndex
+assertEquals(viewIndex, 
rs.unwrap(PhoenixResultSet.class).getStatement().getQueryPlan()
+.getTableRef().getTable().getName().getString());
+// stats are being used for parallelization. So number of scans is 
higher.
+assertEquals(!useStats ? 11 : 1, 
rs.unwrap(PhoenixResultSet.class).getStatement()
+.getQueryPlan().getScans().get(0).size());
{code}

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> 

[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236793#comment-16236793
 ] 

Hudson commented on PHOENIX-3460:
-

SUCCESS: Integrated in Jenkins build Phoenix-master #1861 (See 
[https://builds.apache.org/job/Phoenix-master/1861/])
PHOENIX-3460 Namespace separator : should not be allowed in table or (tdsilva: 
rev 8f9356a2bdd6ba603158899eba38750c85e8e574)
* (edit) 
phoenix-core/src/test/java/org/apache/phoenix/parse/QueryParserTest.java
* (edit) phoenix-core/src/main/antlr3/PhoenixSQL.g


> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4344) MapReduce Delete Support

2017-11-02 Thread Geoffrey Jacoby (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236759#comment-16236759
 ] 

Geoffrey Jacoby commented on PHOENIX-4344:
--

Some thoughts, [~jamestaylor]

I want this to be usable for generic DELETE queries without the need for 
hand-written DBWritable subclasses.

MapReduce goes line by line, rather than by Mapper Task/Scan, so while the 
client would be issuing a broad DELETE query, the mapper itself would either be:

1. Issuing point DELETE Phoenix queries by the complete primary key derived 
from a SELECT the MapReduce is iterating over 
(Mapper)
OR
2. Issuing DELETE mutations down to several HTables via MultiHFileOutputFormat 
from a DELETE the MapReduce is iterating over
(Mapper)

FormatToBytesWritableMapper relies heavily on a LineParser interface, and the 
only choices appear to be CsvLineParser, JsonLineParser, and RegexLineParser. 
That means that in either case the complete row key would have to be built by a 
new ResultSetLineParser that can take in a ResultSet and parse it into an 
intermediate form suitable making either DELETE DML (Option 1) or Delete 
Mutations (Option 2). The former would just need to grab the row key 
components, while the latter would potentially need everything, because an 
index can be on any column. 

Also either way, we need a concrete generalized subclass of the abstract 
DBWritable. 

Option 1 seems considerably simpler/higher level, while Option 2 seems more 
efficient

> MapReduce Delete Support
> 
>
> Key: PHOENIX-4344
> URL: https://issues.apache.org/jira/browse/PHOENIX-4344
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.12.0
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
>
> Phoenix already has the ability to use MapReduce for asynchronous handling of 
> long-running SELECTs. It would be really useful to have this capability for 
> long-running DELETEs, particularly of tables with indexes where using HBase's 
> own MapReduce integration would be prohibitively complicated. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4287:
--
Attachment: PHOENIX-4287_addendum5.patch

Thanks for the code snippet, [~jamestaylor]. Attached is the addendum along 
with a test.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_addendum5.patch, 
> PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch, 
> PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from 

[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236705#comment-16236705
 ] 

Hadoop QA commented on PHOENIX-3460:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895480/PHOENIX-3460-v2.patch
  against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61.
  ATTACHMENT ID: 12895480

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.PartialIndexRebuilderIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1607//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1607//console

This message is automatically generated.

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236638#comment-16236638
 ] 

James Taylor commented on PHOENIX-4333:
---

Two other corner case:
# Handle the case where there's a single region. In that case, we can use the 
time estimate from the single row we have in gps table.
# Handle case where there's a guidepost in the first region, but it's *before* 
the startKey. We'll need to tweak this loop to stop first slightly sooner (when 
we're past the start key of the first region) so we know if there's a guidepost 
in the first region. If we enter the loop, then we have a gps for that region. 
Note too there are a couple of minor changes here that make sense to make, such 
as setting intersectWithGuidePosts and not checking the key length each time 
through the loop since it's not changing.
{code}
int startRegionIndex = regionIndex;
boolean gpsForFirstRegion = false;
try {
if (gpsSize > 0) {
stream = new ByteArrayInputStream(guidePosts.get(), 
guidePosts.getOffset(), guidePosts.getLength());
input = new DataInputStream(stream);
decoder = new PrefixByteDecoder(gps.getMaxLength());
try {
byte[] firstRegionStartKey = 
regionLocations.get(regionIndex).getRegionInfo().getStartKey();
if (firstRegionStartKey.getLength() > 0) {
// Walk guideposts until we're past the first region 
start key
while (firstRegionStartKey.compareTo(currentGuidePost = 
PrefixByteCodec.decode(decoder, input)) >= 0) {
gpsForFirstRegion = true;
minGuidePostTimestamp = Math.min(estimateTs,
gps.getGuidePostTimestamps()[guideIndex]);
guideIndex++;
}
// Continue walking guideposts until we get past the 
currentKey
while (currentKey.compareTo(currentGuidePost = 
PrefixByteCodec.decode(decoder, input)) >= 0) {
minGuidePostTimestamp = Math.min(estimateTs,
gps.getGuidePostTimestamps()[guideIndex]);
guideIndex++;
}
}
} catch (EOFException e) {
// expected. Thrown when we have decoded all guide posts.
intersectWithGuidePosts = false;
}
}
{code}
# Then we'll want to consider {{gpsForFirstRegion}} in our setting of 
{{gpsAvailableForAllRegions}}. This would be necessary if the currentKey (i.e. 
the start key) is after the gps, but before the endKey.
{code}
// We have a guide post in the region if the above loop was entered
// or if the current key is less than the region end key (since the loop
// may not have been entered if our scan end key is smaller than the
// first guide post in that region).
gpsAvailableForAllRegions &= 
currentKeyBytes != initialKeyBytes || 
( gpsForFirstRegion && regionIndex == startRegionIndex ) ||
( endKey == stopKey && // If not comparing against region boundary
  ( endRegionKey.length == 0 || // then check if gp is in the region
currentGuidePost.compareTo(endRegionKey) < 0) );
{code}

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-11-02 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236635#comment-16236635
 ] 

Karan Mehta commented on PHOENIX-4328:
--

bq. Would it be reasonable just to override the client configuration with the 
value instead of throwing an exception about inconsistent namespace mapping 
property?

We can't do that since the properties are read from 
{{ConnectionQueryServicesImpl}} {{properties}} object is an instance of class 
{{ReadOnlyProps}}.

As [~jamestaylor] suggested, one potential option is create a 
{{DelegateHTableInterface}} since every call to 
{{SchemaUtil#getPhysicalName()}} is used for getting a {{HTableInterface}}. 
This happens in the method {{CQSI#getTable()}}. This delegate class can have 
retry logic built in for certain type of tables (For example, SYSTEM tables). 
For the rest, we can bubble up the exception. 
We need to look into detail about potential corner cases here and how it will 
affect server and client side. 

[~twdsi...@gmail.com] FYI.

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>Priority: Major
>  Labels: namespaces
> Fix For: 4.13.0
>
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236589#comment-16236589
 ] 

James Taylor commented on PHOENIX-4287:
---

Something like this function in BaseResultIterators, [~samarthjain]:
{code}
private boolean useStatsForParallelization(StatementContext context, PTable 
table) {
Boolean useStats = table.useStatsForParallelization();
if (useStats != null) {
return useStats;
}
if (table.getType() == PTableType.INDEX) {
PhoenixConnection conn = context.getConnection();
String parentTableName = table.getParentName().getString();
try {
PTable parentTable = conn.getTable(new 
PTableKey(conn.getTenantId(), parentTableName));
useStats = parentTable.useStatsForParallelization();
if (useStats != null) {
return useStats;
}
} catch (TableNotFoundException e) {
Log.warn("Unable to find parent table \"" + parentTableName + 
"\" of table \"" + table.getName().getString() + "\" to determine 
USE_STATS_FOR_PARALLELIZATION", e);
}
}
return 
context.getConnection().getQueryServices().getConfiguration().getBoolean(
USE_STATS_FOR_PARALLELIZATION, 
DEFAULT_USE_STATS_FOR_PARALLELIZATION);
}
{code}
Please add a test too.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236580#comment-16236580
 ] 

Samarth Jain commented on PHOENIX-4287:
---

Yes, that's correct. Will change the patch to fetch the property from the base 
table.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236574#comment-16236574
 ] 

Mujtaba Chohan commented on PHOENIX-4287:
-

Alter on index leads to:
{noformat}
ERROR 1010 (42M01): Not allowed to mutate table.
{noformat}

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | 

[jira] [Updated] (PHOENIX-4342) Surface QueryPlan in MutationPlan

2017-11-02 Thread Geoffrey Jacoby (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-4342:
-
Attachment: PHOENIX-4342.patch

First cut at this patch. 

1. Created "DeleteMutationPlan" which extends MutationPlan and adds a 
getQueryPlan method
2. Refactored all the anonymous delete MutationPlan classes into real inner 
classes implementing DeleteMutationPlan
3. Changed DeleteCompiler.compile to return a DeleteMutationPlan.

If all MutationPlans contain a QueryPlan, we can dispense with 
DeleteMutationPlan and just add a getQueryPlan method to the base interface, 
but I didn't think that was the case. 

[~jamestaylor], fyi. 

> Surface QueryPlan in MutationPlan
> -
>
> Key: PHOENIX-4342
> URL: https://issues.apache.org/jira/browse/PHOENIX-4342
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Geoffrey Jacoby
>Priority: Minor
> Attachments: PHOENIX-4342.patch
>
>
> For DELETE statements, it'd be good to be able to get at the QueryPlan 
> through the MutationPlan so we can get more structured information at compile 
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236557#comment-16236557
 ] 

James Taylor commented on PHOENIX-4287:
---

I thought we didn’t have a way of setting properties on indexes?

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 27 

[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236547#comment-16236547
 ] 

Hadoop QA commented on PHOENIX-3460:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895471/PHOENIX-3460.patch
  against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61.
  ATTACHMENT ID: 12895471

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1606//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1606//console

This message is automatically generated.

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236544#comment-16236544
 ] 

Samarth Jain commented on PHOENIX-4287:
---

USE_STATS_FOR_PARALLELIZATION can be set at an index/view/base table level. For 
index to use parallelization, you need to set USE_STATS_FOR_PARALLELIZATION = 
true, else the default value will be used (which in your case is false)

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236538#comment-16236538
 ] 

Samarth Jain commented on PHOENIX-4287:
---

Just got back. Taking a look.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 27|
> +---+
> {noformat}


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236516#comment-16236516
 ] 

James Taylor commented on PHOENIX-4287:
---

[~mujtabachohan] - can you confirm whether or not the index has stats collected 
for it?

[~samarthjain] - you’ll need to check the right table if an index is being used 
(and with some special logic for an index on a view and local index). You need 
to trace back to the physical parent table.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236498#comment-16236498
 ] 

Mujtaba Chohan commented on PHOENIX-4287:
-

Just figured that out as well :) That index was created with previous version 
and it had {{USE_STATS_FOR_PARALLELIZATION}} set to true causing index to use 
parallelization. 

Here's a case that still doesn't work with table created with latest version:
Global {{USE_STATS_FOR_PARALLELIZATION}} is set to *false*.
Table and global index created without setting parallelization.
Table parallelization is set to *true* ALTER ...SET 
USE_STATS_FOR_PARALLELIZATION=true. Verified in SYSTEM.CATALOG as well as it is 
set for base table only.
Parallelization is *not* used for queries when they get executed against 
*index*, parallelization is correctly used for base table.




> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> 

[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236475#comment-16236475
 ] 

James Taylor commented on PHOENIX-3460:
---

+1 assuming unit tests were run and are passing.

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236468#comment-16236468
 ] 

James Taylor commented on PHOENIX-4287:
---

One more question, [~mujtabachohan]. Prior versions of this patch were 
mistakenly writing the USE_STATS_FOR_PARALLELIZATION value into the table 
metadata, even when it wasn't set. Is your testing using new tables so that 
this doesn't impact you? You can query the SYSTEM.CATALOG directly for the 
table & index to see if there's a value for USE_STATS_FOR_PARALLELIZATION. If 
there is, this prior issue may be affecting you. If you create a new table and 
index and you see a value, there's definitely still an issue.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY 

[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3460:

Attachment: PHOENIX-3460-v2.patch

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3460:

Attachment: PHOENIX-3460-v2.patch

Thanks for the view, attaching a v2 patch that uses 
QueryConstants.NAMESPACE_SEPARATOR

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236460#comment-16236460
 ] 

Thomas D'Silva edited comment on PHOENIX-3460 at 11/2/17 7:36 PM:
--

Thanks for the review, attaching a v2 patch that uses 
QueryConstants.NAMESPACE_SEPARATOR


was (Author: tdsilva):
Thanks for the view, attaching a v2 patch that uses 
QueryConstants.NAMESPACE_SEPARATOR

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460-v2.patch, 
> PHOENIX-3460-v2.patch, PHOENIX-3460.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236458#comment-16236458
 ] 

James Taylor commented on PHOENIX-4287:
---

So something like this:
- Set {{phoenix.use.stats.parallelization}} to false
- Create table: CREATE TABLE t (k VARCHAR PRIMARY KEY, v VARCHAR);
- Create index: CREATE INDEX idx ON t(v);
- Execute query: SELECT v FROM t WHERE v='foo';
- Confirm through explain plan that a) index was used, and b) query isn't 
chunked up based on stats

I suppose we should also have a test that calls ALTER TABLE t SET 
USE_STATS_FOR_PARALLELIZATION=false, and confirm that the query isn't chunked 
up based on stats as well.

[~samarthjain] - I'm not seeing any tests around indexes for this. Did I miss 
them?



> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null   

[jira] [Commented] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236441#comment-16236441
 ] 

James Taylor commented on PHOENIX-3460:
---

Patch looks fine, but one minor nit. It'd be a little more clear *why* we're 
disallowing ':' if you use the HBase constant here:
{code}
c.contains(TableName.NAMESPACE_DELIM)
{code}

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460.patch, 
> SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236431#comment-16236431
 ] 

Mujtaba Chohan edited comment on PHOENIX-4287 at 11/2/17 7:24 PM:
--

[~jamestaylor] At table creation time I didn't set 
USE_STATS_FOR_PARALLELIZATION and as global setting it's set to false. Later I 
turn it off as well with ALTER statement. Index is global.


was (Author: mujtabachohan):
[~jamestaylor] At table creation time I didn't set 
USE_STATS_FOR_PARALLELIZATION and as global setting it's set to false. Index is 
global.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> |

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236431#comment-16236431
 ] 

Mujtaba Chohan commented on PHOENIX-4287:
-

[~jamestaylor] At table creation time I didn't set 
USE_STATS_FOR_PARALLELIZATION and as global setting it's set to false. Index is 
global.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ 

[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3460:

Fix Version/s: (was: 4.7.0)
   4.13.0

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.13.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460.patch, 
> SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3460:

Attachment: PHOENIX-3460.patch

HBase does not allow creating a table name that contains the namespace 
separator.  We should not allow using the namespace separator in the table or 
schema name. Instead we should throw a PhoenixParserException. 

[~jamestaylor] , can you please review?

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.7.0
>
> Attachments: 0001-Phoenix-fix.patch, PHOENIX-3460.patch, 
> SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3460:

Summary: Namespace separator ":" should not be allowed in table or schema 
name  (was: Phoenix Spark plugin cannot find table with a Namespace prefix)

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.7.0
>
> Attachments: 0001-Phoenix-fix.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-3460) Namespace separator ":" should not be allowed in table or schema name

2017-11-02 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva reassigned PHOENIX-3460:
---

Assignee: Thomas D'Silva

> Namespace separator ":" should not be allowed in table or schema name
> -
>
> Key: PHOENIX-3460
> URL: https://issues.apache.org/jira/browse/PHOENIX-3460
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
> Environment: HDP 2.5
>Reporter: Xindian Long
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: namespaces, phoenix, spark
> Fix For: 4.7.0
>
> Attachments: 0001-Phoenix-fix.patch, SchemaUtil.java
>
>
> I am testing some code using Phoenix Spark plug in to read a Phoenix table 
> with a namespace prefix in the table name (the table is created as a phoenix 
> table not a hbase table), but it returns an TableNotFoundException.
> The table is obviously there because I can query it using plain phoenix sql 
> through Squirrel. In addition, using spark sql to query it has no problem at 
> all.
> I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245
> The problem does not exist at all when I was running the same code on HDP 2.4 
> cluster, with phoenix 4.4.
> Neither does the problem occur when I query a table without a namespace 
> prefix in the DB table name, on HDP 2.5
> The log is in the attached file: tableNoFound.txt
> My testing code is also attached.
> The weird thing is in the attached code, if I run testSpark alone it gives 
> the above exception, but if I run the testJdbc first, and followed by 
> testSpark, both of them work.
>  After changing to create table by using
> create table ACME.ENDPOINT_STATUS
> The phoenix-spark plug in seems working. I also find some weird behavior,
> If I do both the following
> create table ACME.ENDPOINT_STATUS ...
> create table "ACME:ENDPOINT_STATUS" ...
> Both table shows up in phoenix, the first one shows as Schema ACME, and table 
> name ENDPOINT_STATUS, and the later on shows as scheme none, and table name 
> ACME:ENDPOINT_STATUS.
> However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, 
> upserts in the table ACME.ENDPOINT_STATUS show up in the other table, so is 
> the other way around.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236403#comment-16236403
 ] 

James Taylor commented on PHOENIX-4287:
---

[~mujtabachohan] - can you outline a simple unit test to make sure we have 
coverage for this? Is the base table initially setup with 
USE_STATS_FOR_PARALLELIZATION=true at creation time? And then later you alter 
the table to turn it off? Is the index a global index? The default, global 
setting for {{phoenix.use.stats.parallelization}} is false, right?

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW  

[jira] [Comment Edited] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236356#comment-16236356
 ] 

Mujtaba Chohan edited comment on PHOENIX-4287 at 11/2/17 6:39 PM:
--

[~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on 
base table and also config set to false globally, stats are correctly not used 
for parallelization when query runs on base table however for index table they 
are still used. See explain plan below. This is with 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a

{noformat}
ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false;
explain select count(*) from T;
+--+-+++
|   PLAN
   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+--+-+++
| CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER 
T_IDX  | 63050353| 1161114| 1509646993152  |
| SERVER FILTER BY FIRST KEY ONLY   
   | 63050353| 1161114| 1509646993152  |
| SERVER AGGREGATE INTO SINGLE ROW  
   | 63050353| 1161114| 1509646993152  |
+--+-+++
{noformat}


was (Author: mujtabachohan):
[~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on 
base table and also config set to false globally, stats are correctly not used 
for parallelization when query runs on base table however on for index it is 
still used. See explain plan below. This is with 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a

{noformat}
ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false;
explain select count(*) from T;
+--+-+++
|   PLAN
   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+--+-+++
| CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER 
T_IDX  | 63050353| 1161114| 1509646993152  |
| SERVER FILTER BY FIRST KEY ONLY   
   | 63050353| 1161114| 1509646993152  |
| SERVER AGGREGATE INTO SINGLE ROW  
   | 63050353| 1161114| 1509646993152  |
+--+-+++
{noformat}

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY 

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-11-02 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236356#comment-16236356
 ] 

Mujtaba Chohan commented on PHOENIX-4287:
-

[~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on 
base table and also config set to false globally, stats are correctly not used 
for parallelization when query runs on base table however on for index it is 
still used. See explain plan below. This is with 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a

{noformat}
ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false;
explain select count(*) from T;
+--+-+++
|   PLAN
   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+--+-+++
| CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER 
T_IDX  | 63050353| 1161114| 1509646993152  |
| SERVER FILTER BY FIRST KEY ONLY   
   | 63050353| 1161114| 1509646993152  |
| SERVER AGGREGATE INTO SINGLE ROW  
   | 63050353| 1161114| 1509646993152  |
+--+-+++
{noformat}

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
>  Labels: localIndex
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151

[jira] [Comment Edited] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236115#comment-16236115
 ] 

James Taylor edited comment on PHOENIX-4333 at 11/2/17 6:25 PM:


We really want to answer the question "Is there a guidepost within every 
region?". Whether a guidepost then intersects the scan is not the check we 
need. For example, you may have a query doing a skip scan which would fail the 
intersection test, but still have a guidepost in the region.

I think if you always set the endRegionKey (instead of only when it's a local 
index) here before the inner loop:
{code}
endRegionKey = regionInfo.getEndKey();
if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means 
we entered the loop) or that the currentGuidePost is less than the region end 
key, then that's enough, since we know that the currentGuidePost is already 
bigger than the start region key. The check for endKey == stopKey is a small 
optimization, since we don't need to do the key comparison again if that's not 
the case since we've already done it as we entered the loop (see comment below).
{code}
// We have a guide post in the region if the above loop was entered
// or if the current key is less than the region end key (since the loop
// may not have been entered if our scan end key is smaller than the
// first guide post in that region).
gpsAvailableForAllRegions &= 
currentKeyBytes != initialKeyBytes || 
( endKey == stopKey && // If not comparing against region boundary
  ( endRegionKey.length == 0 || // then check if gp is in the region
currentGuidePost.compareTo(endRegionKey) < 0) );
{code}

Does this not pass all of your tests?


was (Author: jamestaylor):
We really want to answer the question "Is there a guidepost within every 
region?". Whether a guidepost then intersects the scan is not the check we 
need. For example, you may have a query doing a skip scan which would fail the 
intersection test, but still have a guidepost in the region.

I think if you always set the endRegionKey (instead of only when it's a local 
index) here before the inner loop:
{code}
endRegionKey = regionInfo.getEndKey();
if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means 
we entered the loop) or that the currentGuidePost is less than the region end 
key, then that's enough, since we know that the currentGuidePost is already 
bigger than the start region key. The check for endKey == stopKey is a small 
optimization, since we don't need to do the key comparison again if that's not 
the case since we've already done it as we entered the loop (see comment below).
{code}
// We have a guide post in previous region if the above loop 
was entered
// or if the current key is less than the region end key (since 
the loop
// may not have been entered if our scan end key is smaller 
than the first
// guide post in that region
gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes 
|| 
(endKey == stopKey && (endRegionKey.length == 0 || 
currentGuidePost.compareTo(endRegionKey) < 0));
{code}

Does this not pass all of your tests?

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236115#comment-16236115
 ] 

James Taylor edited comment on PHOENIX-4333 at 11/2/17 6:19 PM:


We really want to answer the question "Is there a guidepost within every 
region?". Whether a guidepost then intersects the scan is not the check we 
need. For example, you may have a query doing a skip scan which would fail the 
intersection test, but still have a guidepost in the region.

I think if you always set the endRegionKey (instead of only when it's a local 
index) here before the inner loop:
{code}
endRegionKey = regionInfo.getEndKey();
if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means 
we entered the loop) or that the currentGuidePost is less than the region end 
key, then that's enough, since we know that the currentGuidePost is already 
bigger than the start region key. The check for endKey == stopKey is a small 
optimization, since we don't need to do the key comparison again if that's not 
the case since we've already done it as we entered the loop (see comment below).
{code}
// We have a guide post in previous region if the above loop 
was entered
// or if the current key is less than the region end key (since 
the loop
// may not have been entered if our scan end key is smaller 
than the first
// guide post in that region
gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes 
|| 
(endKey == stopKey && (endRegionKey.length == 0 || 
currentGuidePost.compareTo(endRegionKey) < 0));
{code}

Does this not pass all of your tests?


was (Author: jamestaylor):
We really want to answer the question "Is there a guidepost within every 
region?". Whether a guidepost then intersects the scan is not the check we 
need. For example, you may have a query doing a skip scan which would fail the 
intersection test, but still have a guidepost in the region.

I think if you always set the endRegionKey (instead of only when it's a local 
index) here before the inner loop:
{code}
endRegionKey = regionInfo.getEndKey();
if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means 
we entered the loop) or that the currentGuidePost is less than the region end 
key, then that's enough, since we know that the currentGuidePost is already 
bigger than the start region key. The check for endKey == stopKey is a small 
optimization, since we don't need to do the key comparison again if that's not 
the case since we've already done it as we entered the loop (see comment below).
{code}
// We have a guide post in previous region if the above loop 
was entered
// or if the current key is less than the region end key (since 
the loop
// may not have been entered if our scan end key is smaller 
than the first
// guide post in that region
hasGuidePostInAllRegions &= currentKeyBytes != initialKeyBytes 
|| (endKey == stopKey && currentGuidePost.compareTo(endRegionKey) < 0;)
{code}

Does this not pass all of your tests?

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236313#comment-16236313
 ] 

James Taylor commented on PHOENIX-4333:
---

Also, looking at ExplainPlanWithStatsEnabledIT.testSelectQueriesWithFilters(), 
the region boundaries are not going to intersect as expected with the 
guideposts, since the split points are using raw bytes which won't have the 
sign bit flipped. Below is what you want to do as Phoenix will do the right 
thing in that case wrt to data types. Some other tests need to be changed as 
well - I'd recommend just always having the SPLIT clause in the CREATE TABLE 
statement as it's just more clear.
{code}
private void testSelectQueriesWithFilters(boolean 
useStatsForParallelization) throws Exception {
String tableName = generateUniqueName();
try (Connection conn = DriverManager.getConnection(getUrl())) {
int guidePostWidth = 20;
String ddl =
"CREATE TABLE " + tableName + " (k INTEGER PRIMARY KEY, a 
bigint, b bigint) "
+ " GUIDE_POSTS_WIDTH=" + guidePostWidth
+ ", USE_STATS_FOR_PARALLELIZATION=" + 
useStatsForParallelization + " SPLIT ON (102,105,108)";
conn.createStatement().execute(ddl);
conn.createStatement().execute("upsert into " + tableName + " 
values (100,100,3)");
{code}

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236122#comment-16236122
 ] 

Hadoop QA commented on PHOENIX-4346:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895377/PHOENIX-4346.patch
  against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61.
  ATTACHMENT ID: 12895377

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.monitoring.PhoenixMetricsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ColumnEncodedMutableTxStatsCollectorIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1604//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1604//console

This message is automatically generated.

> Add support for UNSIGNED_LONG type in Pherf scenarios
> -
>
> Key: PHOENIX-4346
> URL: https://issues.apache.org/jira/browse/PHOENIX-4346
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Monani Mihir
>Priority: Minor
> Attachments: PHOENIX-4346.patch
>
>
> Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would 
> good to have UNSIGNED_LONG available for pherf scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236115#comment-16236115
 ] 

James Taylor commented on PHOENIX-4333:
---

We really want to answer the question "Is there a guidepost within every 
region?". Whether a guidepost then intersects the scan is not the check we 
need. For example, you may have a query doing a skip scan which would fail the 
intersection test, but still have a guidepost in the region.

I think if you always set the endRegionKey (instead of only when it's a local 
index) here before the inner loop:
{code}
endRegionKey = regionInfo.getEndKey();
if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means 
we entered the loop) or that the currentGuidePost is less than the region end 
key, then that's enough, since we know that the currentGuidePost is already 
bigger than the start region key. The check for endKey == stopKey is a small 
optimization, since we don't need to do the key comparison again if that's not 
the case since we've already done it as we entered the loop (see comment below).
{code}
// We have a guide post in previous region if the above loop 
was entered
// or if the current key is less than the region end key (since 
the loop
// may not have been entered if our scan end key is smaller 
than the first
// guide post in that region
hasGuidePostInAllRegions &= currentKeyBytes != initialKeyBytes 
|| (endKey == stopKey && currentGuidePost.compareTo(endRegionKey) < 0;)
{code}

Does this not pass all of your tests?

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4332) Indexes should inherit guide post width of the base data table

2017-11-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236045#comment-16236045
 ] 

Hudson commented on PHOENIX-4332:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1860 (See 
[https://builds.apache.org/job/Phoenix-master/1860/])
PHOENIX-4332 Indexes should inherit guide post width of the base data (samarth: 
rev 61684c4431d16deff53adfbb91ea76c13642df61)
* (add) 
phoenix-core/src/it/java/org/apache/phoenix/schema/stats/StatsCollectorIT.java
* (delete) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/StatsCollectorIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedMutableNonTxStatsCollectorIT.java
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/DefaultStatisticsCollector.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/NonColumnEncodedImmutableNonTxStatsCollectorIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedImmutableTxStatsCollectorIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedImmutableNonTxStatsCollectorIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ColumnEncodedMutableTxStatsCollectorIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/SysTableNamespaceMappedStatsCollectorIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/NonColumnEncodedImmutableTxStatsCollectorIT.java


> Indexes should inherit guide post width of the base data table
> --
>
> Key: PHOENIX-4332
> URL: https://issues.apache.org/jira/browse/PHOENIX-4332
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4332.patch
>
>
> Altering guidepost with on data table does not propagate to global index 
> using {{ALTER TABLE}} command.
> Altering global index table runs in not allowed error.
> {noformat}
> ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1;
> Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop 
> column referenced by VIEW columnName=IDX (state=42M01,code=1010)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue

2017-11-02 Thread Josh Mahonin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235907#comment-16235907
 ] 

Josh Mahonin commented on PHOENIX-4347:
---

Can you post this question to the phoenix-users mailing list? I suspect someone 
may have run into this and found a way to do it already. However, if you're 
able to provide a reproducible unit test in PhoenixSparkIT [ 1 ] which 
necessitates a patch, a contribution would be most welcome.

Thanks! 

[ 1 ] 
https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala

> Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
> 
>
> Key: PHOENIX-4347
> URL: https://issues.apache.org/jira/browse/PHOENIX-4347
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
> Environment: CentOS 6.5, Fedora 25
>Reporter: Lokesh Kumar
>Priority: Major
>  Labels: phoenix, spark-sql
>
> Created a Phoenix table with below schema:
> {code:java}
> CREATE TABLE IF NOT EXISTS sample_table (
>   id VARCHAR NOT NULL, 
>   metricid VARCHAR NOT NULL,
>   timestamp TIMESTAMP NOT NULL,
>   metricvalue DOUBLE, 
>   CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
> {code}
> Inserted some data into this and loaded as Spark Dataset using the Phoenix 
> spark datasource  ('org.apache.phoenix.spark') options.
> The Spark Dataset's schema is as given below:
> root
>  |-- ID: string (nullable = true)
>  |-- METRICID: string (nullable = true)
>  |-- TIMESTAMP: timestamp (nullable = true)
>  |-- METRICVALUE: double (nullable = true)
> I apply the Dataset's filter operation on Timestamp column as given below:
> {code:java}
> Dataset ds = 
> ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)")
> {code}
> This operation throws me an exception as:
>  testPhoenixTimestamp(DatasetTest): 
> org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): 
> Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, 
> column 145.
> The generated query looks like this:
> {code:java}
> 2017-11-02 15:29:31,722 INFO  [main] 
> org.apache.phoenix.mapreduce.PhoenixInputFormat
> Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM 
> SAMPLE_TABLE WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 
> 00:00:00.0)
> {code}
> The issue is with Timestamp filter condition, where the timestamp value is 
> not wrapped in to_timestamp() function.
> I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class 
> compileValue() function, by checking the value's class. If it is 
> java.sql.Timestamp then I am wrapping the value with to_timestamp() function.
> Please let me know if there is another way of correctly querying Timestamp 
> values in Phoenix through Spark's Dataset API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4303) Replace HTableInterface,HConnection with Table,Connection interfaces respectively

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235892#comment-16235892
 ] 

Hadoop QA commented on PHOENIX-4303:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895387/PHOENIX-4303.patch
  against master branch at commit 61684c4431d16deff53adfbb91ea76c13642df61.
  ATTACHMENT ID: 12895387

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 15 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1605//console

This message is automatically generated.

> Replace HTableInterface,HConnection with Table,Connection interfaces 
> respectively
> -
>
> Key: PHOENIX-4303
> URL: https://issues.apache.org/jira/browse/PHOENIX-4303
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
>  Labels: HBase-2.0
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4303.patch
>
>
> In latest versions of HBase HTableInterface,HConnection are replaced with 
> Table and Connection respectively. We can make use of new interfaces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] First hbase-2.0.0-alpha4 Release Candidate is available

2017-11-02 Thread Peter Somogyi
+1 (non-binding), found 1 issue.

Checked signatures, sums - OK
Built from source tar and git tag (Oracle JDK 1.8.0_151, Maven 3.5.2) - OK
Rat check - OK
Starting standalone server from bin tar - OK
LTT with 1M rows - OK
LICENSE, NOTICE - OK

Problem:
Starting standalone server after building from src tar fails: same problem
that Guanghao Zhang had. HBASE-18705

On Wed, Nov 1, 2017 at 3:17 PM, Stack  wrote:

> The first release candidate for HBase 2.0.0-alpha4 is up at:
>
>   https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0-alpha4RC0/
>
> Maven artifacts are available from a staging directory here:
>
>   https://repository.apache.org/content/repositories/orgapachehbase-1178
>
> All was signed with my key at 8ACC93D2 [1]
>
> I tagged the RC as 2.0.0-alpha4RC0
> (5c4b985f89c99cc8b0f8515a4097c811a0848835)
>
> hbase-2.0.0-alpha4 is our fourth alpha release along our march toward
> hbase-2.0.0. It includes all that was in previous alphas (new assignment
> manager, offheap read/write path, in-memory compactions, etc.), but had a
> focus on "Coprocessor Fixup": We no longer pass Coprocessors
> InterfaceAudience.Private parameters and we cut down on the access and
> ability to influence hbase core processing (See [2] on why the radical
> changes in Coprocessor Interface). If you are a Coprocessor developer or
> have Coprocessors to deploy on hbase-2.0.0, we need to hear about your
> experience now before we make an hbase-2.0.0 beta.
>
> hbase-2.0.0-alpha4 is a rough cut ('alpha'), not-for-production preview of
> what hbase-2.0.0 will look like. It is meant for devs and downstreamers to
> test drive and flag us early if we messed up anything ahead of our rolling
> GAs.
>
> The list of features addressed in 2.0.0 so far can be found here [3]. There
> are thousands. The list of ~2k+ fixes in 2.0.0 exclusively can be found
> here [4] (My JIRA JQL foo is a bit dodgy -- forgive me if mistakes).
>
> I've updated our overview doc. on the state of 2.0.0 [6]. 2.0.0-beta-1 will
> be our next release. Its theme is the "Finishing up 2.0.0" release. Here is
> the list of what we have targeted for beta-1 [5]. Check it out. Shout if
> there is anything missing. We may do a 2.0.0-beta-2 if a need. We'll see.
>
> Please take this alpha for a spin especially if you are a Coprocessor
> developer or have a Coprocessor you want to deploy on hbase-2.0.0. Please
> vote on whether it ok to put out this RC as our first alpha (bar is low for
> an 'alpha' -- e.g. CHANGES.txt has not been updated). Let the VOTE be open
> for 72 hours (Saturday)
>
> Thanks,
> Your 2.0.0 Release Manager
>
> 1. http://pgp.mit.edu/pks/lookup?op=get=0x9816C7FC8ACC93D2
> 2. Why CPs are Incompatible:
> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_
> ktczrlKHK8N4SZzs/edit#heading=h.9k7mjbauv0wj
> 3. https://goo.gl/scYjJr
> 4. https://goo.gl/tMHkYS
> 5. https://issues.apache.org/jira/projects/HBASE/versions/12340861
> 6.
> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_
> ktczrlKHK8N4SZzs/
>


[jira] [Commented] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios

2017-11-02 Thread Monani Mihir (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235768#comment-16235768
 ] 

Monani Mihir commented on PHOENIX-4346:
---

[~mujtabachohan] can you review this?

> Add support for UNSIGNED_LONG type in Pherf scenarios
> -
>
> Key: PHOENIX-4346
> URL: https://issues.apache.org/jira/browse/PHOENIX-4346
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Monani Mihir
>Priority: Minor
> Attachments: PHOENIX-4346.patch
>
>
> Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would 
> good to have UNSIGNED_LONG available for pherf scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: rename 5.0-HBase-2.0 branch to 5.x-HBase-2.0

2017-11-02 Thread Ankit Singhal
Agreed James.. Thanks Josh for renaming it.

On Thu, Nov 2, 2017 at 4:18 AM, Josh Elser  wrote:

> Just went ahead and did it. No problem from my POV.
>
>
> On 10/31/17 1:54 PM, James Taylor wrote:
>
>> I propose we rename the 5.0-HBase-2.0 branch to 5.x-HBase-2.0 so that we
>> can do all 5.x based releases from this branch similar to the way we do
>> for
>> 4.x-HBase-###.
>>
>>


[jira] [Updated] (PHOENIX-4303) Replace HTableInterface,HConnection with Table,Connection interfaces respectively

2017-11-02 Thread Rajeshbabu Chintaguntla (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajeshbabu Chintaguntla updated PHOENIX-4303:
-
Attachment: PHOENIX-4303.patch

> Replace HTableInterface,HConnection with Table,Connection interfaces 
> respectively
> -
>
> Key: PHOENIX-4303
> URL: https://issues.apache.org/jira/browse/PHOENIX-4303
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
>  Labels: HBase-2.0
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4303.patch
>
>
> In latest versions of HBase HTableInterface,HConnection are replaced with 
> Table and Connection respectively. We can make use of new interfaces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue

2017-11-02 Thread Lokesh Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Kumar updated PHOENIX-4347:
--
Description: 
Created a Phoenix table with below schema:

{code:java}
CREATE TABLE IF NOT EXISTS sample_table (
  id VARCHAR NOT NULL, 
  metricid VARCHAR NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  metricvalue DOUBLE, 
  CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
{code}

Inserted some data into this and loaded as Spark Dataset using the Phoenix 
spark datasource  ('org.apache.phoenix.spark') options.

The Spark Dataset's schema is as given below:

root
 |-- ID: string (nullable = true)
 |-- METRICID: string (nullable = true)
 |-- TIMESTAMP: timestamp (nullable = true)
 |-- METRICVALUE: double (nullable = true)

I apply the Dataset's filter operation on Timestamp column as given below:

{code:java}
Dataset ds = 
ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)")
{code}

This operation throws me an exception as:

 testPhoenixTimestamp(DatasetTest): 
org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax 
error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145.

The generated query looks like this:

{code:java}
2017-11-02 15:29:31,722 INFO  [main] 
org.apache.phoenix.mapreduce.PhoenixInputFormat
Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM 
SAMPLE_TABLE WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 
00:00:00.0)
{code}

The issue is with Timestamp filter condition, where the timestamp value is not 
wrapped in to_timestamp() function.

I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class 
compileValue() function, by checking the value's class. If it is 
java.sql.Timestamp then I am wrapping the value with to_timestamp() function.

Please let me know if there is another way of correctly querying Timestamp 
values in Phoenix through Spark's Dataset API.

  was:
Created a Phoenix table with below schema:

{code:java}
CREATE TABLE IF NOT EXISTS sample_table (
  id VARCHAR NOT NULL, 
  metricid VARCHAR NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  metricvalue DOUBLE, 
  CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
{code}

Inserted some data into this and loaded as Spark Dataset using the Phoenix 
spark datasource  ('org.apache.phoenix.spark') options.

The Spark Dataset's schema is as given below:

root
 |-- ID: string (nullable = true)
 |-- METRICID: string (nullable = true)
 |-- TIMESTAMP: timestamp (nullable = true)
 |-- METRICVALUE: double (nullable = true)

I apply the Dataset's filter operation on Timestamp column as given below:

{code:java}
Dataset ds = 
ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)")
{code}

This operation throws me an exception as:

 testPhoenixTimestamp(DatasetTest): 
org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax 
error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145.

The generated query looks like this:

{code:java}
2017-11-02 15:29:31,722 INFO  [main] 
org.apache.phoenix.mapreduce.PhoenixInputFormat
Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM 
METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 
00:00:00.0)
{code}

The issue is with Timestamp filter condition, where the timestamp value is not 
wrapped in to_timestamp() function.

I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class 
compileValue() function, by checking the value's class. If it is 
java.sql.Timestamp then I am wrapping the value with to_timestamp() function.

Please let me know if there is another way of correctly querying Timestamp 
values in Phoenix through Spark's Dataset API.


> Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
> 
>
> Key: PHOENIX-4347
> URL: https://issues.apache.org/jira/browse/PHOENIX-4347
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
> Environment: CentOS 6.5, Fedora 25
>Reporter: Lokesh Kumar
>Priority: Major
>  Labels: phoenix, spark-sql
>
> Created a Phoenix table with below schema:
> {code:java}
> CREATE TABLE IF NOT EXISTS sample_table (
>   id VARCHAR NOT NULL, 
>   metricid VARCHAR NOT NULL,
>   timestamp TIMESTAMP NOT NULL,
>   metricvalue DOUBLE, 
>   CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
> {code}
> Inserted some data into this and loaded as Spark Dataset using the Phoenix 
> spark datasource  ('org.apache.phoenix.spark') options.
> The Spark Dataset's schema is as given below:
> root
>  |-- ID: string (nullable = true)
>  |-- METRICID: string (nullable = true)
>  |-- TIMESTAMP: 

[jira] [Updated] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue

2017-11-02 Thread Lokesh Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Kumar updated PHOENIX-4347:
--
Description: 
Created a Phoenix table with below schema:

{code:java}
CREATE TABLE IF NOT EXISTS sample_table (
  id VARCHAR NOT NULL, 
  metricid VARCHAR NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  metricvalue DOUBLE, 
  CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
{code}

Inserted some data into this and loaded as Spark Dataset using the Phoenix 
spark datasource  ('org.apache.phoenix.spark') options.

The Spark Dataset's schema is as given below:

root
 |-- ID: string (nullable = true)
 |-- METRICID: string (nullable = true)
 |-- TIMESTAMP: timestamp (nullable = true)
 |-- METRICVALUE: double (nullable = true)

I apply the Dataset's filter operation on Timestamp column as given below:

{code:java}
Dataset ds = 
ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)")
{code}

This operation throws me an exception as:

 testPhoenixTimestamp(DatasetTest): 
org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax 
error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145.

The generated query looks like this:

{code:java}
2017-11-02 15:29:31,722 INFO  [main] 
org.apache.phoenix.mapreduce.PhoenixInputFormat
Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM 
METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 
00:00:00.0)
{code}

The issue is with Timestamp filter condition, where the timestamp value is not 
wrapped in to_timestamp() function.

I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class 
compileValue() function, by checking the value's class. If it is 
java.sql.Timestamp then I am wrapping the value with to_timestamp() function.

Please let me know if there is another way of correctly querying Timestamp 
values in Phoenix through Spark's Dataset API.

  was:
Created a Phoenix table with below schema:

{code:java}
CREATE TABLE IF NOT EXISTS sample_table (
  id VARCHAR NOT NULL, 
  metricid VARCHAR NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  metricvalue DOUBLE, 
  CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
{code}

Inserted some data into this and loaded as Spark Dataset using the Phoenix 
spark datasource  ('org.apache.phoenix.spark') options.

The Spark Dataset's schema is as given below:

root
 |-- ID: string (nullable = true)
 |-- METRICID: string (nullable = true)
 |-- TIMESTAMP: timestamp (nullable = true)
 |-- METRICVALUE: double (nullable = true)

I apply the Dataset's filter operation on Timestamp column as given below:

{code:java}
Dataset ds = 
ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)")
{code}

This operation throws me an exception as:

 testPhoenixTimestamp(DatasetTest): 
org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax 
error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145.

The generated query looks like this:

{code:java}
2017-11-02 15:29:31,722 INFO  [main] 
org.apache.phoenix.mapreduce.PhoenixInputFormat
Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM 
METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= *2017-10-31 
00:00:00.0*)
{code}

The issue is highlighted in bold above, where the timestamp value is not 
wrapped in to_timestamp() function.

I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class 
compileValue() function, by checking the value's class. If it is 
java.sql.Timestamp then I am wrapping the value with to_timestamp() function.

Please let me know if there is another way of correctly querying Timestamp 
values in Phoenix through Spark's Dataset API.


> Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue
> 
>
> Key: PHOENIX-4347
> URL: https://issues.apache.org/jira/browse/PHOENIX-4347
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
> Environment: CentOS 6.5, Fedora 25
>Reporter: Lokesh Kumar
>Priority: Major
>  Labels: phoenix, spark-sql
>
> Created a Phoenix table with below schema:
> {code:java}
> CREATE TABLE IF NOT EXISTS sample_table (
>   id VARCHAR NOT NULL, 
>   metricid VARCHAR NOT NULL,
>   timestamp TIMESTAMP NOT NULL,
>   metricvalue DOUBLE, 
>   CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
> {code}
> Inserted some data into this and loaded as Spark Dataset using the Phoenix 
> spark datasource  ('org.apache.phoenix.spark') options.
> The Spark Dataset's schema is as given below:
> root
>  |-- ID: string (nullable = true)
>  |-- METRICID: string (nullable = true)
>  |-- TIMESTAMP: timestamp 

[jira] [Created] (PHOENIX-4347) Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue

2017-11-02 Thread Lokesh Kumar (JIRA)
Lokesh Kumar created PHOENIX-4347:
-

 Summary: Spark Dataset loaded using Phoenix Spark Datasource - 
Timestamp filter issue
 Key: PHOENIX-4347
 URL: https://issues.apache.org/jira/browse/PHOENIX-4347
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.11.0
 Environment: CentOS 6.5, Fedora 25
Reporter: Lokesh Kumar
Priority: Major


Created a Phoenix table with below schema:

{code:java}
CREATE TABLE IF NOT EXISTS sample_table (
  id VARCHAR NOT NULL, 
  metricid VARCHAR NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  metricvalue DOUBLE, 
  CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20;
{code}

Inserted some data into this and loaded as Spark Dataset using the Phoenix 
spark datasource  ('org.apache.phoenix.spark') options.

The Spark Dataset's schema is as given below:

root
 |-- ID: string (nullable = true)
 |-- METRICID: string (nullable = true)
 |-- TIMESTAMP: timestamp (nullable = true)
 |-- METRICVALUE: double (nullable = true)

I apply the Dataset's filter operation on Timestamp column as given below:

{code:java}
Dataset ds = 
ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)")
{code}

This operation throws me an exception as:

 testPhoenixTimestamp(DatasetTest): 
org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax 
error. Mismatched input. Expecting "RPAREN", got "00" at line 1, column 145.

The generated query looks like this:

{code:java}
2017-11-02 15:29:31,722 INFO  [main] 
org.apache.phoenix.mapreduce.PhoenixInputFormat
Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM 
METRIC_TBR_DATA WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= *2017-10-31 
00:00:00.0*)
{code}

The issue is highlighted in bold above, where the timestamp value is not 
wrapped in to_timestamp() function.

I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class 
compileValue() function, by checking the value's class. If it is 
java.sql.Timestamp then I am wrapping the value with to_timestamp() function.

Please let me know if there is another way of correctly querying Timestamp 
values in Phoenix through Spark's Dataset API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios

2017-11-02 Thread Monani Mihir (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Monani Mihir updated PHOENIX-4346:
--
Attachment: PHOENIX-4346.patch

Patch for master which adds support for UNSIGNED_LONG type for Pherf scenarios.

> Add support for UNSIGNED_LONG type in Pherf scenarios
> -
>
> Key: PHOENIX-4346
> URL: https://issues.apache.org/jira/browse/PHOENIX-4346
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Monani Mihir
>Priority: Minor
> Attachments: PHOENIX-4346.patch
>
>
> Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would 
> good to have UNSIGNED_LONG available for pherf scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4346) Add support for UNSIGNED_LONG type in Pherf scenarios

2017-11-02 Thread Monani Mihir (JIRA)
Monani Mihir created PHOENIX-4346:
-

 Summary: Add support for UNSIGNED_LONG type in Pherf scenarios
 Key: PHOENIX-4346
 URL: https://issues.apache.org/jira/browse/PHOENIX-4346
 Project: Phoenix
  Issue Type: Improvement
Reporter: Monani Mihir
Priority: Minor


Currently Pherf supports INTEGER, CHAR, VARCHAR, DATE and DECIMAL. It would 
good to have UNSIGNED_LONG available for pherf scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] phoenix pull request #280: indextool inedxTable is not an index table for da...

2017-11-02 Thread xsq0718
GitHub user xsq0718 opened a pull request:

https://github.com/apache/phoenix/pull/280

indextool inedxTable is not an index table for dataTable

Phoenix;phoenix-4.8.0-cdh5.8.0
Hbase;1.2.0
Create phoenixTable ;
CREATE Table "everAp"(pk VARCHAR PRIMARY KEY,"ba"."ap" varchar,"ba"."ft" 
varchar,"ba"."et" varchar,"ba"."n" varchar);

Create Index;
create local index EVERAP_INDEX_AP on "everAp"("ba"."ap") async;
Use indexTool;
./hbase org.apache.phoenix.mapreduce.index.IndexTool -dt \"\"everAp\"\" -it 
EVERAP_INDEX_AP -op hdfs:/hbase/data/default/everApIndc



/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/hbase/bin/../lib/native/Linux-amd64-64
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client 
environment:java.io.tmpdir=/tmp
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client 
environment:java.compiler=
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client 
environment:os.version=2.6.32-504.el6.x86_64
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client 
environment:user.name=root
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client 
environment:user.home=/root
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Client 
environment:user.dir=/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/bin
17/11/02 15:08:09 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=slave1:2181,slave2:2181,master:2181 sessionTimeout=6 
watcher=hconnection-0x4470f8a60x0, quorum=slave1:2181,slave2:2181,master:2181, 
baseZNode=/hbase
17/11/02 15:08:09 INFO zookeeper.ClientCnxn: Opening socket connection to 
server master/192.168.0.250:2181. Will not attempt to authenticate using SASL 
(unknown error)
17/11/02 15:08:09 INFO zookeeper.ClientCnxn: Socket connection established, 
initiating session, client: /192.168.0.250:53140, server: 
master/192.168.0.250:2181
17/11/02 15:08:09 INFO zookeeper.ClientCnxn: Session establishment complete 
on server master/192.168.0.250:2181, sessionid = 0x35f518ca651786a, negotiated 
timeout = 6
17/11/02 15:08:10 INFO metrics.Metrics: Initializing metrics system: phoenix
17/11/02 15:08:10 WARN impl.MetricsConfig: Cannot locate configuration: 
tried hadoop-metrics2-phoenix.properties,hadoop-metrics2.properties
17/11/02 15:08:10 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
10 second(s).
17/11/02 15:08:10 INFO impl.MetricsSystemImpl: phoenix metrics system 
started
17/11/02 15:08:11 INFO Configuration.deprecation: hadoop.native.lib is 
deprecated. Instead, use io.native.lib.available
17/11/02 15:08:12 ERROR index.IndexTool: An exception occurred while 
performing the indexing job: IllegalArgumentException:  EVERAP_INDEX_AP is not 
an index table for everAp  at:
java.lang.IllegalArgumentException:  EVERAP_INDEX_AP is not an index table 
for everAp 
at org.apache.phoenix.mapreduce.index.IndexTool.run(IndexTool.java:190)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.phoenix.mapreduce.index.IndexTool.main(IndexTool.java:394)

You have mail in /var/spool/mail/root

**help!!!**

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/phoenix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/phoenix/pull/280.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #280


commit 489a945159e08e663dc73d3cd51568e6ba0a0f38
Author: Samarth Jain 
Date:   2017-06-14T18:19:58Z

PHOENIX-3890 Disable EncodedQualifierCellsList optimization for tables with 
more than one column family

commit 993164b6388a263f1931eae624693e30bf848d29
Author: Thomas 
Date:   2017-06-08T18:20:28Z

PHOENIX-3918 Ensure all function implementations handle null args correctly

commit de6fbc4e2a13cdc482cbc1c91e51c4bc526aa12f
Author: Samarth Jain 
Date:   2017-06-14T19:44:03Z

PHOENIX-3937 Remove @AfterClass methods from test classes annotated with 
@NeedsOwnMiniClusterTest

commit 64121a3c403a3c5206174b33b3c8762d530279f0
Author: Josh Elser 
Date:   2017-06-15T20:34:43Z

PHOENIX-3940 Handle PERCENTILE_CONT against no rows

commit 98db5d63bd3572328da6ba52ba53357f692c6222
Author: Samarth Jain 
Date:   2017-06-16T17:59:45Z

PHOENIX-3942 Fix failing PhoenixMetricsIT test

commit fba5fa28a03279e3fc427de800774690d280edca
Author: Samarth Jain 
Date:   2017-06-19T20:54:39Z

PHOENIX-3930 Move BaseQueryIT to ParallelStatsDisabledIT (Samarth Jain & 
James Taylor)

commit 

[jira] [Commented] (PHOENIX-4332) Indexes should inherit guide post width of the base data table

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235314#comment-16235314
 ] 

Hadoop QA commented on PHOENIX-4332:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12895338/PHOENIX-4332.patch
  against master branch at commit 82364f6b3083d309f2035f1fd6d132a77ecef71a.
  ATTACHMENT ID: 12895338

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+serverProps.put(QueryServices.STATS_GUIDEPOST_WIDTH_BYTES_ATTRIB, 
Long.toString(defaultGuidePostWidth));
+clientProps.put(QueryServices.STATS_GUIDEPOST_WIDTH_BYTES_ATTRIB, 
Long.toString(defaultGuidePostWidth));

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SetPropertyOnEncodedTableIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.PartialIndexRebuilderIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SetPropertyOnNonEncodedTableIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1603//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1603//console

This message is automatically generated.

> Indexes should inherit guide post width of the base data table
> --
>
> Key: PHOENIX-4332
> URL: https://issues.apache.org/jira/browse/PHOENIX-4332
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4332.patch
>
>
> Altering guidepost with on data table does not propagate to global index 
> using {{ALTER TABLE}} command.
> Altering global index table runs in not allowed error.
> {noformat}
> ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1;
> Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop 
> column referenced by VIEW columnName=IDX (state=42M01,code=1010)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235303#comment-16235303
 ] 

Samarth Jain commented on PHOENIX-4333:
---

Is it a safe assumption to make that if intersectScan is returning a non-null 
value, then we have an intersection? 

{code}
Scan newScan = scanRanges.intersectScan(scan, currentKeyBytes, 
currentGuidePostBytes, keyOffset,
false);
if (newScan != null) {
 // guide post was available in the 
}
{code}

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235294#comment-16235294
 ] 

Samarth Jain commented on PHOENIX-4333:
---

Good point, [~jamestaylor]. I don't think my check would work in the below case:

REGION 1 - VIEW1 and VIEW2
REGION2 - VIEW2 and VIEW3

If we collect stats for VIEW1 and VIEW3, then even though both regions have 
stats, they don't have stats for VIEW2. I think I would also need to check 
whether there any guidepost intersected for the region.

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4333:
--
Attachment: PHOENIX-4333_v2.patch

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, 
> PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4333:
--
Attachment: PHOENIX-4333_v2.patch

Updated patch that sets estimate timestamp to null when we don't have 
guideposts available for all regions.

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235275#comment-16235275
 ] 

James Taylor commented on PHOENIX-4333:
---

Does your check handle the case in which multiple regions are scanned and one 
in the middle has no guide posts? Not sure I understand why the check needs to 
be in the catch, but not a big deal.

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4333:
--
Attachment: (was: PHOENIX-4333_v2.patch)

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235262#comment-16235262
 ] 

Samarth Jain commented on PHOENIX-4333:
---

Actually, the check needs to be done inside this catch block:

{code}
catch (EOFException e) {
// We have read all guide posts

}
{code}

And if we are doing there, I think the check I had makes it easier to 
understand what's going on, IMHO.

{code}
+if (regionIndex < stopIndex) {
+/*
+ * We don't have guide posts available for all 
regions. So in this case we
+ * conservatively say that we cannot provide 
estimates
+ */
+gpsAvailableForAllRegions = false;
+}
 }
{code}



> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235253#comment-16235253
 ] 

Samarth Jain commented on PHOENIX-4333:
---

Ah, I see. Yes, that's true. Let me update the patch.

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235250#comment-16235250
 ] 

James Taylor commented on PHOENIX-4333:
---

Haven’t tested it, but if currentKeyBytes gets set during the inner loop, then 
that means we’ve found at least one gp, no? Just a bit simpler way to detect 
that. If that doesn’t work, the way you have it is fine too.

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4332) Indexes should inherit guide post width of the base data table

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4332:
--
Summary: Indexes should inherit guide post width of the base data table  
(was: Stats - Allow setting guide post width on global indexes)

> Indexes should inherit guide post width of the base data table
> --
>
> Key: PHOENIX-4332
> URL: https://issues.apache.org/jira/browse/PHOENIX-4332
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4332.patch
>
>
> Altering guidepost with on data table does not propagate to global index 
> using {{ALTER TABLE}} command.
> Altering global index table runs in not allowed error.
> {noformat}
> ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1;
> Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop 
> column referenced by VIEW columnName=IDX (state=42M01,code=1010)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4343) In CREATE TABLE allow setting guide post width only on base data tables

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4343:
--
Summary: In CREATE TABLE allow setting guide post width only on base data 
tables  (was: In CREATE TABLE only allow setting guide post width on tables and 
global indexes)

> In CREATE TABLE allow setting guide post width only on base data tables
> ---
>
> Key: PHOENIX-4343
> URL: https://issues.apache.org/jira/browse/PHOENIX-4343
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4343.patch, PHOENIX-4343_v2.patch, 
> PHOENIX-4343_v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-11-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235245#comment-16235245
 ] 

Samarth Jain commented on PHOENIX-4333:
---

It might be a late night and lack of coffee but I am not sure I see the 
co-relation here.
{code}
gpsAvailableForAllRegions &= initialKeyBytes != currentKeyBytes;
{code}

We set initialKeyBytes to currentKeyBytes when we know we are not using stats 
for parallelisation.
{code}
if (!useStatsForParallelization) {
/*
 * If we are not using stats for generating parallel scans, 
we need to reset the
 * currentKey back to what it was at the beginning of the 
loop.
 */
currentKeyBytes = initialKeyBytes;
}
{code}

bq. I also think we should set the estimatedRows and estimatedSize to what 
we've found, but only set estimateInfoTimestamp to null if 
!gpsAvailableForAllRegions. That way callers can choose to use or not use the 
partial estimates based on estimateInfoTimestamp.

Makes sense.


> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---
>
> Key: PHOENIX-4333
> URL: https://issues.apache.org/jira/browse/PHOENIX-4333
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate 
> regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view 
> yield partial results (only contains stats for B,1) which are incorrect even 
> though it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4343) In CREATE TABLE only allow setting guide post width on tables and global indexes

2017-11-02 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235240#comment-16235240
 ] 

James Taylor commented on PHOENIX-4343:
---

+1

> In CREATE TABLE only allow setting guide post width on tables and global 
> indexes
> 
>
> Key: PHOENIX-4343
> URL: https://issues.apache.org/jira/browse/PHOENIX-4343
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4343.patch, PHOENIX-4343_v2.patch, 
> PHOENIX-4343_v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4343) In CREATE TABLE only allow setting guide post width on tables and global indexes

2017-11-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4343:
--
Attachment: PHOENIX-4343_v3.patch

Thanks for the review, [~jamestaylor]. Attached is the updated patch.

> In CREATE TABLE only allow setting guide post width on tables and global 
> indexes
> 
>
> Key: PHOENIX-4343
> URL: https://issues.apache.org/jira/browse/PHOENIX-4343
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
>Priority: Major
> Attachments: PHOENIX-4343.patch, PHOENIX-4343_v2.patch, 
> PHOENIX-4343_v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)