date:20171030

[jira] [Commented] (PHOENIX-4234) Unable to find failed csv records in phoenix logs

2017-10-30 Thread suprita (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224439#comment-16224439
 ] 

suprita commented on PHOENIX-4234:
--

Hi Ankit,

Once I Got a response from you regarding phoenix.

Actually I want a quick response ,so posting you my query personally if you 
could answer this along  with posting on Jira group to get immediate response 
if possible for you.

My doubt is described below:

I am using apache phoenix to create table and then dump csv data into table.
But now I want to alter one colomn’s datatype length from varchar(7) to 
varchar(14),without losing the existing data into table.

Can it be done?
If yes how?

I tried the command to address the above mentioned issue but
ALTER TABLE G1V3IN_ADITI ALTER "INVOICE"."SBNUM" set data type varchar(15), 
column "INVOICE"."SBNUM" drop default;

Where sbnum is the colomn whose existing datatype length is 7 but we want to 
change it to 15.
G1V3IN_ADITI is table name.

But facing the below error:
Error: ERROR 601 (42P00): Syntax error. Encountered "ALTER" at line 1, column 
26. (state=42P00,code=601)
org.apache.phoenix.exception.PhoenixParserException: ERROR 601 (42P00): Syntax 
error. Encountered "ALTER" at line 1, column 26.
at 
org.apache.phoenix.exception.PhoenixParserException.newException(PhoenixParserException.java:33)
at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:111)
at 
org.apache.phoenix.jdbc.PhoenixStatement$PhoenixStatementParser.parseStatement(PhoenixStatement.java:1283)
at 
org.apache.phoenix.jdbc.PhoenixStatement.parseStatement(PhoenixStatement.java:1364)
at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1427)
at sqlline.Commands.execute(Commands.java:822)
at sqlline.Commands.sql(Commands.java:732)
at sqlline.SqlLine.dispatch(SqlLine.java:808)
at sqlline.SqlLine.begin(SqlLine.java:681)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:292)
Caused by: NoViableAltException(7@[])
at 
org.apache.phoenix.parse.PhoenixSQLParser.from_table_name(PhoenixSQLParser.java:9081)
at 
org.apache.phoenix.parse.PhoenixSQLParser.alter_table_node(PhoenixSQLParser.java:3229)
at 
org.apache.phoenix.parse.PhoenixSQLParser.oneStatement(PhoenixSQLParser.java:846)
at 
org.apache.phoenix.parse.PhoenixSQLParser.statement(PhoenixSQLParser.java:499)
at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:108)
... 9 more

Please help to solve this.

It would be great help if I get some solution ASAP for this.


Thanks
Suprita Bothra






> Unable to find failed csv records in phoenix logs
> -
>
> Key: PHOENIX-4234
> URL: https://issues.apache.org/jira/browse/PHOENIX-4234
> Project: Phoenix
>  Issue Type: Bug
>Reporter: suprita bothra
>
> Unable to fetch missing records information in phoenix table.How can we fetch 
> the missing records info.
> Like while parsing csv into hbase via bulkloading via mapreduce,and using 
> --igonre-errors  option to parse csv.
> So csv records having error are skipped but we are unable to fetch the info 
> of records which are skipped/failed and dint go into table.
> There must be logs of such information .Please help in identifying if we can 
> get logs of failed records



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (PHOENIX-4330) To alter colomn length of a phoenix table withouht losing existing data in table

2017-10-30 Thread suprita bothra (JIRA)

suprita bothra created PHOENIX-4330:
---

 Summary: To alter colomn length of a phoenix table withouht losing 
existing data in table
 Key: PHOENIX-4330
 URL: https://issues.apache.org/jira/browse/PHOENIX-4330
 Project: Phoenix
  Issue Type: Bug
Reporter: suprita bothra


Hi ,

I am using apache phoenix to create table and then dump csv data into table.
But now I want to alter one colomn’s datatype length from varchar(7) to 
varchar(14),without losing the existing data into table.

Can it be done?
If yes how?

I tried the command to address the above mentioned issue but
ALTER TABLE G1V3IN_ADITI ALTER "INVOICE"."SBNUM" set data type varchar(15), 
column "INVOICE"."SBNUM" drop default;

Where sbnum is the colomn whose existing datatype length is 7 but we want to 
change it to 15.
G1V3IN_ADITI is table name.

But facing the below error:
Error: ERROR 601 (42P00): Syntax error. Encountered "ALTER" at line 1, column 
26. (state=42P00,code=601)
org.apache.phoenix.exception.PhoenixParserException: ERROR 601 (42P00): Syntax 
error. Encountered "ALTER" at line 1, column 26.
at 
org.apache.phoenix.exception.PhoenixParserException.newException(PhoenixParserException.java:33)
at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:111)
at 
org.apache.phoenix.jdbc.PhoenixStatement$PhoenixStatementParser.parseStatement(PhoenixStatement.java:1283)
at 
org.apache.phoenix.jdbc.PhoenixStatement.parseStatement(PhoenixStatement.java:1364)
at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1427)
at sqlline.Commands.execute(Commands.java:822)
at sqlline.Commands.sql(Commands.java:732)
at sqlline.SqlLine.dispatch(SqlLine.java:808)
at sqlline.SqlLine.begin(SqlLine.java:681)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:292)
Caused by: NoViableAltException(7@[])
at 
org.apache.phoenix.parse.PhoenixSQLParser.from_table_name(PhoenixSQLParser.java:9081)
at 
org.apache.phoenix.parse.PhoenixSQLParser.alter_table_node(PhoenixSQLParser.java:3229)
at 
org.apache.phoenix.parse.PhoenixSQLParser.oneStatement(PhoenixSQLParser.java:846)
at 
org.apache.phoenix.parse.PhoenixSQLParser.statement(PhoenixSQLParser.java:499)
at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:108)
... 9 more




Thanks
Suprita Bothra




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4289) UPDATE STATISTICS command does not collect stats for local indexes

2017-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224457#comment-16224457
 ] 

Hudson commented on PHOENIX-4289:
-

SUCCESS: Integrated in Jenkins build Phoenix-master #1849 (See 
[https://builds.apache.org/job/Phoenix-master/1849/])
PHOENIX-4289 UPDATE STATISTICS command does not collect stats for local 
(samarth: rev 60a9b099eccaf328fd796b93176d8ac665fe039c)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/BaseLocalIndexIT.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ExplainPlanWithStatsEnabledIT.java
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/LocalIndexIT.java


> UPDATE STATISTICS command does not collect stats for local indexes
> --
>
> Key: PHOENIX-4289
> URL: https://issues.apache.org/jira/browse/PHOENIX-4289
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1, Phoenix 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Attachments: PHOENIX-4289.patch, PHOENIX-4289_v2.patch, 
> PHOENIX-4289_v3.patch, PHOENIX-4289_v4.patch
>
>
> With clean {{SYSTEM.STATS}} table and restarted HBase server+Phoenix client. 
> Ran {{UPDATE STATISTICS T ALL}} command. Global guidepost width is set to 
> 100M. No stats are generated for any of the local indexes on table T.
> {noformat}
> explain select count(*) from T;
> +---+-++--+
> |   PLAN| 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +---+-++--+
> | CLIENT 8-CHUNK PARALLEL 8-WAY RANGE SCAN OVER T [1]   | 
> null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY   | 
> null| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW  | 
> null| null   | null |
> +---+-++--+
> select * from system.stats;
> +--++-++--++
> |PHYSICAL_NAME | COLUMN_FAMILY  | GUIDE_POST_KEY  | 
> GUIDE_POSTS_WIDTH  |  LAST_STATS_UPDATE_TIME  | GUIDE_POSTS_ROW_COUNT  |
> +--++-++--++
> | T   || | null   | 
> 2017-10-16 18:36:57.884  | null   |
> | T   | 0  | [B@9bd0fa6  | 10099  |   
>| 75756  |
> | T   | 0  | [B@59d2103b | 10057  |   
>| 75748  |
> | T   | 0  | [B@39dcf4b0 | 10058  |   
>| 75748  |
> | T   | 0  | [B@6e4de19b | 10081  |   
>| 75743  |
> | T   | 0  | [B@f6c03cb  | 10044  |   
>| 75744  |
> | T   | 0  | [B@46f699d5 | 10023  |   
>| 75741  |
> | T   | 0  | [B@18518ccf | 10019  |   
>| 75749  |
> | T   | 0  | [B@1991f767 | 10097  |   
>| 75740  |
> | T   | 0  | [B@768ccdc5 | 10092  |   
>| 75740  |
> | T   | 0  | [B@4c6daf0  | 10026  |   
>| 75739  |
> | T   | 0  | [B@10650953 | 10054  |   
>| 75731  |
> | T   | 0  | [B@659eef7  | 10092  |   
>| 75741  |
> | T   | 0  | [B@162be91c | 10023  |   
>| 75752  |
> | T   | 0  | [B@

[jira] [Created] (PHOENIX-4331) Missing version tag for apache-rat-plugin in pom.xml

2017-10-30 Thread Shehzaad Nakhoda (JIRA)

Shehzaad Nakhoda created PHOENIX-4331:
-

 Summary: Missing version tag for apache-rat-plugin in pom.xml
 Key: PHOENIX-4331
 URL: https://issues.apache.org/jira/browse/PHOENIX-4331
 Project: Phoenix
  Issue Type: Bug
Reporter: Shehzaad Nakhoda


Some internal build processes require maven plugins (under 
build->plugins->plugin) to have version numbers. apache-rat-plugin is used in a 
number of pom.xml's under phoenix but it doesn't have a version number 
specified.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4331) Missing version tag for apache-rat-plugin in pom.xml files

2017-10-30 Thread Shehzaad Nakhoda (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shehzaad Nakhoda updated PHOENIX-4331:
--
Summary: Missing version tag for apache-rat-plugin in pom.xml files  (was: 
Missing version tag for apache-rat-plugin in pom.xml)

> Missing version tag for apache-rat-plugin in pom.xml files
> --
>
> Key: PHOENIX-4331
> URL: https://issues.apache.org/jira/browse/PHOENIX-4331
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Shehzaad Nakhoda
> Attachments: PHOENIX-4331_v1.patch
>
>
> Some internal build processes require maven plugins (under 
> build->plugins->plugin) to have version numbers. apache-rat-plugin is used in 
> a number of pom.xml's under phoenix but it doesn't have a version number 
> specified.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4331) Missing version tag for apache-rat-plugin in pom.xml

2017-10-30 Thread Shehzaad Nakhoda (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shehzaad Nakhoda updated PHOENIX-4331:
--
Attachment: PHOENIX-4331_v1.patch

> Missing version tag for apache-rat-plugin in pom.xml
> 
>
> Key: PHOENIX-4331
> URL: https://issues.apache.org/jira/browse/PHOENIX-4331
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Shehzaad Nakhoda
> Attachments: PHOENIX-4331_v1.patch
>
>
> Some internal build processes require maven plugins (under 
> build->plugins->plugin) to have version numbers. apache-rat-plugin is used in 
> a number of pom.xml's under phoenix but it doesn't have a version number 
> specified.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (PHOENIX-4331) Missing version tag for apache-rat-plugin in pom.xml

2017-10-30 Thread Shehzaad Nakhoda (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224477#comment-16224477
 ] 

Shehzaad Nakhoda edited comment on PHOENIX-4331 at 10/30/17 7:57 AM:
-

[^PHOENIX-4331_v1.patch] is a patch that declares 0.12 as the apache-rat-plugin 
version


was (Author: shehzaadn):
Here is a patch that declares 0.12 as the apache-rat-plugin version

> Missing version tag for apache-rat-plugin in pom.xml
> 
>
> Key: PHOENIX-4331
> URL: https://issues.apache.org/jira/browse/PHOENIX-4331
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Shehzaad Nakhoda
> Attachments: PHOENIX-4331_v1.patch
>
>
> Some internal build processes require maven plugins (under 
> build->plugins->plugin) to have version numbers. apache-rat-plugin is used in 
> a number of pom.xml's under phoenix but it doesn't have a version number 
> specified.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224517#comment-16224517
 ] 

Hadoop QA commented on PHOENIX-4287:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12894689/PHOENIX-4287_v2.patch
  against master branch at commit 60a9b099eccaf328fd796b93176d8ac665fe039c.
  ATTACHMENT ID: 12894689

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+while (useStatsForParallelization && 
intersectWithGuidePosts && (endKey.length == 0 || 
currentGuidePost.compareTo(endKey) <= 0)) {
+scans = addNewScan(parallelScans, scans, newScan, 
currentGuidePostBytes, false, regionLocation);
+ByteArrayInputStream stream = new 
ByteArrayInputStream(guidePosts.get(), guidePosts.getOffset(), 
guidePosts.getLength());
+while (endGuideIndex < numGuidePosts && 
endKey.compareTo(PrefixByteCodec.decode(decoder, input)) >= 0) {

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ContextClassloaderIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1592//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1592//console

This message is automatically generated.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>

[jira] [Commented] (PHOENIX-4331) Missing version tag for apache-rat-plugin in pom.xml files

2017-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224717#comment-16224717
 ] 

Hadoop QA commented on PHOENIX-4331:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12894695/PHOENIX-4331_v1.patch
  against master branch at commit 60a9b099eccaf328fd796b93176d8ac665fe039c.
  ATTACHMENT ID: 12894695

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.PartialIndexRebuilderIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.IndexToolIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1593//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1593//console

This message is automatically generated.

> Missing version tag for apache-rat-plugin in pom.xml files
> --
>
> Key: PHOENIX-4331
> URL: https://issues.apache.org/jira/browse/PHOENIX-4331
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Shehzaad Nakhoda
> Attachments: PHOENIX-4331_v1.patch
>
>
> Some internal build processes require maven plugins (under 
> build->plugins->plugin) to have version numbers. apache-rat-plugin is used in 
> a number of pom.xml's under phoenix but it doesn't have a version number 
> specified.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4331) Missing version tag for apache-rat-plugin in pom.xml files

2017-10-30 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225074#comment-16225074
 ] 

Josh Elser commented on PHOENIX-4331:
-

[~shehzaadn], Phoenix has the Apache Pom listed as its parent: 
https://github.com/apache/phoenix/blob/master/pom.xml#L49-L53

The parent defines the version of the apache-rat-plugin - 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom

Why do you think it's necessary to explicit state it in Phoenix's pom?



> Missing version tag for apache-rat-plugin in pom.xml files
> --
>
> Key: PHOENIX-4331
> URL: https://issues.apache.org/jira/browse/PHOENIX-4331
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Shehzaad Nakhoda
> Attachments: PHOENIX-4331_v1.patch
>
>
> Some internal build processes require maven plugins (under 
> build->plugins->plugin) to have version numbers. apache-rat-plugin is used in 
> a number of pom.xml's under phoenix but it doesn't have a version number 
> specified.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (PHOENIX-4329) Test IndexScrutinyTool while table is taking writes

2017-10-30 Thread James Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor resolved PHOENIX-4329.
---
Resolution: Fixed

Thanks for the patch, [~vincentpoon]. I committed to 4.x-HBase-0.98 and master 
branches.

> Test IndexScrutinyTool while table is taking writes
> ---
>
> Key: PHOENIX-4329
> URL: https://issues.apache.org/jira/browse/PHOENIX-4329
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4329.patch
>
>
> Create a test that confirms that after PHOENIX-4277, an index scrutiny can be 
> successfully done on a table taking writes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4329) Test IndexScrutinyTool while table is taking writes

2017-10-30 Thread James Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4329:
--
Fix Version/s: 4.13.0

> Test IndexScrutinyTool while table is taking writes
> ---
>
> Key: PHOENIX-4329
> URL: https://issues.apache.org/jira/browse/PHOENIX-4329
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4329.patch
>
>
> Create a test that confirms that after PHOENIX-4277, an index scrutiny can be 
> successfully done on a table taking writes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-3757) System mutex table not being created in SYSTEM namespace when namespace mapping is enabled

2017-10-30 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225261#comment-16225261
 ] 

Karan Mehta commented on PHOENIX-3757:
--

Thank you [~elserj]

> System mutex table not being created in SYSTEM namespace when namespace 
> mapping is enabled
> --
>
> Key: PHOENIX-3757
> URL: https://issues.apache.org/jira/browse/PHOENIX-3757
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Karan Mehta
>Priority: Critical
>  Labels: namespaces
> Fix For: 4.13.0
>
> Attachments: PHOENIX-3757.001.patch, PHOENIX-3757.002.patch, 
> PHOENIX-3757.003.patch, PHOENIX-3757.004.patch, PHOENIX-3757.005.patch
>
>
> Noticed this issue while writing a test for PHOENIX-3756:
> The SYSTEM.MUTEX table is always created in the default namespace, even when 
> {{phoenix.schema.isNamespaceMappingEnabled=true}}. At a glance, it looks like 
> the logic for the other system tables isn't applied to the mutex table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4277) Treat delete markers consistently with puts for point-in-time scans

2017-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225416#comment-16225416
 ] 

Hudson commented on PHOENIX-4277:
-

SUCCESS: Integrated in Jenkins build Phoenix-master #1850 (See 
[https://builds.apache.org/job/Phoenix-master/1850/])
PHOENIX-4277 Treat delete markers consistently with puts for (jtaylor: rev 
438ac5676e8e8f0a69875d9b91acaf5c8ac6201c)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/BaseScannerRegionObserver.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/TransactionUtil.java
* (add) 
phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/ScanInfoUtil.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/PointInTimeQueryIT.java


> Treat delete markers consistently with puts for point-in-time scans
> ---
>
> Key: PHOENIX-4277
> URL: https://issues.apache.org/jira/browse/PHOENIX-4277
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4277.test.patch, PHOENIX-4277_v2.patch, 
> PHOENIX-4277_v3.patch, PHOENIX-4277_wip.patch
>
>
> The IndexScrutinyTool relies on doing point-in-time scans to determine 
> consistency between the index and data tables. Unfortunately, deletes to the 
> tables cause a problem with this approach, since delete markers take effect 
> even if they're at a later time stamp than the point-in-time at which the 
> scan is being done (unless KEEP_DELETED_CELLS is true). The logic of this is 
> that scans should get the same results before and after a compaction take 
> place.
> Taking snapshots does not help with this since they cannot be taken at a 
> point-in-time and the delete markers will act the same way - there's no way 
> to guarantee that the index and data table snapshots have the same "logical" 
> set of data.
> Using raw scans would allow us to see the delete markers and do the correct 
> point-in-time filtering ourselves. We'd need to write the filters to do this 
> correctly (see the Tephra TransactionVisibilityFilter for an implementation 
> of this that could be adapted). We'd also need to hook this into Phoenix or 
> potentially dip down to the HBase level  to do this.
> Thanks for brainstorming on this with me, [~lhofhansl].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4329) Test IndexScrutinyTool while table is taking writes

2017-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225417#comment-16225417
 ] 

Hudson commented on PHOENIX-4329:
-

SUCCESS: Integrated in Jenkins build Phoenix-master #1850 (See 
[https://builds.apache.org/job/Phoenix-master/1850/])
PHOENIX-4329 Test IndexScrutinyTool while table is taking writes (jtaylor: rev 
0c38f493ca4e35eefa2297f62cbe56cca47bb81d)
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexScrutinyToolIT.java


> Test IndexScrutinyTool while table is taking writes
> ---
>
> Key: PHOENIX-4329
> URL: https://issues.apache.org/jira/browse/PHOENIX-4329
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4329.patch
>
>
> Create a test that confirms that after PHOENIX-4277, an index scrutiny can be 
> successfully done on a table taking writes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225429#comment-16225429
 ] 

James Taylor commented on PHOENIX-4287:
---

Does the v2 patch contain the diffs against master now that  PHOENIX-4289 is 
checked in, [~samarthjain]?

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 27|
> +---+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread Samarth Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225434#comment-16225434
 ] 

Samarth Jain commented on PHOENIX-4287:
---

Yes, v2 just has changes relevant to this JIRA.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 27|
> +---+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225495#comment-16225495
 ] 

James Taylor commented on PHOENIX-4287:
---

It looks like we're duplicating the same logic to come up with the estimates. 
Why can't we do that in getParallelScans()? Or refactor it so we can?

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 27|
> +---+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4198) Remove the need for users to have access to the Phoenix SYSTEM tables to create tables

2017-10-30 Thread Thomas D'Silva (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225560#comment-16225560
 ] 

Thomas D'Silva commented on PHOENIX-4198:
-

> Actually, this is needed when a NEW user is creating a view and Admin has 
> just given READ/EXEC access to the user on the data table.

[~karanmehta93] is working on PHOENIX-672 which will handle this case, so maybe 
some of this code can be used for that JIRA.

As part of that JIRA we will only allow grant/revoke on the parent physical 
table for views and indexes, and keep the permissions on the index and view 
index physical tables in sync with the parent table, so I don't think we need 
to have an automatic grant option. We should always keep the index tables in 
sync with the parent. FYI [~jamestaylor]

In MetadataEndpointImpl, you should always check that the user has the required 
permissions on the parent table indexes (since they are added to the ptable of 
child views see MetaDataClient.addIndexesFromParentTable )

{code}
+if (parentPhysicalSchemaTableNames[1] != null) {
+parentTableKey = 
SchemaUtil.getTableKey(ByteUtil.EMPTY_BYTE_ARRAY,
+parentPhysicalSchemaTableNames[0], 
parentPhysicalSchemaTableNames[1]);
+PTable parentTable = loadTable(env, parentTableKey, new 
ImmutableBytesPtr(parentTableKey),
+clientTimeStamp, clientTimeStamp, clientVersion);
+cParentPhysicalName = 
parentTable.getPhysicalName().getBytes();
+if (parentSchemaTableNames[1] != null
+&& Bytes.compareTo(parentSchemaTableNames[1], 
parentPhysicalSchemaTableNames[1]) != 0) {
+// parent table is a view
+
indexes.add(TableName.valueOf(MetaDataUtil.getViewIndexPhysicalName(cParentPhysicalName)));
+} else {
+for (PTable index : parentTable.getIndexes()) {
+
indexes.add(TableName.valueOf(index.getPhysicalName().getBytes()));
+}
+}
+
+} else {
+// Mapped View
+cParentPhysicalName = 
SchemaUtil.getTableNameAsBytes(schemaName, tableName);
 }
{code}

Also the view index physical table might not exist (if that view doesn't have 
an index on it), so you only need to check for the permission if it exists. 

Apart from these, LGTM. Great work!





> Remove the need for users to have access to the Phoenix SYSTEM tables to 
> create tables
> --
>
> Key: PHOENIX-4198
> URL: https://issues.apache.org/jira/browse/PHOENIX-4198
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>  Labels: namespaces, security
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4198.patch, PHOENIX-4198_v2.patch, 
> PHOENIX-4198_v3.patch, PHOENIX-4198_v4.patch, PHOENIX-4198_v5.patch
>
>
> Problem statement:-
> A user who doesn't have access to a table should also not be able to modify  
> Phoenix Metadata. Currently, every user required to have a write permission 
> to SYSTEM tables which is a security concern as they can 
> create/alter/drop/corrupt meta data of any other table without proper access 
> to the corresponding physical tables.
> [~devaraj] recommended a solution as below.
> 1. A coprocessor endpoint would be implemented and all write accesses to the 
> catalog table would have to necessarily go through that. The 'hbase' user 
> would own that table. Today, there is MetaDataEndpointImpl that's run on the 
> RS where the catalog is hosted, and that could be enhanced to serve the 
> purpose we need.
> 2. The regionserver hosting the catalog table would do the needful for all 
> catalog updates - creating the mutations as needed, that is.
> 3. The coprocessor endpoint could use Ranger to do necessary authorization 
> checks before updating the catalog table. So for example, if a user doesn't 
> have authorization to create a table in a certain namespace, or update the 
> schema, etc., it can reject such requests outright. Only after successful 
> validations, does it perform the operations (physical operations to do with 
> creating the table, and updating the catalog table with the necessary 
> mutations).
> 4. In essence, the code that implements dealing with DDLs, would be hosted in 
> the catalog table endpoint. The client code would be really thin, and it 
> would just invoke the endpoint with the necessary info. The additional thing 
> that needs to be done in the endpoint is the validation of authorization to 
> prevent unauthorized users from making changes to som

[jira] [Created] (PHOENIX-4332) Stats - Altering guidepost width on base table does not propagate to global index

2017-10-30 Thread Mujtaba Chohan (JIRA)

Mujtaba Chohan created PHOENIX-4332:
---

 Summary: Stats - Altering guidepost width on base table does not 
propagate to global index
 Key: PHOENIX-4332
 URL: https://issues.apache.org/jira/browse/PHOENIX-4332
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Mujtaba Chohan


Altering guidepost with on data table does not propagate to global index using 
{{ALTER TABLE}} command.

Altering global index table runs in not allowed error.
{noformat}
ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1;
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop column 
referenced by VIEW columnName=IDX (state=42M01,code=1010)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (PHOENIX-4332) Stats - Altering guidepost width on base table does not propagate to global index

2017-10-30 Thread Mujtaba Chohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mujtaba Chohan reassigned PHOENIX-4332:
---

Assignee: Samarth Jain

> Stats - Altering guidepost width on base table does not propagate to global 
> index
> -
>
> Key: PHOENIX-4332
> URL: https://issues.apache.org/jira/browse/PHOENIX-4332
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>
> Altering guidepost with on data table does not propagate to global index 
> using {{ALTER TABLE}} command.
> Altering global index table runs in not allowed error.
> {noformat}
> ALTER TABLE IDX SET GUIDE_POSTS_WIDTH=1;
> Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop 
> column referenced by VIEW columnName=IDX (state=42M01,code=1010)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread Samarth Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225583#comment-16225583
 ] 

Samarth Jain commented on PHOENIX-4287:
---

There is some level of duplication but the generation of estimates when 
statsParallelization is off is relatively simpler. We only need to intersect 
scan stop and start key with guideposts and not worry about region boundaries 
and everything else which the code in getParallelScans() does.  My previous 
attempt at using the existing code to generate estimates and not generate 
intra-region scans failed miserably. I will sync with you offline to see if 
what we can do to reuse the existing code.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +---

[jira] [Commented] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225669#comment-16225669
 ] 

Hudson commented on PHOENIX-4322:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1851 (See 
[https://builds.apache.org/job/Phoenix-master/1851/])
PHOENIX-4322 DESC primary key column with variable length does not work 
(maryannxue: rev b0220fa7522fd7e1848ad428a47121b205dec504)
* (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/SortOrderIT.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ScanUtil.java


> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-10-30 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225706#comment-16225706
 ] 

Andrew Purtell commented on PHOENIX-4328:
-

bq.  Cons: Expensive call every time, since this method is always called 
several times.

Can the result be cached in a static location? The check only needs to happen 
once. We aren't going to switch mapping strategy at runtime.

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view

2017-10-30 Thread Mujtaba Chohan (JIRA)

Mujtaba Chohan created PHOENIX-4333:
---

 Summary: Stats - Incorrect estimate when stats are updated on a 
tenant specific view
 Key: PHOENIX-4333
 URL: https://issues.apache.org/jira/browse/PHOENIX-4333
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Mujtaba Chohan
Assignee: Samarth Jain


Consider two tenants A, B with tenant specific view on 2 separate 
regions/region servers.

{noformat}
Region 1 keys:
A,1
A,2
B,1
Region 2 keys:
B,2
B,3
{noformat}

When stats are updated on tenant A view. Querying stats on tenant B view yield 
partial results (only contains stats for B,1) which are incorrect even though 
it shows updated timestamp as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (PHOENIX-4334) Unable to update stats on views that reside on separate regions before phoenix.stats.updateFrequency has elapsed

2017-10-30 Thread Mujtaba Chohan (JIRA)

Mujtaba Chohan created PHOENIX-4334:
---

 Summary: Unable to update stats on views that reside on separate 
regions before phoenix.stats.updateFrequency has elapsed
 Key: PHOENIX-4334
 URL: https://issues.apache.org/jira/browse/PHOENIX-4334
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Mujtaba Chohan
Assignee: Samarth Jain


Consider multiple tenant views that all reside on unique region/region servers. 
Updating stats on any one of the view causes other views to report estimated 
stats last update time as current resulting in stats command getting ignored 
for other views till {{phoenix.stats.updateFrequency}} has elapsed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread Mujtaba Chohan (JIRA)

Mujtaba Chohan created PHOENIX-4335:
---

 Summary: System catalog snapshot created each time a new 
connection is created
 Key: PHOENIX-4335
 URL: https://issues.apache.org/jira/browse/PHOENIX-4335
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan


With current head of 4.x, System Catalog snapshot is created on each new 
connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread Mujtaba Chohan (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225876#comment-16225876
 ] 

Mujtaba Chohan commented on PHOENIX-4335:
-

This affects 4.12.0 release as well.

> System catalog snapshot created each time a new connection is created
> -
>
> Key: PHOENIX-4335
> URL: https://issues.apache.org/jira/browse/PHOENIX-4335
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>
> With current head of 4.x, System Catalog snapshot is created on each new 
> connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread Mujtaba Chohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mujtaba Chohan updated PHOENIX-4335:

Affects Version/s: 4.12.0

> System catalog snapshot created each time a new connection is created
> -
>
> Key: PHOENIX-4335
> URL: https://issues.apache.org/jira/browse/PHOENIX-4335
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>
> With current head of 4.x, System Catalog snapshot is created on each new 
> connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-10-30 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225997#comment-16225997
 ] 

Karan Mehta commented on PHOENIX-4328:
--

bq. Can the result be cached in a static location? The check only needs to 
happen once. We aren't going to switch mapping strategy at runtime.

This is not for one-time. This concern is at run time only. We might have some 
clients (app-servers) with the property 
{{phoenix.schema.isNamespaceMappingEnabled}} SET and others with 
{{phoenix.schema.isNamespaceMappingEnabled}} UNSET. Thus there is a mismatch. 
We need to find runtime resolution if we don't want downtime. The way it 
currently works is it that if there is a mismatch, it throws up an exception 
and doesn't allow client to connect to it.

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-10-30 Thread Ethan Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226068#comment-16226068
 ] 

Ethan Wang commented on PHOENIX-4328:
-

bq. The first client with phoenix.schema.mapSystemTablesToNamespace true will 
acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
happens, all the other clients will start failing.

So all other clients with "phoenix.schema.isNamespaceMappingEnabled" set to 
true or false will both failing while one client is upgrading from sys. to sys: 
?

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4328) Support clients having different "phoenix.schema.mapSystemTablesToNamespace" property

2017-10-30 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226074#comment-16226074
 ] 

Karan Mehta commented on PHOENIX-4328:
--

bq. So all other clients with "phoenix.schema.isNamespaceMappingEnabled" set to 
true or false will both failing while one client is upgrading from sys. to sys: 
?

Clients with {{phoenix.schema.isNamespaceMappingEnabled}} to true will only 
fail if both of them try to acquire the connection at the same time and try to 
migrate the System tables to System namespace. If a client connects to the 
cluster after the migration is completed then it will be fine.
Clients with {{phoenix.schema.isNamespaceMappingEnabled}} to false will fail 
because of the two scenarios mentioned in the description.
Clients will anyways fail since this involves disabling of SYSCAT table, which 
is a downtime for sure.

> Support clients having different "phoenix.schema.mapSystemTablesToNamespace" 
> property
> -
>
> Key: PHOENIX-4328
> URL: https://issues.apache.org/jira/browse/PHOENIX-4328
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Karan Mehta
>
> Imagine a scenario when we enable namespaces for phoenix on the server side 
> and set the property {{phoenix.schema.isNamespaceMappingEnabled}} to true. A 
> bunch of clients are trying to connect to this cluster. All of these clients 
> have 
> {{phoenix.schema.isNamespaceMappingEnabled}} to true, however 
>  for some of them {{phoenix.schema.isNamespaceMappingEnabled}} is set to 
> false and it is true for others. (A typical case for rolling upgrade.)
> The first client with {{phoenix.schema.mapSystemTablesToNamespace}} true will 
> acquire lock in SYSMUTEX and migrate the system tables. As soon as this 
> happens, all the other clients will start failing. 
> There are two scenarios here.
> 1. A new client trying to connect to server without this property set
> This will fail since the ConnectionQueryServicesImpl checks if SYSCAT is 
> namespace mapped or not, If there is a mismatch, it throws an exception, thus 
> the client doesn't get any connection.
> 2. Clients already connected to cluster but don't have this property set
> This will fail because every query calls the endpoint coprocessor on SYSCAT 
> to determine the PTable of the query table and the physical HBase table name 
> is resolved based on the properties. Thus, we try to call the method on 
> SYSCAT instead of SYS:CAT and it results in a TableNotFoundException.
> This JIRA is to discuss about the potential ways in which we can handle this 
> issue.
> Some ideas around this after discussing with [~twdsi...@gmail.com]:
> 1. Build retry logic around the code that works with SYSTEM tables 
> (coprocessor calls etc.) Try with SYSCAT and if it fails, try with SYS:CAT
> Cons: Difficult to maintain and code scattered all over. 
> 2. Use SchemaUtil.getPhyscialTableName method to return the table name that 
> actually exists. (Only for SYSTEM tables)
> Call admin.tableExists to determine if SYSCAT or SYS:CAT exists and return 
> that name. The client properties get ignored on this one. 
> Cons: Expensive call every time, since this method is always called several 
> times.
> [~jamestaylor] [~elserj] [~an...@apache.org] [~apurtell] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226085#comment-16226085
 ] 

Hudson commented on PHOENIX-4322:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1852 (See 
[https://builds.apache.org/job/Phoenix-master/1852/])
PHOENIX-4322 DESC primary key column with variable length does not work 
(maryannxue: rev 45a9c275dbbf9206264236c690f40c309d97da3c)
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ScanUtil.java


> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread James Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4335:
--
Fix Version/s: 4.13.0

> System catalog snapshot created each time a new connection is created
> -
>
> Key: PHOENIX-4335
> URL: https://issues.apache.org/jira/browse/PHOENIX-4335
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
> Fix For: 4.13.0
>
>
> With current head of 4.x, System Catalog snapshot is created on each new 
> connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread James Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4335:
--
Priority: Blocker  (was: Major)

> System catalog snapshot created each time a new connection is created
> -
>
> Key: PHOENIX-4335
> URL: https://issues.apache.org/jira/browse/PHOENIX-4335
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Priority: Blocker
> Fix For: 4.13.0
>
>
> With current head of 4.x, System Catalog snapshot is created on each new 
> connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread Maryann Xue (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue resolved PHOENIX-4322.
--
Resolution: Fixed

> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread Maryann Xue (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-4322:
-
Fix Version/s: 4.13.0

> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
> Fix For: 4.13.0
>
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Reopened] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread James Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor reopened PHOENIX-4322:
---

Can we explore adjusting the RVC row key generation code instead of changing 
ScanUtil since this appears to be the only case that sometimes generates a 
trailing null byte? I'm a bit nervous to change the ScanUtil code if this is a 
very special case as that code is super critical.

Also, it's fine to submit patches that'll kick off a Jenkins run without a code 
review (see https://phoenix.apache.org/contributing.html#Local_Git_workflow), 
but let's make sure patches get reviewed before committing to branches that 
releases come out of. Or we could alternatively change our policy - feel free 
to start a discussion thread.

Since we're pretty close to getting a 4.13 release out, I'm going to revert 
this for now. FYI, the active release branches are 4.x-HBase-0.98 and master 
(which is used for HBase-1.3 releases).

> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
> Fix For: 4.13.0
>
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread Samarth Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4287:
--
Attachment: PHOENIX-4287_v3_wip.patch

wip patch for an attempt to use the existing code. Doesn't work, yet.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch, 
> PHOENIX-4287_v3_wip.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 27|
> +---+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4323) LocalIndexes could fail if your data row is not in the same region as your index region

2017-10-30 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226136#comment-16226136
 ] 

James Taylor commented on PHOENIX-4323:
---

If you want to ensure that local index always sorts before table data, you'd 
need to look at the schema of the Phoenix table. You could use an algorithm 
like this:
- while leading fixed length types, append 0 byte for width of type
- while variable length type, append 0 byte
- append one more 0 byte

Otherwise, it can still happen. If the above it prohibitively expensive, then 
can we live with the write failing for this corner case (and can we make it 
fail in a reasonable way)?

> LocalIndexes could fail if your data row is not in the same region as your 
> index region
> ---
>
> Key: PHOENIX-4323
> URL: https://issues.apache.org/jira/browse/PHOENIX-4323
> Project: Phoenix
>  Issue Type: Bug
>Reporter: churro morales
>Assignee: Vincent Poon
> Attachments: LocalIndexIT.java
>
>
> This is not likely to happen, but if this does your data table and index 
> write will never succeed. 
> In HRegion.doMiniBatchMutation() 
> You create index rows in the preBatchMutate() then when you call checkRow() 
> on that index row the exception will bubble up if the index row is not in the 
> same region as your data row.  
> Like I said this is unlikely, but you would have to do a region merge to fix 
> this issue if encountered.  
> [~vincentpoon] has a test which he will attach to this JIRA showing an 
> example how this can happen. The write will never succeed unless you merge 
> regions if this ever happens. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226193#comment-16226193
 ] 

Hudson commented on PHOENIX-4322:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1853 (See 
[https://builds.apache.org/job/Phoenix-master/1853/])
Revert "PHOENIX-4322 DESC primary key column with variable length does 
(jtaylor: rev 6b24e0d5869839f861f3b7069e865e71d1fc61c6)
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ScanUtil.java
Revert "PHOENIX-4322 DESC primary key column with variable length does 
(jtaylor: rev a7af29f9e90308d5a2805cc3eabf4e607fbe3cb2)
* (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/SortOrderIT.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ScanUtil.java


> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
> Fix For: 4.13.0
>
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226207#comment-16226207
 ] 

Hadoop QA commented on PHOENIX-4287:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12894902/PHOENIX-4287_v3_wip.patch
  against master branch at commit a7af29f9e90308d5a2805cc3eabf4e607fbe3cb2.
  ATTACHMENT ID: 12894902

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+//boolean startNewScan = scanGrouper.shouldStartNewScan(plan, 
scans, startKey, crossedRegionBoundary);
+//
scan.setAttribute(BaseScannerRegionObserver.SCAN_REGION_SERVER, 
regionLocation.getServerName().getVersionedBytes());
+scans = addNewScan(parallelScans, scans, newScan, endKey, true, 
regionLocation, null, null, null);
+scans = addNewScan(parallelScans, scans, newScan, endKey, 
true, regionLocation, null, null, null);

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ExplainPlanWithStatsEnabledIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1594//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1594//console

This message is automatically generated.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch, 
> PHOENIX-4287_v3_wip.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>

[jira] [Commented] (PHOENIX-4290) Full table scan performed for DELETE with table having immutable indexes

2017-10-30 Thread Thomas D'Silva (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226235#comment-16226235
 ] 

Thomas D'Silva commented on PHOENIX-4290:
-

+1 LGTM

nit: fix this comment
{code}
+// The data table is always the last one in the list, but if 
an i
{code}


> Full table scan performed for DELETE with table having immutable indexes
> 
>
> Key: PHOENIX-4290
> URL: https://issues.apache.org/jira/browse/PHOENIX-4290
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4290_v1.patch, PHOENIX-4290_v2.patch, 
> PHOENIX-4290_wip1.patch, PHOENIX-4290_wip10.patch, PHOENIX-4290_wip2.patch, 
> PHOENIX-4290_wip3.patch, PHOENIX-4290_wip4.patch, PHOENIX-4290_wip5.patch, 
> PHOENIX-4290_wip6.patch, PHOENIX-4290_wip7.patch, PHOENIX-4290_wip8.patch, 
> PHOENIX-4290_wip9.patch
>
>
> If a DELETE command is issued with a partial match for the leading part of 
> the primary key, instead of using the data table, when the table has 
> immutable indexes, a full scan will occur against the index.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread Maryann Xue (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226282#comment-16226282
 ] 

Maryann Xue commented on PHOENIX-4322:
--

Sorry, [~jamestaylor], for checking that in without code being reviewed. I was 
thinking PHOENIX-3050 depended on this issue and the impact of this issue might 
be limited.

I first thought the issue lied in RVC evaluation, but found out later that RVC 
chooses to kick away the ASC separators but to keep the DESC separators 
deliberately. I think it's just trying to match with the way we organize PK 
columns into a rowkey. Since there's no way for RVC itself to know what context 
it is used in, I thought it would be better to look into the caller, and in 
this case, it's the ScanUtil.
The condition I proposed to add to the "if" clause in my patch is now strict 
enough I think: we only avoid adding an extra separator if "this current value 
is not empty && it's an RVC && its last byte is already the same separator".

> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
> Fix For: 4.13.0
>
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-3050) Handle DESC columns in child/parent join optimization

2017-10-30 Thread Maryann Xue (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-3050:
-
Attachment: PHOENIX-3050.patch

This patch will work after PHOENIX-4322 gets in. Existing test 
{{HashJoinMoreIT#testBug2961()}} verifies this issue.

> Handle DESC columns in child/parent join optimization
> -
>
> Key: PHOENIX-3050
> URL: https://issues.apache.org/jira/browse/PHOENIX-3050
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
> Attachments: PHOENIX-3050.patch
>
>
> We found that child/parent join optimization would not work with DESC pk 
> columns. So as a quick fix for PHOENIX-3029, we simply avoid DESC columns 
> when optimizing, which would have no impact on the overall approach and no 
> impact on ASC columns.
>  
> But eventually we need to make the optimization work with DESC columns too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4290) Full table scan performed for DELETE with table having immutable indexes

2017-10-30 Thread James Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4290:
--
Attachment: PHOENIX-4290_v3.patch

Thanks for the review, [~tdsilva]. Attaching final patch with code comment 
updated.

> Full table scan performed for DELETE with table having immutable indexes
> 
>
> Key: PHOENIX-4290
> URL: https://issues.apache.org/jira/browse/PHOENIX-4290
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.13.0, 4.12.1
>
> Attachments: PHOENIX-4290_v1.patch, PHOENIX-4290_v2.patch, 
> PHOENIX-4290_v3.patch, PHOENIX-4290_wip1.patch, PHOENIX-4290_wip10.patch, 
> PHOENIX-4290_wip2.patch, PHOENIX-4290_wip3.patch, PHOENIX-4290_wip4.patch, 
> PHOENIX-4290_wip5.patch, PHOENIX-4290_wip6.patch, PHOENIX-4290_wip7.patch, 
> PHOENIX-4290_wip8.patch, PHOENIX-4290_wip9.patch
>
>
> If a DELETE command is issued with a partial match for the leading part of 
> the primary key, instead of using the data table, when the table has 
> immutable indexes, a full scan will occur against the index.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4322) DESC primary key column with variable length does not work in SkipScanFilter

2017-10-30 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226320#comment-16226320
 ] 

James Taylor commented on PHOENIX-4322:
---

No problem, [~maryannxue]. It doesn't seem right that the RVC implementation 
has trailing separator bytes. I'd take a look there and try to understand if 
that can be changed.

> DESC primary key column with variable length does not work in SkipScanFilter
> 
>
> Key: PHOENIX-4322
> URL: https://issues.apache.org/jira/browse/PHOENIX-4322
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Minor
> Fix For: 4.13.0
>
>
> Example:
> {code}
> @Test
> public void inDescCompositePK3() throws Exception {
> String table = generateUniqueName();
> String ddl = "CREATE table " + table + " (oid VARCHAR NOT NULL, code 
> VARCHAR NOT NULL constraint pk primary key (oid DESC, code DESC))";
> Object[][] insertedRows = new Object[][]{{"o1", "1"}, {"o2", "2"}, 
> {"o3", "3"}};
> runQueryTest(ddl, upsert("oid", "code"), insertedRows, new 
> Object[][]{{"o2", "2"}, {"o1", "1"}}, new WhereCondition("(oid, code)", "IN", 
> "(('o2', '2'), ('o1', '1'))"),
> table);
> }
> {code}
> Here the last column in primary key is in DESC order and has variable length, 
> and WHERE clause involves an "IN" operator with RowValueConstructor 
> specifying all PK columns. We get no results.
> This ends up being the root cause for not being able to use child/parent join 
> optimization on DESC pk columns as described in PHOENIX-3050.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread Samarth Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4287:
--
Attachment: PHOENIX-4287_v3.patch

I think I figured out what was going on. When we are not using stats for 
parallelization, we need to reset the start key of the scan to either the 
original scan's start key (if we are looking at the first region) or to the end 
key of the previous region.

[~jamestaylor] - your keen eyes would be much appreciated. It is tricky to get 
this stuff right.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch, 
> PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+

[jira] [Commented] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226336#comment-16226336
 ] 

James Taylor commented on PHOENIX-4335:
---

I think this may occur because I increment the MIN_SYSTEM_TABLE_TIMESTAMP, but 
there's no upgrade code that would ever increase the timestamp (because no 
upgrade code is required for 4.12).

WDYT, [~samarthjain]? It'd be good to have a test that fails if the upgrade is 
done a second time you get a connection.

> System catalog snapshot created each time a new connection is created
> -
>
> Key: PHOENIX-4335
> URL: https://issues.apache.org/jira/browse/PHOENIX-4335
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Priority: Blocker
> Fix For: 4.13.0
>
>
> With current head of 4.x, System Catalog snapshot is created on each new 
> connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4335) System catalog snapshot created each time a new connection is created

2017-10-30 Thread Samarth Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226347#comment-16226347
 ] 

Samarth Jain commented on PHOENIX-4335:
---

Would a straightforward change be to revert the MIN_SYSTEM_TABLE_TIMESTAMP 
increment? We rely on the system table's timestamp to check whether we need to 
create a snapshot.

> System catalog snapshot created each time a new connection is created
> -
>
> Key: PHOENIX-4335
> URL: https://issues.apache.org/jira/browse/PHOENIX-4335
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Priority: Blocker
> Fix For: 4.13.0
>
>
> With current head of 4.x, System Catalog snapshot is created on each new 
> connection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4334) Unable to update stats on views that reside on separate regions before phoenix.stats.updateFrequency has elapsed

2017-10-30 Thread Samarth Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226349#comment-16226349
 ] 

Samarth Jain commented on PHOENIX-4334:
---

We store last_update_time at the physical table level. So if we end up 
collecting stats for view1, then we will have to wait for 
phoenix.stats.updateFrequency before update stats on view2 has any effect. An 
alternative would be set phoenix.stats.updateFrequency to 0. 

I will take a look at why view2 is reporting estimate time as current time.

> Unable to update stats on views that reside on separate regions before 
> phoenix.stats.updateFrequency has elapsed
> 
>
> Key: PHOENIX-4334
> URL: https://issues.apache.org/jira/browse/PHOENIX-4334
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>
> Consider multiple tenant views that all reside on unique region/region 
> servers. Updating stats on any one of the view causes other views to report 
> estimated stats last update time as current resulting in stats command 
> getting ignored for other views till {{phoenix.stats.updateFrequency}} has 
> elapsed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4334) Unable to update stats on views that reside on separate regions before phoenix.stats.updateFrequency has elapsed

2017-10-30 Thread Samarth Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226350#comment-16226350
 ] 

Samarth Jain commented on PHOENIX-4334:
---

[~jamestaylor] - any other ideas on how we can prevent update stats on view2 to 
not block itself from running when update stats on view1 has already run? We 
could possibly store last_update_stats_time at the logical table level too. But 
that would be a non-trivial change.

> Unable to update stats on views that reside on separate regions before 
> phoenix.stats.updateFrequency has elapsed
> 
>
> Key: PHOENIX-4334
> URL: https://issues.apache.org/jira/browse/PHOENIX-4334
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>
> Consider multiple tenant views that all reside on unique region/region 
> servers. Updating stats on any one of the view causes other views to report 
> estimated stats last update time as current resulting in stats command 
> getting ignored for other views till {{phoenix.stats.updateFrequency}} has 
> elapsed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization

2017-10-30 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226354#comment-16226354
 ] 

James Taylor commented on PHOENIX-4287:
---

Good catch, [~samarthjain]. Minor nit: it might be a little more clear to just 
have a new local variable like initialKeyBytes before the inner while loop that 
stores the current value of currentKeyBytes that you can use to set 
currentKeyBytes back to if you're not using stats for parallelization.

> Incorrect aggregate query results when stats are disable for parallelization
> 
>
> Key: PHOENIX-4287
> URL: https://issues.apache.org/jira/browse/PHOENIX-4287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
> Environment: HBase 1.3.1
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
>  Labels: localIndex
> Fix For: 4.12.1
>
> Attachments: PHOENIX-4287.patch, PHOENIX-4287_v2.patch, 
> PHOENIX-4287_v3.patch, PHOENIX-4287_v3_wip.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +---+-++---+
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +---+-++---+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899   | 332170 | 150792825 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 625043899   | 332170 | 150792825 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 625043899   | 332170 | 150792825 |
> +---+-++---+
> select count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 0 |
> +---+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-+++
> |   PLAN  
>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--+-+++
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470   | 332151 | 1507928257617  |
> | SERVER FILTER BY FIRST KEY ONLY 
>  | 438492470   | 332151 | 1507928257617  |
> | SERVER AGGREGATE INTO SINGLE ROW
>  | 438492470   | 332151 | 1507928257617  |
> +--+-+++
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +---+
> | COUNT(1)  |
> +---+
> | 14|
> +---+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--+-++--+
> | PLAN | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +--+-++--+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null| 
> null   | null |
> | SERVER FILTER BY FIRST KEY ONLY  | null 
>| null   | null |
> | SERVER AGGREGATE INTO SINGLE ROW | null 
>| null   | null |
> +--+-++--+
> select /*+NO_INDEX*/ count(*) from

53 matches

Mail list logo