[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354681#comment-15354681
 ] 

Josh Elser commented on PHOENIX-2724:
-

bq. It didn't make a difference either with phoenix.stats.cache.maxSize set to 
1G

Interesting. That's peculiar. Do you have any representative test to simulate 
this (that isn't 100's of GB)? Is it possible to repro this on a smaller 
dataset by artificially reducing the guidepost size to something extremely 
small?

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf

2016-06-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354600#comment-15354600
 ] 

Josh Elser commented on PHOENIX-2931:
-

Two more nits:

{code}
-print "Zookeeper not specified. \nUsage: sqlline.py  \
- \nExample: \n 1. sqlline.py localhost:2181:/hbase \n 2. 
sqlline.py \
-localhost:2181:/hbase ../examples/stock_symbol.sql"
-sys.exit()
+def printUsage():
+print "\nUsage: sqlline.py [zookeeper_quorum_port] \
{code}

{{zookeeper_quorum_port}} seems inaccurate. I think the original {{zookeeper}} 
is better.

{code}
-if len(sys.argv) > 2:
-sqlfile = "--run=" + phoenix_utils.shell_quote([sys.argv[2]])
+sqlfile = ""
+zookeeper = ""
 
 # HBase configuration folder path (where hbase-site.xml reside) for
 # HBase/Phoenix client side property override
 hbase_config_path = os.getenv('HBASE_CONF_DIR', phoenix_utils.current_dir)
 
+if len(sys.argv) == 2:
+if os.path.isfile(sys.argv[1]):
+sqlfile = sys.argv[1]
+else:
+zookeeper = sys.argv[1]
+
+if len(sys.argv) == 3:
+if os.path.isfile(sys.argv[1]):
+printUsage()
+else:
+zookeeper = sys.argv[1]
+sqlfile = sys.argv[2]
+
+if sqlfile:
+sqlfile = "--run=" + sqlfile
+
{code}

It seems like you dropped an {{phoenix_utils.shell_quote}} invocation on 
{{sqlfile}}.

I can fix both of these on commit if you agree, [~aliciashu].

> Phoenix client asks users to provide configs in cli that are present on the 
> machine in hbase conf
> -
>
> Key: PHOENIX-2931
> URL: https://issues.apache.org/jira/browse/PHOENIX-2931
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Alicia Ying Shu
>Assignee: Alicia Ying Shu
>Priority: Minor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, 
> PHOENIX-2931-v3.patch, PHOENIX-2931-v4.patch, PHOENIX-2931.patch
>
>
> Users had complaints on running commands like
> {code}
> phoenix-sqlline 
> pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure
>  service-logs.sql
> {code}
> However the zookeeper quorum and the port are available in hbase configs. 
> Phoenix should read these configs from the system instead of having the user 
> supply them every time.
> What we can do is to introduce a keyword "default". If it is specified, 
> default zookeeper quorum and port will be taken from hbase configs. 
> Otherwise, users can specify their own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3037) Setup proper security context in compaction/split coprocessor hooks

2016-06-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354536#comment-15354536
 ] 

Lars Hofhansl commented on PHOENIX-3037:


[~samarthjain], [~giacomotaylor]

> Setup proper security context in compaction/split coprocessor hooks
> ---
>
> Key: PHOENIX-3037
> URL: https://issues.apache.org/jira/browse/PHOENIX-3037
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> See HBASE-16115 for a discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3037) Setup proper security context in compaction/split coprocessor hooks

2016-06-28 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created PHOENIX-3037:
--

 Summary: Setup proper security context in compaction/split 
coprocessor hooks
 Key: PHOENIX-3037
 URL: https://issues.apache.org/jira/browse/PHOENIX-3037
 Project: Phoenix
  Issue Type: Bug
Reporter: Lars Hofhansl


See HBASE-16115 for a discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2999) Upgrading Multi-tenant table to map with namespace using upgradeUtil

2016-06-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354530#comment-15354530
 ] 

Hadoop QA commented on PHOENIX-2999:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12814571/PHOENIX-2999_v3.patch
  against master branch at commit 93e7c1b30d5934a0c2c668f79a0db8ebac5a92fb.
  ATTACHMENT ID: 12814571

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
34 warning messages.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+Map> tenantViewNames = 
MetaDataUtil.getTenantViewNames(phxConn, phoenixFullTableName);
+phxConn = DriverManager.getConnection(getUrl(), 
props).unwrap(PhoenixConnection.class);
+public void testMapMultiTenantTableToNamespaceDuringUpgrade() throws 
SQLException, SnapshotCreationException,
+String hbaseTableName = 
SchemaUtil.getPhysicalTableName(Bytes.toBytes(phoenixFullTableName), true)
++ "(k VARCHAR not null, v INTEGER not null, f INTEGER, g 
INTEGER NULL, h INTEGER NULL CONSTRAINT pk PRIMARY KEY(k,v)) 
MULTI_TENANT=true");
+.prepareStatement("UPSERT INTO " + phoenixFullTableName + 
" VALUES(?, ?, 0, 0, 0)");
+conn.createStatement().execute("create index " + indexName + " on 
" + phoenixFullTableName + "(f)");
+conn.createStatement().execute("CREATE VIEW diff.v (col VARCHAR) 
AS SELECT * FROM " + phoenixFullTableName);
+conn.createStatement().execute("CREATE VIEW test.v (col VARCHAR) 
AS SELECT * FROM " + phoenixFullTableName);
+conn.createStatement().execute("CREATE VIEW v (col VARCHAR) AS 
SELECT * FROM " + phoenixFullTableName);

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/420//testReport/
Javadoc warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/420//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/420//console

This message is automatically generated.

> Upgrading Multi-tenant table to map with namespace using upgradeUtil
> 
>
> Key: PHOENIX-2999
> URL: https://issues.apache.org/jira/browse/PHOENIX-2999
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2999.patch, PHOENIX-2999_v1.patch, 
> PHOENIX-2999_v2.patch, PHOENIX-2999_v3.patch
>
>
> currently upgradeUtil doesn't handle multi-tenant table with tenant views 
> properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3018) Write local updates to region than HTable in master branch

2016-06-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354502#comment-15354502
 ] 

Hudson commented on PHOENIX-3018:
-

FAILURE: Integrated in Phoenix-master #1299 (See 
[https://builds.apache.org/job/Phoenix-master/1299/])
PHOENIX-3018 Write local updates to region than HTable in master (rajeshbabu: 
rev 93e7c1b30d5934a0c2c668f79a0db8ebac5a92fb)
* phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
* 
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java
* 
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/recovery/TrackingParallelWriterIndexCommitter.java


> Write local updates to region than HTable in master branch
> --
>
> Key: PHOENIX-3018
> URL: https://issues.apache.org/jira/browse/PHOENIX-3018
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3018.patch, PHOENIX-3018_v2.patch
>
>
> Currently in master branch writing local updates through HTable than Region. 
> We can make it region so updates will be master. This is the change needed 
> for master branch only. Others are fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2999) Upgrading Multi-tenant table to map with namespace using upgradeUtil

2016-06-28 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354461#comment-15354461
 ] 

Ankit Singhal edited comment on PHOENIX-2999 at 6/29/16 3:22 AM:
-

[~samarthjain],updated my patch with the fix to support new local index 
implementation.
I know you are highly tied up with last time fixes but would you also mind 
giving a look to this.
Need to get this too in 4.8 as namespace upgrade is not working with new local 
index implementation on view.


There is still some issue with local index upgrade of multi-tenant table having 
local indexes on multiple views(PHOENIX-3002). Pinged [~rajeshbabu] internally 
with reproduction steps and he is looking into the issue.


was (Author: an...@apache.org):
[~samarthjain],updated my patch with the fix to support new local index 
implementation.
I know you are highly tied up with last time fixes but would you also mind 
giving a look to this.
Need to get this too in 4.8 as namespace upgrade is not working with new local 
index implementation on view.


> Upgrading Multi-tenant table to map with namespace using upgradeUtil
> 
>
> Key: PHOENIX-2999
> URL: https://issues.apache.org/jira/browse/PHOENIX-2999
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2999.patch, PHOENIX-2999_v1.patch, 
> PHOENIX-2999_v2.patch, PHOENIX-2999_v3.patch
>
>
> currently upgradeUtil doesn't handle multi-tenant table with tenant views 
> properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2999) Upgrading Multi-tenant table to map with namespace using upgradeUtil

2016-06-28 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-2999:
---
Attachment: PHOENIX-2999_v3.patch

[~samarthjain],updated my patch with the fix to support new local index 
implementation.
I know you are highly tied up with last time fixes but would you also mind 
giving a look to this.
Need to get this too in 4.8 as namespace upgrade is not working with new local 
index implementation on view.


> Upgrading Multi-tenant table to map with namespace using upgradeUtil
> 
>
> Key: PHOENIX-2999
> URL: https://issues.apache.org/jira/browse/PHOENIX-2999
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2999.patch, PHOENIX-2999_v1.patch, 
> PHOENIX-2999_v2.patch, PHOENIX-2999_v3.patch
>
>
> currently upgradeUtil doesn't handle multi-tenant table with tenant views 
> properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2999) Upgrading Multi-tenant table to map with namespace using upgradeUtil

2016-06-28 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-2999:
---
Priority: Critical  (was: Major)

> Upgrading Multi-tenant table to map with namespace using upgradeUtil
> 
>
> Key: PHOENIX-2999
> URL: https://issues.apache.org/jira/browse/PHOENIX-2999
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2999.patch, PHOENIX-2999_v1.patch, 
> PHOENIX-2999_v2.patch, PHOENIX-2999_v3.patch
>
>
> currently upgradeUtil doesn't handle multi-tenant table with tenant views 
> properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2968) Minimize RPCs for ALTER statement over APPEND_ONLY_SCHEMA

2016-06-28 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-2968:

Attachment: PHOENIX-2968.patch

[~jamestaylor]

Can you please review?

> Minimize RPCs for ALTER statement over APPEND_ONLY_SCHEMA
> -
>
> Key: PHOENIX-2968
> URL: https://issues.apache.org/jira/browse/PHOENIX-2968
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Mujtaba Chohan
>Assignee: Thomas D'Silva
>  Labels: Argus
> Fix For: 4.9.0
>
> Attachments: PHOENIX-2968.patch
>
>
> View is created with {{CREATE VIEW IF NOT EXISTS ... APPEND_ONLY_SCHEMA = 
> true, UPDATE_CACHE_FREQUENCY=90}} 
> If {{ALTER VIEW  ADD IF NOT EXISTS MYCOL VARCHAR}} is executed 
> before each upsert statement then performance for each 1K batch upsert 
> degrades by ~40X (50ms for 1K batch without alter statement vs 2300ms with 
> alter statement)
> This is for the case when MYCOL exists. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf

2016-06-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353885#comment-15353885
 ] 

James Taylor commented on PHOENIX-2931:
---

Would you mind committing this, [~elserj] if you're good with it for 4.8?

> Phoenix client asks users to provide configs in cli that are present on the 
> machine in hbase conf
> -
>
> Key: PHOENIX-2931
> URL: https://issues.apache.org/jira/browse/PHOENIX-2931
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Alicia Ying Shu
>Assignee: Alicia Ying Shu
>Priority: Minor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, 
> PHOENIX-2931-v3.patch, PHOENIX-2931-v4.patch, PHOENIX-2931.patch
>
>
> Users had complaints on running commands like
> {code}
> phoenix-sqlline 
> pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure
>  service-logs.sql
> {code}
> However the zookeeper quorum and the port are available in hbase configs. 
> Phoenix should read these configs from the system instead of having the user 
> supply them every time.
> What we can do is to introduce a keyword "default". If it is specified, 
> default zookeeper quorum and port will be taken from hbase configs. 
> Otherwise, users can specify their own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: URIException with PhoenixHBaseLoader

2016-06-28 Thread James Taylor
Have you filed a JIRA yet and do you have a patch available?
Thanks,
James

On Monday, June 27, 2016, Prashant Kommireddi  wrote:

> Agreed. This method call isn't needed for phoenix loader (or any such
> non-direct-fs loaders). You should allow a config to handle it.
>
> On Mon, Jun 27, 2016 at 12:14 PM, Siddhi Mehta  > wrote:
>
> > Hello All,
> >
> > I am getting a URISyntaxException when I try to execute my pig script
> using
> > PHoenixHBaseLoader. Traced attached below.
> > Looking through the code Pig splits multiple paths provided to it based
> on
> > comma(',') and during the query parsing step
> > QueryParserUtils.setHdfsServers(absolutePath, pigContext) tried to split
> > paths based on comma(',') and create URI's/PATHS for the same.
> >
> > Certain loaders like 'PhoenixHBaseLoader' donot pass hdfs locations and
> > instead work with passing PhoenixQueryStatement in the location.
> > e.g.
> > *A = load 'hbase://query/SELECT ID,NAME,DATE FROM HIRES WHERE DATE >
> > TO_DATE('1990-12-21 05:55:00.000')*
> >
> > This locations needs not be parsed to get hdfsservers path from them.
> > Does it make sense to introduce a config/loader property to annotate if
> the
> > loader/store is dealing with hdfs locations and based on the property
> make
> > a function call to  QueryParserUtils.setHdfsServers(absolutePath,
> > pigContext).
> >
> > *Thoughts?*
> >
> > * Stack trace *
> >
> > Caused by: Failed to parse: Pig script failed to parse:
> >  pig script failed to validate:
> > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative
> > path in absolute URI: CREATED_DATE FROM HIRES WHERE
> > CREATED_DATE>=TO_DATE('1990-12-21
> >
> 05:55:00.000')%20AND%20CREATED_DATE%3CTO_DATE('2016-03-08%2008:00:00.000')
> > at
> > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:199)
> > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1712)
> > ... 30 more
> > Caused by:
> >  pig script failed to validate:
> > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative
> > path in absolute URI: CREATED_DATE FROM HIRES WHERE
> > CREATED_DATE>=TO_DATE('1990-12-21
> >
> 05:55:00.000')%20AND%20CREATED_DATE%3CTO_DATE('2016-03-08%2008:00:00.000')
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:897)
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
> > at
> > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
> > ... 31 more
> > Caused by: java.lang.IllegalArgumentException:
> java.net.URISyntaxException:
> > Relative path in absolute URI: CREATED_DATE FROM HIRES WHERE
> > CREATED_DATE>=TO_DATE('1990-12-21
> >
> 05:55:00.000')%20AND%20CREATED_DATE%3CTO_DATE('2016-03-08%2008:00:00.000')
> > at org.apache.hadoop.fs.Path.initialize(Path.java:206)
> > at org.apache.hadoop.fs.Path.(Path.java:172)
> > at
> >
> >
> org.apache.pig.parser.QueryParserUtils.getRemoteHosts(QueryParserUtils.java:138)
> > at
> >
> >
> org.apache.pig.parser.QueryParserUtils.setHdfsServers(QueryParserUtils.java:104)
> > at
> >
> >
> org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:892)
> > ... 37 more
> > Caused by: java.net.URISyntaxException: Relative path in absolute URI:
> > CREATED_DATE FROM HIRES WHERE CREATED_DATE>=TO_DATE('1990-12-21
> >
> 05:55:00.000')%20AND%20CREATED_DATE%3CTO_DATE('2016-03-08%2008:00:00.000')
> > at java.net.URI.checkPath(URI.java:1823)
> > at java.net.URI.(URI.java:745)
> > at org.apache.hadoop.fs.Path.initialize(Path.java:203)
> > ... 41 more
> >
> > Thanks,
> > Siddhi
> >
>


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353744#comment-15353744
 ] 

Mujtaba Chohan commented on PHOENIX-2724:
-

[~jamestaylor] It didn't make a difference either with 
{{phoenix.stats.cache.maxSize}} set to 1G

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353722#comment-15353722
 ] 

James Taylor commented on PHOENIX-2724:
---

[~mujtabachohan] - did you get a chance to perf test this with the config 
mentioned above?

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353721#comment-15353721
 ] 

James Taylor commented on PHOENIX-2724:
---

I'd suggest making the configuration for whether a query runs serial be a 
percentage of the MAX_FILESIZE so it's not tied to guideposts. It's just a 
check used when a LIMIT is used - 99% of the time it's going to be smaller.

Point queries will be faster if we use guideposts. We should remove that 
comment - we've perf tested in the past and verified this. I'm not sure how 
significant it is, but I think for 4.8 we should continue to use guideposts for 
point queries.

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353718#comment-15353718
 ] 

Josh Elser commented on PHOENIX-2724:
-

bq. Mujtaba Chohan - as a first test, can you try increasing that cache size 
phoenix.stats.cache.maxSize? This will be an important config parameter. We 
might want to switch it to being a percentage of the heap instead of an 
absolute time.

Additionally, enabling {{TRACE}} on 
{{org.apache.phoenix.query.TableStatsCache}} will tell you when entries are 
added or evicted to the client-side patch.

bq. Mujtaba Chohan did try updating the client side cache by adjusting 
phoenix.client.maxMetaDataCacheSize to 1GB but that didn't help either.

If it is related to the stats not being cached, altering that property wouldn't 
change anything.

bq. Previously, the server-side cache was being used (which I think is bigger). 
If the cache is too small, we end up making an RPC each time to get the stats.

I'm also wondering if there's an optimization to be had in avoiding this case 
in TableStatsCache. We should be able to determine when this is happening (the 
cache not actually acting as a cache for some configuration reason) and just 
short-circuit the RPC, sending back {{EMPTY_STATS}}.

[~samarthjain], [~mujtabachohan], sorry you both got sucked into debugging this 
one. I'm lamenting even more the lack of insight we have into this (ideally, it 
should have been very easy to tell after the fact). I've been rolling the idea 
around about some mechanism we can plug into on the client to better understand 
execution (nothing fancy). Maybe we need to think about this soon after 4.8.0

I'm at a conference most of this week, but I'll try to keep an eye on my inbox 
and help out where possible.

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3036) Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT

2016-06-28 Thread Samarth Jain (JIRA)
Samarth Jain created PHOENIX-3036:
-

 Summary: Modify phoenix IT tests to extend 
BaseHBaseManagedTimeTableReuseIT
 Key: PHOENIX-3036
 URL: https://issues.apache.org/jira/browse/PHOENIX-3036
 Project: Phoenix
  Issue Type: Improvement
Reporter: Samarth Jain






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2990) Ensure documentation on "time/date" datatypes/functions acknowledge lack of JDBC compliance

2016-06-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353573#comment-15353573
 ] 

Josh Elser commented on PHOENIX-2990:
-

Gotcha. On the same page :)

> Ensure documentation on "time/date" datatypes/functions acknowledge lack of 
> JDBC compliance
> ---
>
> Key: PHOENIX-2990
> URL: https://issues.apache.org/jira/browse/PHOENIX-2990
> Project: Phoenix
>  Issue Type: Task
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2990.diff
>
>
> In talking with [~speleato] about some differences in test cases between the 
> thick and thin driver and DATE/TIMESTAMP datatypes, Sergio asked me if the 
> docs were accurate on the Phoenix website about this.
> Taking a look at Data Types and Functions documentation, we don't outwardly 
> warn users that these are not 100% compliant with the JDBC APIs.
> We do have the issue tracked in JIRA in PHOENIX-868 (and more, i'm sure), but 
> it would be good to make sure the website is also forward in warning users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353533#comment-15353533
 ] 

Samarth Jain commented on PHOENIX-2724:
---

Currently our way of determining whether a query should be executed serially 
relies on whether the amount of data we need to scan is below a threshold. 

So the way you are suggesting, it would mean that we should execute our queries 
serially following these conditions:

{code}
if (perScanLimit == null || scan.getFilter() != null) {
return false;
}
estRowSize = SchemaUtil.estimateRowSize(table);
return (perScanLimit * estRowSize < threshold);
{code}

This kind of check will also mean that we don't need to fetch guide posts info 
for determining whether to execute a query serially or in parallel. 
Which mode to follow will then just be governed by a static threshold config. 
We would likely need to set our threshold to something like 500 MB or so. 
Having to scan a 20GB region using a single scan will likely cause our queries 
to run slower.

It also looks like we use guide posts today for executing point look up 
queries. 

{code}
private boolean useStats() {
boolean isPointLookup = context.getScanRanges().isPointLookup();
/*
 *  Don't use guide posts if:
 *  1) We're doing a point lookup, as HBase is fast enough at those
 * to not need them to be further parallelized. TODO: perf test to 
verify
 *  2) We're collecting stats, as in this case we need to scan entire
 * regions worth of data to track where to put the guide posts.
 */
if (isPointLookup || ScanUtil.isAnalyzeTable(scan)) {
return false;
}
return true;
}

{code}

So it seems like we shouldn't fetch guide posts for point queries too.







> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-3002) Upgrading to 4.8 doesn't recreate local indexes

2016-06-28 Thread Rajeshbabu Chintaguntla (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajeshbabu Chintaguntla resolved PHOENIX-3002.
--
Resolution: Fixed

Added variables for rs.getString(2), rs.getString(4), etc.. and pushed the 
addendum to master and 4.x branches. Thanks [~samarthjain] for review.

> Upgrading to 4.8 doesn't recreate local indexes
> ---
>
> Key: PHOENIX-3002
> URL: https://issues.apache.org/jira/browse/PHOENIX-3002
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Rajeshbabu Chintaguntla
>Priority: Blocker
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3002.patch, PHOENIX-3002_addendum.patch, 
> PHOENIX-3002_v0.patch, PHOENIX-3002_v1.patch, PHOENIX-3002_v2.patch, 
> PHOENIX-3002_v3.patch
>
>
> [~rajeshbabu] - I noticed that when upgrading to 4.8, local indexes created 
> with 4.7 or before aren't getting recreated with the new local indexes 
> implementation.  I am not seeing the metadata rows for the recreated indices 
> in SYSTEM.CATALOG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-3027) Upserting rows to a table with a mutable index using a tenant specific connection fails

2016-06-28 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva resolved PHOENIX-3027.
-
Resolution: Fixed

> Upserting rows to a table with a mutable index using a tenant specific 
> connection fails
> ---
>
> Key: PHOENIX-3027
> URL: https://issues.apache.org/jira/browse/PHOENIX-3027
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
>Reporter: Thomas D'Silva
>Assignee: Thomas D'Silva
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3027-v2.patch, PHOENIX-3027.patch
>
>
> With the following exception
> org.apache.phoenix.execute.CommitException: java.sql.SQLException: ERROR 2008 
> (INT10): Unable to find cached index metadata.  ERROR 2008 (INT10): ERROR 
> 2008 (INT10): Unable to find cached index metadata.  key=-2923123348037284635 
> region=T_1466804698521,,1466804698532.3d1ab071438dad421af3f78e8af3530d. Index 
> update failed
>   at org.apache.phoenix.execute.MutationState.send(MutationState.java:984)
>   at 
> org.apache.phoenix.execute.MutationState.send(MutationState.java:1317)
>   at 
> org.apache.phoenix.execute.MutationState.commit(MutationState.java:1149)
>   at 
> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:524)
>   at 
> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:1)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:521)
>   at 
> org.apache.phoenix.end2end.index.MutableIndexIT.testTenantSpecificConnection(MutableIndexIT.java:671)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
>   at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
> Caused by: java.sql.SQLException: ERROR 2008 (INT10): Unable to find cached 
> index metadata.  ERROR 2008 (INT10): ERROR 20

[jira] [Updated] (PHOENIX-3027) Upserting rows to a table with a mutable index using a tenant specific connection fails

2016-06-28 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3027:

Attachment: PHOENIX-3027-v2.patch

Sure, I made the change. I have attached the final patch.

> Upserting rows to a table with a mutable index using a tenant specific 
> connection fails
> ---
>
> Key: PHOENIX-3027
> URL: https://issues.apache.org/jira/browse/PHOENIX-3027
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
>Reporter: Thomas D'Silva
>Assignee: Thomas D'Silva
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3027-v2.patch, PHOENIX-3027.patch
>
>
> With the following exception
> org.apache.phoenix.execute.CommitException: java.sql.SQLException: ERROR 2008 
> (INT10): Unable to find cached index metadata.  ERROR 2008 (INT10): ERROR 
> 2008 (INT10): Unable to find cached index metadata.  key=-2923123348037284635 
> region=T_1466804698521,,1466804698532.3d1ab071438dad421af3f78e8af3530d. Index 
> update failed
>   at org.apache.phoenix.execute.MutationState.send(MutationState.java:984)
>   at 
> org.apache.phoenix.execute.MutationState.send(MutationState.java:1317)
>   at 
> org.apache.phoenix.execute.MutationState.commit(MutationState.java:1149)
>   at 
> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:524)
>   at 
> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:1)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:521)
>   at 
> org.apache.phoenix.end2end.index.MutableIndexIT.testTenantSpecificConnection(MutableIndexIT.java:671)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
>   at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
> Caused by: java.sql.SQLException: ERROR 2008 (INT10

[jira] [Commented] (PHOENIX-3028) StatisticsWriter should update the stats collection timestamp asynchronously

2016-06-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353506#comment-15353506
 ] 

Hudson commented on PHOENIX-3028:
-

SUCCESS: Integrated in Phoenix-master #1297 (See 
[https://builds.apache.org/job/Phoenix-master/1297/])
PHOENIX-3028 StatisticsWriter should update the stats collection (samarth: rev 
f12bda4b45100622b4c1faf59e46a5d4493a1d29)
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsWriter.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/DefaultStatisticsCollector.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollectionRunTracker.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsScanner.java


> StatisticsWriter should update the stats collection timestamp asynchronously
> 
>
> Key: PHOENIX-3028
> URL: https://issues.apache.org/jira/browse/PHOENIX-3028
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3028.patch, PHOENIX-3028_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353472#comment-15353472
 ] 

James Taylor commented on PHOENIX-2724:
---

[~samarthjain] - what do you think of the idea of not attempting to get 
guideposts just for serial queries? Seems like an easy fix and worthwhile 
regardless.

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2990) Ensure documentation on "time/date" datatypes/functions acknowledge lack of JDBC compliance

2016-06-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353465#comment-15353465
 ] 

James Taylor commented on PHOENIX-2990:
---

It's fine to commit now, [~elserj]. I meant that it'd be good to make our 
date/time types JDBC compliant in a future major release, but *not* our next 
4.8 release.

> Ensure documentation on "time/date" datatypes/functions acknowledge lack of 
> JDBC compliance
> ---
>
> Key: PHOENIX-2990
> URL: https://issues.apache.org/jira/browse/PHOENIX-2990
> Project: Phoenix
>  Issue Type: Task
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2990.diff
>
>
> In talking with [~speleato] about some differences in test cases between the 
> thick and thin driver and DATE/TIMESTAMP datatypes, Sergio asked me if the 
> docs were accurate on the Phoenix website about this.
> Taking a look at Data Types and Functions documentation, we don't outwardly 
> warn users that these are not 100% compliant with the JDBC APIs.
> We do have the issue tracked in JIRA in PHOENIX-868 (and more, i'm sure), but 
> it would be good to make sure the website is also forward in warning users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-3028) StatisticsWriter should update the stats collection timestamp asynchronously

2016-06-28 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain resolved PHOENIX-3028.
---
   Resolution: Fixed
Fix Version/s: 4.8.0

> StatisticsWriter should update the stats collection timestamp asynchronously
> 
>
> Key: PHOENIX-3028
> URL: https://issues.apache.org/jira/browse/PHOENIX-3028
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3028.patch, PHOENIX-3028_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: where are we at with the RC?

2016-06-28 Thread Samarth Jain
PHOENIX-3028 is in. PHOENIX-2724 is a blocker as of now. Will work on this
today to see if it is just a matter of tuning phoenix config or something
more.
If time permits, I would like to opportunistically get PHOENIX-3035 in.

On Mon, Jun 27, 2016 at 10:56 PM, James Taylor 
wrote:

> What are the outstanding JIRAs? Would it be possible to update this daily
> so we can zero in on getting an RC up? If folks could commit there
> outstanding patches (or find a committer to do it for you), that be much
> appreciated.
>
> Thanks,
> James
>


[jira] [Commented] (PHOENIX-2990) Ensure documentation on "time/date" datatypes/functions acknowledge lack of JDBC compliance

2016-06-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353320#comment-15353320
 ] 

Josh Elser commented on PHOENIX-2990:
-

Oh, ok. I can hold off on updating the website until after 4.8 goes out. Thanks 
James. 

> Ensure documentation on "time/date" datatypes/functions acknowledge lack of 
> JDBC compliance
> ---
>
> Key: PHOENIX-2990
> URL: https://issues.apache.org/jira/browse/PHOENIX-2990
> Project: Phoenix
>  Issue Type: Task
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2990.diff
>
>
> In talking with [~speleato] about some differences in test cases between the 
> thick and thin driver and DATE/TIMESTAMP datatypes, Sergio asked me if the 
> docs were accurate on the Phoenix website about this.
> Taking a look at Data Types and Functions documentation, we don't outwardly 
> warn users that these are not 100% compliant with the JDBC APIs.
> We do have the issue tracked in JIRA in PHOENIX-868 (and more, i'm sure), but 
> it would be good to make sure the website is also forward in warning users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2966) Implement mechanism to build async indexes when running locally

2016-06-28 Thread Loknath Priyatham Teja Singamsetty (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353063#comment-15353063
 ] 

Loknath Priyatham Teja Singamsetty  commented on PHOENIX-2966:
--

[~jamestaylor][~samarthjain] 

a) Borrowed the query from PhoenixMRJobSubmitter CANDIDATE_INDEX_INFO_QUERY 
which is also missing the column_family and async_created_date. Will be 
correcting the query and adding the required columns for both usecases and 
access it from some constants file (common location) instead of defining at two 
places.
b) The changes to PhoenixEmbeddedDriver.java is minor typo correction of name 
from "QUARUM" to "QUORUM" for zookeeper related variable name.
c) On the unit test front, I don't see any unit test for BuildIndexScheduleTask 
which is another timer task inside the MetaDataRegionObserver.java. However, 
will see if i can add one or cover as part of workflow or ftest.



> Implement mechanism to build async indexes when running locally
> ---
>
> Key: PHOENIX-2966
> URL: https://issues.apache.org/jira/browse/PHOENIX-2966
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Loknath Priyatham Teja Singamsetty 
> Fix For: 4.8.0
>
> Attachments: phoenix-2966.patch
>
>
> For local, non distributed environments, we need to build indexes that were 
> build asynchronously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352576#comment-15352576
 ] 

Samarth Jain commented on PHOENIX-2724:
---

[~mujtabachohan] did try updating the client side cache by adjusting 
phoenix.client.maxMetaDataCacheSize to 1GB but that didn't help either. We will 
try tomorrow playing with the server side cache size too and see if it helps. I 
will also do some microbenchmarking to better understand the bottleneck.

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3035) Prevent statistics collection from failing major compaction

2016-06-28 Thread Samarth Jain (JIRA)
Samarth Jain created PHOENIX-3035:
-

 Summary: Prevent statistics collection from failing major 
compaction
 Key: PHOENIX-3035
 URL: https://issues.apache.org/jira/browse/PHOENIX-3035
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain
Assignee: Samarth Jain






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3028) StatisticsWriter should update the stats collection timestamp asynchronously

2016-06-28 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3028:
--
Summary: StatisticsWriter should update the stats collection timestamp 
asynchronously  (was: StatisticsWriter shouldn't fail compaction in case of 
errors)

> StatisticsWriter should update the stats collection timestamp asynchronously
> 
>
> Key: PHOENIX-3028
> URL: https://issues.apache.org/jira/browse/PHOENIX-3028
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-3028.patch, PHOENIX-3028_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3028) StatisticsWriter shouldn't fail compaction in case of errors

2016-06-28 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3028:
--
Attachment: PHOENIX-3028_v2.patch

Thanks for the review, [~jamestaylor]. Updated the patch to address test 
failures arising out of not using the correct timestamp for generating the put. 

I will address preventing stats collection to fail compaction and custom config 
in a separate JIRA. If time permits, will try to get it that one in for 4.8 too.

> StatisticsWriter shouldn't fail compaction in case of errors
> 
>
> Key: PHOENIX-3028
> URL: https://issues.apache.org/jira/browse/PHOENIX-3028
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-3028.patch, PHOENIX-3028_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [GSOC 2016] [Status Update]

2016-06-28 Thread Ayola Jayamaha
Hi All,

There are some improvements of Phoenix UI on query builder branch[1]. They
are based on jiras[2,3]. Also the screenshots are attached on respective
jira issues.

I am currently fixing the trace web application to change zookeeper host
and port from their default values.

HTrace is also enabling Dapper like tracing as in zipkin [4]. To integrate
zipkin I am planing to use the instrumented library brave or dropwizard
zipkin.

Thank you.

[1] https://github.com/AyolaJayamaha/phoenix/tree/query-builder
[2] https://issues.apache.org/jira/browse/PHOENIX-2208
[3] https://issues.apache.org/jira/browse/PHOENIX-2701

[4] https://issues.apache.org/jira/browse/HBASE-6449


On Wed, May 25, 2016 at 10:45 AM, Ayola Jayamaha 
wrote:

>
> Hi All,
>
> I went through the knowledge base on HTrace and ZipKin documentation and code
> base experiments with Zipkin Collector, Storage, Zipkin Query Service and
> Zipkin Web UI. I have shared the summary of my findings on these blog posts
> [1-3].
>
> This is my proposal for Improving Phoenix UI [4]. Currently I am going
> through
>
>-
>
>Completed with industrial standard for embed tracing web application
>as a service
>-
>
>Improve the web application with better package management system
>-
>
>Set a common Jetty version for Phoenix
>-
>
>Create a page which under the covers turn on  tracing
>
> I have started on setting a common Jetty version [5]
>
> Thank you.
>
> [1] http://ayolajayamaha.blogspot.com/2016/05/zipkin-architecture.html
> [2]
> http://ayolajayamaha.blogspot.com/2016/05/zipkin-distributed-tracing-system.html
> [3]
> http://ayolajayamaha.blogspot.com/2016/05/zipkin-integration-with-htrace.html
> [4]
> https://docs.google.com/document/d/1Mvcae5JLws_ivpiWP8PuAqUhA27k1H_I9_OVEhJTzOY/
> [5] https://issues.apache.org/jira/browse/PHOENIX-2211
> --
> Best Regards,
> Ayola Jayamaha
> http://ayolajayamaha.blogspot.com/
>
>
>


-- 
Best Regards,
Ayola Jayamaha
http://ayolajayamaha.blogspot.com/


[jira] [Created] (PHOENIX-3034) Passing zookeeper quorum host and port numbers as commandline arguments

2016-06-28 Thread Nishani (JIRA)
Nishani  created PHOENIX-3034:
-

 Summary: Passing zookeeper quorum host and port numbers as 
commandline arguments
 Key: PHOENIX-3034
 URL: https://issues.apache.org/jira/browse/PHOENIX-3034
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.8.0
Reporter: Nishani 
Priority: Minor


Currently the host and port are set to localhost:2181. They need to be changed 
if user requires.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2724) Query with large number of guideposts is slower compared to no stats

2016-06-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352489#comment-15352489
 ] 

James Taylor commented on PHOENIX-2724:
---

bq. Another pretty easy fix would be to not try to get the guideposts for 
certain queries - point lookups and serial queries. We're fine just using the 
region boundaries in those cases.
On second thought, probably just for serial queries (but I don't recall if we 
get the guideposts for these). FWIW, I don't think we need to. It's useful to 
get guideposts for point lookups so we maximize the parallelism.

> Query with large number of guideposts is slower compared to no stats
> 
>
> Key: PHOENIX-2724
> URL: https://issues.apache.org/jira/browse/PHOENIX-2724
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>Reporter: Mujtaba Chohan
>Assignee: Samarth Jain
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, 
> PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)