[jira] [Updated] (PHOENIX-5432) Refactor LiteralExpression to use the builder pattern

2020-03-06 Thread Christine Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Feng updated PHOENIX-5432:

Description: 
LiteralExpression is a mess. While it provides newConstant() APIs to build the 
object, it also provides two public constructors. There are 10 overloaded 
newConstant() methods and it is unclear which API to use in which case.

This should be refactored to use the builder pattern and final member 
variables. Ideally, getters such as getMaxLength() should be simple member 
variable accessors and other ad-hoc logic surrounding those variables should be 
handled correctly when setting their respective values.

 

Two solutions:
 # -Consolidate the LiteralExpression newConstant() methods down into a single 
build() method-
 ** -Pros: easy to use since one build method can create all LiteralExpression 
objects-
 ** -Cons: requires adding 'throws SQLException' to a lot of method signatures 
where it wasn't necessary before, which can be confusing for future developers 
and potentially cause problems if a SQLException is actually thrown in some of 
these cases-
 # Create two build() methods - one for LiteralExpressions that necessitate 
deriving value (which could throw SQLExceptions) and ones that don't
 ** Pros: don't need to change any existing method signatures
 ** Cons: requires future developers to know which of the two build methods to 
use

NOTE: No backward compatibility testing to be done on master branch, waiting on 
review

 

 

  was:
LiteralExpression is a mess. While it provides newConstant() APIs to build the 
object, it also provides two public constructors. There are 10 overloaded 
newConstant() methods and it is unclear which API to use in which case.

This should be refactored to use the builder pattern and final member 
variables. Ideally, getters such as getMaxLength() should be simple member 
variable accessors and other ad-hoc logic surrounding those variables should be 
handled correctly when setting their respective values.

 

Two solutions:
 # Consolidate the LiteralExpression newConstant() methods down into a single 
build() method
 ** Pros: easy to use since one build method can create all LiteralExpression 
objects
 ** Cons: requires adding 'throws SQLException' to a lot of method signatures 
where it wasn't necessary before, which can be confusing for future developers 
and potentially cause problems if a SQLException is actually thrown in some of 
these cases
 # Create two build() methods - one for LiteralExpressions that necessitate 
deriving value (which could throw SQLExceptions) and ones that don't
 ** Pros: don't need to change any existing method signatures
 ** Cons: requires future developers to know which of the two build methods to 
use

 

 


> Refactor LiteralExpression to use the builder pattern
> -
>
> Key: PHOENIX-5432
> URL: https://issues.apache.org/jira/browse/PHOENIX-5432
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0, 5.1.0
>Reporter: Chinmay Kulkarni
>Assignee: Christine Feng
>Priority: Major
> Attachments: PHOENIX-5432-master-v1.patch, 
> PHOENIX-5432.master.v10.patch, PHOENIX-5432.master.v11.patch, 
> PHOENIX-5432.master.v12.patch, PHOENIX-5432.master.v13.patch, 
> PHOENIX-5432.master.v14.patch, PHOENIX-5432.master.v2.patch, 
> PHOENIX-5432.master.v3.patch, PHOENIX-5432.master.v4.patch, 
> PHOENIX-5432.master.v5.patch, PHOENIX-5432.master.v6.patch, 
> PHOENIX-5432.master.v7.patch, PHOENIX-5432.master.v8.patch, 
> PHOENIX-5432.master.v9.patch, PHOENIX-5432.patch
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> LiteralExpression is a mess. While it provides newConstant() APIs to build 
> the object, it also provides two public constructors. There are 10 overloaded 
> newConstant() methods and it is unclear which API to use in which case.
> This should be refactored to use the builder pattern and final member 
> variables. Ideally, getters such as getMaxLength() should be simple member 
> variable accessors and other ad-hoc logic surrounding those variables should 
> be handled correctly when setting their respective values.
>  
> Two solutions:
>  # -Consolidate the LiteralExpression newConstant() methods down into a 
> single build() method-
>  ** -Pros: easy to use since one build method can create all 
> LiteralExpression objects-
>  ** -Cons: requires adding 'throws SQLException' to a lot of method 
> signatures where it wasn't necessary before, which can be confusing for 
> future developers and potentially cause problems if a SQLException is 
> actually thrown in some of these cases-
>  # Create two build() methods - one for LiteralExpressions that necessitate 
> deriving value (which could throw SQLExceptions) and ones that don't
>  ** Pros: 

[jira] [Updated] (PHOENIX-4521) Allow Pherf scenario to define per table max allowed query duration after which thread is interrupted

2020-03-06 Thread Christine Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Feng updated PHOENIX-4521:

Description: 
Some clients interrupt the client thread if it doesn't complete in a required 
amount of time. It would be good if Pherf supported setting this up so we mimic 
client behavior more closely, as we're theorizing this may be causing some 
issues.

 

PLAN
 # Make necessary changes so new timeoutDuration property is recognized and 
parsed correctly from the scenario .xml file (completed)
 # Implement a per-query, per-iteration timeout
 # Test

  was:
Some clients interrupt the client thread if it doesn't complete in a required 
amount of time. It would be good if Pherf supported setting this up so we mimic 
client behavior more closely, as we're theorizing this may be causing some 
issues.

 

PLAN
 # Make necessary changes so new timeoutDuration property is recognized and 
parsed correctly from the scenario .xml file (completed)
 # Implement a timeout for query execution stage based on each table's 
timeoutDuration
 ## Serial execution: each thread should be interrupted after exceeding 
timeoutDuration
 ## Parallel execution: all threads should be interrupted after one thread 
exceeds timeoutDuration
 # Test


> Allow Pherf scenario to define per table max allowed query duration after 
> which thread is interrupted
> -
>
> Key: PHOENIX-4521
> URL: https://issues.apache.org/jira/browse/PHOENIX-4521
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James R. Taylor
>Assignee: Christine Feng
>Priority: Major
>  Labels: phoenix-hardening
>
> Some clients interrupt the client thread if it doesn't complete in a required 
> amount of time. It would be good if Pherf supported setting this up so we 
> mimic client behavior more closely, as we're theorizing this may be causing 
> some issues.
>  
> PLAN
>  # Make necessary changes so new timeoutDuration property is recognized and 
> parsed correctly from the scenario .xml file (completed)
>  # Implement a per-query, per-iteration timeout
>  # Test



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2020-03-06 Thread Daniel Wong (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Wong updated PHOENIX-4845:
-
Attachment: PHOENIX-4845.patch

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, 
> PHOENIX-4845-4.x-HBase-1.3.v2.patch, PHOENIX-4845-4.x-HBase-1.3.v3.patch, 
> PHOENIX-4845.patch, PHOENIX-offset.txt
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2020-03-06 Thread Daniel Wong (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Wong updated PHOENIX-4845:
-
Attachment: (was: PHOENIX-4845.patch)

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, 
> PHOENIX-4845-4.x-HBase-1.3.v2.patch, PHOENIX-4845-4.x-HBase-1.3.v3.patch, 
> PHOENIX-offset.txt
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5760) Pherf Support Sequential Datatypes for INTEGER type fields and have fixed row distribution

2020-03-06 Thread Daniel Wong (Jira)
Daniel Wong created PHOENIX-5760:


 Summary: Pherf Support Sequential Datatypes for INTEGER type 
fields and have fixed row distribution
 Key: PHOENIX-5760
 URL: https://issues.apache.org/jira/browse/PHOENIX-5760
 Project: Phoenix
  Issue Type: Improvement
Reporter: Daniel Wong


In order to run pherf test closer to how users do QueryMore we want to extend 
the Sequential datatype to integer fields with guaranteed values. 

In general we want to be able to generate data for a column from 1..N in pherf 
contiguously.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5673) The mutation state is silently getting cleared on the execution of any DDL

2020-03-06 Thread Siddhi Mehta (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddhi Mehta updated PHOENIX-5673:
--
Attachment: PHOENIX-5673.4.x-HBase-1.3.v3.patch

> The mutation state is silently getting cleared on the execution of any DDL
> --
>
> Key: PHOENIX-5673
> URL: https://issues.apache.org/jira/browse/PHOENIX-5673
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0
>Reporter: Sandeep Guggilam
>Assignee: Siddhi Mehta
>Priority: Critical
>  Labels: beginner, newbie
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5673.4.x-HBase-1.3.v1.patch, 
> PHOENIX-5673.4.x-HBase-1.3.v2.patch, PHOENIX-5673.4.x-HBase-1.3.v3.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When we execute any DDL statement, the mutations state is rolled back 
> silently without informing the user. It should probably throw an exception 
> saying that the mutation state is not empty when executing any DDL. See the 
> below example:
>  
> Steps to reproduce:
> create table t1 (pk varchar not null primary key, mycol varchar)
> upsert into t1 (pk, mycol) values ('x','x');
> create table t2 (pk varchar not null primary key, mycol varchar)
> When we try to execute the above statements and do a conn.commit() at the 
> end, it would silently rollback the upsert statement when we execute the 
> second create statement and you wouldn't see the ('x', 'x') values in the 
> first table. Instead it should probably throw an exception saying that the 
> mutation state is not empty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5759) Reduce thin client JAR size / classpath noise

2020-03-06 Thread Istvan Toth (Jira)
Istvan Toth created PHOENIX-5759:


 Summary: Reduce thin client JAR size / classpath noise
 Key: PHOENIX-5759
 URL: https://issues.apache.org/jira/browse/PHOENIX-5759
 Project: Phoenix
  Issue Type: Wish
Reporter: Istvan Toth


The phoenix thin client is ridiculously huge for what it is. The shaded Avatica 
client JAR is 6MB, the thin client JAR is 28MB. 

This is of course caused by pulling in hadoop-common.

Some ideas for a smaller/better client
 * Provide a client JAR that does not try expand on the kerberos capabilities 
of Avatica. 
 ** this would remove the hadoop dependency
 ** The use case for the thin client is usually _outside_ the cluster, where 
the referred config files may not even be available.
 ** Access through Knox usually doesn't use kerberos at all.
 ** cleaner client classpath
 * Shade with _minimizeJar_
 ** 28->11MB, though I did not test if it actually works
 * Use hadoop-client-api/runtime (from 3.x)
 ** less noise on the classpath
 ** 28->40MB without _minimizeJar_
 ** 28->16MB with _minimizeJar_
 ** Did not test either

My preferred solution is the first one, where we could look int additionally 
shading protobuf to further clean up the classpath. This could be an additional 
artifact, so that we do not break backwards compatibility either way.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)