[jira] [Updated] (PHOENIX-6448) ConnectionQueryServicesImpl init failure may cause Full GC.

2021-04-19 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-6448:
---
Description: 
in ConnectionQueryServicesImpl.init()

In some cases(e.g. the user has not permissions to create SYSTEM.CATALOG), 
there's only LOGGER.WARN and return null directly.
{code:java}
// Some comments here
{
  ...
  if (inspectIfAnyExceptionInChain(e, Collections.> 
singletonList(AccessDeniedException.class))) {
// Pass
LOGGER.warn("Could not check for Phoenix SYSTEM tables," +
  " assuming they exist and are properly configured");

checkClientServerCompatibility(SchemaUtil.getPhysicalName(SYSTEM_CATALOG_NAME_BYTES,
 getProps()).getName());
success = true;
  }
  ...
  return null;
}
...
scheduleRenewLeaseTasks();
{code}
Therefore, the following scheduleRenewLeaseTasks will be skipped and no 
exception is thrown.

 

1. scheduleRenewLeaseTasks not called

2. no renew task started

3. queries will call PhoenixConection.addIteratorForLeaseRenewal() as usual

4. the scannerQueue is unlimited therefore it will always adding new items.

5. Full GC.

  was:
in ConnectionQueryServicesImpl.init()

In some cases(e.g. the user has not permissions to create SYSTEM.CATALOG), 
there's only LOGGER.WARN and return null directly.
{code:java}
// Some comments here
if (inspectIfAnyExceptionInChain(e, Collections 
   .> 
singletonList(AccessDeniedException.class))) {  
  // Pass
  LOGGER.warn("Could not check for Phoenix SYSTEM tables," +
" assuming they exist and are properly 
configured");   
checkClientServerCompatibility(SchemaUtil.getPhysicalName(SYSTEM_CATALOG_NAME_BYTES,
 getProps()).getName());
success = true;
}
...
return null{code}
Therefore, the following scheduleRenewLeaseTasks will be skipped and no 
exception is thrown.

 

1. scheduleRenewLeaseTasks not called

2. no renew task started

3. query will call PhoenixConection.addIteratorForLeaseRenewal() as usual

4. the scannerQueue is unlimited therefore it will always adding new items.

5. Full GC.


> ConnectionQueryServicesImpl init failure may cause Full GC.
> ---
>
> Key: PHOENIX-6448
> URL: https://issues.apache.org/jira/browse/PHOENIX-6448
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Chen Feng
>Priority: Major
>
> in ConnectionQueryServicesImpl.init()
> In some cases(e.g. the user has not permissions to create SYSTEM.CATALOG), 
> there's only LOGGER.WARN and return null directly.
> {code:java}
> // Some comments here
> {
>   ...
>   if (inspectIfAnyExceptionInChain(e, Collections. Exception>> singletonList(AccessDeniedException.class))) {
> // Pass
> LOGGER.warn("Could not check for Phoenix SYSTEM tables," +
>   " assuming they exist and are properly configured");
> 
> checkClientServerCompatibility(SchemaUtil.getPhysicalName(SYSTEM_CATALOG_NAME_BYTES,
>  getProps()).getName());
> success = true;
>   }
>   ...
>   return null;
> }
> ...
> scheduleRenewLeaseTasks();
> {code}
> Therefore, the following scheduleRenewLeaseTasks will be skipped and no 
> exception is thrown.
>  
> 1. scheduleRenewLeaseTasks not called
> 2. no renew task started
> 3. queries will call PhoenixConection.addIteratorForLeaseRenewal() as usual
> 4. the scannerQueue is unlimited therefore it will always adding new items.
> 5. Full GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6448) ConnectionQueryServicesImpl init failure may cause Full GC.

2021-04-19 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-6448:
--

 Summary: ConnectionQueryServicesImpl init failure may cause Full 
GC.
 Key: PHOENIX-6448
 URL: https://issues.apache.org/jira/browse/PHOENIX-6448
 Project: Phoenix
  Issue Type: Bug
Reporter: Chen Feng


in ConnectionQueryServicesImpl.init()

In some cases(e.g. the user has not permissions to create SYSTEM.CATALOG), 
there's only LOGGER.WARN and return null directly.
{code:java}
// Some comments here
if (inspectIfAnyExceptionInChain(e, Collections 
   .> 
singletonList(AccessDeniedException.class))) {  
  // Pass
  LOGGER.warn("Could not check for Phoenix SYSTEM tables," +
" assuming they exist and are properly 
configured");   
checkClientServerCompatibility(SchemaUtil.getPhysicalName(SYSTEM_CATALOG_NAME_BYTES,
 getProps()).getName());
success = true;
}
...
return null{code}
Therefore, the following scheduleRenewLeaseTasks will be skipped and no 
exception is thrown.

 

1. scheduleRenewLeaseTasks not called

2. no renew task started

3. query will call PhoenixConection.addIteratorForLeaseRenewal() as usual

4. the scannerQueue is unlimited therefore it will always adding new items.

5. Full GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-6243) ValueBitSet can be "true" for incorrect columns

2020-12-03 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng reassigned PHOENIX-6243:
--

Assignee: (was: Chen Feng)

> ValueBitSet can be "true" for incorrect columns
> ---
>
> Key: PHOENIX-6243
> URL: https://issues.apache.org/jira/browse/PHOENIX-6243
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0
>Reporter: Chen Feng
>Priority: Major
> Attachments: PHOENIX-6243-4.15-HBase-1.5-v1.patch
>
>
> ValueBitSet can be "true" for incorrect columns. Execute the following sqls 
> CREATE TABLE A(ID UNSIGNED_LONG NOT NULL PRIMARY KEY, COL_0 UNSIGNED_LONG, 
> COL_1 UNSIGNED_LONG, COL_2 UNSIGNED_LONG, COL_3 UNSIGNED_LONG, COL_4 
> UNSIGNED_LONG, COL_5 UNSIGNED_LONG, COL_6 UNSIGNED_LONG, COL_7 UNSIGNED_LONG, 
> COL_8 UNSIGNED_LONG, COL_9 UNSIGNED_LONG, COL_10 UNSIGNED_LONG, COL_11 
> UNSIGNED_LONG, COL_12 UNSIGNED_LONG, COL_13 UNSIGNED_LONG, COL_14 
> UNSIGNED_LONG)
> CREATE TABLE B(ID UNSIGNED_LONG NOT NULL PRIMARY KEY, S_ VARCHAR, UL_ 
> UNSIGNED_LONG)
> UPSERT INTO A VALUES(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
> UPSERT INTO B(ID, UL_) VALUES(1, 1)
> SELECT /*+ USE_SORT_MERGE_JOIN */ J_A.ID, J_A.COL_0, J_A.COL_1, J_A.COL_2, 
> J_A.COL_3, J_A.COL_4, J_A.COL_5, J_A.COL_6, J_A.COL_7, J_A.COL_8, J_A.COL_9, 
> J_A.COL_10, J_A.COL_11, J_A.COL_12, J_A.COL_13, J_A.COL_14, J_B.S_, J_B.UL_ 
> FROM ( SELECT ID, COL_0, COL_1, COL_2, COL_3, COL_4, COL_5, COL_6, COL_7, 
> COL_8, COL_9, COL_10, COL_11, COL_12, COL_13, COL_14 FROM A) J_A JOIN ( 
> SELECT ID, S_, UL_ FROM B) J_B ON J_A.ID = J_B.ID
>  
> will trigger exception as follows
> Error: ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, 
> but had 7 (state=22000,code=201)
> java.sql.SQLException: ERROR 201 (22000): Illegal data. Expected length of at 
> least 8 bytes, but had 7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6243) ValueBitSet can be "true" for incorrect columns

2020-12-03 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-6243:
--

 Summary: ValueBitSet can be "true" for incorrect columns
 Key: PHOENIX-6243
 URL: https://issues.apache.org/jira/browse/PHOENIX-6243
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.15.0
Reporter: Chen Feng
Assignee: Chen Feng


ValueBitSet can be "true" for incorrect columns. Execute the following sqls 

CREATE TABLE A(ID UNSIGNED_LONG NOT NULL PRIMARY KEY, COL_0 UNSIGNED_LONG, 
COL_1 UNSIGNED_LONG, COL_2 UNSIGNED_LONG, COL_3 UNSIGNED_LONG, COL_4 
UNSIGNED_LONG, COL_5 UNSIGNED_LONG, COL_6 UNSIGNED_LONG, COL_7 UNSIGNED_LONG, 
COL_8 UNSIGNED_LONG, COL_9 UNSIGNED_LONG, COL_10 UNSIGNED_LONG, COL_11 
UNSIGNED_LONG, COL_12 UNSIGNED_LONG, COL_13 UNSIGNED_LONG, COL_14 UNSIGNED_LONG)
CREATE TABLE B(ID UNSIGNED_LONG NOT NULL PRIMARY KEY, S_ VARCHAR, UL_ 
UNSIGNED_LONG)
UPSERT INTO A VALUES(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
UPSERT INTO B(ID, UL_) VALUES(1, 1)
SELECT /*+ USE_SORT_MERGE_JOIN */ J_A.ID, J_A.COL_0, J_A.COL_1, J_A.COL_2, 
J_A.COL_3, J_A.COL_4, J_A.COL_5, J_A.COL_6, J_A.COL_7, J_A.COL_8, J_A.COL_9, 
J_A.COL_10, J_A.COL_11, J_A.COL_12, J_A.COL_13, J_A.COL_14, J_B.S_, J_B.UL_ 
FROM ( SELECT ID, COL_0, COL_1, COL_2, COL_3, COL_4, COL_5, COL_6, COL_7, 
COL_8, COL_9, COL_10, COL_11, COL_12, COL_13, COL_14 FROM A) J_A JOIN ( SELECT 
ID, S_, UL_ FROM B) J_B ON J_A.ID = J_B.ID

 

will trigger exception as follows

Error: ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, 
but had 7 (state=22000,code=201)
java.sql.SQLException: ERROR 201 (22000): Illegal data. Expected length of at 
least 8 bytes, but had 7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6012) make USE_OF_QUERY_GUIDE_POSTS configurable to improve query performance.

2020-07-15 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-6012:
--

 Summary: make USE_OF_QUERY_GUIDE_POSTS configurable to improve 
query performance.
 Key: PHOENIX-6012
 URL: https://issues.apache.org/jira/browse/PHOENIX-6012
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng


Loading guide posts from SYSTEM.STATS can cost lots of time.

When we have knowledge of the data size covered by query scan, we can disable 
the use of guide posts to improve query performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5987) PointLookup may cost too much memory

2020-07-03 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5987:
--

 Summary: PointLookup may cost too much memory
 Key: PHOENIX-5987
 URL: https://issues.apache.org/jira/browse/PHOENIX-5987
 Project: Phoenix
  Issue Type: Bug
Reporter: Chen Feng


When all rowkeys are covered in where conditions, Phoenix use point look up to 
switch "a huge range scan" to "multi single key scans".
However, the number single key scans are too huge, it quick exhausts the memory.

We meet such condition in hour product environment as follow:
We have a table with five primary keys like k1, k2, ..., k5, all key types are 
UNSIGNED_LONG.
The query sql like " ... where k1 = 1 and k2 = 1 and k3 in (1,2,3,...,l) and k4 
in (1,2,3,...,m) and k5 in (1,2,3,...,n)"
We have l=600, m=800 and n=1000, so the possible number of look up scans is 
1*1*600*800*1000=480,000,000.
Each scan rowkey costs 5*8=40 bytes. Therefore the total memory cost is 
480,000,000 * 50bytes = 25GB.
25GB exceeds the JMX configuration and causes OOM exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5987) PointLookup may cost too much memory

2020-07-03 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5987:
---
Description: 
When all rowkeys are covered in where conditions, Phoenix uses point look up to 
switch "a huge range scan" to "multi single key scans".
 However, when #scans are too huge, Phoenix quick exhausts the memory.

We meet such condition in hour product environment as follow:
 We have a table with five primary keys like k1, k2, ..., k5, all key types are 
UNSIGNED_LONG.
 The query sql like " ... where k1 = 1 and k2 = 1 and k3 in (1,2,3,...,l) and 
k4 in (1,2,3,...,m) and k5 in (1,2,3,...,n)"
 We have l=600, m=800 and n=1000, #scans of look up is 
1*1*600*800*1000=480,000,000.
 Each scan rowkey costs 5*8=40 bytes. Therefore the total memory cost is 
480,000,000 * 50bytes = 25GB.
 25GB exceeds the JMX configuration and causes OOM exception.

  was:
When all rowkeys are covered in where conditions, Phoenix use point look up to 
switch "a huge range scan" to "multi single key scans".
However, the number single key scans are too huge, it quick exhausts the memory.

We meet such condition in hour product environment as follow:
We have a table with five primary keys like k1, k2, ..., k5, all key types are 
UNSIGNED_LONG.
The query sql like " ... where k1 = 1 and k2 = 1 and k3 in (1,2,3,...,l) and k4 
in (1,2,3,...,m) and k5 in (1,2,3,...,n)"
We have l=600, m=800 and n=1000, so the possible number of look up scans is 
1*1*600*800*1000=480,000,000.
Each scan rowkey costs 5*8=40 bytes. Therefore the total memory cost is 
480,000,000 * 50bytes = 25GB.
25GB exceeds the JMX configuration and causes OOM exception.


> PointLookup may cost too much memory
> 
>
> Key: PHOENIX-5987
> URL: https://issues.apache.org/jira/browse/PHOENIX-5987
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Chen Feng
>Priority: Major
>
> When all rowkeys are covered in where conditions, Phoenix uses point look up 
> to switch "a huge range scan" to "multi single key scans".
>  However, when #scans are too huge, Phoenix quick exhausts the memory.
> We meet such condition in hour product environment as follow:
>  We have a table with five primary keys like k1, k2, ..., k5, all key types 
> are UNSIGNED_LONG.
>  The query sql like " ... where k1 = 1 and k2 = 1 and k3 in (1,2,3,...,l) and 
> k4 in (1,2,3,...,m) and k5 in (1,2,3,...,n)"
>  We have l=600, m=800 and n=1000, #scans of look up is 
> 1*1*600*800*1000=480,000,000.
>  Each scan rowkey costs 5*8=40 bytes. Therefore the total memory cost is 
> 480,000,000 * 50bytes = 25GB.
>  25GB exceeds the JMX configuration and causes OOM exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5897) SingleKeyValueTuple.toString() returns unexpected result

2020-05-15 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5897:
--

 Summary: SingleKeyValueTuple.toString() returns unexpected result
 Key: PHOENIX-5897
 URL: https://issues.apache.org/jira/browse/PHOENIX-5897
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng
Assignee: Chen Feng


In SingleKeyValueTuple.toString(), the code is shown as follows.
return "SingleKeyValueTuple[" + cell == null ? keyPtr.get() == 
UNITIALIZED_KEY_BUFFER ? "null" : 
Bytes.toStringBinary(keyPtr.get(),keyPtr.getOffset(),keyPtr.getLength()) : 
cell.toString() + "]";

actually, the code runs in the following order.
("SingleKeyValueTuple[" + cell) == null ? keyPtr.get() == 
UNITIALIZED_KEY_BUFFER ? "null" : Bytes.toStringBinary() : (cell.toString() + 
"]");

Therefore the result is weird.

BTW, value = condition1 ? condition2 ? X : Y : Z is also confusing, using if 
can be more clear.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5850) Set scan id for hbase side log.

2020-04-15 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5850:
--

 Summary: Set scan id for hbase side log.
 Key: PHOENIX-5850
 URL: https://issues.apache.org/jira/browse/PHOENIX-5850
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng
Assignee: Chen Feng


Adding scan id can help finding slow queries effectively.

It's helpful for debug and diagnose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-04-13 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: (was: PHOENIX-5793-v5.patch)

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5793-v2.patch, PHOENIX-5793-v3.patch, 
> PHOENIX-5793-v4.patch
>
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-31 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: (was: PHOENIX-5793-v5.patch)

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5793-v2.patch, PHOENIX-5793-v3.patch, 
> PHOENIX-5793-v4.patch
>
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: PHOENIX-5793-v3.patch

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5793-v2.patch, PHOENIX-5793-v3.patch
>
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: PHOENIX-5793-v2.patch

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5793-v2.patch
>
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: (was: PHOENIX-5793-v1.patch)

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5793:
--

 Summary: Support parallel init and fast null return for 
SortMergeJoinPlan.
 Key: PHOENIX-5793
 URL: https://issues.apache.org/jira/browse/PHOENIX-5793
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng
Assignee: Chen Feng


For a join sql like A join B. The implementation of SortMergeJoinPlan currently 
inits the two iterators A and B one by one.

By initializing A and B in parallel, we can improve performance in two aspects.

1) By overlapping the time in initializing.

2) If one child query is null, the other child query can be canceled since the 
final result must be null.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5600) "case when" of casting INTEGER to UNSIGNED_LONG throws exception

2019-12-03 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5600:
---
Summary: "case when" of casting INTEGER to UNSIGNED_LONG throws exception  
(was: case when for INTEGER to UNSIGNED_LONG throws exception)

> "case when" of casting INTEGER to UNSIGNED_LONG throws exception
> 
>
> Key: PHOENIX-5600
> URL: https://issues.apache.org/jira/browse/PHOENIX-5600
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Priority: Minor
>
> Executing "case when" of casting INTEGER to UNSIGNED_LONG throws exception.
> An example is shown as follows:
> CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
> FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
>  UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
>  UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);
> #this works correctly
> SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
> END FROM TBL_TEST_CASE_WHEN;
> #this throws exception
> SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
> END FROM TBL_TEST_CASE_WHEN;
>  
> By checking code of PUnsignedLong.java and PInteger.java, I see such comment 
> in PLong.isCoercibleTo(isCoercibleTo), it says
> // In general, don't allow conversion of LONG to INTEGER. There are times when
>  // we check isComparableTo for a more relaxed check and then throw a runtime
>  // exception if we overflow
> However, in this example, enabling casting a const INTEGER to UNSIGNED_LONG 
> is more comprehensible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5600) case when for INTEGER to UNSIGNED_LONG throws exception

2019-12-03 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5600:
---
Description: 
Executing "case when" of casting INTEGER to UNSIGNED_LONG throws exception.

An example is shown as follows:

CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
 UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
 UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);

#this works correctly

SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

#this throws exception

SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

 

By checking code of PUnsignedLong.java and PInteger.java, I see such comment in 
PLong.isCoercibleTo(isCoercibleTo), it says

// In general, don't allow conversion of LONG to INTEGER. There are times when
 // we check isComparableTo for a more relaxed check and then throw a runtime
 // exception if we overflow

However, in this example, enabling casting a const INTEGER to UNSIGNED_LONG is 
more comprehensible.

  was:
Execute select of case when for casting INTEGER to UNSIGNED_LONG throws 
exception.

A example is shown as follows

CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
 UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
 UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);

#this works correctly
 SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
END FROM TBL_TEST_CASE_WHEN;

#this throws exception

SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

 

By checking code of PUnsignedLong.java and PInteger.java, I see such comment in 
PLong.isCoercibleTo(isCoercibleTo), it says

// In general, don't allow conversion of LONG to INTEGER. There are times when
 // we check isComparableTo for a more relaxed check and then throw a runtime
 // exception if we overflow

However, in this example, enable casting a const INTEGER to UNSIGNED_LONG is 
more comprehensible.


> case when for INTEGER to UNSIGNED_LONG throws exception
> ---
>
> Key: PHOENIX-5600
> URL: https://issues.apache.org/jira/browse/PHOENIX-5600
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Priority: Minor
>
> Executing "case when" of casting INTEGER to UNSIGNED_LONG throws exception.
> An example is shown as follows:
> CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
> FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
>  UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
>  UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);
> #this works correctly
> SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
> END FROM TBL_TEST_CASE_WHEN;
> #this throws exception
> SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
> END FROM TBL_TEST_CASE_WHEN;
>  
> By checking code of PUnsignedLong.java and PInteger.java, I see such comment 
> in PLong.isCoercibleTo(isCoercibleTo), it says
> // In general, don't allow conversion of LONG to INTEGER. There are times when
>  // we check isComparableTo for a more relaxed check and then throw a runtime
>  // exception if we overflow
> However, in this example, enabling casting a const INTEGER to UNSIGNED_LONG 
> is more comprehensible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5600) case when for INTEGER to UNSIGNED_LONG throws exception

2019-12-02 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5600:
---
Description: 
Execute select of case when for casting INTEGER to UNSIGNED_LONG throws 
exception.

A example is shown as follows

CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
 UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
 UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);

#this works correctly
 SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
END FROM TBL_TEST_CASE_WHEN;

#this throws exception

SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

 

By checking code of PUnsignedLong.java and PInteger.java, I see such comment in 
PLong.isCoercibleTo(isCoercibleTo), it says

// In general, don't allow conversion of LONG to INTEGER. There are times when
 // we check isComparableTo for a more relaxed check and then throw a runtime
 // exception if we overflow

However, in this example, enable casting a const INTEGER to UNSIGNED_LONG is 
more comprehensible.

  was:
Execute select of case when for casting INTEGER to UNSIGNED_LONG throws 
exception.

A example is shown as follows

CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);

# this works correctly
SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

# exception as follows
# Error: ERROR 203 (22005): Type mismatch. Case expressions must have common 
type: INTEGER cannot be coerced to UNSIGNED_LONG (state=22005,code=203)
# org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type 
mismatch. Case expressions must have common type: INTEGER cannot be coerced to 
UNSIGNED_LONG

SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

 

By checking code of PUnsignedLong.java and PInteger.java, I see such comment in 
PLong.isCoercibleTo(isCoercibleTo), it says

// In general, don't allow conversion of LONG to INTEGER. There are times when
// we check isComparableTo for a more relaxed check and then throw a runtime
// exception if we overflow

However, in this example, enable casting a const INTEGER to UNSIGNED_LONG is 
more comprehensible.


> case when for INTEGER to UNSIGNED_LONG throws exception
> ---
>
> Key: PHOENIX-5600
> URL: https://issues.apache.org/jira/browse/PHOENIX-5600
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Priority: Minor
>
> Execute select of case when for casting INTEGER to UNSIGNED_LONG throws 
> exception.
> A example is shown as follows
> CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
> FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
>  UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
>  UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);
> #this works correctly
>  SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
> END FROM TBL_TEST_CASE_WHEN;
> #this throws exception
> SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y 
> END FROM TBL_TEST_CASE_WHEN;
>  
> By checking code of PUnsignedLong.java and PInteger.java, I see such comment 
> in PLong.isCoercibleTo(isCoercibleTo), it says
> // In general, don't allow conversion of LONG to INTEGER. There are times when
>  // we check isComparableTo for a more relaxed check and then throw a runtime
>  // exception if we overflow
> However, in this example, enable casting a const INTEGER to UNSIGNED_LONG is 
> more comprehensible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5600) case when for INTEGER to UNSIGNED_LONG throws exception

2019-12-02 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5600:
--

 Summary: case when for INTEGER to UNSIGNED_LONG throws exception
 Key: PHOENIX-5600
 URL: https://issues.apache.org/jira/browse/PHOENIX-5600
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng


Execute select of case when for casting INTEGER to UNSIGNED_LONG throws 
exception.

A example is shown as follows

CREATE TABLE IF NOT EXISTS TBL_TEST_CASE_WHEN (A UNSIGNED_LONG NOT NULL, Y 
FLOAT, Z UNSIGNED_LONG CONSTRAINT pk PRIMARY KEY (A));
UPSERT INTO TBL_TEST_CASE_WHEN VALUES (1, 1, 1);
UPSERT INTO TBL_TEST_CASE_WHEN VALUES (2, 2, 2);

# this works correctly
SELECT CASE WHEN A = 1 THEN A+9 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

# exception as follows
# Error: ERROR 203 (22005): Type mismatch. Case expressions must have common 
type: INTEGER cannot be coerced to UNSIGNED_LONG (state=22005,code=203)
# org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type 
mismatch. Case expressions must have common type: INTEGER cannot be coerced to 
UNSIGNED_LONG

SELECT CASE WHEN A = 1 THEN 10 ELSE A END, CASE WHEN Y > 1 THEN 5.5 ELSE Y END 
FROM TBL_TEST_CASE_WHEN;

 

By checking code of PUnsignedLong.java and PInteger.java, I see such comment in 
PLong.isCoercibleTo(isCoercibleTo), it says

// In general, don't allow conversion of LONG to INTEGER. There are times when
// we check isComparableTo for a more relaxed check and then throw a runtime
// exception if we overflow

However, in this example, enable casting a const INTEGER to UNSIGNED_LONG is 
more comprehensible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-10-29 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v6.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v4.patch, PHOENIX-5117-4.x-HBase-1.4-v5.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v6.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-10-21 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v5.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v4.patch, PHOENIX-5117-4.x-HBase-1.4-v5.patch, 
> PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5491) Improve performance of InListExpression.hashCode

2019-09-26 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5491:
---
Priority: Minor  (was: Critical)

> Improve performance of InListExpression.hashCode
> 
>
> Key: PHOENIX-5491
> URL: https://issues.apache.org/jira/browse/PHOENIX-5491
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5491-v2.patch, PHOENIX-5491.patch
>
>
> WhereOptimizer runs very slow in parsing sql like "where A in (a1, a2, ..., 
> a_N) and B = X" when N is very large. In our environment, it runs > 90s for N 
> = 14.
> This is because for the same instance of InListExpression, 
> InListExpression.hashCode() is calculate every time, where 
> InListExpression.values is traversed.
> In previous sql, InListExpression.hashCode() will be called N times, and 
> InListExpression.values has N elements. Therefore the total complexity is N^2.
> Saving the hashCode of InListExpression can reduce the complexity to N. The 
> test shows the large sql can be finished within 5 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5491) Improve performance of InListExpression.hashCode

2019-09-25 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5491:
---
Description: 
WhereOptimizer runs very slow in parsing sql like "where A in (a1, a2, ..., 
a_N) and B = X" when N is very large. In our environment, it runs > 90s for N = 
14.

This is because for the same instance of InListExpression, 
InListExpression.hashCode() is calculate every time, where 
InListExpression.values is traversed.

In previous sql, InListExpression.hashCode() will be called N times, and 
InListExpression.values has N elements. Therefore the total complexity is N^2.

Saving the hashCode of InListExpression can reduce the complexity to N. The 
test shows the large sql can be finished within 5 seconds.

  was:
In WhereOptimizer.pushKeyExpressionsToScan(), has a line of code: 
"extractNodes.addAll(nodesToExtract)"

When executing sqls like "select * from ... where A in (a1, a2, ..., a_n) and B 
= X", saying A in N (N > 100,000) elements, previous code execution will slow 
(> 90s in our environment).

This is because in such case, extractNodes is a HashSet, nodesToExtract is a 
List with N InListExpression (the N InListExpressions are the same instance), 
each InListExpression.values has N elements as well.

HashSet.addAll(list) will call N times of 
InListExpression.hashCode(). Each time, InListExpression.hashCode() will 
calculate hashCode for every value. Therefore, the time complexity will be N^2.

A simple way to solve it is to remember of the hashCode of InListExpression and 
returns it directly if calculated once. The query will finish in 5 seconds.


> Improve performance of InListExpression.hashCode
> 
>
> Key: PHOENIX-5491
> URL: https://issues.apache.org/jira/browse/PHOENIX-5491
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Critical
> Attachments: PHOENIX-5491-v2.patch, PHOENIX-5491.patch
>
>
> WhereOptimizer runs very slow in parsing sql like "where A in (a1, a2, ..., 
> a_N) and B = X" when N is very large. In our environment, it runs > 90s for N 
> = 14.
> This is because for the same instance of InListExpression, 
> InListExpression.hashCode() is calculate every time, where 
> InListExpression.values is traversed.
> In previous sql, InListExpression.hashCode() will be called N times, and 
> InListExpression.values has N elements. Therefore the total complexity is N^2.
> Saving the hashCode of InListExpression can reduce the complexity to N. The 
> test shows the large sql can be finished within 5 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5491) Improve performance of InListExpression.hashCode

2019-09-24 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5491:
---
Attachment: PHOENIX-5491-v2.patch

> Improve performance of InListExpression.hashCode
> 
>
> Key: PHOENIX-5491
> URL: https://issues.apache.org/jira/browse/PHOENIX-5491
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Critical
> Attachments: PHOENIX-5491-v2.patch, PHOENIX-5491.patch
>
>
> In WhereOptimizer.pushKeyExpressionsToScan(), has a line of code: 
> "extractNodes.addAll(nodesToExtract)"
> When executing sqls like "select * from ... where A in (a1, a2, ..., a_n) and 
> B = X", saying A in N (N > 100,000) elements, previous code execution will 
> slow (> 90s in our environment).
> This is because in such case, extractNodes is a HashSet, nodesToExtract is a 
> List with N InListExpression (the N InListExpressions are the same instance), 
> each InListExpression.values has N elements as well.
> HashSet.addAll(list) will call N times of 
> InListExpression.hashCode(). Each time, InListExpression.hashCode() will 
> calculate hashCode for every value. Therefore, the time complexity will be 
> N^2.
> A simple way to solve it is to remember of the hashCode of InListExpression 
> and returns it directly if calculated once. The query will finish in 5 
> seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5491) Improve performance of InListExpression.hashCode

2019-09-24 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5491:
---
Description: 
In WhereOptimizer.pushKeyExpressionsToScan(), has a line of code: 
"extractNodes.addAll(nodesToExtract)"

When executing sqls like "select * from ... where A in (a1, a2, ..., a_n) and B 
= X", saying A in N (N > 100,000) elements, previous code execution will slow 
(> 90s in our environment).

This is because in such case, extractNodes is a HashSet, nodesToExtract is a 
List with N InListExpression (the N InListExpressions are the same instance), 
each InListExpression.values has N elements as well.

HashSet.addAll(list) will call N times of 
InListExpression.hashCode(). Each time, InListExpression.hashCode() will 
calculate hashCode for every value. Therefore, the time complexity will be N^2.

A simple way to solve it is to remember of the hashCode of InListExpression and 
returns it directly if calculated once. The query will finish in 5 seconds.

  was:
In WhereOptimizer.pushKeyExpressionsToScan(), has a line of code: 
"extractNodes.addAll(nodesToExtract)" When executing sqls like "select * from 
... where A in (a1, a2, ..., a_n) and B = X", saying A in N (N > 100,000) 
elements, previous code execution will slow (> 90s in our environment).

This is because in such case, extractNodes is a HashSet, nodesToExtract is a 
List with N InListExpression (the N InListExpressions are the same instance), 
each InListExpression.values has N elements as well.

HashSet.addAll(list) will call N times of 
InListExpression.hashCode(). Each time, InListExpression.hashCode() will 
calculate hashCode for every value. Therefore, the time complexity will be N^2.

A simple way to solve it is to remember of the hashCode of InListExpression and 
returns it directly if calculated once.


> Improve performance of InListExpression.hashCode
> 
>
> Key: PHOENIX-5491
> URL: https://issues.apache.org/jira/browse/PHOENIX-5491
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Critical
> Attachments: PHOENIX-5491.patch
>
>
> In WhereOptimizer.pushKeyExpressionsToScan(), has a line of code: 
> "extractNodes.addAll(nodesToExtract)"
> When executing sqls like "select * from ... where A in (a1, a2, ..., a_n) and 
> B = X", saying A in N (N > 100,000) elements, previous code execution will 
> slow (> 90s in our environment).
> This is because in such case, extractNodes is a HashSet, nodesToExtract is a 
> List with N InListExpression (the N InListExpressions are the same instance), 
> each InListExpression.values has N elements as well.
> HashSet.addAll(list) will call N times of 
> InListExpression.hashCode(). Each time, InListExpression.hashCode() will 
> calculate hashCode for every value. Therefore, the time complexity will be 
> N^2.
> A simple way to solve it is to remember of the hashCode of InListExpression 
> and returns it directly if calculated once. The query will finish in 5 
> seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5491) Improve performance of InListExpression.hashCode

2019-09-24 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5491:
--

 Summary: Improve performance of InListExpression.hashCode
 Key: PHOENIX-5491
 URL: https://issues.apache.org/jira/browse/PHOENIX-5491
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng
Assignee: Chen Feng


In WhereOptimizer.pushKeyExpressionsToScan(), has a line of code: 
"extractNodes.addAll(nodesToExtract)" When executing sqls like "select * from 
... where A in (a1, a2, ..., a_n) and B = X", saying A in N (N > 100,000) 
elements, previous code execution will slow (> 90s in our environment).

This is because in such case, extractNodes is a HashSet, nodesToExtract is a 
List with N InListExpression (the N InListExpressions are the same instance), 
each InListExpression.values has N elements as well.

HashSet.addAll(list) will call N times of 
InListExpression.hashCode(). Each time, InListExpression.hashCode() will 
calculate hashCode for every value. Therefore, the time complexity will be N^2.

A simple way to solve it is to remember of the hashCode of InListExpression and 
returns it directly if calculated once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5405) Enable timerange in sql hint

2019-08-04 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5405:
---
Attachment: PHOENIX-5405-v3.patch

> Enable timerange in sql hint
> 
>
> Key: PHOENIX-5405
> URL: https://issues.apache.org/jira/browse/PHOENIX-5405
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5405-v1.patch, PHOENIX-5405-v2.patch, 
> PHOENIX-5405-v3.patch
>
>
> PHOENIX-914 enabled Native HBase timestamp support to optimize date range 
> queries in Phoenix.
> However, the end time is set at the beginning of connection. For a long term 
> connection, we cannot change SCN during processing.
> If we have knowledge of underly HBase timestamp, we can use a connection-free 
> way to query the target data by adding a hint in sql.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (PHOENIX-5405) Enable timerange in sql hint

2019-08-04 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5405:
---
Attachment: PHOENIX-5405-v2.patch

> Enable timerange in sql hint
> 
>
> Key: PHOENIX-5405
> URL: https://issues.apache.org/jira/browse/PHOENIX-5405
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5405-v1.patch, PHOENIX-5405-v2.patch
>
>
> PHOENIX-914 enabled Native HBase timestamp support to optimize date range 
> queries in Phoenix.
> However, the end time is set at the beginning of connection. For a long term 
> connection, we cannot change SCN during processing.
> If we have knowledge of underly HBase timestamp, we can use a connection-free 
> way to query the target data by adding a hint in sql.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (PHOENIX-5405) Enable timerange in sql hint

2019-07-20 Thread Chen Feng (JIRA)
Chen Feng created PHOENIX-5405:
--

 Summary: Enable timerange in sql hint
 Key: PHOENIX-5405
 URL: https://issues.apache.org/jira/browse/PHOENIX-5405
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng


PHOENIX-914 enabled Native HBase timestamp support to optimize date range 
queries in Phoenix.
However, the end time is set at the beginning of connection. For a long term 
connection, we cannot change SCN during processing.
If we have knowledge of underly HBase timestamp, we can use a connection-free 
way to query the target data by adding a hint in sql.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-03 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v4.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v4.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-03 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v3.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-03 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: (was: PHOENIX-5117-4.x-HBase-1.4-v3.patch)

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-02 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v3.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-01 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v2.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-01 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: (was: PHOENIX-5117-4.x-HBase-1.4-v2.patch)

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-01 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v2.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-07-01 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-4.x-HBase-1.4-v1.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.1, 5.1.1
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2019-05-16 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng reassigned PHOENIX-5051:
--

Assignee: Chen Feng  (was: Cheng Fan)

> ScanningResultIterator metric "RowsScanned" not set
> ---
>
> Key: PHOENIX-5051
> URL: https://issues.apache.org/jira/browse/PHOENIX-5051
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-5051-4.x-HBase-1.2-v1.patch, 
> PHOENIX-5051-v1.patch, PHOENIX-5051-v2.patch
>
>
> in ScanningResultIterator, the metric "RowsScanned" was not set, while 
> "RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-05-16 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng reassigned PHOENIX-5117:
--

Assignee: Chen Feng

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2019-05-16 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-4296:
---
Attachment: PHOENIX-4296-4.x-HBase-1.2-v4.patch

> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>Assignee: Chen Feng
>Priority: Major
> Fix For: 4.14.2
>
> Attachments: PHOENIX-4296-4.x-HBase-1.2-v2.patch, 
> PHOENIX-4296-4.x-HBase-1.2-v3.patch, PHOENIX-4296-4.x-HBase-1.2-v4.patch, 
> PHOENIX-4296-4.x-HBase-1.2.patch, PHOENIX-4296.patch
>
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-04-27 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: (was: PHOENIX-5117-v1.patch)

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Minor
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2019-04-27 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-4296:
---
Attachment: (was: PHOENIX-4296-4.x-HBase-1.2-v3.patch)

> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>Priority: Major
> Attachments: PHOENIX-4296-4.x-HBase-1.2-v2.patch, 
> PHOENIX-4296-4.x-HBase-1.2.patch, PHOENIX-4296.patch
>
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2019-04-27 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-4296:
---
Attachment: PHOENIX-4296-4.x-HBase-1.2-v3.patch

> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>Priority: Major
> Attachments: PHOENIX-4296-4.x-HBase-1.2-v2.patch, 
> PHOENIX-4296-4.x-HBase-1.2-v3.patch, PHOENIX-4296-4.x-HBase-1.2.patch, 
> PHOENIX-4296.patch
>
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5217) Incorrect result for COUNT DISTINCT ... limit ...

2019-03-28 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5217:
---
Description: 
For table t1(pk1, col1, CONSTRAINT(pk1))

upsert into "t1" values (1, 1);
 upsert into "t1" values (2, 2);

sql A: select count("pk1") from "t1" limit 1, return 2 [correct]

sql B: select count(disctinct("pk1")) from "t1" limit 1, return 1 [incorrect]

  was:
For table t1(pk1, col1, CONSTRAINT(pk1))

upsert into "t1" values (1, 1);
 upsert into "t1" values (2, 2);

sql1: select count("pk1") from "t1" limit 1, return 2 [correct]

sql2: select count(disctinct("pk1")) from "t1" limit 1, return 1 [incorrect]


> Incorrect result for COUNT DISTINCT ... limit ...
> -
>
> Key: PHOENIX-5217
> URL: https://issues.apache.org/jira/browse/PHOENIX-5217
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
> Environment: 4.14.1: incorrect
> 4.6: correct.
>  
>Reporter: Chen Feng
>Priority: Critical
>
> For table t1(pk1, col1, CONSTRAINT(pk1))
> upsert into "t1" values (1, 1);
>  upsert into "t1" values (2, 2);
> sql A: select count("pk1") from "t1" limit 1, return 2 [correct]
> sql B: select count(disctinct("pk1")) from "t1" limit 1, return 1 [incorrect]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5217) Incorrect result for COUNT DISTINCT ... limit ...

2019-03-28 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5217:
---
Environment: 
4.14.1: incorrect

4.6: correct.

 

  was:
Phoenix 4.14.1 returns the incorrect answer.

Phoenix 4.6 is correct.

No other version is tested.

 


> Incorrect result for COUNT DISTINCT ... limit ...
> -
>
> Key: PHOENIX-5217
> URL: https://issues.apache.org/jira/browse/PHOENIX-5217
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
> Environment: 4.14.1: incorrect
> 4.6: correct.
>  
>Reporter: Chen Feng
>Priority: Critical
>
> For table t1(pk1, col1, CONSTRAINT(pk1))
> upsert into "t1" values (1, 1);
>  upsert into "t1" values (2, 2);
> sql1: select count("pk1") from "t1" limit 1, return 2 [correct]
> sql2: select count(disctinct("pk1")) from "t1" limit 1, return 1 [incorrect]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5217) Incorrect result for COUNT DISTINCT ... limit ...

2019-03-28 Thread Chen Feng (JIRA)
Chen Feng created PHOENIX-5217:
--

 Summary: Incorrect result for COUNT DISTINCT ... limit ...
 Key: PHOENIX-5217
 URL: https://issues.apache.org/jira/browse/PHOENIX-5217
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.14.1
 Environment: Phoenix 4.14.1 returns the incorrect answer.

Phoenix 4.6 is correct.

No other version is tested.

 
Reporter: Chen Feng


For table t1(pk1, col1, CONSTRAINT(pk1))

upsert into "t1" values (1, 1);
 upsert into "t1" values (2, 2);

sql1: select count("pk1") from "t1" limit 1, return 2 [correct]

sql2: select count(disctinct("pk1")) from "t1" limit 1, return 1 [incorrect]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-01-31 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Description: 
HBASE-5980 provides the ability to return the number of rows scanned. Such 
metrics should also be returned by Phoenix.

 

HBASE-21815 is acquired.

  was:HBASE-5980 provides the ability to return the number of rows scanned. 
Such metrics should also be returned by Phoenix


> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
>  
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-01-31 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Description: 
HBASE-5980 provides the ability to return the number of rows scanned. Such 
metrics should also be returned by Phoenix.

HBASE-21815 is acquired.

  was:
HBASE-5980 provides the ability to return the number of rows scanned. Such 
metrics should also be returned by Phoenix.

 

HBASE-21815 is acquired.


> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-01-31 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Attachment: PHOENIX-5117-v1.patch

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-01-31 Thread Chen Feng (JIRA)
Chen Feng created PHOENIX-5117:
--

 Summary: Return the count of rows scanned in HBase
 Key: PHOENIX-5117
 URL: https://issues.apache.org/jira/browse/PHOENIX-5117
 Project: Phoenix
  Issue Type: New Feature
Affects Versions: 4.14.1
Reporter: Chen Feng


HBASE-5980 provides the ability to return the number of rows scanned. Here we 
return the count by Phoenix metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2019-01-31 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5117:
---
Description: HBASE-5980 provides the ability to return the number of rows 
scanned. Such metrics should also be returned by Phoenix  (was: HBASE-5980 
provides the ability to return the number of rows scanned. Here we return the 
count by Phoenix metrics.)

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Minor
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2019-01-23 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-4296:
---
Attachment: PHOENIX-4296-4.x-HBase-1.2-v2.patch

> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>Priority: Major
> Attachments: PHOENIX-4296-4.x-HBase-1.2-v2.patch, 
> PHOENIX-4296-4.x-HBase-1.2.patch, PHOENIX-4296.patch
>
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2019-01-03 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-4296:
---
Attachment: PHOENIX-4296-4.x-HBase-1.2.patch

> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>Priority: Major
> Attachments: PHOENIX-4296-4.x-HBase-1.2.patch, PHOENIX-4296.patch
>
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2019-01-03 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-4296:
---
Attachment: PHOENIX-4296.patch

> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>Priority: Major
> Attachments: PHOENIX-4296.patch
>
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2018-12-04 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5051:
---
Attachment: PHOENIX-5051-4.x-HBase-1.2-v1.patch

> ScanningResultIterator metric "RowsScanned" not set
> ---
>
> Key: PHOENIX-5051
> URL: https://issues.apache.org/jira/browse/PHOENIX-5051
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Cheng Fan
>Priority: Major
> Fix For: 4.14.2
>
> Attachments: PHOENIX-5051-4.x-HBase-1.2-v1.patch, 
> PHOENIX-5051-v1.patch, PHOENIX-5051-v2.patch
>
>
> in ScanningResultIterator, the metric "RowsScanned" was not set, while 
> "RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2018-12-04 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5051:
---
Affects Version/s: (was: 5.0.0)

> ScanningResultIterator metric "RowsScanned" not set
> ---
>
> Key: PHOENIX-5051
> URL: https://issues.apache.org/jira/browse/PHOENIX-5051
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Major
> Fix For: 4.14.2
>
> Attachments: PHOENIX-5051-v1.patch, PHOENIX-5051-v2.patch
>
>
> in ScanningResultIterator, the metric "RowsScanned" was not set, while 
> "RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2018-12-04 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5051:
---
Fix Version/s: (was: 5.1.0)
   (was: 4.15.0)

> ScanningResultIterator metric "RowsScanned" not set
> ---
>
> Key: PHOENIX-5051
> URL: https://issues.apache.org/jira/browse/PHOENIX-5051
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Priority: Major
> Fix For: 4.14.2
>
> Attachments: PHOENIX-5051-v1.patch, PHOENIX-5051-v2.patch
>
>
> in ScanningResultIterator, the metric "RowsScanned" was not set, while 
> "RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2018-11-30 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5051:
---
Attachment: PHOENIX-5051-v1.patch

> ScanningResultIterator metric "RowsScanned" not set
> ---
>
> Key: PHOENIX-5051
> URL: https://issues.apache.org/jira/browse/PHOENIX-5051
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Chen Feng
>Priority: Major
> Attachments: PHOENIX-5051-v1.patch, row_scanned.patch
>
>
> in ScanningResultIterator, the metric "RowsScanned" was not set, while 
> "RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2018-11-30 Thread Chen Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5051:
---
Attachment: (was: row_scanned.patch)

> ScanningResultIterator metric "RowsScanned" not set
> ---
>
> Key: PHOENIX-5051
> URL: https://issues.apache.org/jira/browse/PHOENIX-5051
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Chen Feng
>Priority: Major
> Attachments: PHOENIX-5051-v1.patch
>
>
> in ScanningResultIterator, the metric "RowsScanned" was not set, while 
> "RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5051) ScanningResultIterator metric "RowsScanned" not set

2018-11-30 Thread Chen Feng (JIRA)
Chen Feng created PHOENIX-5051:
--

 Summary: ScanningResultIterator metric "RowsScanned" not set
 Key: PHOENIX-5051
 URL: https://issues.apache.org/jira/browse/PHOENIX-5051
 Project: Phoenix
  Issue Type: Bug
Reporter: Chen Feng
 Attachments: row_scanned.patch

in ScanningResultIterator, the metric "RowsScanned" was not set, while 
"RowsFiltered" was set twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4296) Dead loop in HBase reverse scan when amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE

2017-10-18 Thread Chen Feng (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209010#comment-16209010
 ] 

Chen Feng commented on PHOENIX-4296:


For example, if there are 3001 rows, whose ids are  [1, 2, 3, ..., 3001].
In the first scan with scan.startRow=1, stopRow=3001, since it is a reverse 
scan, the lastKey will be 3001, 3000, 2999, ... 3.
At the end of the first scan, lastKey will be 2. 

In the next scan, the code
scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey)); will set 
the scan with scan.startRow=2, stopRow=3001.

Therefore, the outer scan never ends with internal scans with repeated rows 
from 3 to 3001.


> Dead loop in HBase reverse scan when amount of scan data is greater than 
> SCAN_RESULT_CHUNK_SIZE
> ---
>
> Key: PHOENIX-4296
> URL: https://issues.apache.org/jira/browse/PHOENIX-4296
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.6.0
>Reporter: rukawakang
>
> This problem seems to only occur with reverse scan not forward scan. When 
> amount of scan data is greater than SCAN_RESULT_CHUNK_SIZE(default 2999), 
> Class ChunkedResultIteratorFactory will multiple calls function 
> getResultIterator. But in function getResultIterator it always readjusts 
> startRow, in fact, if in reverse scan we should readjust stopRow. For example 
> {code:java}
> if (ScanUtil.isReversed(scan)) {
> scan.setStopRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> } else {
> scan.setStartRow(ByteUtil.copyKeyBytesIfNecessary(lastKey));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)