[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666402#comment-15666402
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 7:42 AM:
-

[~jamestaylor].thank your for your suggestion, my considerations as follows:

1. If  GroupBy is "GROUP BY pkCol1 + 1, TRUNC(pkCol2)", the OrderBy must be 
"ORDER BY pkCol1 + 1" or "ORDER BY TRUNC(pkCol2)", the OrderBy columns must 
match the GroupBy columns.
2. Only when all  GROUP BY/Order BY expressions are simple RowKey Columns (i.e. 
GROUP BY pkCol1, pkCol2 or OrderBy BY pkCol1, pkCol2) , we have necessary to go 
further to  check if  the  GROUP BY/Order BY is "isOrderPreserving". If  GROUP 
BY/Order BY expressions are not simple RowKey Columns(i.e.GROUP BY pkCol1 + 1, 
TRUNC(pkCol2) or ORDER BY pkCol1 + 1, TRUNC(pkCol2)), surely the  GROUP 
BY/Order BY should not be "isOrderPreserving".

So I think my patch is Ok, just as the following code explained,  it just needs 
to only conside the RowKeyColumnExpression. RowKeyColumnExpression is enough 
for checking if the Order BY is "isOrderPreserving",for other type of 
Expression, the following visit method return null, and the 
OrderPreservingTracker.isOrderPreserving method will return false,which is as  
expected.

 {code:borderStyle=solid} 
@Override
public Info visit(RowKeyColumnExpression node) {
if(groupBy==null || groupBy.isEmpty()) {
return new Info(node.getPosition());
}
int pkPosition=node.getPosition();
assert pkPosition < groupBy.getExpressions().size();
Expression 
groupByExpression=groupBy.getExpressions().get(pkPosition);
if(!(groupByExpression instanceof RowKeyColumnExpression)) {
return null;
}
int 
orginalPkPosition=((RowKeyColumnExpression)groupByExpression).getPosition();
return new Info(orginalPkPosition);
}
 {code} 

By the way, I had already considered the modification as same as your 
suggestion when I made my patch, finally I select current patch because it is 
more simpler ,and the modification is just restricted in the single 
OrderPreservingTracker class,FYI.


was (Author: comnetwork):
[~jamestaylor].thank your for your suggestion, my considerations as follows:

1. If  GroupBy is "GROUP BY pkCol1 + 1, TRUNC(pkCol2)", the OrderBy must be 
"ORDER BY pkCol1 + 1" or "ORDER BY TRUNC(pkCol2)", the OrderBy columns must 
match the GroupBy columns.
2. Only when all  GROUP BY/Order BY expressions are simple RowKey Columns (i.e. 
GROUP BY pkCol1, pkCol2 or OrderBy BY pkCol1, pkCol2) , we have necessary to go 
further to  check if  the  GROUP BY/Order BY is "isOrderPreserving". If  GROUP 
BY/Order BY expressions are not simple RowKey Columns(i.e.GROUP BY pkCol1 + 1, 
TRUNC(pkCol2) or ORDER BY pkCol1 + 1, TRUNC(pkCol2)), surely the  GROUP 
BY/Order BY should not be "isOrderPreserving".

So I think my patch is Ok, just as the following code explained,  it just need 
to only conside the RowKeyColumnExpression. RowKeyColumnExpression is enough 
for checking if the Order BY is "isOrderPreserving",for other type of 
Expression, the following visit method return null, and the 
OrderPreservingTracker.isOrderPreserving method will return false,which is as  
expected.

 {code:borderStyle=solid} 
@Override
public Info visit(RowKeyColumnExpression node) {
if(groupBy==null || groupBy.isEmpty()) {
return new Info(node.getPosition());
}
int pkPosition=node.getPosition();
assert pkPosition < groupBy.getExpressions().size();
Expression 
groupByExpression=groupBy.getExpressions().get(pkPosition);
if(!(groupByExpression instanceof RowKeyColumnExpression)) {
return null;
}
int 
orginalPkPosition=((RowKeyColumnExpression)groupByExpression).getPosition();
return new Info(orginalPkPosition);
}
 {code} 

By the way, I had already considered the modification as same as your 
suggestion when I made my patch, finally I select current patch because it is 
more simpler ,and the modification is just restricted in the single 
OrderPreservingTracker class,FYI.

> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS 

[jira] [Commented] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666402#comment-15666402
 ] 

chenglei commented on PHOENIX-3451:
---

[~jamestaylor].thank your for your suggestion, my considerations as follows:

1. If  GroupBy is "GROUP BY pkCol1 + 1, TRUNC(pkCol2)", the OrderBy must be 
"ORDER BY pkCol1 + 1" or "ORDER BY TRUNC(pkCol2)", the OrderBy columns must 
match the GroupBy columns.
2. Only when all  GROUP BY/Order BY expressions are simple RowKey Columns (i.e. 
GROUP BY pkCol1, pkCol2 or OrderBy BY pkCol1, pkCol2) , we have necessary to go 
further to  check if  the  GROUP BY/Order BY is "isOrderPreserving". If  GROUP 
BY/Order BY expressions are not simple RowKey Columns(i.e.GROUP BY pkCol1 + 1, 
TRUNC(pkCol2) or ORDER BY pkCol1 + 1, TRUNC(pkCol2)), surely the  GROUP 
BY/Order BY should not be "isOrderPreserving".

So I think my patch is Ok, just as the following code explained,  it just need 
to only conside the RowKeyColumnExpression. RowKeyColumnExpression is enough 
for checking if the Order BY is "isOrderPreserving",for other type of 
Expression, the following visit method return null, and the 
OrderPreservingTracker.isOrderPreserving method will return false,which is as  
expected.

 {code:borderStyle=solid} 
@Override
public Info visit(RowKeyColumnExpression node) {
if(groupBy==null || groupBy.isEmpty()) {
return new Info(node.getPosition());
}
int pkPosition=node.getPosition();
assert pkPosition < groupBy.getExpressions().size();
Expression 
groupByExpression=groupBy.getExpressions().get(pkPosition);
if(!(groupByExpression instanceof RowKeyColumnExpression)) {
return null;
}
int 
orginalPkPosition=((RowKeyColumnExpression)groupByExpression).getPosition();
return new Info(orginalPkPosition);
}
 {code} 

By the way, I had already considered the modification as same as your 
suggestion when I made my patch, finally I select current patch because it is 
more simpler ,and the modification is just restricted in the single 
OrderPreservingTracker class,FYI.

> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3482) Provide a work around for HBASE-17096

2016-11-14 Thread Samarth Jain (JIRA)
Samarth Jain created PHOENIX-3482:
-

 Summary: Provide a work around for HBASE-17096
 Key: PHOENIX-3482
 URL: https://issues.apache.org/jira/browse/PHOENIX-3482
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain


HBASE-17096 causes failures in UpgradeIT#testAcquiringAndReleasingUpgradeMutex. 
Essentially releasing of the upgrade mutex by using the checkAndMutate api 
isn't working correctly. A simple though not ideal work around would be to not 
call releaseMutex() and let the lock expire by the virtue of TTL set on the 
cell. The side effect is that if a client encounters and exception while 
executing the upgrade code, then a new client won't be able to initiate the 
upgrade till the TTL expires.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666301#comment-15666301
 ] 

Samarth Jain commented on PHOENIX-3481:
---

bq. Actually, you're better off if you can track down and change it on the 
server-side as otherwise we'll need to check-in a new Phoenix jar to core 
(assuming this is potentially a real issue rather than a test only issue).

[~jamestaylor], I looked at the changes in HBase code between the versions 
0.98.17 and 0.98.23 and I see that HBASE-15245 added new checks that were not 
present before. Specifically this change in HRegionServer:

{code}
+ByteString value = regionSpecifier.getValue();
+RegionSpecifierType type = regionSpecifier.getType();
+switch (type) {
+case REGION_NAME:
+  byte[] regionName = value.toByteArray();
+  String encodedRegionName = HRegionInfo.encodeRegionName(regionName);
+  return getRegionByEncodedName(regionName, encodedRegionName);
+case ENCODED_REGION_NAME:
+  return getRegionByEncodedName(value.toStringUtf8());
+default:
+  throw new DoNotRetryIOException(
+"Unsupported region specifier type: " + type);
+}
{code}

The call 

{code}
getRegionByEncodedName(regionName, encodedRegionName);
{code}

throws a NotServingRegionException. This exception is thrown before the Phoenix 
coprocessor hook for doPostScannerOpen() could throw a 
StaleRegionBoundaryCacheException for the phoenix client to retry. FWIW, this 
change has also been made to HBase-1.3.0 as part of HBASE-15177.

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch, PHOENIX-3481_v3_0.98.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Apache Phoenix committer - Kevin Liew

2016-11-14 Thread Kevin Liew
Thank you James and PMC. I'm excited to see how Apache Phoenix will evolve, and 
I am grateful for the opportunity to contribute as a committer. 

On 2016-11-10 11:07 (-0800), James Taylor  wrote: 
> On behalf of the Apache Phoenix PMC, I'm pleased to announce that Kevin
> Liew has accepted our invitation to become a committer on the Apache
> Phoenix project. He's done an great job finding and fixing many important
> Phoenix JIRAs as well as most recently implementing support for default
> column value declarations [1] in our upcoming 4.9.0 release.
> 
> Welcome aboard, Kevin. Looking forward to many more contributions!
> 
> Regards,
> James
> 
> [1] https://issues.apache.org/jira/browse/PHOENIX-476
> 


[jira] [Commented] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666071#comment-15666071
 ] 

James Taylor commented on PHOENIX-3451:
---

I think we're on the same page. I'm just suggesting we track the original PK 
position in the GroupByCompiler as we have the info there already.

> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665917#comment-15665917
 ] 

Samarth Jain commented on PHOENIX-3481:
---

It turns out upgrading to HBase-0.98.23 may not be straightforward after all. I 
am running into a test failure in 
UpgradeIT#testAcquiringAndReleasingUpgradeMutex because of a bug in HBase 
checkAndMutate api. See HBASE-17096 for details. 

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch, PHOENIX-3481_v3_0.98.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 3:30 AM:
-

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, the 
patch actually did like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy. My patch only 
changes the Info's pkPosition used in OrderPreservingTracker.isOrderPreserving 
method,the final Orderby's orderByExpressions are not  created by the Info's 
pkPosition, Info's pkPosition only affects OrderPreservingTracker. 
isOrderPreserving method.  In OrderPreservingTracker. isOrderPreserving 
method,the pkPosition  must be the position in  original RowKey columns if the 
SQL has GroupBy.




was (Author: comnetwork):
[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, the 
patch actually did like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker.isOrderPreserving 
method, In OrderPreservingTracker. isOrderPreserving method,the pkPosition  
must be the position in  original RowKey columns if the SQL has GroupBy.



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665899#comment-15665899
 ] 

James Taylor commented on PHOENIX-3451:
---

Sorry about that - I based my feedback on your description in your very helpful 
analysis. Your patch is indeed on the right track, but it makes some 
assumptions. For example, it would only work if a GROUP BY is done directly on 
a RowKeyColumnExpression (i.e. GROUP BY pkCol1, pkCol2), but not for cases like 
GROUP BY pkCol1 + 1, TRUNC(pkCol2).

I'd recommend adding information to the GroupBy object to capture the pk 
position of each Info. You only need to do that here in GroupByCompiler (by 
adding a new setter tot he GroupByBuilder):
{code}
if (isOrderPreserving || isUngroupedAggregate) {
return new 
GroupBy.GroupByBuilder(this).setIsOrderPreserving(isOrderPreserving).setOrderPreservingColumnCount(orderPreservingColumnCount).build();
}
{code}
At that point, you still have the tracker, so you can add a method like 
tracker.getPkPositions() which returns a list or array of int which would be 
the position of the primary key that gets passed into the new Builder setter 
method. Then you can use that information in the OrderByCompiler if an 
aggregation is being done.

> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 3:26 AM:
-

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, the 
patch actually did like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker.isOrderPreserving 
method, In OrderPreservingTracker. isOrderPreserving method,the pkPosition  
must be the position in  original RowKey columns if the SQL has GroupBy.




was (Author: comnetwork):
[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker.isOrderPreserving 
method, In OrderPreservingTracker. isOrderPreserving method,the pkPosition  
must be the position in  original RowKey columns if the SQL has GroupBy.



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 3:25 AM:
-

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker.isOrderPreserving 
method, In OrderPreservingTracker. isOrderPreserving method,the pkPosition  
must be the position in  original RowKey columns if the SQL has GroupBy.




was (Author: comnetwork):
[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker's 
isOrderPreserving method, In OrderPreservingTracker's isOrderPreserving 
method,the pkPosition  must be the position in  original RowKey columns if the 
sql  exists GroupBy.



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 3:25 AM:
-

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch do not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker's 
isOrderPreserving method, In OrderPreservingTracker's isOrderPreserving 
method,the pkPosition  must be the position in  original RowKey columns if the 
sql  exists GroupBy.




was (Author: comnetwork):
[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch did not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker's 
isOrderPreserving method, In OrderPreservingTracker's isOrderPreserving 
method,the pkPosition  must be the position in  original RowKey columns if the 
sql  exists GroupBy.



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 3:24 AM:
-

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch did not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch only 
changes the Info's pkPosition used in OrderPreservingTracker's 
isOrderPreserving method, In OrderPreservingTracker's isOrderPreserving 
method,the pkPosition  must be the position in  original RowKey columns if the 
sql  exists GroupBy.




was (Author: comnetwork):
[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch did not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch is only 
change the Info's pkPosition used in OrderPreservingTracker's isOrderPreserving



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei edited comment on PHOENIX-3451 at 11/15/16 3:17 AM:
-

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.

My patch did not change the final Orderby's orderByExpressions, 
orderByExpression's position is still the position in GroupBy.My patch is only 
change the Info's pkPosition used in OrderPreservingTracker's isOrderPreserving




was (Author: comnetwork):
[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665874#comment-15665874
 ] 

chenglei commented on PHOENIX-3451:
---

[~jamestaylor], It seems you did not look at my uploaded PHOENIX-3451.diff, I 
indeed patch like you said.



> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665868#comment-15665868
 ] 

James Taylor commented on PHOENIX-3451:
---

[~comnetwork] - good, helpful analysis, but your conclusion isn't quite right. 
We want to use the pk position according to the GROUP BY because the rows 
returned from the server which are *aggregated* rows and the sort will be done 
on the client. The problem appears to be that our hasEqualityConstraints isn't 
taking this into account. It should be determining if there's an equality 
constraint for the GROUP BY expressions in those positions (instead of treating 
those as positions in the original schema).

Probably the easiest fix would be to hold on to the OrderPreservingTracker.Info 
in a list for the GroupByCompiler. Then in the OrderByCompiler, we could look 
at the List from the GroupBy and essentially index 
it by position, but then translate it based on 
OrderPreservingTracker.Info.pkPosition. That would be the correct index to use 
when calling hasEqualityConstraints. If there's no list (i.e. it's not an 
aggregation), we'd just use the pkPosition to directly index the ScanRanges as 
we're doing now. 

> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665767#comment-15665767
 ] 

James Taylor commented on PHOENIX-3481:
---

Actually, you're better off if you can track down and change it on the 
server-side as otherwise we'll need to check-in a new Phoenix jar to core 
(assuming this is potentially a real issue rather than a test only issue).

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch, PHOENIX-3481_v3_0.98.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3333) Support Spark 2.0

2016-11-14 Thread DEQUN (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665762#comment-15665762
 ] 

DEQUN commented on PHOENIX-:


Also confused, can you supply more details , thanks ! :-)

> Support Spark 2.0
> -
>
> Key: PHOENIX-
> URL: https://issues.apache.org/jira/browse/PHOENIX-
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.8.0
> Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is 
> hdp 2.5
>Reporter: dalin qin
> Attachments: PHOENIX--interim.patch
>
>
> spark version is  2.0.0.2.5.0.0-1245
> As mentioned by Josh , I believe spark 2.0 changed their api so that failed 
> phoenix. Please come up with update version to adapt spark's change.
> In [1]: df = sqlContext.read \
>...:   .format("org.apache.phoenix.spark") \
>...:   .option("table", "TABLE1") \
>...:   .option("zkUrl", "namenode:2181:/hbase-unsecure") \
>...:   .load()
> ---
> Py4JJavaError Traceback (most recent call last)
>  in ()
> > 1 df = sqlContext.read   .format("org.apache.phoenix.spark")   
> .option("table", "TABLE1")   .option("zkUrl", 
> "namenode:2181:/hbase-unsecure")   .load()
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, 
> path, format, schema, **options)
> 151 return 
> self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
> 152 else:
> --> 153 return self._df(self._jreader.load())
> 154
> 155 @since(1.4)
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py
>  in __call__(self, *args)
> 931 answer = self.gateway_client.send_command(command)
> 932 return_value = get_return_value(
> --> 933 answer, self.gateway_client, self.target_id, self.name)
> 934
> 935 for temp_arg in temp_args:
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw)
>  61 def deco(*a, **kw):
>  62 try:
> ---> 63 return f(*a, **kw)
>  64 except py4j.protocol.Py4JJavaError as e:
>  65 s = e.java_exception.toString()
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py 
> in get_return_value(answer, gateway_client, target_id, name)
> 310 raise Py4JJavaError(
> 311 "An error occurred while calling {0}{1}{2}.\n".
> --> 312 format(target_id, ".", name), value)
> 313 else:
> 314 raise Py4JError(
> Py4JJavaError: An error occurred while calling o43.load.
> : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.getDeclaredMethod(Class.java:2128)
> at 
> java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
> at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:472)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
> at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at 
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
> at 
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
> at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
> at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
> at 
> org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
> at org.apache.spark.SparkContext.clean(SparkContext.scala:2037)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:366)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:365)
> at 
> 

[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665754#comment-15665754
 ] 

Samarth Jain commented on PHOENIX-3481:
---

bq. The intent is that NotServingRegionException is translated on the server 
side to StaleRegionBoundaryCacheException
I see. I wasn't sure if I should make the change in the ServerUtil method 
itself since it will make the change across the board. But from your comment it 
looks like it should be. Will commit the v2 patch then (provided my local run 
is successful). 

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch, PHOENIX-3481_v3_0.98.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3333) Support Spark 2.0

2016-11-14 Thread lichenglin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665749#comment-15665749
 ] 

lichenglin commented on PHOENIX-:
-

In fact,There is no need to build spark.

Just add jars of spark into project phoenix.

> Support Spark 2.0
> -
>
> Key: PHOENIX-
> URL: https://issues.apache.org/jira/browse/PHOENIX-
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.8.0
> Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is 
> hdp 2.5
>Reporter: dalin qin
> Attachments: PHOENIX--interim.patch
>
>
> spark version is  2.0.0.2.5.0.0-1245
> As mentioned by Josh , I believe spark 2.0 changed their api so that failed 
> phoenix. Please come up with update version to adapt spark's change.
> In [1]: df = sqlContext.read \
>...:   .format("org.apache.phoenix.spark") \
>...:   .option("table", "TABLE1") \
>...:   .option("zkUrl", "namenode:2181:/hbase-unsecure") \
>...:   .load()
> ---
> Py4JJavaError Traceback (most recent call last)
>  in ()
> > 1 df = sqlContext.read   .format("org.apache.phoenix.spark")   
> .option("table", "TABLE1")   .option("zkUrl", 
> "namenode:2181:/hbase-unsecure")   .load()
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, 
> path, format, schema, **options)
> 151 return 
> self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
> 152 else:
> --> 153 return self._df(self._jreader.load())
> 154
> 155 @since(1.4)
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py
>  in __call__(self, *args)
> 931 answer = self.gateway_client.send_command(command)
> 932 return_value = get_return_value(
> --> 933 answer, self.gateway_client, self.target_id, self.name)
> 934
> 935 for temp_arg in temp_args:
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw)
>  61 def deco(*a, **kw):
>  62 try:
> ---> 63 return f(*a, **kw)
>  64 except py4j.protocol.Py4JJavaError as e:
>  65 s = e.java_exception.toString()
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py 
> in get_return_value(answer, gateway_client, target_id, name)
> 310 raise Py4JJavaError(
> 311 "An error occurred while calling {0}{1}{2}.\n".
> --> 312 format(target_id, ".", name), value)
> 313 else:
> 314 raise Py4JError(
> Py4JJavaError: An error occurred while calling o43.load.
> : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.getDeclaredMethod(Class.java:2128)
> at 
> java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
> at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:472)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
> at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at 
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
> at 
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
> at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
> at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
> at 
> org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
> at org.apache.spark.SparkContext.clean(SparkContext.scala:2037)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:366)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:365)
>   

[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665743#comment-15665743
 ] 

James Taylor commented on PHOENIX-3481:
---

v2 patch was simpler. Any reason why you changed it? The intent is that 
NotServingRegionException is translated on the server side to 
StaleRegionBoundaryCacheException, but it seems that there's a new code path 
where this is not being done. [~rajeshbabu] is the real expert in this area of 
the code.

My preference is v2 as it'd be good to be consistent across the board with this.

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch, PHOENIX-3481_v3_0.98.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3481:
--
Attachment: PHOENIX-3481_v3_0.98.patch

v3 patch that makes the change localized in TableResultIterator. 

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch, PHOENIX-3481_v3_0.98.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665671#comment-15665671
 ] 

Samarth Jain commented on PHOENIX-3481:
---

[~jamestaylor], need your keen eyes again on this v2 patch. I am running tests 
locally on the 0.98 branch.

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3481:
--
Attachment: PHOENIX-3481_master_v2.patch

v2 patch for master branch. Hopefully this will trigger the QA run.

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch, 
> PHOENIX-3481_master_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665597#comment-15665597
 ] 

Andrew Purtell commented on PHOENIX-3481:
-

Thanks [~samarth.j...@gmail.com]

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665569#comment-15665569
 ] 

Samarth Jain commented on PHOENIX-3481:
---

[~apurtell] - 
https://github.com/apache/hbase/commit/381fcdcfdfd3bac274090cacfeea7c132ba8dd1e 
looks like the change that you are looking for. 

[~jamestaylor], I ran into another test failure locally with my patch : 
SkipScanAfterManualSplitIT#testManualSplit().

{code}
org.apache.phoenix.exception.PhoenixIOException: 
org.apache.phoenix.exception.PhoenixIOException: callTimeout=120, 
callDuration=9000109: row ' c' on table 'T02' at 
region=T02,,1479169996157.dd7e5b63ff5fdc6ef21fd3e26a6400a5., 
hostname=localhost,54017,1479169931042, seqNum=1
at 
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:752)
at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:696)
at 
org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
at 
org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
at 
org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
at 
org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
at 
org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
at 
org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778)
at 
org.apache.phoenix.end2end.SkipScanAfterManualSplitIT.testManualSplit(SkipScanAfterManualSplitIT.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.util.concurrent.ExecutionException: 
org.apache.phoenix.exception.PhoenixIOException: callTimeout=120, 
callDuration=9000109: row ' c' on table 'T02' at 
region=T02,,1479169996157.dd7e5b63ff5fdc6ef21fd3e26a6400a5., 
hostname=localhost,54017,1479169931042, seqNum=1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:202)
at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:747)
... 34 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: 
callTimeout=120, callDuration=9000109: row ' c' on table 'T02' at 
region=T02,,1479169996157.dd7e5b63ff5fdc6ef21fd3e26a6400a5., 
hostname=localhost,54017,1479169931042, seqNum=1
at 
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
at 

[jira] [Updated] (PHOENIX-3469) Incorrect sort order for DESC primary key for NULLS LAST/NULLS FIRST

2016-11-14 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-3469:
--
Attachment: PHOENIX-3469_v3.patch

Patch on top of PHOENIX-3452

> Incorrect sort order for DESC primary key for NULLS LAST/NULLS FIRST
> 
>
> Key: PHOENIX-3469
> URL: https://issues.apache.org/jira/browse/PHOENIX-3469
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: chenglei
>Assignee: chenglei
> Attachments: PHOENIX-3469_v2.patch, PHOENIX-3469_v3.patch
>
>
> This problem can be reproduced as following:
> {code:borderStyle=solid} 
>CREATE TABLE  DESC_TEST (
> ORGANIZATION_ID VARCHAR,
> CONTAINER_ID VARCHAR,
> ENTITY_ID VARCHAR NOT NULL,
> CONSTRAINT TEST_PK PRIMARY KEY ( 
>   ORGANIZATION_ID DESC,
>   CONTAINER_ID DESC,
>   ENTITY_ID
>   ))
>   UPSERT INTO DESC_TEST VALUES ('a',null,'11')
>   UPSERT INTO DESC_TEST VALUES (null,'2','22')
>   UPSERT INTO DESC_TEST VALUES ('c','3','33')
> {code} 
> For the following sql:
> {code:borderStyle=solid}
>   SELECT CONTAINER_ID,ORGANIZATION_ID FROM DESC_TEST  order by 
> CONTAINER_ID ASC NULLS LAST
> {code} 
> the expecting result is:
> {code:borderStyle=solid}
>  2,   null 
>  3,c   
> null,  a
> {code} 
> but the actual result is:
> {code:borderStyle=solid}
>   null,  a 
>   2,   null 
>   3,c 
> {code} 
> By debuging the source code,I found the ScanPlan passes the OrderByExpression 
> to both the ScanRegionObserver and MergeSortTopNResultIterator in line 100 
> and line 232,but the OrderByExpression 's "isNullsLast" property is false, 
> while the sql is "order by CONTAINER_ID ASC NULLS LAST", the "isNullsLast" 
> property should be true.
> {code:borderStyle=solid}
>  90private ScanPlan(StatementContext context, FilterableStatement 
> statement, TableRef table, RowProjector projector, Integer limit, Integer 
> offset, OrderBy orderBy, ParallelIteratorFactory parallelIteratorFactory, 
> boolean allowPageFilter, Expression dynamicFilter) throws SQLException {
>  ..   
> 95  boolean isOrdered = !orderBy.getOrderByExpressions().isEmpty();
> 96 if (isOrdered) { // TopN
> 97   int thresholdBytes = 
> context.getConnection().getQueryServices().getProps().getInt(
> 98   QueryServices.SPOOL_THRESHOLD_BYTES_ATTRIB, 
> QueryServicesOptions.DEFAULT_SPOOL_THRESHOLD_BYTES);
> 99   ScanRegionObserver.serializeIntoScan(context.getScan(), 
> thresholdBytes, 
> 100  limit == null ? -1 : QueryUtil.getOffsetLimit(limit, 
> offset),  orderBy.getOrderByExpressions(),
> 101  projector.getEstimatedRowByteSize());
> 102   }
> ..
> 231} else if (isOrdered) {
> 232scanner = new MergeSortTopNResultIterator(iterators, limit, 
> offset, orderBy.getOrderByExpressions());
> {code} 
> so the problem is caused by the OrderByCompiler, in line 144, it should not 
> negative the "isNullsLast",because the "isNullsLast" should not be influenced 
> by the SortOrder,no matter it  is DESC or ASC:
> {code:borderStyle=solid}
> 142  if (expression.getSortOrder() == SortOrder.DESC) {
> 143 isAscending = !isAscending;
> 144 isNullsLast = !isNullsLast;
> 145 }
> {code} 
> I include more IT test cases in my patch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665530#comment-15665530
 ] 

Andrew Purtell commented on PHOENIX-3481:
-

I will bisect to find the change that causes it, so we know

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3452) NULLS FIRST/NULL LAST should not impact whether GROUP BY is order preserving

2016-11-14 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-3452:
--
Summary: NULLS FIRST/NULL LAST should not impact whether GROUP BY is order 
preserving  (was: Secondary index and query using distinct: ORDER BY doesn't 
work correctly)

> NULLS FIRST/NULL LAST should not impact whether GROUP BY is order preserving
> 
>
> Key: PHOENIX-3452
> URL: https://issues.apache.org/jira/browse/PHOENIX-3452
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3452_v2.patch, PHOENIX-3452_v3.patch
>
>
> This may be related to PHOENIX-3451 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org1','container1','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org1','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org1','container1','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org1','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org1','container1','entityId2',1.5);
> UPSERT INTO test.test VALUES ('org1','container1','entityId1',1.6);
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org1'
> AND container_id = 'container1'
> ORDER BY score DESC
> Notice that the returned results are not returned in descending score order. 
> Instead they are returned in descending entity_id order. If I remove the 
> DISTINCT or remove the secondary index the result is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3469) Incorrect sort order for DESC primary key for NULLS LAST/NULLS FIRST

2016-11-14 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-3469:
--
Summary: Incorrect sort order for DESC primary key for NULLS LAST/NULLS 
FIRST  (was: Once a column in primary key or index is DESC, the corresponding 
order by  NULLS LAST/NULLS FIRST may work incorrectly)

> Incorrect sort order for DESC primary key for NULLS LAST/NULLS FIRST
> 
>
> Key: PHOENIX-3469
> URL: https://issues.apache.org/jira/browse/PHOENIX-3469
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: chenglei
>Assignee: chenglei
> Attachments: PHOENIX-3469_v2.patch
>
>
> This problem can be reproduced as following:
> {code:borderStyle=solid} 
>CREATE TABLE  DESC_TEST (
> ORGANIZATION_ID VARCHAR,
> CONTAINER_ID VARCHAR,
> ENTITY_ID VARCHAR NOT NULL,
> CONSTRAINT TEST_PK PRIMARY KEY ( 
>   ORGANIZATION_ID DESC,
>   CONTAINER_ID DESC,
>   ENTITY_ID
>   ))
>   UPSERT INTO DESC_TEST VALUES ('a',null,'11')
>   UPSERT INTO DESC_TEST VALUES (null,'2','22')
>   UPSERT INTO DESC_TEST VALUES ('c','3','33')
> {code} 
> For the following sql:
> {code:borderStyle=solid}
>   SELECT CONTAINER_ID,ORGANIZATION_ID FROM DESC_TEST  order by 
> CONTAINER_ID ASC NULLS LAST
> {code} 
> the expecting result is:
> {code:borderStyle=solid}
>  2,   null 
>  3,c   
> null,  a
> {code} 
> but the actual result is:
> {code:borderStyle=solid}
>   null,  a 
>   2,   null 
>   3,c 
> {code} 
> By debuging the source code,I found the ScanPlan passes the OrderByExpression 
> to both the ScanRegionObserver and MergeSortTopNResultIterator in line 100 
> and line 232,but the OrderByExpression 's "isNullsLast" property is false, 
> while the sql is "order by CONTAINER_ID ASC NULLS LAST", the "isNullsLast" 
> property should be true.
> {code:borderStyle=solid}
>  90private ScanPlan(StatementContext context, FilterableStatement 
> statement, TableRef table, RowProjector projector, Integer limit, Integer 
> offset, OrderBy orderBy, ParallelIteratorFactory parallelIteratorFactory, 
> boolean allowPageFilter, Expression dynamicFilter) throws SQLException {
>  ..   
> 95  boolean isOrdered = !orderBy.getOrderByExpressions().isEmpty();
> 96 if (isOrdered) { // TopN
> 97   int thresholdBytes = 
> context.getConnection().getQueryServices().getProps().getInt(
> 98   QueryServices.SPOOL_THRESHOLD_BYTES_ATTRIB, 
> QueryServicesOptions.DEFAULT_SPOOL_THRESHOLD_BYTES);
> 99   ScanRegionObserver.serializeIntoScan(context.getScan(), 
> thresholdBytes, 
> 100  limit == null ? -1 : QueryUtil.getOffsetLimit(limit, 
> offset),  orderBy.getOrderByExpressions(),
> 101  projector.getEstimatedRowByteSize());
> 102   }
> ..
> 231} else if (isOrdered) {
> 232scanner = new MergeSortTopNResultIterator(iterators, limit, 
> offset, orderBy.getOrderByExpressions());
> {code} 
> so the problem is caused by the OrderByCompiler, in line 144, it should not 
> negative the "isNullsLast",because the "isNullsLast" should not be influenced 
> by the SortOrder,no matter it  is DESC or ASC:
> {code:borderStyle=solid}
> 142  if (expression.getSortOrder() == SortOrder.DESC) {
> 143 isAscending = !isAscending;
> 144 isNullsLast = !isNullsLast;
> 145 }
> {code} 
> I include more IT test cases in my patch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665462#comment-15665462
 ] 

James Taylor commented on PHOENIX-3481:
---

+1. Thanks, [~samarthjain]. Weird that this just started happening for 0.98.21+.

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3481:
--
Attachment: PHOENIX-3481_master.patch

Attaching patch for master branch to get a qa run. 

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch, PHOENIX-3481_master.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3481:
--
Attachment: PHOENIX-3481.patch

The reason is that there is a race condition between the call initiated by 
Phoenix to async modify the SYSTEM.CATALOG hbase metadata and the call to put 
Phoenix Metadata for SYSTEM.CATALOG. The simple fix is to wait for the async 
call to finish.

[~jamestaylor], please review.

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3481.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665378#comment-15665378
 ] 

Samarth Jain commented on PHOENIX-3481:
---

To reproduce this issue, run any IT test. The start up fails with the below 
error:

{code}
2016-11-14 15:27:05,427 WARN  [main] 
org.apache.hadoop.hbase.client.HTable(1725): Error calling coprocessor service 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for row 
\x00SYSTEM\x00CATALOG
java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: 
callTimeout=120, callDuration=9000103: row 'SYSTEMCATALOG' on table 
'SYSTEM.CATALOG' at 
region=SYSTEM.CATALOG,,1479166015987.0c94299fe4e5271c0b7a12aba74707d7., 
hostname=localhost,52826,1479165992749, seqNum=1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at 
org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1723)
at 
org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1680)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1283)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1263)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1448)
at 
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2275)
at 
org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:939)
at 
org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:211)
at 
org.apache.phoenix.jdbc.PhoenixStatement$3.call(PhoenixStatement.java:355)
at 
org.apache.phoenix.jdbc.PhoenixStatement$3.call(PhoenixStatement.java:1)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:337)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1440)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2410)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:1)
at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2358)
at 
org.apache.phoenix.jdbc.PhoenixTestDriver.getConnectionQueryServices(PhoenixTestDriver.java:96)
at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:147)
at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:141)
at 
org.apache.phoenix.jdbc.PhoenixTestDriver.connect(PhoenixTestDriver.java:83)
at 
org.apache.phoenix.query.BaseTest.initAndRegisterTestDriver(BaseTest.java:680)
at org.apache.phoenix.query.BaseTest.setUpTestDriver(BaseTest.java:564)
at org.apache.phoenix.query.BaseTest.setUpTestDriver(BaseTest.java:557)
at 
org.apache.phoenix.end2end.ParallelStatsDisabledIT.doSetup(ParallelStatsDisabledIT.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.net.SocketTimeoutException: 

[jira] [Updated] (PHOENIX-3481) Phoenix initialization fails for HBase 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3481:
--
Summary: Phoenix initialization fails for HBase 0.98.21 and beyond  (was: 
Phoenix initialization fails for 0.98.21 and beyond)

> Phoenix initialization fails for HBase 0.98.21 and beyond
> -
>
> Key: PHOENIX-3481
> URL: https://issues.apache.org/jira/browse/PHOENIX-3481
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3481) Phoenix initialization fails for 0.98.21 and beyond

2016-11-14 Thread Samarth Jain (JIRA)
Samarth Jain created PHOENIX-3481:
-

 Summary: Phoenix initialization fails for 0.98.21 and beyond
 Key: PHOENIX-3481
 URL: https://issues.apache.org/jira/browse/PHOENIX-3481
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain
Assignee: Samarth Jain
 Fix For: 4.9.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Coprocessor metrics

2016-11-14 Thread Enis Söztutar
Thanks everyone, this is really helpful. If we can leverage the work
already Josh/Nick did in Avatica in HBase that will be really good.

Seems that the consensus is to follow #2 approach, and pay the price at
replicating the API layer in HBase for the convenience of coprocessors and
not to tie ourselves with a third party API + implementation.

However, even if we do an avatica release, HBase depending on metrics API
in Avatica is the same thing as HBase depending on dropwizard directly
since HBase does not "control" the Avatica API either. At this point
blindly forking the code inside HBase seems like the way to go (possibly in
it's own module).

Let me poke around, and fork the code if possible inside HBase. I'll send
reviews your way.

Enis

On Mon, Nov 14, 2016 at 7:58 AM, Josh Elser  wrote:

> Yep -- see avatica-metrics[1], avatica-dropwizard-metrics3[2], and my
> dropwizard-hadoop-metrics2[3] project for what Nick is referring to.
>
> What I ended up doing in Calcite/Avatica was a step beyond your #3, Enis.
> Instead of choosing a subset of some standard metrics library to expose, I
> "re-built" the actual API that I wanted to expose. At the end of the day,
> the API I "built" was nearly 100% what dropwizard metrics' API was. I like
> the dropwizard-metrics API; however, we wanted to avoid the strong coupling
> to a single metrics implementation.
>
> My current feeling is that external API should never include
> classes/interfaces which you don't "own". Re-building the API that already
> exists is pedantic, but I think it's a really good way to pay down the
> maintenance debt (whenever the next metrics library "hotness" takes off).
>
> If it's amenable to you, Enis, I'm happy to work with you to do whatever
> decoupling of this metrics abstraction away from the "core" of Avatica
> (e.g. presently, a new update of the library would also require a full
> release of Avatica which is no-good for HBase). I think a lot of the
> lifting I've done already would be reusable by you and help make a better
> product at the end of the day.
>
> - Josh
>
> [1] https://github.com/apache/calcite/tree/master/avatica/metrics
> [2] https://github.com/apache/calcite/tree/master/avatica/metric
> s-dropwizardmetrics3
> [3] https://github.com/joshelser/dropwizard-hadoop-metrics2
>
>
> Nick Dimiduk wrote:
>
>> IIRC, the plan is to get off of Hadoop Metrics2, so I am in favor of
>> either
>> (2) or (3). Specifically for (3), I believe there is an implementation for
>> translating Dropwizard Metrics to Hadoop Metrics2, in or around Avatica
>> and/or Phoenix Query Server.
>>
>> On Fri, Nov 11, 2016 at 3:15 PM, Enis Söztutar  wrote:
>>
>> HBase / Phoenix devs,
>>>
>>> I would like to solicit early feedback on the design approach that we
>>> would
>>> pursue for exposing coprocessor metrics. It has implications for our
>>> compatibility, so lets try to have some consensus. Added Phoenix devs as
>>> well since this will affect how coprocessors can emit metrics via region
>>> server metrics bus.
>>>
>>> The issue is HBASE-9774 [1].
>>>
>>>
>>> We have a couple of options:
>>>
>>> (1) Expose Hadoop Metrics2 + HBase internal classes (like BaseSourceImpl,
>>> MutableFastCounter, FastLongHistogram, etc). This option is the least
>>> amount of work in terms of defining the API. We would mark the important
>>> classes with LimitedPrivate(Coprocessor) and have the coprocessors each
>>> write their metrics source classes separately. The disadvantage would be
>>> that some of the internal APIs are now public and has to be evolved with
>>> regards to coprocessor API compatibility. Also it will make it so that
>>> breaking coprocessors are now easier across minor releases.
>>> (2) Build a Metrics subset API in HBase to abstract away HBase metrics
>>> classes and Hadoop2 metrics classes and expose this API only. The API
>>> will
>>> probably be limited and will be a small subset. HBase internals do not
>>> need
>>> to be changed that much, but the API has to be kept
>>> LimitedPrivate(Coprocessor) with the compatibility implications.
>>> (3) Expose (a limited subset of) third-party API to the coprocessors
>>> (like
>>> Yammer metrics) and never expose internal HBase / Hadoop implementation.
>>> Build a translation layer between the yammer metrics and our Hadoop
>>> metrics
>>> 2 implementation so that things will still work. If we end up changing
>>> the
>>> implementation, existing coprocessors will not be affected. The downside
>>> is
>>> that whatever API that we agree to expose becomes our compatibility
>>> point.
>>> We cannot change that dependency version unless it is acceptable via our
>>> compatibility guidelines.
>>>
>>> Personally, I would like to pursue option (3) especially with Yammer
>>> metrics since we do not have to build yet another API endpoint. Hadoop's
>>> metrics API is not the best and we do not know whether we will end up
>>> changing that dependency. What do you guys think?
>>>

Re: Coprocessor metrics

2016-11-14 Thread Andrew Purtell
+1

We also need to get away from Guava's Service interface for replication
endpoint plugins using this same approach (#2): HBASE-15982

On Mon, Nov 14, 2016 at 10:37 AM, Gary Helmling  wrote:

> >
> >
> > My current feeling is that external API should never include
> > classes/interfaces which you don't "own". Re-building the API that
> > already exists is pedantic, but I think it's a really good way to pay
> > down the maintenance debt (whenever the next metrics library "hotness"
> > takes off).
> >
> >
> +1 to this.  I'd be very hesitant to tie ourselves too strongly to a
> specific implementation, even if it is just copying an interface.
>
> For coprocessors specifically, I think we can start with a limited API
> exposing common metric types and evolve it from there.  But starting simple
> seems key.
>
> So #2 seems like the right approach to me.
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)


Re: Coprocessor metrics

2016-11-14 Thread Gary Helmling
>
>
> My current feeling is that external API should never include
> classes/interfaces which you don't "own". Re-building the API that
> already exists is pedantic, but I think it's a really good way to pay
> down the maintenance debt (whenever the next metrics library "hotness"
> takes off).
>
>
+1 to this.  I'd be very hesitant to tie ourselves too strongly to a
specific implementation, even if it is just copying an interface.

For coprocessors specifically, I think we can start with a limited API
exposing common metric types and evolve it from there.  But starting simple
seems key.

So #2 seems like the right approach to me.


[jira] [Commented] (PHOENIX-3241) Convert_tz doesn't allow timestamp data type

2016-11-14 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664597#comment-15664597
 ] 

Josh Elser commented on PHOENIX-3241:
-

Thanks, [~an...@apache.org]!

[~jamestaylor], unless I hear otherwise from you, I will wait for 4.9.0 to 
finish before landing this.
[~lhofhansl] ditto for 4.8.2

> Convert_tz doesn't allow timestamp data type
> 
>
> Key: PHOENIX-3241
> URL: https://issues.apache.org/jira/browse/PHOENIX-3241
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Josh Elser
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3241.002.patch, PHOENIX-3241.003.patch, 
> PHOENIX-3241.patch
>
>
> As per documentation, we allow timestamp data type of convert_tz but as per 
> code only DATE dataype is allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 4:19 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 4:18 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Updated] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated PHOENIX-3451:
--
Attachment: PHOENIX-3451.diff

> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Joel Palmert
>Assignee: chenglei
> Attachments: PHOENIX-3451.diff
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId51.2
> entityId31.4
> The expected out out would be
> entityId81.45
> entityId31.4
> You will get the expected output if you remove the secondary index from the 
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the 
> ordering is not correct. However, the 2first results in that ordering is 
> still not the onces returned by the limit clause, which makes me think there 
> are multiple issues here and why I filed both separately. The rows being 
> returned are the ones assigned to container1. It looks like Phoenix is first 
> getting the rows from the first container and when it finds that to be enough 
> it stops the scan. What it should be doing is getting 2 results for each 
> container and then merge then and then limit again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Coprocessor metrics

2016-11-14 Thread Josh Elser
Yep -- see avatica-metrics[1], avatica-dropwizard-metrics3[2], and my 
dropwizard-hadoop-metrics2[3] project for what Nick is referring to.


What I ended up doing in Calcite/Avatica was a step beyond your #3, 
Enis. Instead of choosing a subset of some standard metrics library to 
expose, I "re-built" the actual API that I wanted to expose. At the end 
of the day, the API I "built" was nearly 100% what dropwizard metrics' 
API was. I like the dropwizard-metrics API; however, we wanted to avoid 
the strong coupling to a single metrics implementation.


My current feeling is that external API should never include 
classes/interfaces which you don't "own". Re-building the API that 
already exists is pedantic, but I think it's a really good way to pay 
down the maintenance debt (whenever the next metrics library "hotness" 
takes off).


If it's amenable to you, Enis, I'm happy to work with you to do whatever 
decoupling of this metrics abstraction away from the "core" of Avatica 
(e.g. presently, a new update of the library would also require a full 
release of Avatica which is no-good for HBase). I think a lot of the 
lifting I've done already would be reusable by you and help make a 
better product at the end of the day.


- Josh

[1] https://github.com/apache/calcite/tree/master/avatica/metrics
[2] 
https://github.com/apache/calcite/tree/master/avatica/metrics-dropwizardmetrics3

[3] https://github.com/joshelser/dropwizard-hadoop-metrics2

Nick Dimiduk wrote:

IIRC, the plan is to get off of Hadoop Metrics2, so I am in favor of either
(2) or (3). Specifically for (3), I believe there is an implementation for
translating Dropwizard Metrics to Hadoop Metrics2, in or around Avatica
and/or Phoenix Query Server.

On Fri, Nov 11, 2016 at 3:15 PM, Enis Söztutar  wrote:


HBase / Phoenix devs,

I would like to solicit early feedback on the design approach that we would
pursue for exposing coprocessor metrics. It has implications for our
compatibility, so lets try to have some consensus. Added Phoenix devs as
well since this will affect how coprocessors can emit metrics via region
server metrics bus.

The issue is HBASE-9774 [1].


We have a couple of options:

(1) Expose Hadoop Metrics2 + HBase internal classes (like BaseSourceImpl,
MutableFastCounter, FastLongHistogram, etc). This option is the least
amount of work in terms of defining the API. We would mark the important
classes with LimitedPrivate(Coprocessor) and have the coprocessors each
write their metrics source classes separately. The disadvantage would be
that some of the internal APIs are now public and has to be evolved with
regards to coprocessor API compatibility. Also it will make it so that
breaking coprocessors are now easier across minor releases.
(2) Build a Metrics subset API in HBase to abstract away HBase metrics
classes and Hadoop2 metrics classes and expose this API only. The API will
probably be limited and will be a small subset. HBase internals do not need
to be changed that much, but the API has to be kept
LimitedPrivate(Coprocessor) with the compatibility implications.
(3) Expose (a limited subset of) third-party API to the coprocessors (like
Yammer metrics) and never expose internal HBase / Hadoop implementation.
Build a translation layer between the yammer metrics and our Hadoop metrics
2 implementation so that things will still work. If we end up changing the
implementation, existing coprocessors will not be affected. The downside is
that whatever API that we agree to expose becomes our compatibility point.
We cannot change that dependency version unless it is acceptable via our
compatibility guidelines.

Personally, I would like to pursue option (3) especially with Yammer
metrics since we do not have to build yet another API endpoint. Hadoop's
metrics API is not the best and we do not know whether we will end up
changing that dependency. What do you guys think?


[1] https://issues.apache.org/jira/browse/HBASE-9774





[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:21 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:19 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:13 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:11 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:00 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following select sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler think the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because the where condition 
"container_id IN ( 'container1','container2','container3' )", obviously OrderBy 
is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode converts to a 
KeyValueColumnExpression in line 408,then in line 409,wrapGroupByExpression 
method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression, because the 
index of "score" in "[ENTITY_ID, SCORE]" is 1,so the return value of 
RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = node.accept(visitor);
109 if (info == null) 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:08 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression in line 291, 
and because the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the 
return value of RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:06 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" column  is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression, and because 
the index of "score" column in "[ENTITY_ID, SCORE]" is 1,so the return value of 
RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:05 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode is converted 
to a KeyValueColumnExpression in line 408,then in line 
409,wrapGroupByExpression method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression, because the 
index of "score" in "[ENTITY_ID, SCORE]" is 1,so the return value of 
RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = node.accept(visitor);
109

[jira] [Commented] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei commented on PHOENIX-3451:
---

This bug is caused by the OrderByCompiler,my analysis of this bug is as 
following:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following select sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler think the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because the where condition 
"container_id IN ( 'container1','container2','container3' )", obviously OrderBy 
is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode converts to a 
KeyValueColumnExpression in line 408,then in line 409,wrapGroupByExpression 
method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression, because the 
index of "score" in "[ENTITY_ID, SCORE]" is 1,so the return value of 
RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = node.accept(visitor);
109 if (info == null) {
110 isOrderPreserving = false;

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:03 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler thinking the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because we can see the where 
condition is "container_id IN ( 'container1','container2','container3' )", 
obviously OrderBy is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode converts to a 
KeyValueColumnExpression in line 408,then in line 409,wrapGroupByExpression 
method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression, because the 
index of "score" in "[ENTITY_ID, SCORE]" is 1,so the return value of 
RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = node.accept(visitor);
109 if 

[jira] [Comment Edited] (PHOENIX-3451) Secondary index and query using distinct: LIMIT doesn't return the first rows

2016-11-14 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664118#comment-15664118
 ] 

chenglei edited comment on PHOENIX-3451 at 11/14/16 3:01 PM:
-

This bug is caused by the OrderByCompiler,my analysis of this bug is as follows:

take following table which describe by [~jpalmert]  as a example : 
{code:borderStyle=solid} 
 CREATE TABLE IF NOT EXISTS TEST.TEST (
ORGANIZATION_ID CHAR(15) NOT NULL,
CONTAINER_ID CHAR(15) NOT NULL,
ENTITY_ID CHAR(15) NOT NULL,
SCORE DOUBLE,
CONSTRAINT TEST_PK PRIMARY KEY (
   ORGANIZATION_ID,
   CONTAINER_ID,
   ENTITY_ID
 )
 )

CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);

UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
{code} 

for the following query sql :
{code:borderStyle=solid} 
SELECT DISTINCT entity_id, score
FROM test.test
WHERE organization_id = 'org2'
AND container_id IN ( 'container1','container2','container3' )
ORDER BY score DESC
LIMIT 2
{code} 

the phoenix would use the following index table  TEST_SCORE to do the query:
{code:borderStyle=solid} 
CREATE INDEX IF NOT EXISTS TEST_SCORE ON  
TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC);
{code} 

Using that index is good,the problem is that the OrderByCompiler think the 
OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY,but because the where condition 
"container_id IN ( 'container1','container2','container3' )", obviously OrderBy 
is not OrderBy.FWD_ROW_KEY_ORDER_BY.

When we look into  OrderByCompiler's compile method, in line 123,  the  "score" 
ColumnParseNode in  "ORDER BY score DESC" accepts a   ExpressionCompiler 
visitor:
{code:borderStyle=solid} 
123expression = node.getNode().accept(compiler);
124// Detect mix of aggregate and non aggregates (i.e. ORDER BY 
txns, SUM(txns)
125if (!expression.isStateless() && !compiler.isAggregate()) {
126if (statement.isAggregate() || statement.isDistinct()) {
127// Detect ORDER BY not in SELECT DISTINCT: SELECT 
DISTINCT count(*) FROM t ORDER BY x
128if (statement.isDistinct()) {
129throw new 
SQLExceptionInfo.Builder(SQLExceptionCode.ORDER_BY_NOT_IN_SELECT_DISTINCT)
130
.setMessage(expression.toString()).build().buildException();
131}
132
ExpressionCompiler.throwNonAggExpressionInAggException(expression.toString());
{code} 
In ExpressionCompiler 's visit method,the "score" ColumnParseNode converts to a 
KeyValueColumnExpression in line 408,then in line 409,wrapGroupByExpression 
method is invoked:
{code:borderStyle=solid} 
393  public Expression visit(ColumnParseNode node) throws SQLException {
 
408   Expression expression = 
ref.newColumnExpression(node.isTableNameCaseSensitive(), 
node.isCaseSensitive());
409   Expression wrappedExpression = 
wrapGroupByExpression(expression);
{code} 
in wrapGroupByExpression method,because the "score" is in 
groupBy.getExpressions(),which is "[ENTITY_ID, SCORE]",so 
KeyValueColumnExpression  is replaced by RowKeyColumnExpression, because the 
index of "score" in "[ENTITY_ID, SCORE]" is 1,so the return value of 
RowKeyColumnExpression 's position method is 1 :

{code:borderStyle=solid} 
282   private Expression wrapGroupByExpression(Expression expression) {
.
286if (aggregateFunction == null) {
287int index = groupBy.getExpressions().indexOf(expression);
288if (index >= 0) {
289isAggregate = true;
290RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
291expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
292}
293}
294return expression;
295}
 
{code} 

so when OrderByCompiler's compile method invokes OrderPreservingTracker's track 
method, in line 108,the return Info's pkPosition is 1:

{code:borderStyle=solid} 
106public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
107 if (isOrderPreserving) {
108 Info info = node.accept(visitor);
109 if (info == null) {

[jira] [Comment Edited] (PHOENIX-3333) Support Spark 2.0

2016-11-14 Thread DEQUN (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663791#comment-15663791
 ] 

DEQUN edited comment on PHOENIX- at 11/14/16 12:25 PM:
---

I'm new to git, maven and phoenix. Could you show me  how to use this path or 
how to build spark_2.0 branch?


was (Author: sd10):
I'm new to git, maven and phoenix. Could you show me  tutorial how to use this 
path or how to build spark_2.0 branch?

> Support Spark 2.0
> -
>
> Key: PHOENIX-
> URL: https://issues.apache.org/jira/browse/PHOENIX-
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.8.0
> Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is 
> hdp 2.5
>Reporter: dalin qin
> Attachments: PHOENIX--interim.patch
>
>
> spark version is  2.0.0.2.5.0.0-1245
> As mentioned by Josh , I believe spark 2.0 changed their api so that failed 
> phoenix. Please come up with update version to adapt spark's change.
> In [1]: df = sqlContext.read \
>...:   .format("org.apache.phoenix.spark") \
>...:   .option("table", "TABLE1") \
>...:   .option("zkUrl", "namenode:2181:/hbase-unsecure") \
>...:   .load()
> ---
> Py4JJavaError Traceback (most recent call last)
>  in ()
> > 1 df = sqlContext.read   .format("org.apache.phoenix.spark")   
> .option("table", "TABLE1")   .option("zkUrl", 
> "namenode:2181:/hbase-unsecure")   .load()
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, 
> path, format, schema, **options)
> 151 return 
> self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
> 152 else:
> --> 153 return self._df(self._jreader.load())
> 154
> 155 @since(1.4)
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py
>  in __call__(self, *args)
> 931 answer = self.gateway_client.send_command(command)
> 932 return_value = get_return_value(
> --> 933 answer, self.gateway_client, self.target_id, self.name)
> 934
> 935 for temp_arg in temp_args:
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw)
>  61 def deco(*a, **kw):
>  62 try:
> ---> 63 return f(*a, **kw)
>  64 except py4j.protocol.Py4JJavaError as e:
>  65 s = e.java_exception.toString()
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py 
> in get_return_value(answer, gateway_client, target_id, name)
> 310 raise Py4JJavaError(
> 311 "An error occurred while calling {0}{1}{2}.\n".
> --> 312 format(target_id, ".", name), value)
> 313 else:
> 314 raise Py4JError(
> Py4JJavaError: An error occurred while calling o43.load.
> : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.getDeclaredMethod(Class.java:2128)
> at 
> java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
> at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:472)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
> at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at 
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
> at 
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
> at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
> at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
> at 
> org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
> at 

[jira] [Commented] (PHOENIX-3333) Support Spark 2.0

2016-11-14 Thread DEQUN (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663791#comment-15663791
 ] 

DEQUN commented on PHOENIX-:


I'm new to git, maven and phoenix. Could you show me  tutorial how to use this 
path or how to build spark_2.0 branch?

> Support Spark 2.0
> -
>
> Key: PHOENIX-
> URL: https://issues.apache.org/jira/browse/PHOENIX-
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.8.0
> Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is 
> hdp 2.5
>Reporter: dalin qin
> Attachments: PHOENIX--interim.patch
>
>
> spark version is  2.0.0.2.5.0.0-1245
> As mentioned by Josh , I believe spark 2.0 changed their api so that failed 
> phoenix. Please come up with update version to adapt spark's change.
> In [1]: df = sqlContext.read \
>...:   .format("org.apache.phoenix.spark") \
>...:   .option("table", "TABLE1") \
>...:   .option("zkUrl", "namenode:2181:/hbase-unsecure") \
>...:   .load()
> ---
> Py4JJavaError Traceback (most recent call last)
>  in ()
> > 1 df = sqlContext.read   .format("org.apache.phoenix.spark")   
> .option("table", "TABLE1")   .option("zkUrl", 
> "namenode:2181:/hbase-unsecure")   .load()
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, 
> path, format, schema, **options)
> 151 return 
> self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
> 152 else:
> --> 153 return self._df(self._jreader.load())
> 154
> 155 @since(1.4)
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py
>  in __call__(self, *args)
> 931 answer = self.gateway_client.send_command(command)
> 932 return_value = get_return_value(
> --> 933 answer, self.gateway_client, self.target_id, self.name)
> 934
> 935 for temp_arg in temp_args:
> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw)
>  61 def deco(*a, **kw):
>  62 try:
> ---> 63 return f(*a, **kw)
>  64 except py4j.protocol.Py4JJavaError as e:
>  65 s = e.java_exception.toString()
> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py 
> in get_return_value(answer, gateway_client, target_id, name)
> 310 raise Py4JJavaError(
> 311 "An error occurred while calling {0}{1}{2}.\n".
> --> 312 format(target_id, ".", name), value)
> 313 else:
> 314 raise Py4JError(
> Py4JJavaError: An error occurred while calling o43.load.
> : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.getDeclaredMethod(Class.java:2128)
> at 
> java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
> at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:472)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
> at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at 
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
> at 
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
> at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
> at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
> at 
> org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
> at org.apache.spark.SparkContext.clean(SparkContext.scala:2037)
> at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:366)
> at 

[jira] [Commented] (PHOENIX-3465) order by incorrect when array column be used

2016-11-14 Thread Yuan Kang (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663737#comment-15663737
 ] 

Yuan Kang commented on PHOENIX-3465:


some one can help?

> order by incorrect when array column be used
> 
>
> Key: PHOENIX-3465
> URL: https://issues.apache.org/jira/browse/PHOENIX-3465
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.0
>Reporter: Yuan Kang
>  Labels: bug
>
> when I create a table like that:
> create table "TABLE_A"
> (
> task_id varchar not null,
> date varchar not null,
> dim varchar not null,
> valueArray double array,
> dimNameArray varchar array,
> constraint pk primary key (task_id, date, dim)
> ) SALT_BUCKETS = 4, COMPRESSION='SNAPPY';
> upsert some data ,when I query a sql below :
> select date, sum(valueArray[16]) as val1
> from TABLE_A 
> where date = '2016-11-01' and task_id = '4692' order by val1 desc limit 50;"
> the result is incorrert.the similer issus was announced be fix in 4.5.0,this 
> issus is happened in 4.8.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3478) UPDATE STATISTICS SET syntax error

2016-11-14 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663468#comment-15663468
 ] 

Ankit Singhal commented on PHOENIX-3478:


you need to use quotes if you have SQL reserved characters(like ".") in your 
property name(https://phoenix.apache.org/language/index.html#name). 

{code}
UPDATE STATISTICS my_table SET "phoenix.stats.guidepost.width"=5000
{code}

Let me see If I can update documentation to represent this better.


> UPDATE STATISTICS SET syntax error
> --
>
> Key: PHOENIX-3478
> URL: https://issues.apache.org/jira/browse/PHOENIX-3478
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.1
>Reporter: Alex Batyrshin
>Priority: Minor
>
> According to 
> [documentation|https://phoenix.apache.org/language/index.html#update_statistics]
>  example:
> bq. UPDATE STATISTICS my_table SET phoenix.stats.guidepost.width=5000
> And this is real world:
> {code}
> 0: jdbc:phoenix:> UPDATE STATISTICS "xxx" SET 
> phoenix.stats.guidepost.width=3000;
> Error: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "EQ", got 
> "." at line 1, column 52. (state=42P00,code=604)
> {code}
> Looks like SET parameter should be quoted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-3478) UPDATE STATISTICS SET syntax error

2016-11-14 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal resolved PHOENIX-3478.

Resolution: Not A Problem

> UPDATE STATISTICS SET syntax error
> --
>
> Key: PHOENIX-3478
> URL: https://issues.apache.org/jira/browse/PHOENIX-3478
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.8.1
>Reporter: Alex Batyrshin
>Priority: Minor
>
> According to 
> [documentation|https://phoenix.apache.org/language/index.html#update_statistics]
>  example:
> bq. UPDATE STATISTICS my_table SET phoenix.stats.guidepost.width=5000
> And this is real world:
> {code}
> 0: jdbc:phoenix:> UPDATE STATISTICS "xxx" SET 
> phoenix.stats.guidepost.width=3000;
> Error: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "EQ", got 
> "." at line 1, column 52. (state=42P00,code=604)
> {code}
> Looks like SET parameter should be quoted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-3241) Convert_tz doesn't allow timestamp data type

2016-11-14 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-3241:
---
Assignee: Josh Elser  (was: Ankit Singhal)

> Convert_tz doesn't allow timestamp data type
> 
>
> Key: PHOENIX-3241
> URL: https://issues.apache.org/jira/browse/PHOENIX-3241
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Josh Elser
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3241.002.patch, PHOENIX-3241.003.patch, 
> PHOENIX-3241.patch
>
>
> As per documentation, we allow timestamp data type of convert_tz but as per 
> code only DATE dataype is allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3241) Convert_tz doesn't allow timestamp data type

2016-11-14 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663426#comment-15663426
 ] 

Ankit Singhal edited comment on PHOENIX-3241 at 11/14/16 10:39 AM:
---

[~elserj], +1 , Adding a test case is always better.
And, Extending credit(Assigning to you) as you have spent more time on this 
than me. :)


was (Author: an...@apache.org):
[~elserj], +1 , Adding a test case is always better.
And, Extending credit(Assigning to you) as you have spent more time on this 
than me.

> Convert_tz doesn't allow timestamp data type
> 
>
> Key: PHOENIX-3241
> URL: https://issues.apache.org/jira/browse/PHOENIX-3241
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3241.002.patch, PHOENIX-3241.003.patch, 
> PHOENIX-3241.patch
>
>
> As per documentation, we allow timestamp data type of convert_tz but as per 
> code only DATE dataype is allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-3241) Convert_tz doesn't allow timestamp data type

2016-11-14 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663426#comment-15663426
 ] 

Ankit Singhal edited comment on PHOENIX-3241 at 11/14/16 10:38 AM:
---

[~elserj], +1 , Adding a test case is always better.
And, Extending credit(Assigning to you) as you have spent more time on this 
than me.


was (Author: an...@apache.org):
[~elserj], +1 , Adding a test case is always better.

> Convert_tz doesn't allow timestamp data type
> 
>
> Key: PHOENIX-3241
> URL: https://issues.apache.org/jira/browse/PHOENIX-3241
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3241.002.patch, PHOENIX-3241.003.patch, 
> PHOENIX-3241.patch
>
>
> As per documentation, we allow timestamp data type of convert_tz but as per 
> code only DATE dataype is allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3241) Convert_tz doesn't allow timestamp data type

2016-11-14 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663426#comment-15663426
 ] 

Ankit Singhal commented on PHOENIX-3241:


[~elserj], +1 , Adding a test case is always better.

> Convert_tz doesn't allow timestamp data type
> 
>
> Key: PHOENIX-3241
> URL: https://issues.apache.org/jira/browse/PHOENIX-3241
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3241.002.patch, PHOENIX-3241.003.patch, 
> PHOENIX-3241.patch
>
>
> As per documentation, we allow timestamp data type of convert_tz but as per 
> code only DATE dataype is allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3461) Statistics collection broken if name space mapping enabled for SYSTEM tables

2016-11-14 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663418#comment-15663418
 ] 

Ankit Singhal commented on PHOENIX-3461:


bq. I think if we have any hope of namespaces not regressing, we need better 
test coverage.
Yes [~jamestaylor] ,  created PHOENIX-3480 for increasing the test coverage. 
And, I think it will be great if we can reach a level where all the test can be 
run with namespace enabled and disabled.

> Statistics collection broken if name space mapping enabled for SYSTEM tables
> 
>
> Key: PHOENIX-3461
> URL: https://issues.apache.org/jira/browse/PHOENIX-3461
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.9.0
>
> Attachments: PHOENIX-3461-v3.patch, PHOENIX-3461_master.patch, 
> PHOENIX-3461_v2.patch, PHOENIX-3461_v4.patch, PHOENIX-3461_v5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3480) Extend test coverage of namespace mapping feature to avoid regression

2016-11-14 Thread Ankit Singhal (JIRA)
Ankit Singhal created PHOENIX-3480:
--

 Summary: Extend test coverage of namespace mapping feature to 
avoid regression
 Key: PHOENIX-3480
 URL: https://issues.apache.org/jira/browse/PHOENIX-3480
 Project: Phoenix
  Issue Type: Bug
Reporter: Ankit Singhal
Assignee: Ankit Singhal


We should increase test coverage of namespace mapping feature to ensure that 
new changes are using the right set of APIs to get physical name for user and 
system tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)