[jira] [Commented] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

2017-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961806#comment-15961806
 ] 

ASF GitHub Bot commented on DRILL-5395:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/803


> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---
>
> Key: DRILL-5395
> URL: https://issues.apache.org/jira/browse/DRILL-5395
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - MapRDB
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Abhishek Girish
>Assignee: Padma Penumarthy
>  Labels: MapR-DB-Binary, ready-to-commit
> Fix For: 1.11.0
>
> Attachments: drillbit.log.txt
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 
> million rows). A partial WIP fix for DRILL-5394 caused the number of rows to 
> be reported incorrectly (~300,000). 2 minor fragments were created (due to 
> filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, 
> possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>AS customer, 
>Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS 
> FLOAT) 
>AS availability 
> FROM   db.tpch_maprdb.lineitem_1 a 
> WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>join 
>   (SELECT Convert_from(b.row_key, 'UTF8') 
>   AS customer, 
>Cast( 
>Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>availability 
> FROM   db.tpch_maprdb.lineitem_1 b 
> WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>  ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

2017-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946528#comment-15946528
 ] 

ASF GitHub Bot commented on DRILL-5395:
---

Github user Agirish commented on the issue:

https://github.com/apache/drill/pull/803
  
Can confirm that with this fix, the previously observed issue (NPE) is no 
longer reproducible. Thanks for making this change. 


> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---
>
> Key: DRILL-5395
> URL: https://issues.apache.org/jira/browse/DRILL-5395
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - MapRDB
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Abhishek Girish
>Assignee: Padma Penumarthy
>  Labels: MapR-DB-Binary
> Fix For: 1.11.0
>
> Attachments: drillbit.log.txt
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 
> million rows). A partial WIP fix for DRILL-5394 caused the number of rows to 
> be reported incorrectly (~300,000). 2 minor fragments were created (due to 
> filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, 
> possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>AS customer, 
>Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS 
> FLOAT) 
>AS availability 
> FROM   db.tpch_maprdb.lineitem_1 a 
> WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>join 
>   (SELECT Convert_from(b.row_key, 'UTF8') 
>   AS customer, 
>Cast( 
>Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>availability 
> FROM   db.tpch_maprdb.lineitem_1 b 
> WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>  ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

2017-03-28 Thread Abhishek Girish (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946517#comment-15946517
 ] 

Abhishek Girish commented on DRILL-5395:


Thanks for such detailed explanation of the underlying issue, [~ppenumarthy]. 
Very useful! 

> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---
>
> Key: DRILL-5395
> URL: https://issues.apache.org/jira/browse/DRILL-5395
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - MapRDB
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Abhishek Girish
>Assignee: Padma Penumarthy
>  Labels: MapR-DB-Binary
> Fix For: 1.11.0
>
> Attachments: drillbit.log.txt
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 
> million rows). A partial WIP fix for DRILL-5394 caused the number of rows to 
> be reported incorrectly (~300,000). 2 minor fragments were created (due to 
> filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, 
> possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>AS customer, 
>Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS 
> FLOAT) 
>AS availability 
> FROM   db.tpch_maprdb.lineitem_1 a 
> WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>join 
>   (SELECT Convert_from(b.row_key, 'UTF8') 
>   AS customer, 
>Cast( 
>Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>availability 
> FROM   db.tpch_maprdb.lineitem_1 b 
> WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>  ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

2017-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946514#comment-15946514
 ] 

ASF GitHub Bot commented on DRILL-5395:
---

GitHub user ppadma opened a pull request:

https://github.com/apache/drill/pull/803

DRILL-5395: Query on MapR-DB table fails with NPE due to an issue wit…

… counts
See [DRILL-5395](https://issues.apache.org/jira/browse/DRILL-5395) for 
details.





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppadma/drill DRILL-5395

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/803.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #803


commit 1a21d4a9ddfeb5f06a3224016d7925ccfc135e75
Author: Padma Penumarthy 
Date:   2017-03-28T22:08:54Z

DRILL-5395: Query on MapR-DB table fails with NPE due to an issue with 
assignment logic




> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---
>
> Key: DRILL-5395
> URL: https://issues.apache.org/jira/browse/DRILL-5395
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - MapRDB
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Abhishek Girish
>Assignee: Padma Penumarthy
>  Labels: MapR-DB-Binary
> Fix For: 1.11.0
>
> Attachments: drillbit.log.txt
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 
> million rows). A partial WIP fix for DRILL-5394 caused the number of rows to 
> be reported incorrectly (~300,000). 2 minor fragments were created (due to 
> filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, 
> possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>AS customer, 
>Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS 
> FLOAT) 
>AS availability 
> FROM   db.tpch_maprdb.lineitem_1 a 
> WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>join 
>   (SELECT Convert_from(b.row_key, 'UTF8') 
>   AS customer, 
>Cast( 
>Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>availability 
> FROM   db.tpch_maprdb.lineitem_1 b 
> WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>  ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

2017-03-28 Thread Padma Penumarthy (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946511#comment-15946511
 ] 

Padma Penumarthy commented on DRILL-5395:
-

Based on number of fragments (decided by parallelization) and HBase regions to 
assign to these fragments, we calculate minimum and maximum regions to assign 
per fragment. 
Each region is assigned to a fragment on the node it is located on, fragments 
being picked in a round robin fashion. At the end, we make adjustments so each 
fragment is assigned at least the minimum and not more than maximum regions per 
fragment.
We can  have a case where we assigned up to minimum per fragment and still have 
regions to assign. In that case, minHeap will be null. We try to assign the 
unassigned regions to fragments with less than minimum by accessing minHeap, 
which is null and thus causing NPE.

In this case, 
we have 5 regions with ~300K rows. We apply selectivity of 0.5 and determine 
the cost as ~150,000. With slice target of 100K, we  create 2 fragments. On a 3 
node cluster, we have 2 fragments assigned to 2 nodes and none assigned to 3rd 
node. With 5 regions to assign, minimum regions to assign per fragment is 
calculated as 5/2 i.e. 2 and maximum regions to assign per fragment is 3.  In 
the first round,  each region is allocated to a fragment on the node on which 
it is located. 4 regions got assigned to 2 fragments i.e. both the fragments 
have 2 each. One of the regions is located on the node which does not have any 
fragments and it does not get assigned to a fragment in the first round. Once 
initial assignment based on region location is done, we try to adjust the 
assignments so that each fragment gets assigned at least the minimum and at 
most the maximum. For this adjustment phase, we build a minHeap which is 
supposed to contain fragments which have less than minimum assigned. In this 
case, since we assigned 2 regions each to 2 fragments, minHeap is null. For 
assigning the 5th region, we try to get fragment which has least number of 
regions assigned by accessing minHeap, which is NULL.  This causes Null Pointer 
Exception.

Fix is to include the fragments which have minimum assigned in minHeap (not 
just less than minimum).




> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---
>
> Key: DRILL-5395
> URL: https://issues.apache.org/jira/browse/DRILL-5395
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - MapRDB
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Abhishek Girish
>Assignee: Padma Penumarthy
>  Labels: MapR-DB-Binary
> Fix For: 1.11.0
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 
> million rows). A partial WIP fix for DRILL-5394 caused the number of rows to 
> be reported incorrectly (~300,000). 2 minor fragments were created (due to 
> filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, 
> possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>AS customer, 
>Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS 
> FLOAT) 
>AS availability 
> FROM   db.tpch_maprdb.lineitem_1 a 
> WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>join 
>   (SELECT Convert_from(b.row_key, 'UTF8') 
>   AS customer, 
>Cast( 
>Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>availability 
> FROM   db.tpch_maprdb.lineitem_1 b 
> WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>  ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)