[jira] [Updated] (DRILL-6686) Exception happens when trying to filter by id from a MaprDB json table

Anton Gozhiy (JIRA) Tue, 14 Aug 2018 10:28:19 -0700


     [ 
https://issues.apache.org/jira/browse/DRILL-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Anton Gozhiy updated DRILL-6686:
--------------------------------
    Description: 
*Prerequisites:*
- Put the attached json file to dfs:
{noformat}
hadoop fs -put -f ./lineitem.json /tmp/
{noformat}
- Import it to MapRDB:
{noformat}
mapr importJSON -idField "l_orderkey" -src /tmp/lineitem.json -dst /tmp/lineitem
{noformat}
- Create Hive External table:
{noformat}
CREATE EXTERNAL TABLE lineitem ( 
l_orderkey string, 
l_comment string, 
l_commitdate string,
l_discount string,
l_extendedprice string,
l_linenumber string,
l_linestatus string,
l_partkey string,
l_quantity string,
l_receiptdate string,
l_returnflag string,
l_shipdate string,
l_shipinstruct string,
l_shipmode string,
l_suppkey string,
l_tax int
) 
STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' 
TBLPROPERTIES("maprdb.table.name" = "/tmp/lineitem","maprdb.column.id" = 
"l_orderkey");
{noformat}
- In Drill:
{noformat}
set store.hive.maprdb_json.optimize_scan_with_native_reader = true;
{noformat}

*Query:*
{code:sql}
select * from hive.`lineitem` where l_orderkey < 100
{code}

*Expected results:*
The query should return result

*Actual result:*
Exception happens:
{noformat}
SYSTEM ERROR: IllegalArgumentException: A INT value can not be used for '_id' 
field.



  (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
during fragment initialization: Error while applying rule 
MapRDBPushFilterIntoScan:Filter_On_Scan, args 
[rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0,
 100)), 
rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan 
[ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], 
columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, 
`l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, 
`l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, 
`l_tax`, `**`]])]
    org.apache.drill.exec.work.foreman.Foreman.run():294
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
  Caused By (java.lang.RuntimeException) Error while applying rule 
MapRDBPushFilterIntoScan:Filter_On_Scan, args 
[rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0,
 100)), 
rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan 
[ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], 
columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, 
`l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, 
`l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, 
`l_tax`, `**`]])]
    org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236
    org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652
    org.apache.calcite.tools.Programs$RuleSetProgram.run():368
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430
    
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
    org.apache.drill.exec.work.foreman.Foreman.runSQL():567
    org.apache.drill.exec.work.foreman.Foreman.run():266
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
  Caused By (java.lang.IllegalArgumentException) A INT value can not be used 
for '_id' field.
    com.mapr.db.impl.ConditionLeaf.checkArgs():308
    com.mapr.db.impl.ConditionLeaf.<init>():100
    com.mapr.db.impl.ConditionLeaf.<init>():86
    com.mapr.db.impl.ConditionLeaf.<init>():82
    com.mapr.db.impl.ConditionImpl.is():407
    com.mapr.db.impl.ConditionImpl.is():402
    com.mapr.db.impl.ConditionImpl.is():43
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.setIsCondition():127
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.createJsonScanSpec():181
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():80
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():33
    org.apache.drill.common.expression.FunctionCall.accept():60
    org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.parseTree():48
    
org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan.doPushFilterIntoJsonGroupScan():135
    org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan$1.onMatch():64
    org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212
    org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652
    org.apache.calcite.tools.Programs$RuleSetProgram.run():368
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430
    
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
    org.apache.drill.exec.work.foreman.Foreman.runSQL():567
    org.apache.drill.exec.work.foreman.Foreman.run():266
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
{noformat}

*Notes:*
- The same query works fine if 
store.hive.maprdb_json.optimize_scan_with_native_reader=false
- The same exception happens, if select using dfs:
{code:sql}
select * from dfs.tmp.`lineitem` where _id < 100
{code}
- The last query works fine, if disable filter pushdown in maprdb format plugin:
{code:json}
    "maprdb": {
      "type": "maprdb",
      "enablePushdown": false
    }
{code}


  was:
*Prerequisites:*
- Put the attached json file to dfs:
{noformat}
hadoop fs -put -f ./lineitem.json /tmp/
{noformat}
- Import it to MapRDB:
{noformat}
mapr importJSON -idField "l_orderkey" -src /tmp/lineitem.json -dst /tmp/lineitem
{noformat}
- Create Hive External table:
{noformat}
CREATE EXTERNAL TABLE lineitem ( 
l_orderkey string, 
l_comment string, 
l_commitdate string,
l_discount string,
l_extendedprice string,
l_linenumber string,
l_linestatus string,
l_partkey string,
l_quantity string,
l_receiptdate string,
l_returnflag string,
l_shipdate string,
l_shipinstruct string,
l_shipmode string,
l_suppkey string,
l_tax int
) 
STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' 
TBLPROPERTIES("maprdb.table.name" = "/tmp/lineitem","maprdb.column.id" = 
"l_orderkey");
{noformat}
- In Drill:
{noformat}
set store.hive.maprdb_json.optimize_scan_with_native_reader = true;
{noformat}

*Query:*
{code:sql}
select * from hive.`lineitem` where l_orderkey < 100
{code}

*Expected results:*
The query should return result

*Actual result:*
Exception happens:
{noformat}
SYSTEM ERROR: IllegalArgumentException: A INT value can not be used for '_id' 
field.



  (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
during fragment initialization: Error while applying rule 
MapRDBPushFilterIntoScan:Filter_On_Scan, args 
[rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0,
 100)), 
rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan 
[ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], 
columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, 
`l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, 
`l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, 
`l_tax`, `**`]])]
    org.apache.drill.exec.work.foreman.Foreman.run():294
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
  Caused By (java.lang.RuntimeException) Error while applying rule 
MapRDBPushFilterIntoScan:Filter_On_Scan, args 
[rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0,
 100)), 
rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan 
[ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], 
columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, 
`l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, 
`l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, 
`l_tax`, `**`]])]
    org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236
    org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652
    org.apache.calcite.tools.Programs$RuleSetProgram.run():368
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430
    
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
    org.apache.drill.exec.work.foreman.Foreman.runSQL():567
    org.apache.drill.exec.work.foreman.Foreman.run():266
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
  Caused By (java.lang.IllegalArgumentException) A INT value can not be used 
for '_id' field.
    com.mapr.db.impl.ConditionLeaf.checkArgs():308
    com.mapr.db.impl.ConditionLeaf.<init>():100
    com.mapr.db.impl.ConditionLeaf.<init>():86
    com.mapr.db.impl.ConditionLeaf.<init>():82
    com.mapr.db.impl.ConditionImpl.is():407
    com.mapr.db.impl.ConditionImpl.is():402
    com.mapr.db.impl.ConditionImpl.is():43
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.setIsCondition():127
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.createJsonScanSpec():181
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():80
    
org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():33
    org.apache.drill.common.expression.FunctionCall.accept():60
    org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.parseTree():48
    
org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan.doPushFilterIntoJsonGroupScan():135
    org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan$1.onMatch():64
    org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212
    org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652
    org.apache.calcite.tools.Programs$RuleSetProgram.run():368
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430
    
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
    org.apache.drill.exec.work.foreman.Foreman.runSQL():567
    org.apache.drill.exec.work.foreman.Foreman.run():266
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
{noformat}

*Note:*
The same query works fine if 
store.hive.maprdb_json.optimize_scan_with_native_reader=false


> Exception happens when trying to filter by id from a MaprDB json table
> ----------------------------------------------------------------------
>
>                 Key: DRILL-6686
>                 URL: https://issues.apache.org/jira/browse/DRILL-6686
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.15.0
>            Reporter: Anton Gozhiy
>            Priority: Major
>         Attachments: lineitem.json
>
>
> *Prerequisites:*
> - Put the attached json file to dfs:
> {noformat}
> hadoop fs -put -f ./lineitem.json /tmp/
> {noformat}
> - Import it to MapRDB:
> {noformat}
> mapr importJSON -idField "l_orderkey" -src /tmp/lineitem.json -dst 
> /tmp/lineitem
> {noformat}
> - Create Hive External table:
> {noformat}
> CREATE EXTERNAL TABLE lineitem ( 
> l_orderkey string, 
> l_comment string, 
> l_commitdate string,
> l_discount string,
> l_extendedprice string,
> l_linenumber string,
> l_linestatus string,
> l_partkey string,
> l_quantity string,
> l_receiptdate string,
> l_returnflag string,
> l_shipdate string,
> l_shipinstruct string,
> l_shipmode string,
> l_suppkey string,
> l_tax int
> ) 
> STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' 
> TBLPROPERTIES("maprdb.table.name" = "/tmp/lineitem","maprdb.column.id" = 
> "l_orderkey");
> {noformat}
> - In Drill:
> {noformat}
> set store.hive.maprdb_json.optimize_scan_with_native_reader = true;
> {noformat}
> *Query:*
> {code:sql}
> select * from hive.`lineitem` where l_orderkey < 100
> {code}
> *Expected results:*
> The query should return result
> *Actual result:*
> Exception happens:
> {noformat}
> SYSTEM ERROR: IllegalArgumentException: A INT value can not be used for '_id' 
> field.
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Error while applying rule 
> MapRDBPushFilterIntoScan:Filter_On_Scan, args 
> [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0,
>  100)), 
> rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan 
> [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], 
> columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, 
> `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, 
> `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, 
> `l_tax`, `**`]])]
>     org.apache.drill.exec.work.foreman.Foreman.run():294
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748
>   Caused By (java.lang.RuntimeException) Error while applying rule 
> MapRDBPushFilterIntoScan:Filter_On_Scan, args 
> [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0,
>  100)), 
> rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan 
> [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], 
> columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, 
> `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, 
> `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, 
> `l_tax`, `**`]])]
>     org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236
>     org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652
>     org.apache.calcite.tools.Programs$RuleSetProgram.run():368
>     
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430
>     
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460
>     org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182
>     org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
>     org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
>     org.apache.drill.exec.work.foreman.Foreman.runSQL():567
>     org.apache.drill.exec.work.foreman.Foreman.run():266
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748
>   Caused By (java.lang.IllegalArgumentException) A INT value can not be used 
> for '_id' field.
>     com.mapr.db.impl.ConditionLeaf.checkArgs():308
>     com.mapr.db.impl.ConditionLeaf.<init>():100
>     com.mapr.db.impl.ConditionLeaf.<init>():86
>     com.mapr.db.impl.ConditionLeaf.<init>():82
>     com.mapr.db.impl.ConditionImpl.is():407
>     com.mapr.db.impl.ConditionImpl.is():402
>     com.mapr.db.impl.ConditionImpl.is():43
>     
> org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.setIsCondition():127
>     
> org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.createJsonScanSpec():181
>     
> org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():80
>     
> org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():33
>     org.apache.drill.common.expression.FunctionCall.accept():60
>     
> org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.parseTree():48
>     
> org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan.doPushFilterIntoJsonGroupScan():135
>     
> org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan$1.onMatch():64
>     org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212
>     org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652
>     org.apache.calcite.tools.Programs$RuleSetProgram.run():368
>     
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430
>     
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460
>     org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182
>     org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
>     org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
>     org.apache.drill.exec.work.foreman.Foreman.runSQL():567
>     org.apache.drill.exec.work.foreman.Foreman.run():266
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748
> {noformat}
> *Notes:*
> - The same query works fine if 
> store.hive.maprdb_json.optimize_scan_with_native_reader=false
> - The same exception happens, if select using dfs:
> {code:sql}
> select * from dfs.tmp.`lineitem` where _id < 100
> {code}
> - The last query works fine, if disable filter pushdown in maprdb format 
> plugin:
> {code:json}
>     "maprdb": {
>       "type": "maprdb",
>       "enablePushdown": false
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6686) Exception happens when trying to filter by id from a MaprDB json table

Reply via email to