[ https://issues.apache.org/jira/browse/DRILL-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anton Gozhiy updated DRILL-6686: -------------------------------- Description: *Prerequisites:* - Put the attached json file to dfs: {noformat} hadoop fs -put -f ./lineitem.json /tmp/ {noformat} - Import it to MapRDB: {noformat} mapr importJSON -idField "l_orderkey" -src /tmp/lineitem.json -dst /tmp/lineitem {noformat} - Create Hive External table: {noformat} CREATE EXTERNAL TABLE lineitem ( l_orderkey string, l_comment string, l_commitdate string, l_discount string, l_extendedprice string, l_linenumber string, l_linestatus string, l_partkey string, l_quantity string, l_receiptdate string, l_returnflag string, l_shipdate string, l_shipinstruct string, l_shipmode string, l_suppkey string, l_tax int ) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' TBLPROPERTIES("maprdb.table.name" = "/tmp/lineitem","maprdb.column.id" = "l_orderkey"); {noformat} - In Drill: {noformat} set store.hive.maprdb_json.optimize_scan_with_native_reader = true; {noformat} *Query:* {code:sql} select * from hive.`lineitem` where l_orderkey < 100 {code} *Expected results:* The query should return result *Actual result:* Exception happens: {noformat} SYSTEM ERROR: IllegalArgumentException: A INT value can not be used for '_id' field. (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: Error while applying rule MapRDBPushFilterIntoScan:Filter_On_Scan, args [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0, 100)), rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, `l_tax`, `**`]])] org.apache.drill.exec.work.foreman.Foreman.run():294 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.lang.RuntimeException) Error while applying rule MapRDBPushFilterIntoScan:Filter_On_Scan, args [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0, 100)), rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, `l_tax`, `**`]])] org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236 org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652 org.apache.calcite.tools.Programs$RuleSetProgram.run():368 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182 org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 org.apache.drill.exec.work.foreman.Foreman.runSQL():567 org.apache.drill.exec.work.foreman.Foreman.run():266 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.lang.IllegalArgumentException) A INT value can not be used for '_id' field. com.mapr.db.impl.ConditionLeaf.checkArgs():308 com.mapr.db.impl.ConditionLeaf.<init>():100 com.mapr.db.impl.ConditionLeaf.<init>():86 com.mapr.db.impl.ConditionLeaf.<init>():82 com.mapr.db.impl.ConditionImpl.is():407 com.mapr.db.impl.ConditionImpl.is():402 com.mapr.db.impl.ConditionImpl.is():43 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.setIsCondition():127 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.createJsonScanSpec():181 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():80 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():33 org.apache.drill.common.expression.FunctionCall.accept():60 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.parseTree():48 org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan.doPushFilterIntoJsonGroupScan():135 org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan$1.onMatch():64 org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212 org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652 org.apache.calcite.tools.Programs$RuleSetProgram.run():368 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182 org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 org.apache.drill.exec.work.foreman.Foreman.runSQL():567 org.apache.drill.exec.work.foreman.Foreman.run():266 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 {noformat} *Notes:* - The same query works fine if store.hive.maprdb_json.optimize_scan_with_native_reader=false - The same exception happens, if select using dfs: {code:sql} select * from dfs.tmp.`lineitem` where _id < 100 {code} - The last query works fine, if disable filter pushdown in maprdb format plugin: {code:json} "maprdb": { "type": "maprdb", "enablePushdown": false } {code} was: *Prerequisites:* - Put the attached json file to dfs: {noformat} hadoop fs -put -f ./lineitem.json /tmp/ {noformat} - Import it to MapRDB: {noformat} mapr importJSON -idField "l_orderkey" -src /tmp/lineitem.json -dst /tmp/lineitem {noformat} - Create Hive External table: {noformat} CREATE EXTERNAL TABLE lineitem ( l_orderkey string, l_comment string, l_commitdate string, l_discount string, l_extendedprice string, l_linenumber string, l_linestatus string, l_partkey string, l_quantity string, l_receiptdate string, l_returnflag string, l_shipdate string, l_shipinstruct string, l_shipmode string, l_suppkey string, l_tax int ) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' TBLPROPERTIES("maprdb.table.name" = "/tmp/lineitem","maprdb.column.id" = "l_orderkey"); {noformat} - In Drill: {noformat} set store.hive.maprdb_json.optimize_scan_with_native_reader = true; {noformat} *Query:* {code:sql} select * from hive.`lineitem` where l_orderkey < 100 {code} *Expected results:* The query should return result *Actual result:* Exception happens: {noformat} SYSTEM ERROR: IllegalArgumentException: A INT value can not be used for '_id' field. (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: Error while applying rule MapRDBPushFilterIntoScan:Filter_On_Scan, args [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0, 100)), rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, `l_tax`, `**`]])] org.apache.drill.exec.work.foreman.Foreman.run():294 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.lang.RuntimeException) Error while applying rule MapRDBPushFilterIntoScan:Filter_On_Scan, args [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0, 100)), rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, `l_tax`, `**`]])] org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236 org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652 org.apache.calcite.tools.Programs$RuleSetProgram.run():368 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182 org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 org.apache.drill.exec.work.foreman.Foreman.runSQL():567 org.apache.drill.exec.work.foreman.Foreman.run():266 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.lang.IllegalArgumentException) A INT value can not be used for '_id' field. com.mapr.db.impl.ConditionLeaf.checkArgs():308 com.mapr.db.impl.ConditionLeaf.<init>():100 com.mapr.db.impl.ConditionLeaf.<init>():86 com.mapr.db.impl.ConditionLeaf.<init>():82 com.mapr.db.impl.ConditionImpl.is():407 com.mapr.db.impl.ConditionImpl.is():402 com.mapr.db.impl.ConditionImpl.is():43 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.setIsCondition():127 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.createJsonScanSpec():181 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():80 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():33 org.apache.drill.common.expression.FunctionCall.accept():60 org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.parseTree():48 org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan.doPushFilterIntoJsonGroupScan():135 org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan$1.onMatch():64 org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212 org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652 org.apache.calcite.tools.Programs$RuleSetProgram.run():368 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182 org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 org.apache.drill.exec.work.foreman.Foreman.runSQL():567 org.apache.drill.exec.work.foreman.Foreman.run():266 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 {noformat} *Note:* The same query works fine if store.hive.maprdb_json.optimize_scan_with_native_reader=false > Exception happens when trying to filter by id from a MaprDB json table > ---------------------------------------------------------------------- > > Key: DRILL-6686 > URL: https://issues.apache.org/jira/browse/DRILL-6686 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.15.0 > Reporter: Anton Gozhiy > Priority: Major > Attachments: lineitem.json > > > *Prerequisites:* > - Put the attached json file to dfs: > {noformat} > hadoop fs -put -f ./lineitem.json /tmp/ > {noformat} > - Import it to MapRDB: > {noformat} > mapr importJSON -idField "l_orderkey" -src /tmp/lineitem.json -dst > /tmp/lineitem > {noformat} > - Create Hive External table: > {noformat} > CREATE EXTERNAL TABLE lineitem ( > l_orderkey string, > l_comment string, > l_commitdate string, > l_discount string, > l_extendedprice string, > l_linenumber string, > l_linestatus string, > l_partkey string, > l_quantity string, > l_receiptdate string, > l_returnflag string, > l_shipdate string, > l_shipinstruct string, > l_shipmode string, > l_suppkey string, > l_tax int > ) > STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' > TBLPROPERTIES("maprdb.table.name" = "/tmp/lineitem","maprdb.column.id" = > "l_orderkey"); > {noformat} > - In Drill: > {noformat} > set store.hive.maprdb_json.optimize_scan_with_native_reader = true; > {noformat} > *Query:* > {code:sql} > select * from hive.`lineitem` where l_orderkey < 100 > {code} > *Expected results:* > The query should return result > *Actual result:* > Exception happens: > {noformat} > SYSTEM ERROR: IllegalArgumentException: A INT value can not be used for '_id' > field. > (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception > during fragment initialization: Error while applying rule > MapRDBPushFilterIntoScan:Filter_On_Scan, args > [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0, > 100)), > rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan > [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], > columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, > `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, > `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, > `l_tax`, `**`]])] > org.apache.drill.exec.work.foreman.Foreman.run():294 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 > Caused By (java.lang.RuntimeException) Error while applying rule > MapRDBPushFilterIntoScan:Filter_On_Scan, args > [rel#1751:FilterPrel.PHYSICAL.SINGLETON([]).[](input=rel#1746:Subset#3.PHYSICAL.SINGLETON([]).[],condition=<($0, > 100)), > rel#1745:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=JsonTableGroupScan > [ScanSpec=JsonScanSpec [tableName=/tmp/lineitem, condition=null], > columns=[`_id`, `l_comment`, `l_commitdate`, `l_discount`, `l_extendedprice`, > `l_linenumber`, `l_linestatus`, `l_partkey`, `l_quantity`, `l_receiptdate`, > `l_returnflag`, `l_shipdate`, `l_shipinstruct`, `l_shipmode`, `l_suppkey`, > `l_tax`, `**`]])] > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236 > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652 > org.apache.calcite.tools.Programs$RuleSetProgram.run():368 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460 > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 > org.apache.drill.exec.work.foreman.Foreman.runSQL():567 > org.apache.drill.exec.work.foreman.Foreman.run():266 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 > Caused By (java.lang.IllegalArgumentException) A INT value can not be used > for '_id' field. > com.mapr.db.impl.ConditionLeaf.checkArgs():308 > com.mapr.db.impl.ConditionLeaf.<init>():100 > com.mapr.db.impl.ConditionLeaf.<init>():86 > com.mapr.db.impl.ConditionLeaf.<init>():82 > com.mapr.db.impl.ConditionImpl.is():407 > com.mapr.db.impl.ConditionImpl.is():402 > com.mapr.db.impl.ConditionImpl.is():43 > > org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.setIsCondition():127 > > org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.createJsonScanSpec():181 > > org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():80 > > org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.visitFunctionCall():33 > org.apache.drill.common.expression.FunctionCall.accept():60 > > org.apache.drill.exec.store.mapr.db.json.JsonConditionBuilder.parseTree():48 > > org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan.doPushFilterIntoJsonGroupScan():135 > > org.apache.drill.exec.store.mapr.db.MapRDBPushFilterIntoScan$1.onMatch():64 > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212 > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():652 > org.apache.calcite.tools.Programs$RuleSetProgram.run():368 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():430 > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():460 > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():182 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 > org.apache.drill.exec.work.foreman.Foreman.runSQL():567 > org.apache.drill.exec.work.foreman.Foreman.run():266 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 > {noformat} > *Notes:* > - The same query works fine if > store.hive.maprdb_json.optimize_scan_with_native_reader=false > - The same exception happens, if select using dfs: > {code:sql} > select * from dfs.tmp.`lineitem` where _id < 100 > {code} > - The last query works fine, if disable filter pushdown in maprdb format > plugin: > {code:json} > "maprdb": { > "type": "maprdb", > "enablePushdown": false > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)