Tugdual Grall created DRILL-3443: ------------------------------------ Summary: Flatten function raise exception when JSON files have different schema Key: DRILL-3443 URL: https://issues.apache.org/jira/browse/DRILL-3443 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.0.0 Environment: DRILL 1.0 Embedded (running on OSX with Java 8) DRILL 1.0 Deployed on MapR 4.1 Sandbox Reporter: Tugdual Grall Assignee: Daniel Barclay (Drill) Priority: Critical
I have 2 JSON documents: {code} { "name" : "PPRODUCT_002", "price" : 200.00, "tags" : ["sports" , "cool", "ocean"] } { "name" : "PPRODUCT_001", "price" : 100.00 } {code} And I execute this query: {code} SELECT name, flatten(tags) FROM dfs.`data/json_array/*.json` {code} If the JSON Documents are located in 2 different files and the first file does not contains the "tags" (product 001 in 001.json ), the following exception is raised: {code} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: java.lang.ClassCastException: Cannot cast org.apache.drill.exec.vector.NullableIntVector to org.apache.drill.exec.vector.RepeatedValueVector Fragment 0:0 [Error Id: 4bb5b9e4-0de1-48e9-a0f3-956339608903 on 192.168.99.13:31010] {code} It is working if: * All the JSON documents are in a single json file (order is not important) * if the product with the tags attribute is "first" on the file system, for example you put product 02 in 000.json (that will be read before 001.json) This is similar to [DRILL-3334] bug -- This message was sent by Atlassian JIRA (v6.3.4#6332)