Amit Katti created DRILL-1058:
---------------------------------
Summary: Unable to import nested/repeated data from JSON into
PARQUET table
Key: DRILL-1058
URL: https://issues.apache.org/jira/browse/DRILL-1058
Project: Apache Drill
Issue Type: Bug
Components: Storage - Writer
Environment: CentOS release 6.5
Reporter: Amit Katti
I have a JSON file with nested data (schema present below):
{
"rownum": 1,
"name": "fred ovid",
"age": 76,
"gpa": 1.55,
"studentnum": 692315658449,
"create_time": "2014-05-27 00:26:07",
"interests": [
"Reading",
"Mountain Biking",
"Hacking"
]
}
I am able to read this JSON file successfully from drill and access nested
values. However when I try to import this data and create a table in PARQUET
format, it errors:
QUERY: create table test as select * from
`/user/root/sample-data/nested_student.json`;
ERROR: Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure
while running query.[error_id: "3ce3dc1e-d920-4262-ae2d-28bd2d034597"
endpoint {
address: "perfnode154.perf.lab"
user_port: 31010
control_port: 31011
data_port: 31012
}
error_type: 0
message: "Failure while running fragment. < ParquetEncodingException:[ error
starting field interests at 6 ] < ClassCastException:[
parquet.io.PrimitiveColumnIO cannot be cast to parquet.io.GroupColumnIO ]"
]
Error: exception while executing query (state=,code=0)
{code}
2014-06-24 00:41:18,646 [b10db58d-8d4d-4d02-9fb5-a5081e5cb254:frag:0:0] ERROR
o.a.d.e.w.f.AbstractStatusReporter - Error
48602de2-8306-47d2-875f-8ad2cd2e964a: Failure while running fragment.
java.lang.ClassCastException: parquet.io.PrimitiveColumnIO cannot be cast to
parquet.io.GroupColumnIO
at
parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.startField(MessageColumnIO.java:171)
~[parquet-column-1.5.0-20140513.004024-1.jar:na]
at
org.apache.drill.exec.store.ParquetOutputRecordWriter.addRepeatedVarCharHolder(ParquetOutputRecordWriter.java:761)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.store.EventBasedRecordWriter$RepeatedVarCharFieldWriter.writeField(EventBasedRecordWriter.java:1156)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.store.EventBasedRecordWriter.write(EventBasedRecordWriter.java:150)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:111)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:91)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:72)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:65)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:45)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:94)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:91)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:56)
~[drill-java-exec-1.0.0-m2-incubat
ing-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:85)
~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:46)
~[drill-java-exec-1.0.0-m2-incubat
ing-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:100)
~[drill-java-exec-1.0.0-m2
-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)