Abhishek Girish created DRILL-1616:
--------------------------------------
Summary: Drill throws "Schema is currently null" error on
count(field) when field is an JSON object/array
Key: DRILL-1616
URL: https://issues.apache.org/jira/browse/DRILL-1616
Project: Apache Drill
Issue Type: Bug
Components: Storage - JSON
Reporter: Abhishek Girish
Assignee: Jason Altekruse
Count(field) throws error on fields which are objects or arrays and these are
not clean. They do not indicate an error in usage. Also, count on
objects/arrays should be supported.
> select * from `abc.json`;
+------------+------------+------------+------------+------------+
| field_1 | field_2 | field_3 | field_4 | field_5 |
+------------+------------+------------+------------+------------+
| ["1"] | null | {"inner_3":[]} | {"inner_1":[],"inner_3":{}} | []
|
| ["5"] | 2 | {"inner_1":"2","inner_3":[]} |
{"inner_1":["1","2","3"],"inner_2":"3","inner_3":{"inner_object_field_1":"2"}}
| [{"inner_list":["1","null","6"],"inner_ |
| ["5","10","15"] | A wild string appears! |
{"inner_1":"5","inner_2":"3","inner_3":[{},{"inner_object_field_1":"10"}]} |
{"inner_1":["4","5","6"],"inner_2":"3","inner_3":{}} | [{ |
+------------+------------+------------+------------+------------+
3 rows selected (0.081 seconds)
> select count(field_1) from `abc.json`;
Query failed: Failure while running fragment., Schema is currently null. You
must call buildSchema(SelectionVectorMode) before this container can return a
schema. [ b6f021f9-213e-475e-83f4-a6facf6fd76d on abhi7.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query.
(state=,code=0)
Error is seen on fields 1,3,4,5.
The issue is not seen when array index is specified.
> select count(field_1[0]) from `abc.json`;
+------------+
| EXPR$0 |
+------------+
| 3 |
+------------+
1 row selected (0.152 seconds)
Or when the element in the object is specified:
> select count(t.field_3.inner_3) from `textmode.json` as t;
+------------+
| EXPR$0 |
+------------+
| 3 |
+------------+
1 row selected (0.155 seconds)
LOG:
2014-10-30 13:28:20,286 [a90cc246-e60b-452b-ba96-7f79709f5ffa:frag:0:0] ERROR
o.a.d.e.w.f.AbstractStatusReporter - Error
bc438332-0828-4a86-8063-9dc8c5a703d9: Failure while running fragment.
java.lang.NullPointerException: Schema is currently null. You must call
buildSchema(SelectionVectorMode) before this container can return a schema.
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208)
~[guava-14.0.1.jar:na]
at
org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:273)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at
org.apache.drill.exec.record.AbstractRecordBatch.getSchema(AbstractRecordBatch.java:116)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.getSchema(IteratorValidatorBatchIterator.java:75)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema(ScreenCreator.java:100)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:103)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:249)
[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_65]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)