Abhishek Girish created DRILL-1559:
--------------------------------------

             Summary: Writing to JSON from Parquet fails when the Parquet file 
is created from JSON
                 Key: DRILL-1559
                 URL: https://issues.apache.org/jira/browse/DRILL-1559
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - JSON
            Reporter: Abhishek Girish
            Assignee: Steven Phillips


Succeeds: 
> alter session set `store.format` = 'parquet';
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | store.format updated. |
+------------+------------+
1 row selected (0.038 seconds)

> create table `yelp_academic_dataset_review_parquet` as select * from 
> `yelp_academic_dataset_review.json`;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 0_0        | 1125458                   |
+------------+---------------------------+
1 row selected (163.893 seconds)

hadoop fs -ls /jsondata/yelp_academic_dataset_review_parquet
Found 2 items
-rwxr-xr-x   3 mapr mapr  535544902 2014-10-20 17:08 
/jsondata/yelp_academic_dataset_review_parquet/0_0_0.parquet
-rwxr-xr-x   3 mapr mapr   29696406 2014-10-20 17:09 
/jsondata/yelp_academic_dataset_review_parquet/0_0_1.parquet

Fails:
> alter session set `store.format` = 'json';
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | store.format updated. |
+------------+------------+
1 row selected (0.033 seconds)

> create table `yelp_academic_dataset_review_json` as select * from 
> yelp_academic_dataset_review_parquet;

Query failed: Failure while running fragment. Schema is currently null.  You 
must call buildSchema(SelectionVectorMode) before this container can return a 
schema. [b96dc570-77f2-46db-b9e6-8215e2062b15]

LOG entry:
2014-10-20 17:10:47,785 [cbccfeb9-a235-4ea7-9bcc-56d35daf4827:frag:1:0] ERROR 
o.a.d.e.w.f.AbstractStatusReporter - Error 
de3eb523-3924-4941-8cf4-eb7a71a2df2d: Failure while running fragment.
java.lang.NullPointerException: Schema is currently null.  You must call 
buildSchema(SelectionVectorMode) before this container can return a schema.
        at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) 
~[guava-14.0.1.jar:na]
        at 
org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:220)
 
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.getSchema(AbstractRecordBatch.java:115)
 
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.getSchema(IteratorValidatorBatchIterator.java:74)
 
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:101)
 
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:104)
 
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:250)
 
[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_65]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to