Hi, Can you query the JSON, does the below query return results ?
select * from `test.json`; Can you please paste the content of your JSON file here, there is no attachment, I believe attachments will not go through. Please share the contents of your JSON file, and the version of Drill you are on. Thanks, Khurram On Thu, Aug 30, 2018 at 7:54 PM, Sri Krishna <[email protected]> wrote: > Hi, > > > > I am trying to convert a JSON file in Parquet files using Drill. The query > is: > > ALTER SESSION SET `store.json.all_text_mode` = true; > > Use dfs.tmpp; > > ALTER SESSION SET `store.format` = 'parquet'; > > CREATE TABLE `testParquet` as select * from `test.json`; > > > > The first line is done so that we don’t have to worry about numbers, > integers etc. For now reading them as strings works. When I run this query > I get this error message (not clear): > > Error: INTERNAL_ERROR ERROR: You tried to start when you are using a > ValueWriter of type SingleListWriter. > > > > Attached is the JSON file and the trouble is with the first line. The line > by itself can be folded into Parquet file(above CTAS works) and so are the > rest of them by themselves. Both together gives this error. I ran a query > to just read the file and I get the same error with this line and others > but not alone (just like CTAS). I can get around reading the file by > setting mixed mode as: > > ALTER SESSION SET `exec.enable_union_type` = true; > > But then I get an error that List type isn’t supported (I assume they are > talking about mixed types in an array). > > Here is the stack trace (from enabling verbose) in case of write failure: > > Error: INTERNAL_ERROR ERROR: You tried to start when you are using a > ValueWriter > > of type SingleListWriter. > > > > Fragment 0:0 > > > > [Error Id: 1ae5c2ce-e1ef-40f9-afce-d1e00ac9fa15 on IMC28859.imc2.com:31010 > ] > > > > (java.lang.IllegalStateException) You tried to start when you are using > a Valu > > eWriter of type SingleListWriter. > > org.apache.drill.exec.vector.complex.impl. > AbstractFieldWriter.start():78 > > org.apache.drill.exec.vector.complex.impl.SingleListWriter.start():71 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataAllText():430 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataAllText():462 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataAllText():462 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataAllText():462 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataAllText():462 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataAllText():462 > > org.apache.drill.exec.vector.complex.fn.JsonReader. > writeDataSwitch():311 > > org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector():254 > > org.apache.drill.exec.vector.complex.fn.JsonReader.write():209 > > org.apache.drill.exec.store.easy.json.JSONRecordReader.next():214 > > org.apache.drill.exec.physical.impl.ScanBatch.next():177 > > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project. > ProjectRecordBatch.innerNext():1 > > 42 > > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project. > ProjectRecordBatch.innerNext():1 > > 42 > > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():90 > > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project. > ProjectRecordBatch.innerNext():1 > > 42 > > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ > ScreenRoot.innerNext():83 > > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 > > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 > > java.security.AccessController.doPrivileged():-2 > > javax.security.auth.Subject.doAs():422 > > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > > org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 > > org.apache.drill.common.SelfCleaningRunnable.run():38 > > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > > java.lang.Thread.run():745 (state=,code=0) > > 0: jdbc:drill:zk=local> > > > > > > Here is the trace with mixed mode enabled: > > 0: jdbc:drill:zk=local> ALTER SESSION SET `exec.enable_union_type` = true; > > +-------+----------------------------------+ > > | ok | summary | > > +-------+----------------------------------+ > > | true | exec.enable_union_type updated. | > > +-------+----------------------------------+ > > 1 row selected (0.173 seconds) > > 0: jdbc:drill:zk=local> CREATE TABLE `test.parquet` as select * from > `test.json` > > ; > > Error: SYSTEM ERROR: UnsupportedOperationException: Unsupported type LIST > > > > Fragment 0:0 > > > > [Error Id: 04db6e6a-aa66-4c4f-9573-23fc1215c638 on IMC28859.imc2.com:31010 > ] > > > > (java.lang.UnsupportedOperationException) Unsupported type LIST > > org.apache.drill.exec.store.parquet.ParquetRecordWriter.getType():295 > > org.apache.drill.exec.store.parquet.ParquetRecordWriter.getType():291 > > org.apache.drill.exec.store.parquet.ParquetRecordWriter.getType():291 > > org.apache.drill.exec.store.parquet.ParquetRecordWriter.getType():291 > > org.apache.drill.exec.store.parquet.ParquetRecordWriter. > newSchema():226 > > org.apache.drill.exec.store.parquet.ParquetRecordWriter. > updateSchema():211 > > org.apache.drill.exec.physical.impl.WriterRecordBatch. > setupNewSchema():162 > > org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():108 > > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project. > ProjectRecordBatch.innerNext():1 > > 42 > > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ > ScreenRoot.innerNext():83 > > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 > > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 > > java.security.AccessController.doPrivileged():-2 > > javax.security.auth.Subject.doAs():422 > > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > > org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 > > org.apache.drill.common.SelfCleaningRunnable.run():38 > > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > > java.lang.Thread.run():745 (state=,code=0) > > 0: jdbc:drill:zk=local> > > > > > > All lines pass formatting and I could infer a schema from these > (jsonSchema.net). There are some irrelevant things in the JSON which is > generated from XML like namespace declarations but that should not > interfere from what I can say. > > > > Any help would be appreciated. > > > > Sri >
