Re: Parquet vectorized reader DELTA_BYTE_ARRAY

2017-05-25 Thread andreiL
I took a closer look and, yes the files were written with Parquet v2. For some reason Parquet v2 was set as the default, I set it back to Parquet v1. Thanks Michael and Ryan for the info. Andrei. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Parqu

Re: Parquet vectorized reader DELTA_BYTE_ARRAY

2017-05-22 Thread Ryan Blue
Michael is right, the delta byte array encoding is a Parquet v2 feature. Parquet v2 isn't finished yet, though some features are in releases and those features will be supported in future releases. In other words, Parquet will maintain backward-compatibility for any released v2 features. I don't r

Re: Parquet vectorized reader DELTA_BYTE_ARRAY

2017-05-22 Thread Michael Allman
Hi AndreiL, Were these files written with the Parquet V2 writer? The Spark 2.1 vectorized reader does not appear to support that format. Michael > On May 9, 2017, at 11:04 AM, andreiL wrote: > > Hi, I am getting an exception in Spark 2.1 reading parquet files where some > columns are DELTA_B

Parquet vectorized reader DELTA_BYTE_ARRAY

2017-05-09 Thread andreiL
Hi, I am getting an exception in Spark 2.1 reading parquet files where some columns are DELTA_BYTE_ARRAY encoded. java.lang.UnsupportedOperationException: Unsupported encoding: DELTA_BYTE_ARRAY Is this exception by design, or am I missing something? If I turn off the vectorized reader, reading t