As you suggest, 700 rows is tiny. If you want to benchmark this you'd want a file around 1GB. Also expect to save more time when you read a subset of the columns. To answer your question, Parquet supports varchars fine.
On Sep 17, 2014, at 6:36 PM, pratik khadloya wrote: > Hello, > > Does anyone know if the Parquet format is generally not suited well or slow > for reading and writing VARCHAR fields? I am currently investigating why it > takes longer to read a parquet file which has 5 cols BIGINT(20), > BIGINT(20), SMALLINT(6), SMALLINT(6), VARCHAR(255) than reading a simple > csv file. > > For reading ALL the columns, It takes about 2ms to read a csv file vs 650ms > for a Parquet file with the same data. There are only 700 rows in the table. > > Does anyone have any information about it? > I suspect the overhead of parquet format is more for smaller files. > > Thanks, > Pratik
