As you suggest, 700 rows is tiny.
If you want to benchmark this you'd want a file around 1GB.
Also expect to save more time when you read a subset of the columns.
To answer your question, Parquet supports varchars fine.

On Sep 17, 2014, at 6:36 PM, pratik khadloya wrote:

> Hello,
> 
> Does anyone know if the Parquet format is generally not suited well or slow
> for reading and writing VARCHAR fields? I am currently investigating why it
> takes longer to read a parquet file which has 5 cols BIGINT(20),
> BIGINT(20), SMALLINT(6), SMALLINT(6), VARCHAR(255) than reading a simple
> csv file.
> 
> For reading ALL the columns, It takes about 2ms to read a csv file vs 650ms
> for a Parquet file with the same data. There are only 700 rows in the table.
> 
> Does anyone have any information about it?
> I suspect the overhead of parquet format is more for smaller files.
> 
> Thanks,
> Pratik

Reply via email to