Hello, Does anyone know if the Parquet format is generally not suited well or slow for reading and writing VARCHAR fields? I am currently investigating why it takes longer to read a parquet file which has 5 cols BIGINT(20), BIGINT(20), SMALLINT(6), SMALLINT(6), VARCHAR(255) than reading a simple csv file.
For reading ALL the columns, It takes about 2ms to read a csv file vs 650ms for a Parquet file with the same data. There are only 700 rows in the table. Does anyone have any information about it? I suspect the overhead of parquet format is more for smaller files. Thanks, Pratik
