[ https://issues.apache.org/jira/browse/PARQUET-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou updated PARQUET-1594: ------------------------------------ Fix Version/s: (was: format-2.3.1) > Parquet File is not able to read from Spark and Hive > ---------------------------------------------------- > > Key: PARQUET-1594 > URL: https://issues.apache.org/jira/browse/PARQUET-1594 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Reporter: Prashanth pampanna desai > Priority: Major > > Issue: Caused by: java.io.IOException: Expected 35393 values in column chunk > at maprfs:////path/date=20190605/caa63aa9-abfa-46e1-8221-10f6c669512d.parquet > offset 4 but got 46402 values instead over 2 pages ending at file offset > 341624 > we are getting Avro Serialized messages from kafka which are being consumed > by Spring-kafka and converted into parquet gets persisted into MaprFS(hdfs) > file system. > i have tried replicating the issue in local with same Avro file but i was > able to read parquet successfully, I am not sure why the parquet being > corrupted in HDFS . > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)