[ https://issues.apache.org/jira/browse/ARROW-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284762#comment-17284762 ]
Matthias Rosenthaler edited comment on ARROW-11629 at 3/29/21, 6:54 AM: ------------------------------------------------------------------------ [~jorisvandenbossche], In apache drill the values are all displaced. The columns doesn't suit together. [https://github.com/mukunku/ParquetViewer] v. 2.3.0 returns 0 values although the values are not 0 or empty in csv. And older version of this parquet viewer returns the attached exception. !image-2021-02-15-15-49-41-908.png! was (Author: matthros): [~jorisvandenbossche], In apache drill the values are all displaced. The columns doesn't suit together. [https://github.com/mukunku/ParquetViewer] v. 2.0.2 returns 0 values although the values are not 0 or empty in csv. And older version of this parquet viewer returns the attached exception. !image-2021-02-15-15-49-41-908.png! > [C++] Writing float32 values with "Dictionary Encoding" makes parquet files > not readable for some tools > ------------------------------------------------------------------------------------------------------- > > Key: ARROW-11629 > URL: https://issues.apache.org/jira/browse/ARROW-11629 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Python > Affects Versions: 3.0.0 > Reporter: Matthias Rosenthaler > Priority: Major > Attachments: foo.parquet, image-2021-02-15-15-49-41-908.png, > output.csv, output.parquet > > > If I try to read the attached csv file with pyarrow, changing the float64 > columns to float32 and export it to parquet, the parquet file gets corrupted. > It is not readable for apache drill or Parquet.Net any longer. > > Update: Bug in "*Dictionary Encoding*" feature. If I switch it off for > float32 columns, everything works as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)