Kevin created ARROW-14432: ----------------------------- Summary: created_by is not exposed in the python wrapper, creating reader side issue. Key: ARROW-14432 URL: https://issues.apache.org/jira/browse/ARROW-14432 Project: Apache Arrow Issue Type: Bug Reporter: Kevin
Current python wrapper does NOT expose created_by [https://github.com/apache/arrow/blob/master/python/pyarrow/_parquet.pxd#L361] But, this is available in CPP version: [https://github.com/apache/arrow/blob/4591d76fce2846a29dac33bf01e9ba0337b118e9/cpp/src/parquet/properties.h#L249] [https://github.com/apache/arrow/blob/master/python/pyarrow/_parquet.pxd#L320] This creates an issue when Hadoop parquet reader reads this pyarrow parquet file: SO : [https://stackoverflow.com/questions/69658140/how-to-save-a-parquet-with-pandas-using-same-header-than-hadoop-spark-parquet?noredirect=1#comment123131862_69658140] Deelopment should be minimal -- This message was sent by Atlassian Jira (v8.3.4#803005)