[
https://issues.apache.org/jira/browse/PARQUET-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821152#comment-16821152
]
Hatem Helal commented on PARQUET-1565:
--------------------------------------
This is a somewhat esoteric problem but the fix seems to be to extend the
switch case here [this switch
case|https://github.com/apache/arrow/blob/master/cpp/src/parquet/arrow/schema.cc#L174]
to handle the corrupted thrift metadata.
> [C++] SEGV in FromParquetSchema with corrupt file from PARQUET-1481
> -------------------------------------------------------------------
>
> Key: PARQUET-1565
> URL: https://issues.apache.org/jira/browse/PARQUET-1565
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Affects Versions: cpp-1.6.0
> Reporter: Hatem Helal
> Assignee: Hatem Helal
> Priority: Minor
>
> Calling {{parquet::arrow::FromParquetSchema}} when reading the corrupt file
> attached to PARQUET-1481 results in a SEGV. I'm not sure when this was
> introduced but I didn't observe this problem with our app that uses
> parquet-cpp v1.4.0. Our team caught this while integrating Arrow 0.12.1 into
> MATLAB.
> To reproduce this, add the following lines to
> [parquet-reader.cc|https://github.com/apache/arrow/blob/master/cpp/tools/parquet/parquet-reader.cc#L66],
> build, and try to read the corrupt file attached to PARQUET-1481.
> {code:java}
> const auto parquet_schema = reader->metadata()->schema();
> std::shared_ptr<::arrow::Schema> arrow_schema;
> PARQUET_THROW_NOT_OK(parquet::arrow::FromParquetSchema(parquet_schema,
> &arrow_schema));{code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)