Dave Challis created DRILL-6670:
-----------------------------------

             Summary: Error in parquet record reader - previously readable file 
fails to be read in 1.14
                 Key: DRILL-6670
                 URL: https://issues.apache.org/jira/browse/DRILL-6670
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Parquet
    Affects Versions: 1.14.0
            Reporter: Dave Challis


Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
and 1.13, but fails to be read with 1.14.

Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
error message from the Drill web query UI:

{code}
{"code":500,"message":"SQL error while querying Drill DB Failed to create 
prepared statement: INTERNAL_ERROR ERROR: Error in parquet record 
reader.\nMessage: Failure in setting up reader\nParquet Metadata: 
ParquetMetaData{FileMetaData{schema: message schema {\n  optional binary name 
(UTF8);\n  optional binary creation_parameters (UTF8);\n  optional int64 
creation_date (TIMESTAMP_MICROS);\n  optional int32 data_version;\n  optional 
int32 schema_version;\n}\n, metadata: {pandas={\"index_columns\": [], 
\"column_indexes\": [], \"columns\": [{\"name\": \"name\", \"field_name\": 
\"name\", \"pandas_type\": \"unicode\", \"numpy_type\": \"object\", 
\"metadata\": null}, {\"name\": \"creation_parameters\", \"field_name\": 
\"creation_parameters\", \"pandas_type\": \"unicode\", \"numpy_type\": 
\"object\", \"metadata\": null}, {\"name\": \"creation_date\", \"field_name\": 
\"creation_date\", \"pandas_type\": \"datetime\", \"numpy_type\": 
\"datetime64[ns]\", \"metadata\": null}, {\"name\": \"data_version\", 
\"field_name\": \"data_version\", \"pandas_type\": \"int32\", \"numpy_type\": 
\"int32\", \"metadata\": null}, {\"name\": \"schema_version\", \"field_name\": 
\"schema_version\", \"pandas_type\": \"int32\", \"numpy_type\": \"int32\", 
\"metadata\": null}], \"pandas_version\": \"0.22.0\"}}}, blocks: 
[BlockMetaData{1, 27142 [ColumnMetaData{SNAPPY [name] optional binary name 
(UTF8)  [PLAIN, RLE], 4}, ColumnMetaData{SNAPPY [creation_parameters] optional 
binary creation_parameters (UTF8)  [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
[creation_date] optional int64 creation_date (TIMESTAMP_MICROS)  [PLAIN, RLE], 
46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version  
[PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
schema_version  [PLAIN, RLE], 46593}]}]}\n\nFragment 0:0\n\n[Error Id: 
7c76ae97-03e3-4fab-9125-ec19fc572bf5 on f9d0456cddd2:31010]"}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to