Andrei Stankevich created PARQUET-1046: ------------------------------------------
Summary: Impossible to read thrift object from parquet file if it has List<Enum> field that was removed from thrift schema. Key: PARQUET-1046 URL: https://issues.apache.org/jira/browse/PARQUET-1046 Project: Parquet Issue Type: Bug Reporter: Andrei Stankevich If thrift class has a field with type List<some_enum> ParquetReader makes list's elements type as enum (type id = 16) but it has to make it Int32. What happens is all fields that have field type as enum in thrift schema file in java class have field type as Int32. Same is true for List fields if list's elements are enum. But when ParquetReader creates an object it uses type enum for list's elements instead of Int32. Because of this fact we have an issue. We can not remove list field if it has enum elements. If we remove field like this from schema file but it will present in parquet file, when ParquetReader reads this field it tries to skip it because this field is not in the schema and it calls method TProtocolUtil.skip method with type = 15 for list and then it calls same method for each list element with type 16 for enum but TProtocolUtil.skip doesn't have this type in switch-case and it is not skipping list elements and because of this it throws exception later when it tries to skip List end. -- This message was sent by Atlassian JIRA (v6.4.14#64029)