garyanaplan commented on issue #349: URL: https://github.com/apache/arrow-rs/issues/349#issuecomment-858440550
Yep. If I update my test to remove BOOLEAN from the schema, the problem goes away. I've done some digging around today and noticed that it looks like the problem might lie in the generation of the file. I previously reported that parquet-tools dump <file> would happily process the file, however I trimmed down the example to just include BOOLEAN field in schema and increased the number of rows in the group and noted the following when dumping: `value 2039: R:0 D:0 V:true value 2040: R:0 D:0 V:false value 2041: R:0 D:0 V:true value 2042: R:0 D:0 V:false value 2043: R:0 D:0 V:true value 2044: R:0 D:0 V:false value 2045: R:0 D:0 V:true value 2046: R:0 D:0 V:false value 2047: R:0 D:0 V:true value 2048: R:0 D:0 V:false value 2049: R:0 D:0 V:false value 2050: R:0 D:0 V:false value 2051: R:0 D:0 V:false value 2052: R:0 D:0 V:false value 2053: R:0 D:0 V:false value 2054: R:0 D:0 V:false value 2055: R:0 D:0 V:false ` All the values after 2048 are false and they continue to be false until the end of the file. It looks like the generated input file is invalid, so I'm going to poke around there a little next. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org