theosib-amazon commented on PR #957: URL: https://github.com/apache/parquet-mr/pull/957#issuecomment-1126305627
I won't be able to add a test any time soon. Here's why. First take note of the two parquet files attached to https://issues.apache.org/jira/browse/PARQUET-2069. When I implement my own Parquet reader, the fix in this PR is able to make the "modified.parquet" file readable by ParquetMR. So what I did was copy org.apache.parquet.avro.TestBackwardCompatibility and modify it to read a new parquet file that I added to the resources folder. If I make my new test TestArrayListCompatibility point to original.parquet, it reads just fine, and the test passes. But if I make it point to modified.parquet, then I get an exception no matter whether this PR's fix is in or not. And the exception thrown is not the same as the exception described in the bug report. Instead, I get this: org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: Avro field 'elements' not found This has exposed some other bug in Parquet/Avro. The thing is, since this isn't reproducible when I use my own reader, then the only way to reproduce it is to use tests built into ParquetMR. But due to ParquetMR's unfortunate reliance on runtime-generated code, it's impossible to run tests from the IDE, which makes them incredibly difficult to debug. If someone has a solution to that problem, I'd really appreciate some help. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org