theosib-amazon commented on PR #957:
URL: https://github.com/apache/parquet-mr/pull/957#issuecomment-1126305627

   I won't be able to add a test any time soon. Here's why.
   
   First take note of the two parquet files attached to 
https://issues.apache.org/jira/browse/PARQUET-2069.
   
   When I implement my own Parquet reader, the fix in this PR is able to make 
the "modified.parquet" file readable by ParquetMR. So what I did was copy 
org.apache.parquet.avro.TestBackwardCompatibility and modify it to read a new 
parquet file that I added to the resources folder. If I make my new test 
TestArrayListCompatibility point to original.parquet, it reads just fine, and 
the test passes. But if I make it point to modified.parquet, then I get an 
exception no matter whether this PR's fix is in or not. And the exception 
thrown is not the same as the exception described in the bug report. Instead, I 
get this:
   
   org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: 
Avro field 'elements' not found
   
   This has exposed some other bug in Parquet/Avro. The thing is, since this 
isn't reproducible when I use my own reader, then the only way to reproduce it 
is to use tests built into ParquetMR. But due to ParquetMR's unfortunate 
reliance on runtime-generated code, it's impossible to run tests from the IDE, 
which makes them incredibly difficult to debug. If someone has a solution to 
that problem, I'd really appreciate some help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to