wjones127 commented on pull request #12216:
URL: https://github.com/apache/arrow/pull/12216#issuecomment-1022834715


   @emkornfield I have debugged further and I believe I have narrowed down to 
the approximate place where the data is being corrupted, though it's very 
strange. I have added two `ValidateFull()` calls that seem to be before and 
after this corruption occurs. The one on `parquet/arrow/reader_internal.cc:780` 
passes, but the one on `parquet/arrow/reader.cc:482` fails. The error I get 
when I run:
   
   ```
   56: 
/Users/willjones/Documents/arrows/arrow/cpp/src/parquet/arrow/reader.cc:482:  
Check failed: _s.ok() Operation failed: out_->ValidateFull()
   56: Bad status: Invalid: In chunk 0: Invalid: null_count value (854) doesn't 
match actual number of nulls in array (861)
   56: 
/Users/willjones/Documents/arrows/arrow/cpp/src/arrow/array/validate.cc:118  
ValidateNulls(*data.type)
   ```
   
   Problem is between those two points I see nothing that could alter the 
array. I am running this with `OMP_NUM_THREADS=1` and `OMP_THREAD_LIMIT=1`, and 
I can confirm that in my debugger I only see 2 threads (1 worker and one worker 
loop). So I'm very confused as to what could be going on between there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to