thisisnic opened a new pull request, #49338:
URL: https://github.com/apache/arrow/pull/49338

   ### Rationale for this change
   
   When reading a CSV with sparse data (many missing values followed by actual 
values), Arrow can infer a column type as `null` based on the first block of 
data. When non-null values appear later, the error message incorrectly suggests 
using `skip = 1` for header rows, which is misleading.
   
   ### What changes are included in this PR?
   
   Adds a specific check for "conversion error to null" that provides a helpful 
message explaining the cause (type inference from sparse data) and the solution 
(specify column types explicitly via `col_types` or `schema`).
   
   ### Are these changes tested?
   
   Yes, added a test in `test-dataset-csv.R`.
   
   ### Are there any user-facing changes?
   
   Yes, improved error message when CSV type inference fails due to sparse data.
   
   ---
   
   This PR was authored by Claude (Opus 4.5) and reviewed by @thisisnic.
   
   🤖 Generated with [Claude Code](https://claude.ai/code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to