Prathamesh9284 opened a new pull request, #692:
URL: https://github.com/apache/wayang/pull/692

   ## Summary
   
   Closes #690
   
   Improves CSV error handling in the SQL API filesystem source 
(`JavaCSVTableSource`) to provide clear, actionable error messages when CSV 
files are malformed or misconfigured.
   
   ### Changes
   
   - **Improved error message in `parseLine`**: Shows expected vs actual column 
count, the separator used, the offending line, and a hint about the required 
Calcite header format (`name:type`)
   - **Added `validateHeaderLine` method**: Validates the CSV header before 
data parsing begins — checks that the comma-separated column count matches the 
table schema and that each column follows the `name:type` format
   - **Added empty file detection in `streamLines`**: Throws a clear error if 
the CSV file has no lines at all
   - **Removed `static` from `streamLines`**: Required to access instance 
fields (`fieldTypes`, `sourcePath`) for header validation
   
   ### Context
   
   Calcite's CSV adapter requires a typed header row (e.g., 
`id:int,name:string,email:string`) using **commas**, while data rows use 
Wayang's configurable separator (default `;`). Without a proper header, the 
previous error was:
   
   > `Error while parsing CSV file ... at line ..., using separator ;`
   
   This gave no indication of what was actually wrong. The new errors clearly 
explain the issue:
   
   - `CSV file '...': header has 1 comma-separated columns but table schema 
expects 4.`
   - `CSV file '...': header column 'NAMEA' missing required type. Expected 
'name:type' format.`
   - `Column count mismatch in CSV file '...': expected 4 columns but found 1 
(separator ';').`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to