officialasishkumar opened a new pull request, #4375:
URL: https://github.com/apache/texera/pull/4375

   ### What changes were proposed in this PR?
   
   `ParallelCSVScanSourceOpDesc.getPhysicalOp` called `customDelimiter.get` 
without
   first checking whether the `Option` is defined. When `customDelimiter` is 
`None`
   (the field's default), this throws a `NoSuchElementException` before the 
fallback
   comma delimiter can be applied.
   
   **Before:**
   ```scala
   if (customDelimiter.get.isEmpty) {   // throws NoSuchElementException when 
None
     customDelimiter = Option(",")
   }
   ```
   
   **After:**
   ```scala
   if (customDelimiter.isEmpty || customDelimiter.get.isEmpty) {
     customDelimiter = Option(",")
   }
   ```
   
   This brings the parallel variant in line with `CSVScanSourceOpDesc`, which 
has
   always used the correct two-part guard.
   
   ### Any related issues, documentation, discussions?
   
   Closes #4374
   
   ### How was this PR tested?
   
   Two new test cases were added to `CSVScanSourceOpDescSpec`:
   
   1. `"use comma as the default delimiter when customDelimiter is not set for 
parallel CSV"` — verifies that `getPhysicalOp` does not throw when 
`customDelimiter` is `None` and that the default `,` is applied.
   2. `"use comma as the default delimiter when customDelimiter is empty string 
for parallel CSV"` — same verification for `Some("")`.
   
   The existing parallel-CSV schema-inference tests continue to pass unchanged.
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to