Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-29 Thread via GitHub


mbutrovich commented on PR #1765:
URL: 
https://github.com/apache/datafusion-comet/pull/1765#issuecomment-2919178189

   It's changing an exception message that it isn't even matching:
   ```
   Expected :"...o bypass this error.[]"
   Actual   :"...o bypass this error.[ (of class java.lang.String)]"
   ```
   I'll look at it this morning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-28 Thread via GitHub


parthchandra commented on PR #1765:
URL: 
https://github.com/apache/datafusion-comet/pull/1765#issuecomment-2917912411

   @mbutrovich looks like this is causing ci failures.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-28 Thread via GitHub


codecov-commenter commented on PR #1765:
URL: 
https://github.com/apache/datafusion-comet/pull/1765#issuecomment-2917808380

   ## 
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   Attention: Patch coverage is `34.21053%` with `25 lines` in your changes 
missing coverage. Please review.
   > Project coverage is 33.05%. Comparing base 
[(`f09f8af`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/f09f8af64c6599255e116a376f4f008f2fd63b43?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`5591758`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/5591758672846a571d0094d04beaefccc7f14a1d?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 222 commits behind head on main.
   
   | [Files with missing 
lines](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...ain/scala/org/apache/comet/CometExecIterator.scala](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?src=pr&el=tree&filepath=spark%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fcomet%2FCometExecIterator.scala&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9jb21ldC9Db21ldEV4ZWNJdGVyYXRvci5zY2FsYQ==)
 | 37.50% | [15 Missing :warning: 
](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   | 
[...n/scala/org/apache/comet/rules/CometScanRule.scala](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?src=pr&el=tree&filepath=spark%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fcomet%2Frules%2FCometScanRule.scala&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9jb21ldC9ydWxlcy9Db21ldFNjYW5SdWxlLnNjYWxh)
 | 28.57% | [6 Missing and 4 partials :warning: 
](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ##   main#1765   +/-   ##
   =
   - Coverage 56.12%   33.05%   -23.08% 
   + Complexity  976  785  -191 
   =
 Files   119  129   +10 
 Lines 1174312535  +792 
 Branches   2251 2354  +103 
   =
   - Hits   6591 4143 -2448 
   - Misses 4012 7432 +3420 
   + Partials   1140  960  -180 
   ```
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1765?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
:rocket: New features to boost your workflow: 
   
   - :snowflake: [Test 
Analytics](https://docs.codecov.com/docs/test-analytics): Detect flaky tests, 
report on failures, and find test suite problems.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-22 Thread via GitHub


mbutrovich commented on code in PR #1765:
URL: https://github.com/apache/datafusion-comet/pull/1765#discussion_r2102461675


##
spark/src/main/scala/org/apache/comet/CometExecIterator.scala:
##
@@ -133,23 +136,51 @@ class CometExecIterator(
 }
   }
 
-  def getNextBatch(): Option[ColumnarBatch] = {
+  private def getNextBatch: Option[ColumnarBatch] = {
 assert(partitionIndex >= 0 && partitionIndex < numParts)
 
 if (tracingEnabled) {
   traceMemoryUsage()
 }
 
 val ctx = TaskContext.get()
-withTrace(
-  s"getNextBatch[JVM] stage=${ctx.stageId()}",
-  tracingEnabled, {
-nativeUtil.getNextBatch(
-  numOutputCols,
-  (arrayAddrs, schemaAddrs) => {
-nativeLib.executePlan(ctx.stageId(), partitionIndex, plan, 
arrayAddrs, schemaAddrs)
-  })
-  })
+
+try {

Review Comment:
   We'll have to think about that design since every error handler could 
necessitate a different set of configs/context to be sent over for correct 
behavior on the native side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-22 Thread via GitHub


parthchandra commented on code in PR #1765:
URL: https://github.com/apache/datafusion-comet/pull/1765#discussion_r2103085875


##
spark/src/main/scala/org/apache/comet/CometExecIterator.scala:
##
@@ -133,23 +136,51 @@ class CometExecIterator(
 }
   }
 
-  def getNextBatch(): Option[ColumnarBatch] = {
+  private def getNextBatch: Option[ColumnarBatch] = {
 assert(partitionIndex >= 0 && partitionIndex < numParts)
 
 if (tracingEnabled) {
   traceMemoryUsage()
 }
 
 val ctx = TaskContext.get()
-withTrace(
-  s"getNextBatch[JVM] stage=${ctx.stageId()}",
-  tracingEnabled, {
-nativeUtil.getNextBatch(
-  numOutputCols,
-  (arrayAddrs, schemaAddrs) => {
-nativeLib.executePlan(ctx.stageId(), partitionIndex, plan, 
arrayAddrs, schemaAddrs)
-  })
-  })
+
+try {

Review Comment:
   Maybe log an issue to see if the error/exception translation can be 
improved? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-22 Thread via GitHub


mbutrovich commented on code in PR #1765:
URL: https://github.com/apache/datafusion-comet/pull/1765#discussion_r2102461675


##
spark/src/main/scala/org/apache/comet/CometExecIterator.scala:
##
@@ -133,23 +136,51 @@ class CometExecIterator(
 }
   }
 
-  def getNextBatch(): Option[ColumnarBatch] = {
+  private def getNextBatch: Option[ColumnarBatch] = {
 assert(partitionIndex >= 0 && partitionIndex < numParts)
 
 if (tracingEnabled) {
   traceMemoryUsage()
 }
 
 val ctx = TaskContext.get()
-withTrace(
-  s"getNextBatch[JVM] stage=${ctx.stageId()}",
-  tracingEnabled, {
-nativeUtil.getNextBatch(
-  numOutputCols,
-  (arrayAddrs, schemaAddrs) => {
-nativeLib.executePlan(ctx.stageId(), partitionIndex, plan, 
arrayAddrs, schemaAddrs)
-  })
-  })
+
+try {

Review Comment:
   We'l have to think about that design since every error handler could 
necessitate a different set of configs/context to be sent over for correct 
behavior on the native side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-22 Thread via GitHub


mbutrovich commented on PR #1765:
URL: 
https://github.com/apache/datafusion-comet/pull/1765#issuecomment-2901080407

   The Spark SQL test "Enabling/disabling ignoreMissingFiles using Parquet" 
specifically tests for a SparkFileNotFoundException. That class is private to 
Spark so Comet can't throw one. Their helper class that can throw these 
exceptions are also private. Is this just a "change the test diff and move on"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: translate missing or corrupt file exceptions in NativeUtil, fall back native scans if asked to ignore [datafusion-comet]

2025-05-21 Thread via GitHub


parthchandra commented on code in PR #1765:
URL: https://github.com/apache/datafusion-comet/pull/1765#discussion_r2101431726


##
spark/src/main/scala/org/apache/comet/CometExecIterator.scala:
##
@@ -133,23 +136,51 @@ class CometExecIterator(
 }
   }
 
-  def getNextBatch(): Option[ColumnarBatch] = {
+  private def getNextBatch: Option[ColumnarBatch] = {
 assert(partitionIndex >= 0 && partitionIndex < numParts)
 
 if (tracingEnabled) {
   traceMemoryUsage()
 }
 
 val ctx = TaskContext.get()
-withTrace(
-  s"getNextBatch[JVM] stage=${ctx.stageId()}",
-  tracingEnabled, {
-nativeUtil.getNextBatch(
-  numOutputCols,
-  (arrayAddrs, schemaAddrs) => {
-nativeLib.executePlan(ctx.stageId(), partitionIndex, plan, 
arrayAddrs, schemaAddrs)
-  })
-  })
+
+try {

Review Comment:
   This will work for the time being, but perhaps we should trap the error in 
`errors.rs`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]