Iskander14yo commented on issue #2035: URL: https://github.com/apache/datafusion-comet/issues/2035#issuecomment-3099044466
Thanks for the feedback! **On the failing query:** Appreciate the reminder, I had forgotten that Comet can use different readers. To avoid extra tuning, I’ll set `spark.comet.scan.allowIncompatible = true`, letting Comet pick the most suitable reader automatically. This fixes the issue and should also add flexibility, since native readers have their own limitations and we’re not concerned with Spark compatibility in this context. **On the error:** I managed to trace the issue to this expression: `CASE WHEN (SearchEngineID = 0 AND AdvEngineID = 0) THEN Referer ELSE '' END AS Src`. It seems that Comet falls back to Spark execution (`Comet native execution is disabled due to: unsupported Spark partitioning: ArrayBuffer(PageViews#463L DESC NULLS LAST)`), which leads to `CometExecIterator` being used to execute a plan. That plan includes a cast, and Arrow fails on it. The open question (for me) is "why is the cast even inserted here". Unfortunately, I don’t know Comet/Arrow internals well enough to debug this further or suggest a proper fix. Also, I read your note about heap vs off-heap memory — had similar thoughts. I’ll try allocating more memory to Comet and see how it goes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org