Iskander14yo commented on issue #2035:
URL: 
https://github.com/apache/datafusion-comet/issues/2035#issuecomment-3099044466

   Thanks for the feedback!
   
   **On the failing query:**
   Appreciate the reminder, I had forgotten that Comet can use different 
readers. To avoid extra tuning, I’ll set `spark.comet.scan.allowIncompatible = 
true`, letting Comet pick the most suitable reader automatically. This fixes 
the issue and should also add flexibility, since native readers have their own 
limitations and we’re not concerned with Spark compatibility in this context.
   
   **On the error:**
   I managed to trace the issue to this expression:
   `CASE WHEN (SearchEngineID = 0 AND AdvEngineID = 0) THEN Referer ELSE '' END 
AS Src`.
   It seems that Comet falls back to Spark execution (`Comet native execution 
is disabled due to: unsupported Spark partitioning: ArrayBuffer(PageViews#463L 
DESC NULLS LAST)`), which leads to `CometExecIterator` being used to execute a 
plan. That plan includes a cast, and Arrow fails on it.
   The open question (for me) is "why is the cast even inserted here". 
Unfortunately, I don’t know Comet/Arrow internals well enough to debug this 
further or suggest a proper fix.
   
   Also, I read your note about heap vs off-heap memory — had similar thoughts. 
I’ll try allocating more memory to Comet and see how it goes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to