Dandandan commented on PR #16599: URL: https://github.com/apache/datafusion/pull/16599#issuecomment-3016886359
> > hm yeah that makes sense, perhaps we could find out what we need for this. > > > I guess it would be relatively powerful with predicate pushdown as well: we don't have to decode / validate the data to create the filter. > > The thing is that I am not sure how common it is for users to want to apply string predicates on binary columns > > The clickbench single paruqet file is correctly annotated so that the relevant columns are strings . The clickbench_partitioned dataset is the only dataset I have ever seen that has columns marked as `binary` that are intended to be treated as strings (and I am pretty sure it was a bug, not on purpose) > > Thus I am not sure how important optimizing this case is I agree, not so sure either :) though I was also thinking about other areas where we might do casting in filter expression and therefore limit the pushdown usefulness. Needs some examples though to show this is happening. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org