alamb commented on issue #3249: URL: https://github.com/apache/arrow-datafusion/issues/3249#issuecomment-1236347394
> Hi @alamb, I am a little confused. Do you mean the behaviour of create view should be equivalent of subquery? I think I would say it like "create view" makes a entry in the catalog and then when *querying* if there is a view reference for that query the view should be be treated like a subquery > And create view is just to give a name binding to a query expression. And we always unfold (or inline) the view in the logical plan as a subquery, so that we could do different optimizations in different contexts. Is my understanding correct? Yes, I think that is the ideal behavior for views > And this should be an optimization of logical plan, right? I think that would be a good implementation, as it would work for both SQL and DataFrame interfaces. > Another question is that should we treat treat view as a lazy value or an expression? I think the answer is expression (even though that would mean `v` is recomputed each query in your example). This is the core difference between views and tables. If a user wants to calculate the result once, they can use a table (or `CREATE TABLE AS SELECT...`. If they want a convenient alias that is expanded at query time, they use a view. Trying to save the output of a view has at least two potential challenges: 1. When the the data in `t` changes (a new row got added, say) then the output needs to be updated (like some sort of cache invalidation) 2. The size of the view `v` may be very large and so caching it might be infeasible. As an example, in TPCH schema maybe you don't want to have to explicitly write the joins from `lineitem` to order each time so you could define a view like ```sql create view lineitems_detailed as select * from lineitems l JOIN orders o on l.o_key = o.o_key; ``` And then let's say you write queries with predicates like ```sql select count(*) from lineitems_detailed where order_type = 'foo' AND o_date between '1/1/2022' AND '1/1/2023'; ``` So in this case the actual values of the view are never "materialized" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org