On the subject of CAST pushdown to Scans, there are potential drawbacks ...
- In general, the planner will see a Scan-Project where the Project has CAST functions. But the Project can have arbitrary expressions, e.g CAST(a as INT) * 5 or a combination of 2 CAST functions or non-CAST functions etc. It would be quite expensive to examine each expression (there could be hundreds) to determine whether it is eligible to be pushed to the Scan. - Expressing Nullability is not possible with CAST. If a column should be tagged as (not)nullable, CAST syntax does not allow that. - Drill currently supports CASTing to a SQL data type, but not to the complex types such as arrays and maps. We would have to add support for that from a language perspective as well as the run-time. This would be non-trivial effort. -Aman On Fri, Apr 6, 2018 at 4:59 PM, Hanumath Rao Maduri <hanu....@gmail.com> wrote: > Hello, > > Thanks for Ted & Paul for clarifying my questions. > Sorry for not being clear in my previous post, When I said create view I > was under the impression for simple views where we use cast expressions > currently to cast them to types. In this case planner can use this > information to force the scans to use this as the schema. > > If the query fails then it fails at the scan and not after inferring the > schema by the scanner. > > I know that views can get complicated with joins and expressions. For > schema hinting through views I assume they should be created on single > tables with corresponding columns one wants to project from the table. > > > Regarding the same question, today we had a discussion with Aman. Here view > can be considered as a "view" of the table with schema in place. > > We can change some syntax to suite it for specifying schema. something like > this. > > create schema[optional] view(/virtual table ) v1 as (a: int, b : int) > select a, b from t1 with some other rules as to conversion of scalar to > complex types. > > Then the queries when used on this view (below) should enable the scanner > to use this type information and then use it to convert the data into the > appropriate types. > select * from v1 > > For the possibility of schema information not being known by the user, may > be use something like this. > > create schema[optional] view(/virtual table) v1 as select a, b from t1 > infer schema. > > This view when used to query the table should trigger the logic of > inferring and consolidating the schema and attaching that inferred schema > to the view. In future when we use the same view, we should be using the > inferred schema. This view either can be local view pertaining to the > session or a global view so that other queries across sessions can use > them. > > > By default we can apply certain rules such as converting simple scalar > values to other scalar values (like int to double etc). But we should be > also able to give option to the customer to enable rules such as scalar int > to array[int] when creating the view itself. > > > Thanks, > -Hanu > > > On Fri, Apr 6, 2018 at 3:10 PM, Paul Rogers <par0...@yahoo.com.invalid> > wrote: > > > Ted, this is why your participation in Drill is such a gift: cast > > push-down is an elegant, simple solution that even works in views. > > Beautiful. > > > > Thanks, > > - Paul > > > > > > > > On Friday, April 6, 2018, 11:35:37 AM PDT, Ted Dunning < > > ted.dunn...@gmail.com> wrote: > > > > On Thu, Apr 5, 2018 at 9:43 PM, Paul Rogers <par0...@yahoo.com.invalid> > > wrote: > > > > > Great discussion. Really appreciate the insight from the Drill users! > > > > > > To Ted's points: the simplest possible solution is to allow a table > > > function to express types. Just making stuff up: > > > > > > SELECT a FROM schema(myTable, (a: INT)) > > > > > > > Why not just allow cast to be pushed down to the reader? > > > > Why invent new language features? > > > > Or, really ugly, a session option: > > > > > > ALTER SESSION SET schema.myTable="a: INT" > > > > > > > These are a big problem. > > > > >