/me sighs of relief

On Mon, Dec 14, 2015 at 7:28 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Actually, even without multiple storage types, this could be radically
> confusing.
>
> If I have many avro files that are partitioned into directories, then
> queries that use the partitioning to limit the files that I see could
> include or exclude more recent files that have added a new field.
>
> That means that a query would succeed or fail according to which date range
> I use for the query.
>
> That seems pretty radically bad.
>
>
>
>
> On Mon, Dec 14, 2015 at 9:33 AM, Stefán Baxter <ste...@activitystream.com>
> wrote:
>
> > Hi,
> >
> > This simply can not be the desired behavior!
> >
> > This prevents from using a field from a changing schema with dir0
> > sub-selection (directory pruning) as the altered/full schema is never
> part
> > of the query and it subsequently fails.
> >
> > Drill should, IMOP, never have rules that are dependent on the underlying
> > storage type. If the query runs with JSON and Parquet then it should work
> > for Avro as well.
> >
> > I'm hoping this strict schema validation is all just a misunderstanding.
> >
> > Regards,
> >  -Stefán
> >
> > On Mon, Dec 14, 2015 at 3:28 PM, Kamesh <kamesh.had...@gmail.com> wrote:
> >
> > > For Avro files, we first construct the schema, and this schema is used
> > for
> > > validating queries. So, if there are any errors in the query (like the
> > > invalid field references) it will fail fast. As of now, for other file
> > > formats, query validation (checking  for invalid field reference) does
> > not
> > > happen, and at run time, it constructs the schema for them and hence
> > nulls
> > > for invalid fields.
> > >
> > >
> > > On Mon, Dec 14, 2015 at 2:36 PM, Stefán Baxter <
> > ste...@activitystream.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm getting the following error when querying Avro files:
> > > >
> > > > Error: VALIDATION ERROR: From line 1, column 48 to line 1, column 57:
> > > > Column 'some_col' not found in any table
> > > >
> > > > It's true that the field is in none of the tables I'm targeting, in
> > that
> > > > particular query, but that does not mean that it is in none of the
> > > possible
> > > > files I could be querying.
> > > >
> > > > We use Avro to get the benefits of the schema but I never expected
> > Drill
> > > to
> > > > enforce it this way.
> > > >
> > > > Why do unresolved  columns not return null?
> > > >
> > > > This makes no sense to me as I think a fundamental trade of Drill,
> when
> > > > trying to eliminate ETL, is to return null for any missing fields.
> > > >
> > > > Please advise.
> > > >
> > > > Regards,
> > > >  -Stefán
> > > >
> > >
> > >
> > >
> > > --
> > > Kamesh.
> > >
> >
>

Reply via email to