How about changing all those “implicit” columns to have some
“unconventional” prefix, like an underscore (or two _ _ ); e.g. _suffix, _dir0,
etc .
With such a change we may need to handle the transition of existing users’ code
; e.g., maybe change the priority (mentioned below) so that an existing
“suffix” column takes precedence over the implicit one.
Or just go “cold turkey” and force the users to change.
Just an idea,
Boaz
On 10/9/17, 10:45 AM, "Paul Rogers" <[email protected]> wrote:
Hi All,
Drill provides a set of “implicit” columns to describe files: filename,
suffix, fan and filepath. Drill also provides an open-ended set of partition
columns: dir0, dir1, dir2, etc.
Not all readers support the above: some do and some don’t.
Drill semantics seem to treat these as semi-reserved words when a reader
supports implicit columns. If a table has a “suffix” column, then Drill will
treat “suffix” as an implicit column, ignoring the table column. If the user
wants that table column, they can use a session option to temporarily rename
the implicit column. A bit odd, perhaps, but it is our solution.
What is our desired behavior, however, if the user asks for a column that
includes an implicit column as a prefix: “suffix.a”? Clearly, here, “suffix” is
a map (i.e. structure) and “a” is a field within that map. Since the implicit
“suffix” is never a map, should we:
1) Assume that, here, “suffix” is a map column projected from the table?
2) Issue an error?
3) Ignore the “.a” part and just return “suffix” as an implicit column?
4) Something else?
The code is murky on this point because JSON is implemented far differently
than text files and so on. Each has its own rules. Do we need consistency of
behavior, or is reader-specific behavior the expected design?
Thanks,
- Paul