I am looking for discussion here. A colleague was asking me how to add
comments to the metadata of a view.  (He's new to Drill, thus the idea of
not having metadata for a table is one he's warming up to).

That got me thinking... why couldn't we use Drill Views to store
table/field comments?  This could be a great way to help add contextual
information for users. Here's some current observations when I issue a
describe view_myview


1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
2. Even thought the underlying parquet table has types, the view does not
pass the types for the underlying parquet files through.  (The type is ANY)
3. The data for the view is all just a json file that could be easily
extended.


So, a few things would be a nice to have

1. Table comments.  when I issue a describe table, if the view has a
"Description" field, then having that print out as a description for the
whole view would be nice.  This is harder, I think because it's not just
extending the view information.

2. Column comments:  A text field that could be added to the view, and just
print out another column with description.  This would be very helpful.
While Drill being schemaless is awesome, the ability to add information to
known data, is huge.

3. Ability to to use the types from the Parquet files (without manually
specifying each type).  If we could provide an option to View creation to
attempt to infer type, that would be handy. I realize that folks are using
the LIMIT 0 to get metadata, but describe could be done well too.

4. Ability, using ANSI Sql to update the view column descriptions and the
description for the view itself.

5. I believe Avro has the ability to add this information to the files, so
if the data exists outside of views (such as in AVRO files) should we
present it to the user in describe table events as well?

Curious if folks think this would be valuable, how much work an addition
like this would be to Drill, and other thoughts in general.


John

Reply via email to