Hi,

I have a table which ended up having 3K+ columns. The building of the table
wasn't that painful, but the part where things suck is when creating VIEWs
on top of that table.

1 of the views that I want to create needs complex operation and references
a ton of columns or almost all of the columns.

When applying this view to hive, it takes over 25 minutes for the view
definition to get applied. Acceptable if the view didn't need frequent
updates, but not acceptable if we plan to change the view often or have
multiple such views.

So the questions:
1) Should it take so long for hive to create a view that has so many
columns ? If not, should we open a JIRA and investigate this issue ?
2) The underlying tables are CSV (raw data) or ORC (after some
processing)... would we benefit if we change it from 3K+ columns to a
single column containing List<Object> column or Map<String, Object> for all
the values and then use the required columns

We are on Hive 0.13.0 and our metastore is backed by MariaDB 10

Thanks,
Viral

Reply via email to