Hi all,
I have a large dataset of parquet files that are nested within several
subdirectories. For example:
study1
|----data1
|----2020-01-01
|---0001.parquet
|----data2
study2
|----dataA
|----dataB
Is it possible for Drill to report back the "directories" as "tables"? For
example to perform a query and return something that tells me the directory
structure?
I've read something about creating workspaces, but to do so for each of the
directories seems onerous, and also requires going into the storage plugin
configuration.
The alternative would be to implement some logic and traverse the file
system, outside of Drill, and then use that information to drive the
"tables" for the queries. Although, that seems unintuitive provided Drill's
ability to traverse the file system, infer schema, create cache, and so on.
Thanks,
Rafael