Hi Team,
I am looking into using the apache drill capabilities for querying H5 data.
The documentation on this as provided on the site
https://drill.apache.org/docs/hdf5-format-plugin/ works for version 1.19.0,
however not as of 1.20.0.
The column where the actual data is mapped into seems to be no longer available.
e.g. the column int_data as per below example is no longer there .
apache drill> select * from dfs.test.`dset.h5`;
|-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
| path | data_type | file_name | data_size | element_count | is_timestamp |
is_time_duration | dataset_data_type | dimensions | int_data
|
|-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
| /dset | DATASET | dset.h5 | 96 | 24 | false |
false | INTEGER | [4, 6] |
[[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15,16,17,18],[19,20,21,22,23,24]] |
|-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
I have read somewhere that a parameter in the workspace definition
"showPreview" : true should restore the original way of working, however when
trying to save this parameter, it is automagically removed.
(remark : the environment is running the apache/drill image in a docker
container, the config is stored on a mounted drive)
The reason for needing this int_data, double_data column is that there are a
lot of times too many values in and it is not known upfront how many values
will be in the field.
Hence the "column" approach in the select * from table(xyz) is not workable.
It is necessary to be able to do e.g. select flatten(int_data) as int_data
from dfs.test.dset.h5;
Is there a way to get this (re)-activated in apache dril 1.22 and successors ?
All help is much appreciated.
Kind regards
Tore