Dear Charles,

I just tried docker run -it --name drill_test -p 8047:8047 -p 31010:31010 
apache/drill:1.23.0-SNAPSHOT

I got  docker: Error response from daemon: manifest for 
apache/drill:1.23.0-SNAPSHOT not found: manifest unknown: manifest unknown.

Maybe it still needs to sync somewhere.
I will give it a retry later.

Kind regards

Tore



From: Charles Givre <[email protected]>
Sent: Monday, 9 March 2026 19:06
To: Tore Van Grembergen <[email protected]>
Cc: [email protected]
Subject: Re: quering H5 "flatten" data in apache drill in releases after 1.19.0

Hi Tore,
We pushed an update to Drill which fixes the issue.  I think you should be able 
to pull version 1.23.0-SNAPSHOT from Docker hub.  Alternatively, you can build 
from source and it will have the fix.
Best,
— C


On Mar 7, 2026, at 03:06, Tore Van Grembergen 
<[email protected]<mailto:[email protected]>> wrote:

Dear Charles,

Good that you found it.
I am no java specialist, but I am willing to give it a try.
Will the fix be added in the upcoming release ?

Kind regards

Tore

From: Charles Givre <[email protected]<mailto:[email protected]>>
Sent: Friday, 6 March 2026 22:49
To: Tore Van Grembergen <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: quering H5 "flatten" data in apache drill in releases after 1.19.0

Hi Tore,
As I suspected, there was a small bug in the config file.  Are you comfortable 
with Java?   I’m in the middle of doing some significant drill rework and 
submitting a fix would be a little difficult.  I can send you the fix however.
Best,
— C



On Mar 6, 2026, at 15:11, Tore Van Grembergen 
<[email protected]<mailto:[email protected]>> wrote:

Thanks for following up on this
Much appreciated.

Kind regards

Tore

From: Charles Givre <[email protected]<mailto:[email protected]>>
Sent: Friday, 6 March 2026 21:10
To: Tore Van Grembergen <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: quering H5 "flatten" data in apache drill in releases after 1.19.0

Hi Tore,
I’ll look into this.  I did look at the logic and everything is still there.  
There are current unit tests which test all this functionality and they are 
passing.  I have a theory about what may be happening and that is that the 
parameter may be getting dropped in the UI.
Best,
— C




On Mar 6, 2026, at 15:01, Tore Van Grembergen 
<[email protected]<mailto:[email protected]>> wrote:

Dear Charles,

The same happens as with "showPreview": true.
Apache Drill  allows for saving this config. It does not give an error
However when you look back, this parameter has disappeared.

Kind regards

Tore




From: Charles Givre <[email protected]<mailto:[email protected]>>
Sent: Friday, 6 March 2026 06:43
To: Tore Van Grembergen <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: quering H5 "flatten" data in apache drill in releases after 1.19.0

What happens when you set it to false?





On Mar 6, 2026, at 00:28, Tore Van Grembergen 
<[email protected]<mailto:[email protected]>> wrote:

Dear Charles,

Thanks for coming back on this.

I tried the "showPreview": true.
Apache Drill  allows for saving this config. It does not give an error
However when you look back, this parameter has disappeared.
It is as if it is filtered out during the save of the config file.

Kind regards

Tore

From: Charles Givre <[email protected]<mailto:[email protected]>>
Sent: Friday, 6 March 2026 03:47
To: [email protected]<mailto:[email protected]>
Cc: Tore Van Grembergen <[email protected]<mailto:[email protected]>>
Subject: Re: quering H5 "flatten" data in apache drill in releases after 1.19.0

Hi Tore,
Thanks for your interest and use of Drill.  Could you try this:

1.  In the configuration for your dfs plugin, make sure that the config for the 
hdf5 format is as shown below:


"hdf5": {
  "type": "hdf5",
  "extensions": [
    "h5"
  ],
  "showPreview": true
}
2.   Run a SELECT *  query on your HDF5 file and report back what the results 
look like.

A word about the HDF5 plugin.  The preview you are looking for is really just 
meant to give a sample of the data.  If your data set is really large, it will 
get truncated in that view.   Also, if I remember correctly, the name 
“int_data” is the actual name of that column from the dataset.

Really the better way to query your data is to use the defaultPath option.  
This allows you to query tables within HDF5 files.


"SELECT int_col_0, int_col_1

FROM table(dfs.`hdf5/scalar.h5` (type => 'hdf5', defaultPath => '/nd/3D'))"
Best,
— C







On Mar 5, 2026, at 15:46, Tore Van Grembergen via user 
<[email protected]<mailto:[email protected]>> wrote:

Hi Team,

I am looking into using the apache drill capabilities for querying H5 data.
The documentation on this as provided on the site 
https://drill.apache.org/docs/hdf5-format-plugin/ works for version 1.19.0, 
however not as of 1.20.0.
The column where the actual data is mapped into seems to be no longer available.

e.g. the column int_data as per below example is no longer there .

apache drill> select * from dfs.test.`dset.h5`;
|-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
| path  | data_type | file_name | data_size | element_count | is_timestamp | 
is_time_duration | dataset_data_type | dimensions | int_data                    
                                             |
|-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
| /dset | DATASET   | dset.h5   | 96        | 24            | false        | 
false            | INTEGER           | [4, 6]     | 
[[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15,16,17,18],[19,20,21,22,23,24]] |
|-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|


I have read somewhere that a parameter in the workspace definition 
"showPreview" : true should restore the original way of working, however when 
trying to save this parameter, it is automagically removed.
(remark : the environment is running the apache/drill image in a docker 
container, the config is stored on a mounted drive)

The reason for needing this int_data, double_data column is that there are a 
lot of times too many values in and it is not known upfront  how many values 
will be in the field.
Hence the "column" approach in the select * from table(xyz) is not workable.
It is necessary to be able to do  e.g. select flatten(int_data) as int_data 
from dfs.test.dset.h5;

Is there a way to get this (re)-activated in apache dril 1.22 and successors ?

All help is much appreciated.

Kind regards

Tore

Reply via email to