I have a Parquet file which has a field defined as a struct:
workEmail: struct<address: string>
child 0, address: string
-- field metadata --
PARQUET:field_id: '13'
-- field metadata --
PARQUET:field_id: '1'
I am trying to write a filter as a DNF to query a specific value for
workEmail.address but pyarrow does not seem to accept the DNF:
tbl = pyarrow.parquet.read_table(filename, use_legacy_dataset=False,
columns=["workEmail"], filters=[("workEmail.address", "=", "[email protected]")])
Is this supported? If not, any other workarounds?
--
Partha Dutta
[email protected]