Hi Partha,

The functionality to select a nested field exists in the C++ library, but
as far as I know, this is not yet exposed in the Python bindings, so the
example you are showing is not yet supported in practice.

I opened a JIRA to track this feature:
https://issues.apache.org/jira/browse/ARROW-11259

Best,
Joris

On Wed, 13 Jan 2021 at 19:57, PARTHA DUTTA <[email protected]> wrote:

> I have a Parquet file which has a field defined as a struct:
> workEmail: struct<address: string>
>   child 0, address: string
>     -- field metadata --
>     PARQUET:field_id: '13'
>   -- field metadata --
>   PARQUET:field_id: '1'
>
> I am trying to write a filter as a DNF to query a specific value for
> workEmail.address but pyarrow does not seem to accept the DNF:
>
> tbl = pyarrow.parquet.read_table(filename, use_legacy_dataset=False,
> columns=["workEmail"], filters=[("workEmail.address", "=", "[email protected]
> ")])
>
> Is this supported? If not, any other workarounds?
>
> --
> Partha Dutta
> [email protected]
>

Reply via email to