Hi,

I'm trying to use a an array in Parquet to store list of IDs (1:* scenario)
as opposed to put each ID in a separate field. (array contains 1-10 values)

This requires me to use REPEATED_CONTAINS to search for these values.

I was expecting a performance penalty but it turns out that searching with
REPEATED_CONTAINS is 20x times slower then looking for a single value.

My guess it has to do with scan optimization and regular expressions being
used for comparison but I wonder if that is so or if there are some tricks
available to speed this up.

Any suggestions?

Regards,
 -Stefán

Reply via email to