I've noticed drill offers a REPEATED_CONTAINS which can be applied to fields which are arrays.
https://drill.apache.org/docs/repeated-contains/ I have a schema stored in parquet files which contain a repeated field containing a key and a value. However such structures can't be queried using the REPEATED_CONTAINS. I was thinking of writing a user defined function to look through it. My question is: is it worth it? Will it be faster than doing this? {"name":"classic","fillings":[ {"name":"sugar","cal":500} , {"name":"flour","cal":300} ] } SELECT flat.fill FROM (SELECT FLATTEN(t.fillings) AS fill FROM dfs.flatten.`test.json` t) flat WHERE flat.fill.name like 'sug%'; Specifically what's the cost of using FLATTEN compared to iterating over the array right in a UDF? Thanks Jean-Claude