One solution seems to be to pre-flatten the data in a CTE, resulting in
dramatically lower runtimes:
~~~
WITH flat AS (SELECT id, FLATTEN(data) AS data)
SELECT id, data[0] AS dttm, data)[1] AS result
FROM flat
~~~
This was tested on a single node, and each JSON array to be flattened
has 1,440
Matt,
Having multiple flatten's in your query leads to cross-join between the
output of each flatten. So a performance hit is expected with the addition
of each flatten. And there could also be a genuine performance bug for this
scenario. To be sure it is a bug we need more information as
I have JSON data with with a nested list and am using FLATTEN to extract
two of three list elements as:
~~~
SELECT id, FLATTEN(data)[0] AS dttm, FLATTEN(data)[1] AS result FROM ...
~~~
This works, but each FLATTEN seems to slow the query down dramatically,
3x slower with the second flatten.