Hi David & Indhumathi,
Storing Array of String as just String column in SI by flattening [with row
level position reference] can result in slow performance in case of
* Multiple array_contains() or multiple array[0] = 'x'
* The join solution mentioned can result in multiple scan (once for every
complex filter condition) which can slow down the SI performance.
* Row level SI can slow down SI performance when the filter results huge
value.
* To support multiple SI on a single table, complex SI will become row
level position reference and primitive will become blocklet level position
reference. Need extra logic /time for join.
* Solution 2 cannot support struct column SI in the future. So, it cannot
be a generic solution.

Considering the above points, *solution2 is a very good solution if only
one filter exist* for complex column. *But not a good solution for all the
scenarios.*

*So, I have to go with solution1 or need to wait for other people opinions
or new solutions.*

Thanks,
Ajantha

On Thu, Jul 30, 2020 at 1:19 PM David CaiQiang <david.c...@gmail.com> wrote:

> +1 for solution2
>
> Can we support more than one array_contains by using SI join (like SI on
> primitive data type)?
>
>
>
> -----
> Best Regards
> David Cai
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Reply via email to