Nick Dimiduk created SPARK-19638:
------------------------------------

             Summary: Filter pushdown not working for struct fields
                 Key: SPARK-19638
                 URL: https://issues.apache.org/jira/browse/SPARK-19638
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: Nick Dimiduk


Working with a dataset containing struct fields, and enabling debug logging in 
the ES connector, I'm seeing the following behavior. The dataframe is created 
over the ES connector and then the schema is extended with a couple column 
aliases, such as.

{noformat}
df.withColumn("f2", df("foo"))
{noformat}

Queries vs those alias columns work as expected for fields that are non-struct 
members.

{noformat}
scala> df.withColumn("f2", df("foo")).where("f2 == '1'").limit(0).show
17/02/16 15:06:49 DEBUG DataSource: Pushing down filters 
[IsNotNull(foo),EqualTo(foo,1)]
17/02/16 15:06:49 TRACE DataSource: Transformed filters into DSL 
[{"exists":{"field":"foo"}},{"match":{"foo":"1"}}]
{noformat}

However, try the same with an alias over a struct field, and no filters are 
pushed down.

{noformat}
scala> df.withColumn("bar_baz", df("bar.baz")).where("bar_baz == 
'1'").limit(1).show
{noformat}

In fact, this is the case even when no alias is used at all.

{noformat}
scala> df.where("bar.baz == '1'").limit(1).show
{noformat}

Basically, pushdown for structs doesn't work at all.

Maybe this is specific to the ES connector?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to