[ https://issues.apache.org/jira/browse/SPARK-18593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724129#comment-15724129 ]
Dongjoon Hyun commented on SPARK-18593: --------------------------------------- Oops. Thank you for correction. > JDBCRDD returns incorrect results for filters on CHAR of PostgreSQL > ------------------------------------------------------------------- > > Key: SPARK-18593 > URL: https://issues.apache.org/jira/browse/SPARK-18593 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.2, 1.6.3 > Reporter: Durga Prasad Gunturu > Assignee: Takeshi Yamamuro > Priority: Minor > Labels: correctness > Fix For: 2.0.0 > > > In Apache Spark 1.6.x, JDBCRDD returns incorrect results for a query with > filters on CHAR column with PostgreSQL CHAR type. The root cause is > PostgreSQL returns `space padded string` for a result. So, the post > processing filter `Filter (a#0 = A)` is evaluated false. Spark 2.0.0 removes > the post filter because it is already handled in the database by > `PushedFilters: [EqualTo(a,A)]`. > {code} > scala> val t_char = sqlContext.read.option("user", > "postgres").option("password", > "rootpass").jdbc("jdbc:postgresql://localhost:5432/postgres", "t_char", new > java.util.Properties()) > t_char: org.apache.spark.sql.DataFrame = [a: string] > scala> val t_varchar = sqlContext.read.option("user", > "postgres").option("password", > "rootpass").jdbc("jdbc:postgresql://localhost:5432/postgres", "t_varchar", > new java.util.Properties()) > t_varchar: org.apache.spark.sql.DataFrame = [a: string] > scala> t_char.show > +----------+ > | a| > +----------+ > |A | > |AA | > |AAA | > +----------+ > scala> t_varchar.show > +---+ > | a| > +---+ > | A| > | AA| > |AAA| > +---+ > scala> t_char.filter(t_char("a")==="A").show > +---+ > | a| > +---+ > +---+ > scala> t_char.filter(t_char("a")==="A ").show > +----------+ > | a| > +----------+ > |A | > +----------+ > scala> t_varchar.filter(t_varchar("a")==="A").show > +---+ > | a| > +---+ > | A| > +---+ > scala> t_char.filter(t_char("a")==="A").explain > == Physical Plan == > Filter (a#0 = A) > +- Scan > JDBCRelation(jdbc:postgresql://localhost:5432/postgres,t_char,[Lorg.apache.spark.Partition;@2f65c341,{user=postgres, > password=rootpass})[a#0] PushedFilters: [EqualTo(a,A)] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org