[ https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417025#comment-15417025 ]
Kalyan commented on PHOENIX-2336: --------------------------------- Hi Josh Mahonin, same patch is going to work for PHOENIX-2336, PHOENIX-2290 and PHOENIX-2547. i added the unit tests with proper comment also .. Comments: Limitation: filter / where expressions are not allowed with "double quotes", instead of that pass it as column expressions Reason: if the expression contains "double quotes" then spark sql parser, ignoring evaluating .. giving to next level to handle Please review this patch https://github.com/kalyanhadooptraining/phoenix/commit/81df0c698ba4155a8f73ffe0ad657e9a5640d811 > Queries with small case column-names return empty result-set when working > with Spark Datasource Plugin > ------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-2336 > URL: https://issues.apache.org/jira/browse/PHOENIX-2336 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.6.0 > Reporter: Suhas Nalapure > Assignee: Josh Mahonin > Labels: verify > Fix For: 4.9.0 > > > Hi, > The Spark DataFrame filter operation returns empty result-set when > column-name is in the smaller case. Example below: > DataFrame df = > sqlContext.read().format("org.apache.phoenix.spark").options(params).load(); > df.filter("\"col1\" = '5.0'").show(); > Result: > +---+----+---+---+---+--- > | ID|col1| c1| d2| d3| d4| > +---+----+---+---+---+---+ > +---+----+---+---+---+---+ > Whereas the table actually has some rows matching the filter condition. And > if double quotes are removed from around the column name i.e. df.filter("col1 > = '5.0'").show(); , a ColumnNotFoundException is thrown: > Exception in thread "main" java.lang.RuntimeException: > org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): > Undefined column. columnName=D1 > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125) > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80) > at > org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) -- This message was sent by Atlassian JIRA (v6.3.4#6332)