Re: Spark sql not pushing down timestamp range queries

2016-04-15 Thread Kiran Chitturi
Mich, I am curious as well on how Spark casts between different types of filters. For example: the conversions happen implicitly for 'EqualTo' filter scala> sqlContext.sql("SELECT * from events WHERE `registration` = > '2015-05-28'").explain() > > 16/04/15 11:44:15 INFO ParseDriver: Parsing

Re: Spark sql not pushing down timestamp range queries

2016-04-15 Thread Kiran Chitturi
Thanks Hyukjin for the suggestion. I will take a look at implementing Solr datasource with CatalystScan. ​

Re: Spark sql not pushing down timestamp range queries

2016-04-15 Thread Mich Talebzadeh
Thanks Takeshi, I did check it. I believe you are referring to this statement "This is likely because we cast this expression weirdly to be compatible with Hive. Specifically I think this turns into, CAST(c_date AS STRING) >= "2016-01-01", and we don't push down casts down into data sources.

Re: Spark sql not pushing down timestamp range queries

2016-04-14 Thread Takeshi Yamamuro
Hi, Mich Did you check the URL Josh referred to?; the cast for string comparisons is needed for accepting `c_date >= "2016"`. // maropu On Fri, Apr 15, 2016 at 10:30 AM, Hyukjin Kwon wrote: > Hi, > > > String comparison itself is pushed down fine but the problem is to

Re: Spark sql not pushing down timestamp range queries

2016-04-14 Thread Hyukjin Kwon
Hi, String comparison itself is pushed down fine but the problem is to deal with Cast. It was pushed down before but is was reverted, ( https://github.com/apache/spark/pull/8049). Several fixes were tried here, https://github.com/apache/spark/pull/11005 and etc. but there were no changes to

Re: Spark sql not pushing down timestamp range queries

2016-04-14 Thread Mich Talebzadeh
Hi Josh, Can you please clarify whether date comparisons as two strings work at all? I was under the impression is that with string comparison only first characters are compared? Thanks Dr Mich Talebzadeh LinkedIn *

Re: Spark sql not pushing down timestamp range queries

2016-04-14 Thread Josh Rosen
AFAIK this is not being pushed down because it involves an implicit cast and we currently don't push casts into data sources or scans; see https://github.com/databricks/spark-redshift/issues/155 for a possibly-related discussion. On Thu, Apr 14, 2016 at 10:27 AM Mich Talebzadeh

Re: Spark sql not pushing down timestamp range queries

2016-04-14 Thread Mich Talebzadeh
Are you comparing strings in here or timestamp? Filter ((cast(registration#37 as string) >= 2015-05-28) && (cast(registration#37 as string) <= 2015-05-29)) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Spark sql not pushing down timestamp range queries

2016-04-14 Thread Kiran Chitturi
Hi, Timestamp range filter queries in SQL are not getting pushed down to the PrunedFilteredScan instances. The filtering is happening at the Spark layer. The physical plan for timestamp range queries is not showing the pushed filters where as range queries on other types is working fine as the