Hi,

String comparison itself is pushed down fine but the problem is to deal
with Cast.


It was pushed down before but is was reverted, (
https://github.com/apache/spark/pull/8049).

Several fixes were tried here, https://github.com/apache/spark/pull/11005
and etc. but there were no changes to make it.


To cut it short, it is not being pushed down because it is unsafe to
resolve cast (eg. long to integer)

For an workaround,  the implementation of Solr data source should be
changed to one with CatalystScan, which take all the filters.

But CatalystScan is not designed to be binary compatible across releases,
however it looks some think it is stable now, as mentioned here,
https://github.com/apache/spark/pull/10750#issuecomment-175400704.


Thanks!


2016-04-15 3:30 GMT+09:00 Mich Talebzadeh <mich.talebza...@gmail.com>:

> Hi Josh,
>
> Can you please clarify whether date comparisons as two strings work at all?
>
> I was under the impression is that with string comparison only first
> characters are compared?
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 14 April 2016 at 19:26, Josh Rosen <joshro...@databricks.com> wrote:
>
>> AFAIK this is not being pushed down because it involves an implicit cast
>> and we currently don't push casts into data sources or scans; see
>> https://github.com/databricks/spark-redshift/issues/155 for a
>> possibly-related discussion.
>>
>> On Thu, Apr 14, 2016 at 10:27 AM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Are you comparing strings in here or timestamp?
>>>
>>> Filter ((cast(registration#37 as string) >= 2015-05-28) &&
>>> (cast(registration#37 as string) <= 2015-05-29))
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 14 April 2016 at 18:04, Kiran Chitturi <kiran.chitt...@lucidworks.com
>>> > wrote:
>>>
>>>> Hi,
>>>>
>>>> Timestamp range filter queries in SQL are not getting pushed down to
>>>> the PrunedFilteredScan instances. The filtering is happening at the Spark
>>>> layer.
>>>>
>>>> The physical plan for timestamp range queries is not showing the pushed
>>>> filters where as range queries on other types is working fine as the
>>>> physical plan is showing the pushed filters.
>>>>
>>>> Please see below for code and examples.
>>>>
>>>> *Example:*
>>>>
>>>> *1.* Range filter queries on Timestamp types
>>>>
>>>>    *code: *
>>>>
>>>>> sqlContext.sql("SELECT * from events WHERE `registration` >=
>>>>> '2015-05-28' AND `registration` <= '2015-05-29' ")
>>>>
>>>>    *Full example*:
>>>> https://github.com/lucidworks/spark-solr/blob/master/src/test/scala/com/lucidworks/spark/EventsimTestSuite.scala#L151
>>>> *    plan*:
>>>> https://gist.github.com/kiranchitturi/4a52688c9f0abe3d4b2bd8b938044421#file-time-range-sql
>>>>
>>>> *2. * Range filter queries on Long types
>>>>
>>>>     *code*:
>>>>
>>>>> sqlContext.sql("SELECT * from events WHERE `length` >= '700' and
>>>>> `length` <= '1000'")
>>>>
>>>>     *Full example*:
>>>> https://github.com/lucidworks/spark-solr/blob/master/src/test/scala/com/lucidworks/spark/EventsimTestSuite.scala#L151
>>>>     *plan*:
>>>> https://gist.github.com/kiranchitturi/4a52688c9f0abe3d4b2bd8b938044421#file-length-range-sql
>>>>
>>>> The SolrRelation class we use extends
>>>> <https://github.com/lucidworks/spark-solr/blob/master/src/main/scala/com/lucidworks/spark/SolrRelation.scala#L37>
>>>> the PrunedFilteredScan.
>>>>
>>>> Since Solr supports date ranges, I would like for the timestamp filters
>>>> to be pushed down to the Solr query.
>>>>
>>>> Are there limitations on the type of filters that are passed down with
>>>> Timestamp types ?
>>>> Is there something that I should do in my code to fix this ?
>>>>
>>>> Thanks,
>>>> --
>>>> Kiran Chitturi
>>>>
>>>>
>>>
>

Reply via email to