[ 
https://issues.apache.org/jira/browse/SPARK-26165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698227#comment-16698227
 ] 

Sujith edited comment on SPARK-26165 at 11/25/18 7:16 PM:
----------------------------------------------------------

I think we shall avoid casting to string in the cases where filter condition 
literals of string type value  can generate a valid  date/timestamp,  like the 
filter  condition mentioned in jira ,otherwise we can fallback to the current 
logic of casting to string type.

This approach can also avoid the unnecessary overhead of casting the left 
filter column expression timestamp/date type values to string type as mentioned 
in JIRA.

I wll raise a PR for handle this issue.. please let me know for any suggestions.

 cc [~srowen]  [~cloud_fan] [~vinodkc] 


was (Author: s71955):
I think we shall avoid casting to string in the cases where filter condition 
literals string type value  can generate a valid  date or timestamp,  like the 
filter  condition mentioned in jira ,otherwise we can fallback to the current 
logic of cast to string type.

This approach can also avoid the unnecessary overhead of casting the left 
filter column expression timestamp/date type values to string

I wll raise a PR for handle this issue.. please let me know for any suggestions.

 cc [~srowen]  [~cloud_fan] [~vinodkc] 

> Date and Timestamp column expression is getting converted to string in less 
> than/greater than filter query even though valid date/timestamp string 
> literal is used in the right side filter expression
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-26165
>                 URL: https://issues.apache.org/jira/browse/SPARK-26165
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer
>    Affects Versions: 2.3.2, 2.4.0
>            Reporter: Sujith
>            Priority: Major
>         Attachments: timestamp_filter_perf.PNG
>
>
> Date and Timestamp column is getting converted to string in less than/greater 
> than filter query even though date strings that contains a time, like 
> '2018-03-18" 12:39:40' to date. Besides it's not possible to cast a string 
> like '2018-03-18 12:39:40' to a timestamp.
>  
> scala> spark.sql("""explain extended SELECT username FROM orders WHERE 
> order_creation_date > '2017-02-26 13:45:12'""").show(false);
> +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> |== Parsed Logical Plan ==
> 'Project ['username]
> +- 'Filter ('order_creation_date > 2017-02-26 13:45:12)
>  +- 'UnresolvedRelation `orders`
> == Analyzed Logical Plan ==
> username: string
> Project [username#59]
> +- Filter (cast(order_creation_date#60 as string) > 2017-02-26 13:45:12)
>  +- SubqueryAlias orders
>  +- HiveTableRelation `default`.`orders`, 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [username#59, 
> order_creation_date#60, amount#61]
> == Optimized Logical Plan ==
> Project [username#59]
> +- Filter (isnotnull(order_creation_date#60) && (cast(order_creation_date#60 
> as string) > 2017-02-26 13:45:12))
>  +- HiveTableRelation `default`.`orders`, 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [username#59, 
> order_creation_date#60, amount#61]
> == Physical Plan ==
> *(1) Project [username#59]
> +- *(1) Filter (isnotnull(order_creation_date#60) && 
> (cast(order_creation_date#60 as string) > 2017-02-26 13:45:12))
>  +- HiveTableScan [order_creation_date#60, username#59], HiveTableRelation 
> `default`.`orders`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> [username#59, order_creation
> +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to