[ https://issues.apache.org/jira/browse/SPARK-17592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jorge Machado updated SPARK-17592: ---------------------------------- Comment: was deleted (was: I'm hitting the same issue I'm afraid but in slightly another way. When I have a dataframe (that comes from oracle DB ) as parquet I can see in the logs that a field is beeing saved as integer : { "type" : "struct", "fields" : [ \{ "name" : "project_id", "type" : "integer", "nullable" : true, "metadata" : { } },... on hue (which reads from hive) I see : !image-2018-05-24-17-10-24-515.png!) > SQL: CAST string as INT inconsistent with Hive > ---------------------------------------------- > > Key: SPARK-17592 > URL: https://issues.apache.org/jira/browse/SPARK-17592 > Project: Spark > Issue Type: Bug > Affects Versions: 2.0.0 > Reporter: Furcy Pin > Priority: Major > Attachments: image-2018-05-24-17-10-24-515.png > > > Hello, > there seem to be an inconsistency between Spark and Hive when casting a > string into an Int. > With Hive: > {code} > select cast("0.4" as INT) ; > > 0 > select cast("0.5" as INT) ; > > 0 > select cast("0.6" as INT) ; > > 0 > {code} > With Spark-SQL: > {code} > select cast("0.4" as INT) ; > > 0 > select cast("0.5" as INT) ; > > 1 > select cast("0.6" as INT) ; > > 1 > {code} > Hive seems to perform a floor(string.toDouble), while Spark seems to perform > a round(string.toDouble) > I'm not sure there is any ISO standard for this, mysql has the same behavior > than Hive, while postgresql performs a string.toInt and throws an > NumberFormatException > Personnally I think Hive is right, hence my posting this here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org