[ 
https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568393#comment-16568393
 ] 

Thomas Graves commented on SPARK-24924:
---------------------------------------

| It wouldn't be very different for 2.4.0. It could be different but I guess it 
should be incremental improvement without behaviour changes.

I don't buy this agrument, the code has been restructured a lot and you could 
have introduced bugs, behavior changes, etc.  If the user has been using the 
databrick spark-avro version for other releases and it was working fine and now 
we magically map it to a different version and they break, they are going to 
complain and say, I didn't change anything why did this break. 

Users could have also made their own modified version of the databricks 
spark-avro package (which we actually have to support primitive types) and thus 
the implementation is not the same and yet you are assuming it is.  Just a note 
the fact we use different version isn't my issue, I'm happy to make that work, 
I'm worried about other users who didn't happen to see this jira.   I also 
realize these are 3rd party packages but I think we are making the assumption 
here based on this being a databricks package, which in my opinion we 
shouldn't.   What if this was companyX package which we didn't know about, what 
would/should be the expected behavior? 

How many users complained about the csv thing?  Could we just improve the error 
message to more simply state, "Multiple sources found, perhaps you are 
including an external package that also supports avro. Spark started internally 
supporting as of release X.Y, please remove the external package or rewrite to 
use different function"

> Add mapping for built-in Avro data source
> -----------------------------------------
>
>                 Key: SPARK-24924
>                 URL: https://issues.apache.org/jira/browse/SPARK-24924
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Dongjoon Hyun
>            Assignee: Dongjoon Hyun
>            Priority: Minor
>             Fix For: 2.4.0
>
>
> This issue aims to the followings.
>  # Like `com.databricks.spark.csv` mapping, we had better map 
> `com.databricks.spark.avro` to built-in Avro data source.
>  # Remove incorrect error message, `Please find an Avro package at ...`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to