[GitHub] [spark] HyukjinKwon edited a comment on issue #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
HyukjinKwon edited a comment on issue #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/24405#issuecomment-540881610 @giamo mind updating PR? Sorry for my late response. Looks like we're getting closer to merge. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on issue #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
HyukjinKwon edited a comment on issue #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/24405#issuecomment-529158817 > 2. It's possible to pass the writer schema inside the options map, if you guys think it would be more clear. Should I implement the change? WDYT @mgaido91, @gengliangwang, and @cloud-fan? I think we should stop adding parameters and merge it into `options`. There are already differences between `from/to_xxx`'s options comparing to DataSources options - some options are specific to DataSource, for instance, `multiline`. I think it's fine to have the opposite case, specific to `from/to_xxx`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on issue #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
HyukjinKwon edited a comment on issue #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/24405#issuecomment-526981210 Yes, please. What I am worried is that, 1. I would like to be very sure if this is something worth and regular users want. From what I read the comments and codes here, basically it allows to take a wider schema whereas currently it's strictly restricted to the avro schema converted from Spark's, am I correct? Can you show it with actual code examples (better if that's runnable)? I lately updated PR template (see https://github.com/apache/spark/commit/0ea8db9fd3d882140d8fa305dd69fc94db62cf8f) . It would be great if this is followed. 2. Can we merge those additional parameters into `options: java.util.Map[String, String]`? I don't think it's a good idea to keep adding parameters to the function while we have `options`. If Avro Datasource does not have equivalent options, we can add the options for specific for both functions. 3. Out of curiosity, how do we handle such case in Avro DataSource? Are we able to merge options if there are? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org