Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22121#discussion_r210972278 --- Diff: docs/avro-data-source-guide.md --- @@ -0,0 +1,267 @@ +--- +layout: global +title: Avro Data Source Guide +--- + +Since Spark 2.4 release, [Spark SQL](https://spark.apache.org/docs/latest/sql-programming-guide.html) provides support for reading and writing Avro data. + +## Deploying +The <code>spark-avro</code> module is external and not included in `spark-submit` or `spark-shell` by default. + +As with any Spark applications, `spark-submit` is used to launch your application. `spark-avro_{{site.SCALA_BINARY_VERSION}}` +and its dependencies can be directly added to `spark-submit` using `--packages`, such as, --- End diff -- Here I am following https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#deploying . Using `--packages` ensures that this library and its dependencies will be added to the classpath, which should be good enough for general users. For users build their jar, they are supposed to know the general option `--jars`. I can add it if you insist.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org