https://issues.apache.org/jira/browse/SPARK-20597
I'm going to send a PR soon. Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Mon, May 1, 2017 at 8:26 PM, Cody Koeninger <[email protected]> wrote: > Yeah, seems reasonable. > > On Mon, May 1, 2017 at 12:40 PM, Jacek Laskowski <[email protected]> wrote: >> Hi, >> >> Thanks Cody and Michael! I didn't expect to get two answers so quickly and >> from THE brains behind spark - Kafka integration. #impressed >> >> Yes, Michael has nailed it. Using save's path was so natural to me after >> months with Spark that I was surprised to not have seen it instead of the >> custom and surely not very obvious topic. >> >> Imagine my day today when I'd discovered that I could use KafkaSource in >> batch queries and then suddenly found out about no support for path in save. >> I'm not faint-hearted so I survived :-) >> >> I think that change would make KafkaSource even cooler. Please add support >> if possible (and make it part of the upcoming 2.2.0, too!) >> >> Thanks. >> >> Jacek >> >> On 1 May 2017 7:26 p.m., "Michael Armbrust" <[email protected]> wrote: >>> >>> He's just suggesting that since the DataStreamWriter start() method can >>> fill in an option named "path", we should make that a synonym for "topic". >>> Then you could do something like. >>> >>> df.writeStream.format("kafka").start("topic") >>> >>> Seems reasonable if people don't think that is confusing. >>> >>> On Mon, May 1, 2017 at 8:43 AM, Cody Koeninger <[email protected]> wrote: >>>> >>>> I'm confused about what you're suggesting. Are you saying that a >>>> Kafka sink should take a filesystem path as an option? >>>> >>>> On Mon, May 1, 2017 at 8:52 AM, Jacek Laskowski <[email protected]> wrote: >>>> > Hi, >>>> > >>>> > I've just found out that KafkaSourceProvider supports topic option >>>> > that sets the Kafka topic to save a DataFrame to. >>>> > >>>> > You can also use topic column to assign rows to topics. >>>> > >>>> > Given the features, I've been wondering why "path" option is not >>>> > supported (even of least precedence) so when no topic column or option >>>> > are defined, save(path: String) would be the least priority. >>>> > >>>> > WDYT? >>>> > >>>> > It looks pretty trivial to support --> see KafkaSourceProvider at >>>> > lines [1] and [2] if I'm not mistaken. >>>> > >>>> > [1] >>>> > https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L145 >>>> > [2] >>>> > https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L163 >>>> > >>>> > Pozdrawiam, >>>> > Jacek Laskowski >>>> > ---- >>>> > https://medium.com/@jaceklaskowski/ >>>> > Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark >>>> > Follow me at https://twitter.com/jaceklaskowski >>>> > >>>> > --------------------------------------------------------------------- >>>> > To unsubscribe e-mail: [email protected] >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: [email protected] >>>> >>> >> --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
