Hi, Thanks Cody and Michael! I didn't expect to get two answers so quickly and from THE brains behind spark - Kafka integration. #impressed
Yes, Michael has nailed it. Using save's path was so natural to me after months with Spark that I was surprised to not have seen it instead of the custom and surely not very obvious topic. Imagine my day today when I'd discovered that I could use KafkaSource in batch queries and then suddenly found out about no support for path in save. I'm not faint-hearted so I survived :-) I think that change would make KafkaSource even cooler. Please add support if possible (and make it part of the upcoming 2.2.0, too!) Thanks. Jacek On 1 May 2017 7:26 p.m., "Michael Armbrust" <[email protected]> wrote: > He's just suggesting that since the DataStreamWriter start() method can > fill in an option named "path", we should make that a synonym for "topic". > Then you could do something like. > > df.writeStream.format("kafka").start("topic") > > Seems reasonable if people don't think that is confusing. > > On Mon, May 1, 2017 at 8:43 AM, Cody Koeninger <[email protected]> wrote: > >> I'm confused about what you're suggesting. Are you saying that a >> Kafka sink should take a filesystem path as an option? >> >> On Mon, May 1, 2017 at 8:52 AM, Jacek Laskowski <[email protected]> wrote: >> > Hi, >> > >> > I've just found out that KafkaSourceProvider supports topic option >> > that sets the Kafka topic to save a DataFrame to. >> > >> > You can also use topic column to assign rows to topics. >> > >> > Given the features, I've been wondering why "path" option is not >> > supported (even of least precedence) so when no topic column or option >> > are defined, save(path: String) would be the least priority. >> > >> > WDYT? >> > >> > It looks pretty trivial to support --> see KafkaSourceProvider at >> > lines [1] and [2] if I'm not mistaken. >> > >> > [1] https://github.com/apache/spark/blob/master/external/kafka- >> 0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/Kafk >> aSourceProvider.scala#L145 >> > [2] https://github.com/apache/spark/blob/master/external/kafka- >> 0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/Kafk >> aSourceProvider.scala#L163 >> > >> > Pozdrawiam, >> > Jacek Laskowski >> > ---- >> > https://medium.com/@jaceklaskowski/ >> > Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark >> > Follow me at https://twitter.com/jaceklaskowski >> > >> > --------------------------------------------------------------------- >> > To unsubscribe e-mail: [email protected] >> > >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: [email protected] >> >> >
