Re: Structured Streaming with Kafka sources/sinks

2016-08-30 Thread Reynold Xin
In this case simply not much progress has been made, because people might be busy with other stuff. Ofir it looks like you have spent non-trivial amount of time thinking about this topic and have even designed something to work -- can you chime in on the JIRA ticket with your thoughts and your

Re: Structured Streaming with Kafka sources/sinks

2016-08-30 Thread Nicholas Chammas
> I personally find it disappointing that a big chuck of Spark's design and development is happening behind closed curtains. I'm not too familiar with Streaming, but I see design docs and proposals for ML and SQL published here and on JIRA all the time, and they are discussed extensively. For

Re: Structured Streaming with Kafka sources/sinks

2016-08-30 Thread Cody Koeninger
Not that I wouldn't rather have more open communication around this issue...but what are people actually expecting to get out of structured streaming with regard to Kafka? There aren't any realistic pushdown-type optimizations available, and from what I could tell the last time I looked at

Re: Structured Streaming with Kafka sources/sinks

2016-08-30 Thread Ofir Manor
I personally find it disappointing that a big chuck of Spark's design and development is happening behind closed curtains. It makes it harder than necessary for me to work with Spark. We had to improvise in the recent weeks a temporary solution for reading from Kafka (from Structured Streaming) to

Re: Structured Streaming with Kafka sources/sinks

2016-08-29 Thread Fred Reiss
I think that the community really needs some feedback on the progress of this very important task. Many existing Spark Streaming applications can't be ported to Structured Streaming without Kafka support. Is there a design document somewhere? Or can someone from the DataBricks team break down

Re: Structured Streaming with Kafka sources/sinks

2016-08-27 Thread Koert Kuipers
thats great is this effort happening anywhere that is publicly visible? github? On Tue, Aug 16, 2016 at 2:04 AM, Reynold Xin wrote: > We (the team at Databricks) are working on one currently. > > > On Mon, Aug 15, 2016 at 7:26 PM, Cody Koeninger >

Re: Structured Streaming with Kafka sources/sinks

2016-08-16 Thread Reynold Xin
We (the team at Databricks) are working on one currently. On Mon, Aug 15, 2016 at 7:26 PM, Cody Koeninger wrote: > https://issues.apache.org/jira/browse/SPARK-15406 > > I'm not working on it (yet?), never got an answer to the question of > who was planning to work on it. >

Re: Structured Streaming with Kafka sources/sinks

2016-08-15 Thread Cody Koeninger
https://issues.apache.org/jira/browse/SPARK-15406 I'm not working on it (yet?), never got an answer to the question of who was planning to work on it. On Mon, Aug 15, 2016 at 9:12 PM, Guo, Chenzhao wrote: > Hi all, > > > > I’m trying to write Structured Streaming test

Structured Streaming with Kafka sources/sinks

2016-08-15 Thread Guo, Chenzhao
Hi all, I'm trying to write Structured Streaming test code and will deal with Kafka source. Currently Spark 2.0 doesn't support Kafka sources/sinks. I found some Databricks slides saying that Kafka sources/sinks will be implemented in Spark 2.0, so is there anybody working on this? And when