Re: Structured Streaming with Kafka sources/sinks

Cody Koeninger Tue, 30 Aug 2016 08:10:35 -0700

Not that I wouldn't rather have more open communication around this
issue...but what are people actually expecting to get out of
structured streaming with regard to Kafka?


There aren't any realistic pushdown-type optimizations available, and
from what I could tell the last time I looked at structured streaming,
resolving the event time vs processing time issue was still a ways
off.

On Tue, Aug 30, 2016 at 1:56 AM, Ofir Manor <ofir.ma...@equalum.io> wrote:
> I personally find it disappointing that a big chuck of Spark's design and
> development is happening behind closed curtains. It makes it harder than
> necessary for me to work with Spark. We had to improvise in the recent weeks
> a temporary solution for reading from Kafka (from Structured Streaming) to
> unblock our development, and I feed that if the design and development of
> that feature was done in the open, it would have saved us a lot of hassle
> (and would reduce the refactoring of our code base).
>
> It hard not compare it to other Apache projects - for example, I believe
> most of the Apache Kafka full-time contributors work at a single company,
> but they manage as a community to have a very transparent design and
> development process, which seems to work great.
>
> Ofir Manor
>
> Co-Founder & CTO | Equalum
>
> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>
>
> On Mon, Aug 29, 2016 at 10:39 PM, Fred Reiss <freiss....@gmail.com> wrote:
>>
>> I think that the community really needs some feedback on the progress of
>> this very important task. Many existing Spark Streaming applications can't
>> be ported to Structured Streaming without Kafka support.
>>
>> Is there a design document somewhere?  Or can someone from the DataBricks
>> team break down the existing monolithic JIRA issue into smaller steps that
>> reflect the current development plan?
>>
>> Fred
>>
>>
>> On Sat, Aug 27, 2016 at 2:32 PM, Koert Kuipers <ko...@tresata.com> wrote:
>>>
>>> thats great
>>>
>>> is this effort happening anywhere that is publicly visible? github?
>>>
>>> On Tue, Aug 16, 2016 at 2:04 AM, Reynold Xin <r...@databricks.com> wrote:
>>>>
>>>> We (the team at Databricks) are working on one currently.
>>>>
>>>>
>>>> On Mon, Aug 15, 2016 at 7:26 PM, Cody Koeninger <c...@koeninger.org>
>>>> wrote:
>>>>>
>>>>> https://issues.apache.org/jira/browse/SPARK-15406
>>>>>
>>>>> I'm not working on it (yet?), never got an answer to the question of
>>>>> who was planning to work on it.
>>>>>
>>>>> On Mon, Aug 15, 2016 at 9:12 PM, Guo, Chenzhao <chenzhao....@intel.com>
>>>>> wrote:
>>>>> > Hi all,
>>>>> >
>>>>> >
>>>>> >
>>>>> > I’m trying to write Structured Streaming test code and will deal with
>>>>> > Kafka
>>>>> > source. Currently Spark 2.0 doesn’t support Kafka sources/sinks.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I found some Databricks slides saying that Kafka sources/sinks will
>>>>> > be
>>>>> > implemented in Spark 2.0, so is there anybody working on this? And
>>>>> > when will
>>>>> > it be released?
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > Chenzhao Guo
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>
>>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Structured Streaming with Kafka sources/sinks

Reply via email to