Re: Does Data pipeline using kafka and structured streaming work?
One thing you should be aware of (that's a showstopper for my use cases, but may not be for yours) is that you can provide Kafka offsets to start from, but you can't really get access to offsets and metadata during the job on a per-batch or per-partition basis, just on a per-message basis. On Tue, Nov 1, 2016 at 8:29 PM, Michael Armbrustwrote: > Yeah, those are all requests for additional features / version support. > I've been using kafka with structured streaming to do both ETL into > partitioned parquet tables as well as streaming event time windowed > aggregation for several weeks now. > > On Tue, Nov 1, 2016 at 6:18 PM, Cody Koeninger wrote: >> >> Look at the resolved subtasks attached to that ticket you linked. >> Some of them are unresolved, but basic functionality is there. >> >> On Tue, Nov 1, 2016 at 7:37 PM, shyla deshpande >> wrote: >> > Hi Michael, >> > >> > Thanks for the reply. >> > >> > The following link says there is a open unresolved Jira for Structured >> > streaming support for consuming from Kafka. >> > >> > https://issues.apache.org/jira/browse/SPARK-15406 >> > >> > Appreciate your help. >> > >> > -Shyla >> > >> > >> > On Tue, Nov 1, 2016 at 5:19 PM, Michael Armbrust >> > >> > wrote: >> >> >> >> I'm not aware of any open issues against the kafka source for >> >> structured >> >> streaming. >> >> >> >> On Tue, Nov 1, 2016 at 4:45 PM, shyla deshpande >> >> >> >> wrote: >> >>> >> >>> I am building a data pipeline using Kafka, Spark streaming and >> >>> Cassandra. >> >>> Wondering if the issues with Kafka source fixed in Spark 2.0.1. If >> >>> not, >> >>> please give me an update on when it may be fixed. >> >>> >> >>> Thanks >> >>> -Shyla >> >> >> >> >> > > > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Does Data pipeline using kafka and structured streaming work?
Yeah, those are all requests for additional features / version support. I've been using kafka with structured streaming to do both ETL into partitioned parquet tables as well as streaming event time windowed aggregation for several weeks now. On Tue, Nov 1, 2016 at 6:18 PM, Cody Koeningerwrote: > Look at the resolved subtasks attached to that ticket you linked. > Some of them are unresolved, but basic functionality is there. > > On Tue, Nov 1, 2016 at 7:37 PM, shyla deshpande > wrote: > > Hi Michael, > > > > Thanks for the reply. > > > > The following link says there is a open unresolved Jira for Structured > > streaming support for consuming from Kafka. > > > > https://issues.apache.org/jira/browse/SPARK-15406 > > > > Appreciate your help. > > > > -Shyla > > > > > > On Tue, Nov 1, 2016 at 5:19 PM, Michael Armbrust > > > wrote: > >> > >> I'm not aware of any open issues against the kafka source for structured > >> streaming. > >> > >> On Tue, Nov 1, 2016 at 4:45 PM, shyla deshpande < > deshpandesh...@gmail.com> > >> wrote: > >>> > >>> I am building a data pipeline using Kafka, Spark streaming and > Cassandra. > >>> Wondering if the issues with Kafka source fixed in Spark 2.0.1. If > not, > >>> please give me an update on when it may be fixed. > >>> > >>> Thanks > >>> -Shyla > >> > >> > > >
Re: Does Data pipeline using kafka and structured streaming work?
Look at the resolved subtasks attached to that ticket you linked. Some of them are unresolved, but basic functionality is there. On Tue, Nov 1, 2016 at 7:37 PM, shyla deshpandewrote: > Hi Michael, > > Thanks for the reply. > > The following link says there is a open unresolved Jira for Structured > streaming support for consuming from Kafka. > > https://issues.apache.org/jira/browse/SPARK-15406 > > Appreciate your help. > > -Shyla > > > On Tue, Nov 1, 2016 at 5:19 PM, Michael Armbrust > wrote: >> >> I'm not aware of any open issues against the kafka source for structured >> streaming. >> >> On Tue, Nov 1, 2016 at 4:45 PM, shyla deshpande >> wrote: >>> >>> I am building a data pipeline using Kafka, Spark streaming and Cassandra. >>> Wondering if the issues with Kafka source fixed in Spark 2.0.1. If not, >>> please give me an update on when it may be fixed. >>> >>> Thanks >>> -Shyla >> >> > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Does Data pipeline using kafka and structured streaming work?
Hi Michael, Thanks for the reply. The following link says there is a open unresolved Jira for Structured streaming support for consuming from Kafka. https://issues.apache.org/jira/browse/SPARK-15406 Appreciate your help. -Shyla On Tue, Nov 1, 2016 at 5:19 PM, Michael Armbrustwrote: > I'm not aware of any open issues against the kafka source for structured > streaming. > > On Tue, Nov 1, 2016 at 4:45 PM, shyla deshpande > wrote: > >> I am building a data pipeline using Kafka, Spark streaming and Cassandra. >> Wondering if the issues with Kafka source fixed in Spark 2.0.1. If not, >> please give me an update on when it may be fixed. >> >> Thanks >> -Shyla >> > >
Re: Does Data pipeline using kafka and structured streaming work?
I'm not aware of any open issues against the kafka source for structured streaming. On Tue, Nov 1, 2016 at 4:45 PM, shyla deshpandewrote: > I am building a data pipeline using Kafka, Spark streaming and Cassandra. > Wondering if the issues with Kafka source fixed in Spark 2.0.1. If not, > please give me an update on when it may be fixed. > > Thanks > -Shyla >
Does Data pipeline using kafka and structured streaming work?
I am building a data pipeline using Kafka, Spark streaming and Cassandra. Wondering if the issues with Kafka source fixed in Spark 2.0.1. If not, please give me an update on when it may be fixed. Thanks -Shyla