Kafka-based Spark Streaming and Vertex AI for Sentiment Analysis

2024-02-21 Thread Mich Talebzadeh
I am working on a pet project to implement a real-time sentiment analysis system for analyzing customer reviews. It leverages Kafka for data ingestion, Spark Structured Streaming (SSS) for real-time processing, and Vertex AI for sentiment analysis and potential action triggers. *Features* -

Re: Kafka to spark streaming

2022-01-30 Thread Gourav Sengupta
Hi Amit, before answering your question, I am just trying to understand it. I am not exactly clear how do the Akka application, Kafka and SPARK Streaming application sit together, and what are you exactly trying to achieve? Can you please elaborate? Regards, Gourav On Fri, Jan 28, 2022 at 10

Re: Kafka to spark streaming

2022-01-29 Thread Amit Sharma
Thanks Mich. The link you shared have two options Kafka and Socket only. Thanks Amit On Sat, Jan 29, 2022 at 3:49 AM Mich Talebzadeh wrote: > So you have a classic architecture with spark receiving events through a > kafka topic via kafka-spark-connector, do something with it and send data >

Re: Kafka to spark streaming

2022-01-29 Thread Mich Talebzadeh
So you have a classic architecture with spark receiving events through a kafka topic via kafka-spark-connector, do something with it and send data out to the consumer. Are you using Spark structured streaming here with batch streaming? check

Kafka to spark streaming

2022-01-28 Thread Amit Sharma
Hello everyone, we have spark streaming application. We send request to stream through Akka actor using Kafka topic. We wait for response as it is real time. Just want a suggestion is there any better option like Livy where we can send and receive request to spark streaming. Thanks Amit

Re: Kafka with Spark Streaming work on local but it doesn't work in Standalone mode

2020-07-24 Thread Gabor Somogyi
Hi Davide, Please see the doc: *Note: Kafka 0.8 support is deprecated as of Spark 2.3.0.* Have you tried the same with Structured Streaming and not with DStreams? If you insist somehow to DStreams you can use spark-streaming-kafka-0-10 connector instead. BR, G On Fri, Jul 24, 2020 at 12:08 PM

Kafka with Spark Streaming work on local but it doesn't work in Standalone mode

2020-07-24 Thread Davide Curcio
nd Zookeeper in the same machine in which I have the driver, it worked both locally and in the cluster. But obviously for the sake of scalability and modularity I'd like to use the current configuration. I'm using Spark 2.4.6, the Kafka Streaming API are "spark-streaming-kafka-0-8-ass

Re: Issue Storing offset in Kafka for Spark Streaming Application

2017-10-13 Thread Arpan Rajani
Hi Gerard, Excellent, indeed your inputs helped. Thank you for the quick reply. I modified the code based on inputs. Now the application starts and it reads from the topic. Now we stream like 50,000 messages on the Kafka topic. After a while we terminate the application using YARN kill and

Re: Issue Storing offset in Kafka for Spark Streaming Application

2017-10-13 Thread Gerard Maas
Hi Arpan, The error suggests that the streaming context has been started with streamingContext.start() and after that statement, some other dstream operations have been attempted. A suggested pattern to manage the offsets is the following: var offsetRanges: Array[OffsetRanger] = _ //create

Issue Storing offset in Kafka for Spark Streaming Application

2017-10-13 Thread Arpan Rajani
Hi all, In our cluster we have Kafka 0.10.1 and Spark 2.1.0. We are trying to store the offsets in Kafka in order to achieve restartability of the streaming application. ( Using checkpoints, I already implemented, we will require to change code in production hence checkpoint won't work) Checking

Re: Which streaming platform is best? Kafka or Spark Streaming?

2017-03-11 Thread Gaurav Pandya
ine.netdna-ssl.com/wp- > content/uploads/2015/11/spark-streaming-datanami.png > > > Regards, > Vaquar khan > > > On Fri, Mar 10, 2017 at 6:17 AM, Sean Owen <so...@cloudera.com> wrote: > >> Kafka and Spark Streaming don't do the same thing. Kafka stores and >&

Re: Which streaming platform is best? Kafka or Spark Streaming?

2017-03-10 Thread vaquar khan
at 6:17 AM, Sean Owen <so...@cloudera.com> wrote: > Kafka and Spark Streaming don't do the same thing. Kafka stores and > transports data, Spark Streaming runs computations on a stream of data. > Neither is itself a streaming platform in its entirety. > > It's kind of like aski

Re: Which streaming platform is best? Kafka or Spark Streaming?

2017-03-10 Thread Sean Owen
Kafka and Spark Streaming don't do the same thing. Kafka stores and transports data, Spark Streaming runs computations on a stream of data. Neither is itself a streaming platform in its entirety. It's kind of like asking whether you should build a website using just MySQL, or nginx. > On 9

Re: Which streaming platform is best? Kafka or Spark Streaming?

2017-03-10 Thread Robin East
17, at 20:37, Gaurav1809 <gauravhpan...@gmail.com> wrote: >> >> Hi All, Would you please let me know which streaming platform is best. Be it >> server log processing, social media feeds ot any such streaming data. I want >> to know the comparison between Kafka &a

Re: Which streaming platform is best? Kafka or Spark Streaming?

2017-03-09 Thread Jörn Franke
ase let me know which streaming platform is best. Be it > server log processing, social media feeds ot any such streaming data. I want > to know the comparison between Kafka & Spark Streaming. > > > > -- > View this message in context: > http://apache-spark-user-list.100

Which streaming platform is best? Kafka or Spark Streaming?

2017-03-09 Thread Gaurav1809
Hi All, Would you please let me know which streaming platform is best. Be it server log processing, social media feeds ot any such streaming data. I want to know the comparison between Kafka & Spark Streaming. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Jacek Laskowski
> > From: Mich Talebzadeh <mich.talebza...@gmail.com> > Date: Friday, December 2, 2016 at 12:26 PM > To: Gabriel Perez <gabr...@adtheorent.com> > Cc: Jacek Laskowski <ja...@japila.pl>, user <user@spark.apache.org> > > > Subject: Re: Kafka 0.10 & Spark

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Gabriel Perez
:26 PM To: Gabriel Perez <gabr...@adtheorent.com> Cc: Jacek Laskowski <ja...@japila.pl>, user <user@spark.apache.org> Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2 in this POC of yours are you running this app with spark in Local mode by any chance? Dr Mich Ta

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Mich Talebzadeh
rez <gabr...@adtheorent.com> > *Cc: *user <user@spark.apache.org> > *Subject: *Re: Kafka 0.10 & Spark Streaming 2.0.2 > > > > Hi, > > > > How many partitions does the topic have? How do you check how many > executors read from the topic? > > &

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Gabriel Perez
Laskowski <ja...@japila.pl> Date: Friday, December 2, 2016 at 12:21 PM To: Gabriel Perez <gabr...@adtheorent.com> Cc: user <user@spark.apache.org> Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2 Hi, Can you post the screenshot of the Executors and Streaming tabs? Jacek

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Jacek Laskowski
> *To: *Gabriel Perez <gabr...@adtheorent.com> > *Cc: *user <user@spark.apache.org> > *Subject: *Re: Kafka 0.10 & Spark Streaming 2.0.2 > > > > Hi, > > > > How many partitions does the topic have? How do you check how many > executors read from

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Gabriel Perez
Friday, December 2, 2016 at 11:47 AM To: Gabriel Perez <gabr...@adtheorent.com> Cc: user <user@spark.apache.org> Subject: Re: Kafka 0.10 & Spark Streaming 2.0.2 Hi, How many partitions does the topic have? How do you check how many executors read from the topic? Jacek

Re: Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread Jacek Laskowski
@Override public void call( JavaRDD<ConsumerRecordString, String>> rdd ) { OffsetRange[] offsetRanges = ( (HasOffsetRanges) rdd.rdd() ).offsetRanges(); // some time later, after outputs have compl

Kafka 0.10 & Spark Streaming 2.0.2

2016-12-02 Thread gabrielperez2484
OffsetRanges) rdd.rdd() ).offsetRanges(); // some time later, after outputs have completed ( (CanCommitOffsets) stream.inputDStream() ).commitAsync( offsetRanges ); } } ); -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-0-10-Spark-Streaming-2-0-2-tp28153.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Number of consumers in Kafka with Spark Streaming

2016-06-21 Thread Cody Koeninger
. The direct stream doesn't use consumer groups in the same way the kafka high level consumer does, but you should be able to pass group id in the kafka parameters. On Tue, Jun 21, 2016 at 9:56 AM, Guillermo Ortiz <konstt2...@gmail.com> wrote: > I use Spark Streaming with Kafka and I'd lik

Number of consumers in Kafka with Spark Streaming

2016-06-21 Thread Guillermo Ortiz
I use Spark Streaming with Kafka and I'd like to know how many consumers are generated. I guess that as many as partitions in Kafka but I'm not sure. Is there a way to know the name of the groupId generated in Spark to Kafka?

RE: Handle empty kafka in Spark Streaming

2016-06-15 Thread David Newberger
Newberger -Original Message- From: Yogesh Vyas [mailto:informy...@gmail.com] Sent: Wednesday, June 15, 2016 8:30 AM To: David Newberger Subject: Re: Handle empty kafka in Spark Streaming I am looking for something which checks the JavaPairReceiverInputDStreambefore further going for any

RE: Handle empty kafka in Spark Streaming

2016-06-15 Thread David Newberger
day, June 15, 2016 6:31 AM To: user Subject: Handle empty kafka in Spark Streaming Hi, Does anyone knows how to handle empty Kafka while Spark Streaming job is running ? Regards, Yogesh - To unsubscribe, e-mail: user-unsub

Handle empty kafka in Spark Streaming

2016-06-15 Thread Yogesh Vyas
Hi, Does anyone knows how to handle empty Kafka while Spark Streaming job is running ? Regards, Yogesh - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Access fields by name/index from Avro data read from Kafka through Spark Streaming

2016-02-25 Thread Harsh J
cks.com> wrote: > >> You can use `DStream.map` to transform objects to anything you want. >> >> On Thu, Feb 25, 2016 at 11:06 AM, Mohammad Tariq <donta...@gmail.com> >> wrote: >> >>> Hi group, >>> >>> I have just started working

Re: Access fields by name/index from Avro data read from Kafka through Spark Streaming

2016-02-25 Thread Mohammad Tariq
anything you want. > > On Thu, Feb 25, 2016 at 11:06 AM, Mohammad Tariq <donta...@gmail.com> > wrote: > >> Hi group, >> >> I have just started working with confluent platform and spark streaming, >> and was wondering if it is possible to access individu

Re: Access fields by name/index from Avro data read from Kafka through Spark Streaming

2016-02-25 Thread Shixiong(Ryan) Zhu
access individual fields from an > Avro object read from a kafka topic through spark streaming. As per its > default behaviour *KafkaUtils.createDirectStream[Object, Object, > KafkaAvroDecoder, KafkaAvroDecoder](ssc, kafkaParams, topicsSet)* return > a *DStream[Object, Object]*, and do

Access fields by name/index from Avro data read from Kafka through Spark Streaming

2016-02-25 Thread Mohammad Tariq
Hi group, I have just started working with confluent platform and spark streaming, and was wondering if it is possible to access individual fields from an Avro object read from a kafka topic through spark streaming. As per its default behaviour *KafkaUtils.createDirectStream[Object, Object

Re: Optimize the performance of inserting data to Cassandra with Kafka and Spark Streaming

2016-02-17 Thread radoburansky
gt; > Thank you very much for your reading and suggestions in advances. > > Jerry Wong > > ------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/Optimize-t

Re: Optimize the performance of inserting data to Cassandra with Kafka and Spark Streaming

2016-02-17 Thread Jerry
and send messages to Brokers (1000 messages/per >> time) >> >> But the Cassandra can only be inserted about 100 messages in each round >> of test. >> Can anybody give me advices why the other messages (about 900 message) >> can't be consumed? >> How do I c

Optimize the performance of inserting data to Cassandra with Kafka and Spark Streaming

2016-02-16 Thread Jerry
r your reading and suggestions in advances. Jerry Wong -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Optimize-the-performance-of-inserting-data-to-Cassandra-with-Kafka-and-Spark-Streaming-tp262

architecture though experiment: what is the advantage of using kafka with spark streaming?

2015-12-10 Thread Andy Davidson
I noticed that many people are using Kafka and spark streaming. Can some one provide a couple of use case I image some possible use cases might be Is the purpose using Kafka 1. provide some buffering? 2. implementing some sort of load balancing for the over all system? 3. Provide filtering

Re: architecture though experiment: what is the advantage of using kafka with spark streaming?

2015-12-10 Thread Cody Koeninger
language for processing data than what you'd get with writing kafka consumers yourself. On Thu, Dec 10, 2015 at 8:00 PM, Andy Davidson < a...@santacruzintegration.com> wrote: > I noticed that many people are using Kafka and spark streaming. Can some > one provide a couple of use case

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cassa L
: You can configure PLAINTEXT listener as well with the broker and use that port for spark. -- Harsha On August 28, 2015 at 12:24:45 PM, Sourabh Chandak (sourabh3...@gmail.com) wrote: Can we use the existing kafka spark streaming jar to connect to a kafka server running in SSL mode

SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cassa L
Hi, I was going through SSL setup of Kafka. https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka However, I am also using Spark-Kafka streaming to read data from Kafka. Is there a way to activate SSL for spark streaming API or not possible at all? Thanks, LCassa

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cassa L
using Spark-Kafka streaming to read data from Kafka. Is there a way to activate SSL for spark streaming API or not possible at all? Thanks, LCassa

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cody Koeninger
was going through SSL setup of Kafka. https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka However, I am also using Spark-Kafka streaming to read data from Kafka. Is there a way to activate SSL for spark streaming API or not possible at all? Thanks, LCassa

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Sourabh Chandak
Can we use the existing kafka spark streaming jar to connect to a kafka server running in SSL mode? We are fine with non SSL consumer as our kafka cluster and spark cluster are in the same network Thanks, Sourabh On Fri, Aug 28, 2015 at 12:03 PM, Gwen Shapira g...@confluent.io wrote: I can't

Re: No Twitter Input from Kafka to Spark Streaming

2015-08-06 Thread Akhil Das
://apache-spark-user-list.1001560.n3.nabble.com/No-Twitter-Input-from-Kafka-to-Spark-Streaming-tp24131p24142.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: No Twitter Input from Kafka to Spark Streaming

2015-08-05 Thread narendra
Thanks Akash for the answer. I added endpoint to the listener and now it is working. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/No-Twitter-Input-from-Kafka-to-Spark-Streaming-tp24131p24142.html Sent from the Apache Spark User List mailing list archive

No Twitter Input from Kafka to Spark Streaming

2015-08-04 Thread narendra
My application takes Twitter4j tweets and publishes those to a topic in Kafka. Spark Streaming subscribes to that topic for processing. But in actual, Spark Streaming is not able to receive tweet data from Kafka so Spark Streaming is running empty batch jobs with out input and I am not able to see

Re: No Twitter Input from Kafka to Spark Streaming

2015-08-04 Thread Cody Koeninger
Have you tried using the console consumer to see if anything is actually getting published to that topic? On Tue, Aug 4, 2015 at 11:45 AM, narendra narencs...@gmail.com wrote: My application takes Twitter4j tweets and publishes those to a topic in Kafka. Spark Streaming subscribes

writing to kafka using spark streaming

2015-07-06 Thread Shushant Arora
I have a requirement to write in kafka queue from a spark streaming application. I am using spark 1.2 streaming. Since different executors in spark are allocated at each run so instantiating a new kafka producer at each run seems a costly operation .Is there a way to reuse objects in processing

Re: writing to kafka using spark streaming

2015-07-06 Thread Cody Koeninger
Use foreachPartition, and allocate whatever the costly resource is once per partition. On Mon, Jul 6, 2015 at 6:11 AM, Shushant Arora shushantaror...@gmail.com wrote: I have a requirement to write in kafka queue from a spark streaming application. I am using spark 1.2 streaming. Since

Re: writing to kafka using spark streaming

2015-07-06 Thread Tathagata Das
Yeah, creating a new producer at the granularity of partitions may not be that costly. On Mon, Jul 6, 2015 at 6:40 AM, Cody Koeninger c...@koeninger.org wrote: Use foreachPartition, and allocate whatever the costly resource is once per partition. On Mon, Jul 6, 2015 at 6:11 AM, Shushant

Re: writing to kafka using spark streaming

2015-07-06 Thread Shushant Arora
whats the difference between foreachPartition vs mapPartitions for a Dtstream both works at partition granularity? One is an operation and another is action but if I call an opeartion afterwords mapPartitions also, which one is more efficient and recommeded? On Tue, Jul 7, 2015 at 12:21 AM,

Re: writing to kafka using spark streaming

2015-07-06 Thread Tathagata Das
Both have same efficiency. The primary difference is that one is a transformation (hence is lazy, and requires another action to actually execute), and the other is an action. But it may be a slightly better design in general to have transformations be purely functional (that is, no external side

Re: writing to kafka using spark streaming

2015-07-06 Thread Shushant Arora
On using foreachPartition jobs get created are not displayed on driver console but are visible on web ui. On driver it creates some stage statistics of form [Stage 2: (0 + 2) / 5] and disappeared . I am using foreachPartition as :

Re: Help with publishing to Kafka from Spark Streaming?

2015-05-02 Thread Saisai Shao
Here is the pull request, you may refer to this: https://github.com/apache/spark/pull/2994 Thanks Jerry 2015-05-01 14:38 GMT+08:00 Pavan Sudheendra pavan0...@gmail.com: Link to the question: http://stackoverflow.com/questions/29974017/spark-kafka-producer-not-serializable-exception

Help with publishing to Kafka from Spark Streaming?

2015-05-01 Thread Pavan Sudheendra
Link to the question: http://stackoverflow.com/questions/29974017/spark-kafka-producer-not-serializable-exception Thanks for any pointers.

Re: How to replay consuming messages from kafka using spark streaming?

2015-01-23 Thread mykidong
Hi, I have written spark streaming kafka receiver using kafka simple consumer api: https://github.com/mykidong/spark-kafka-simple-consumer-receiver This kafka receiver can be used as alternative to the current spark streaming kafka receiver which is just written in high level kafka consumer api

Re: How to replay consuming messages from kafka using spark streaming?

2015-01-14 Thread Cody Koeninger
. my streaming job is retrieving messages from kafka, and save them as avro files onto hdfs. My question is, if worker fails to write avro to hdfs, sometimes, I want to replay consuming messages from the last succeeded kafka offset again. I think, Spark Streaming Kafka Receiver is written using

RE: How to replay consuming messages from kafka using spark streaming?

2015-01-14 Thread Shao, Saisai
, January 15, 2015 11:59 AM To: user@spark.apache.org Subject: How to replay consuming messages from kafka using spark streaming? Hi, My Spark Streaming Job is doing like kafka etl to HDFS. For instance, every 10 min. my streaming job is retrieving messages from kafka, and save them as avro files

How to replay consuming messages from kafka using spark streaming?

2015-01-14 Thread mykidong
succeeded kafka offset again. I think, Spark Streaming Kafka Receiver is written using Kafka High Level Consumer API, not Simple Consumer API. Any idea how to replay kafka consuming in spark streaming? - Kidong. -- View this message in context: http://apache-spark-user-list.1001560.n3