Re: [Structured Streaming][Kafka] For a Kafka topic with 3 partitions, how does the parallelism work ?

Raghavendra Pandey Sat, 21 Apr 2018 21:58:56 -0700

Yes as long as there are 3 cores available on your local machine.

On Fri, Apr 20, 2018 at 10:56 AM karthikjay <aswin8...@gmail.com> wrote:


> I have the following code to read data from Kafka topic using the
> structured
> streaming. The topic has 3 partitions:
>
>  val spark = SparkSession
>       .builder
>       .appName("TestPartition")
>       .master("local[*]")
>       .getOrCreate()
>
>     import spark.implicits._
>
>     val dataFrame = spark
>       .readStream
>       .format("kafka")
>       .option("kafka.bootstrap.servers",
> "1.2.3.184:9092,1.2.3.185:9092,1.2.3.186:9092")
>       .option("subscribe", "partition_test")
>       .option("failOnDataLoss", "false")
>       .load()
>       .selectExpr("CAST(value AS STRING)")
>
> My understanding is that Spark will launch 3 Kafka consumers (for 3
> partitions) and these 3 consumers will be running on the worker nodes. Is
> my
> understanding right ?
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: [Structured Streaming][Kafka] For a Kafka topic with 3 partitions, how does the parallelism work ?

Reply via email to