Hi all,
We currently have the following issue with a Spark Structured Streaming
(SS) application. The application reads messages from thousands of source
systems, stores them in Kafka and Spark aggregates them using SS and
watermarking (15 minutes).
The root problem is that a few of the source
wrote:
> This makes sense to me and was going to propose something similar in order
> to be able to use the kafka acls more effectively as well, can you file a
> jira for it?
>
> Tom
>
> On Friday, November 9, 2018, 2:26:12 AM CST, Anastasios Zouzias <
> zouz...@gma
f the form (spark-kafka-source-*).
Best regards,
Anastasios Zouzias
way all the others.
> >>
> >> My question is which file should I modify in order to achieve isolating
> 1 partition of the RDD? Where does the actual partitioning is made?
> >>
> >> I hope it is clear!
> >>
> >> Thank you very much,
> >> Thodoris
> >>
> >>
> >> -
> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
--
-- Anastasios Zouzias
<a...@zurich.ibm.com>
I just increase the numPartitions to be twice
> larger, how coalesce(numPartitions: Int, shuffle: Boolean = false) keeps
> the data locality? Do I need to define my own Partitioner?
>
> Thanks,
> Fei
>
> On Sun, Jan 15, 2017 at 3:58 AM, Anastasios Zouzias <zouz...@gmail.com>
gt; Is there anyone who knows how to implement it or any hints for it?
>
> Thanks in advance,
> Fei
>
>
--
-- Anastasios Zouzias
<a...@zurich.ibm.com>