Hi,
I know this is a basic question but someone enquired about it and I just
wanted to fill my knowledge gap so to speak.
Within the context of Spark streaming, the RDD is created from the incoming
topic and RDD is partitioned and each node of Spark is operating on a
partition at that time. OK
We partitioned data logically for 2 different jobs...in our use case based
on geography...
On Thu, 12 Dec 2019 at 3:39 pm, Chetan Khatri
wrote:
> Thanks, If you can share alternative change in design. I would love to
> hear from you.
>
> On Wed, Dec 11, 2019 at 9:34 PM ayan guha wrote:
>
>> No
| |
Genieliu
|
|
feixiang...@163.com
China
|
签名由网易邮箱大师定制
Thanks, If you can share alternative change in design. I would love to hear
from you.
On Wed, Dec 11, 2019 at 9:34 PM ayan guha wrote:
> No we faced problem with that setup.
>
> On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <
> chetan.opensou...@gmail.com> wrote:
>
>> Hi Spark Users,
>> would
No we faced problem with that setup.
On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri
wrote:
> Hi Spark Users,
> would that be possible to write to same partition to the parquet file
> through concurrent two spark jobs with different spark session.
>
> thanks
>
--
Best Regards,
Ayan Guha
Hi Spark Users,
would that be possible to write to same partition to the parquet file
through concurrent two spark jobs with different spark session.
thanks
I have found a source how to compile spark codes and dynamically load them
into distributed executors in spark repl:
https://ardoris.wordpress.com/2014/03/30/how-spark-does-class-loading/
If you run spark repl, you can find the spark configuration like this :