Re: How to update structured streaming apps gracefully

2018-12-18 Thread Priya Matpadi
Changes in streaming query that allow or disallow recovery from checkpoint
is clearly provided in
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#recovery-semantics-after-changes-in-a-streaming-query
.

On Tue, Dec 18, 2018 at 9:45 AM vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Checkpointing is only used for failure recovery not for app upgrades. You
> need to manually code the unload/load and save it to a persistent store
>
> Le mar. 18 déc. 2018 à 17:29, Priya Matpadi  a écrit :
>
>> Using checkpointing for graceful updates is my understanding as well,
>> based on the writeup in
>> https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#recovering-from-failures-with-checkpointing,
>> and some prototyping. Have you faced any missed events?
>>
>> On Mon, Dec 17, 2018 at 6:56 PM Yuta Morisawa <
>> yu-moris...@kddi-research.jp> wrote:
>>
>>> Hi
>>>
>>> Now I'm trying to update my structured streaming application.
>>> But I have no idea how to update it gracefully.
>>>
>>> Should I stop it, replace a jar file then restart it?
>>> In my understanding, in that case, all the state will be recovered if I
>>> use checkpoints.
>>> Is this correct?
>>>
>>> Thank you,
>>>
>>>
>>> --
>>>
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>


Re: How to update structured streaming apps gracefully

2018-12-18 Thread Priya Matpadi
Using checkpointing for graceful updates is my understanding as well, based
on the writeup in
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#recovering-from-failures-with-checkpointing,
and some prototyping. Have you faced any missed events?

On Mon, Dec 17, 2018 at 6:56 PM Yuta Morisawa 
wrote:

> Hi
>
> Now I'm trying to update my structured streaming application.
> But I have no idea how to update it gracefully.
>
> Should I stop it, replace a jar file then restart it?
> In my understanding, in that case, all the state will be recovered if I
> use checkpoints.
> Is this correct?
>
> Thank you,
>
>
> --
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: How to track batch jobs in spark ?

2018-12-05 Thread Priya Matpadi
if you are deploying your spark application on YARN cluster,
1. ssh into master node
2. List the currently running application and retreive the application_id
yarn application --list
3. Kill the application using application_id of the form
application_x_ from output of list command
yarn application --kill 

On Wed, Dec 5, 2018 at 1:42 PM kant kodali  wrote:

> Hi All,
>
> How to track batch jobs in spark? For example, is there some id or token i
> can get after I spawn a batch job and use it to track the progress or to
> kill the batch job itself?
>
> For Streaming, we have StreamingQuery.id()
>
> Thanks!
>