Re: Unsubscribe

2023-02-18 Thread winnie hw
Please send an email to user-unsubscr...@spark.apache.org rather than this
one.


On Sun, Feb 19, 2023 at 12:06 PM Sendil Chidambaram 
wrote:

> Unsubscribe
>


Unsubscribe

2023-02-18 Thread Sendil Chidambaram
Unsubscribe


Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Holden Karau
Is there someone focused on streaming work these days who would want to
shepherd this?

On Sat, Feb 18, 2023 at 5:02 PM Dongjoon Hyun 
wrote:

> Thank you for considering me, but may I ask what makes you think to put me
> there, Mich? I'm curious about your reason.
>
> > I have put dongjoon.hyun as a shepherd.
>
> BTW, unfortunately, I cannot help you with that due to my on-going
> personal stuff. I'll adjust the JIRA first.
>
> Thanks,
> Dongjoon.
>
>
> On Sat, Feb 18, 2023 at 10:51 AM Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> https://issues.apache.org/jira/browse/SPARK-42485
>>
>>
>> Spark Structured Streaming is a very useful tool in dealing with Event
>> Driven Architecture. In an Event Driven Architecture, there is generally a
>> main loop that listens for events and then triggers a call-back function
>> when one of those events is detected. In a streaming application the
>> application waits to receive the source messages in a set interval or
>> whenever they happen and reacts accordingly.
>>
>> There are occasions that you may want to stop the Spark program
>> gracefully. Gracefully meaning that Spark application handles the last
>> streaming message completely and terminates the application. This is
>> different from invoking interrupts such as CTRL-C.
>>
>> Of course one can terminate the process based on the following
>>
>>1. query.awaitTermination() # Waits for the termination of this
>>query, with stop() or with error
>>
>>
>>1. query.awaitTermination(timeoutMs) # Returns true if this query is
>>terminated within the timeout in milliseconds.
>>
>> So the first one above waits until an interrupt signal is received. The
>> second one will count the timeout and will exit when the timeout in
>> milliseconds is reached.
>>
>> The issue is that one needs to predict how long the streaming job needs
>> to run. Clearly any interrupt at the terminal or OS level (kill process),
>> may end up the processing terminated without a proper completion of the
>> streaming process.
>>
>> I have devised a method that allows one to terminate the spark
>> application internally after processing the last received message. Within
>> say 2 seconds of the confirmation of shutdown, the process will invoke a
>> graceful shutdown.
>>
>> This new feature proposes a solution to handle the topic doing work for
>> the message being processed gracefully, wait for it to complete and
>> shutdown the streaming process for a given topic without loss of data or
>> orphaned transactions
>>
>>
>> I have put dongjoon.hyun as a shepherd. Kindly advise me if that is the
>> correct approach.
>>
>> JIRA ticket https://issues.apache.org/jira/browse/SPARK-42485
>>
>> SPIP doc: TBC
>>
>> Discussion thread: in
>>
>> https://lists.apache.org/list.html?d...@spark.apache.org
>>
>>
>> Thanks.
>>
>>
>>view my Linkedin profile
>> 
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Dongjoon Hyun
Thank you for considering me, but may I ask what makes you think to put me
there, Mich? I'm curious about your reason.

> I have put dongjoon.hyun as a shepherd.

BTW, unfortunately, I cannot help you with that due to my on-going personal
stuff. I'll adjust the JIRA first.

Thanks,
Dongjoon.


On Sat, Feb 18, 2023 at 10:51 AM Mich Talebzadeh 
wrote:

> https://issues.apache.org/jira/browse/SPARK-42485
>
>
> Spark Structured Streaming is a very useful tool in dealing with Event
> Driven Architecture. In an Event Driven Architecture, there is generally a
> main loop that listens for events and then triggers a call-back function
> when one of those events is detected. In a streaming application the
> application waits to receive the source messages in a set interval or
> whenever they happen and reacts accordingly.
>
> There are occasions that you may want to stop the Spark program
> gracefully. Gracefully meaning that Spark application handles the last
> streaming message completely and terminates the application. This is
> different from invoking interrupts such as CTRL-C.
>
> Of course one can terminate the process based on the following
>
>1. query.awaitTermination() # Waits for the termination of this query,
>with stop() or with error
>
>
>1. query.awaitTermination(timeoutMs) # Returns true if this query is
>terminated within the timeout in milliseconds.
>
> So the first one above waits until an interrupt signal is received. The
> second one will count the timeout and will exit when the timeout in
> milliseconds is reached.
>
> The issue is that one needs to predict how long the streaming job needs to
> run. Clearly any interrupt at the terminal or OS level (kill process), may
> end up the processing terminated without a proper completion of the
> streaming process.
>
> I have devised a method that allows one to terminate the spark application
> internally after processing the last received message. Within say 2 seconds
> of the confirmation of shutdown, the process will invoke a graceful
> shutdown.
>
> This new feature proposes a solution to handle the topic doing work for
> the message being processed gracefully, wait for it to complete and
> shutdown the streaming process for a given topic without loss of data or
> orphaned transactions
>
>
> I have put dongjoon.hyun as a shepherd. Kindly advise me if that is the
> correct approach.
>
> JIRA ticket https://issues.apache.org/jira/browse/SPARK-42485
>
> SPIP doc: TBC
>
> Discussion thread: in
>
> https://lists.apache.org/list.html?d...@spark.apache.org
>
>
> Thanks.
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>


SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Mich Talebzadeh
https://issues.apache.org/jira/browse/SPARK-42485


Spark Structured Streaming is a very useful tool in dealing with Event
Driven Architecture. In an Event Driven Architecture, there is generally a
main loop that listens for events and then triggers a call-back function
when one of those events is detected. In a streaming application the
application waits to receive the source messages in a set interval or
whenever they happen and reacts accordingly.

There are occasions that you may want to stop the Spark program
gracefully. Gracefully meaning that Spark application handles the last
streaming message completely and terminates the application. This is
different from invoking interrupts such as CTRL-C.

Of course one can terminate the process based on the following

   1. query.awaitTermination() # Waits for the termination of this query,
   with stop() or with error


   1. query.awaitTermination(timeoutMs) # Returns true if this query is
   terminated within the timeout in milliseconds.

So the first one above waits until an interrupt signal is received. The
second one will count the timeout and will exit when the timeout in
milliseconds is reached.

The issue is that one needs to predict how long the streaming job needs to
run. Clearly any interrupt at the terminal or OS level (kill process), may
end up the processing terminated without a proper completion of the
streaming process.

I have devised a method that allows one to terminate the spark application
internally after processing the last received message. Within say 2 seconds
of the confirmation of shutdown, the process will invoke a graceful
shutdown.

This new feature proposes a solution to handle the topic doing work for the
message being processed gracefully, wait for it to complete and shutdown
the streaming process for a given topic without loss of data or orphaned
transactions


I have put dongjoon.hyun as a shepherd. Kindly advise me if that is the
correct approach.

JIRA ticket https://issues.apache.org/jira/browse/SPARK-42485

SPIP doc: TBC

Discussion thread: in

https://lists.apache.org/list.html?d...@spark.apache.org


Thanks.


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.


Vote SPIP

2023-02-18 Thread Faisal Waris
I need to vote for an SPIP but don't know how to get a JSF account for spark.
The Apache Foundation instructions are not very clear on this.
Please advise.
Thanks,
Faisal