Re: Design review of SPARK-28594

2019-09-01 Thread Jungtaek Lim
Great, thanks for reviewing, Felix!

On Mon, Sep 2, 2019 at 2:16 AM Felix Cheung 
wrote:

> I did review it and solving this problem makes sense. I will comment in
> the JIRA.
>
> --
> *From:* Jungtaek Lim 
> *Sent:* Sunday, August 25, 2019 3:34:22 PM
> *To:* dev 
> *Subject:* Design review of SPARK-28594
>
> Hi devs,
>
> I have been working on designing SPARK-28594 [1] (though I've started with
> this via different requests) and design doc is now available [2].
>
> Let me describe SPARK-28954 briefly - single and growing event log file
> for application has been major issue for streaming application as as long
> as event log just grows while the application is running, and lots of
> issues occur from there. The only viable workaround has been disabling
> event log which is not easily acceptable. Maybe stopping the application
> and rerunning would be another approach but it sounds really odd to stop
> the application due to event log. SPARK-28594 enables the way to roll the
> event log files, with compacting old event log files without losing the
> ability to replay whole logs.
>
> While I'll break down issue into subtask and start from easier one, in
> parallel I'd like to ask for reviewing on the design to get better idea and
> find possible defects of design.
>
> Please note that the doc is intended to describe the detailed changes
> (closer to the implementation details) and is not a kind of SPIP because I
> wouldn't feel going through SPIP process for this improvement - the change
> would be rather not huge and the proposal works orthogonal to current
> feature. Please let me know if it's not the case and SPIP process is
> necessary.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 1. https://issues.apache.org/jira/browse/SPARK-28594
> 2.
> https://docs.google.com/document/d/12bdCC4nA58uveRxpeo8k7kGOI2NRTXmXyBOweSi4YcY/edit?usp=sharing
>
>

-- 
Name : Jungtaek Lim
Blog : http://medium.com/@heartsavior
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior


Re: Design review of SPARK-28594

2019-09-01 Thread Felix Cheung
I did review it and solving this problem makes sense. I will comment in the 
JIRA.


From: Jungtaek Lim 
Sent: Sunday, August 25, 2019 3:34:22 PM
To: dev 
Subject: Design review of SPARK-28594

Hi devs,

I have been working on designing SPARK-28594 [1] (though I've started with this 
via different requests) and design doc is now available [2].

Let me describe SPARK-28954 briefly - single and growing event log file for 
application has been major issue for streaming application as as long as event 
log just grows while the application is running, and lots of issues occur from 
there. The only viable workaround has been disabling event log which is not 
easily acceptable. Maybe stopping the application and rerunning would be 
another approach but it sounds really odd to stop the application due to event 
log. SPARK-28594 enables the way to roll the event log files, with compacting 
old event log files without losing the ability to replay whole logs.

While I'll break down issue into subtask and start from easier one, in parallel 
I'd like to ask for reviewing on the design to get better idea and find 
possible defects of design.

Please note that the doc is intended to describe the detailed changes (closer 
to the implementation details) and is not a kind of SPIP because I wouldn't 
feel going through SPIP process for this improvement - the change would be 
rather not huge and the proposal works orthogonal to current feature. Please 
let me know if it's not the case and SPIP process is necessary.

Thanks,
Jungtaek Lim (HeartSaVioR)

1. https://issues.apache.org/jira/browse/SPARK-28594
2. 
https://docs.google.com/document/d/12bdCC4nA58uveRxpeo8k7kGOI2NRTXmXyBOweSi4YcY/edit?usp=sharing



Design review of SPARK-28594

2019-08-25 Thread Jungtaek Lim
Hi devs,

I have been working on designing SPARK-28594 [1] (though I've started with
this via different requests) and design doc is now available [2].

Let me describe SPARK-28954 briefly - single and growing event log file for
application has been major issue for streaming application as as long as
event log just grows while the application is running, and lots of issues
occur from there. The only viable workaround has been disabling event log
which is not easily acceptable. Maybe stopping the application and
rerunning would be another approach but it sounds really odd to stop the
application due to event log. SPARK-28594 enables the way to roll the event
log files, with compacting old event log files without losing the ability
to replay whole logs.

While I'll break down issue into subtask and start from easier one, in
parallel I'd like to ask for reviewing on the design to get better idea and
find possible defects of design.

Please note that the doc is intended to describe the detailed changes
(closer to the implementation details) and is not a kind of SPIP because I
wouldn't feel going through SPIP process for this improvement - the change
would be rather not huge and the proposal works orthogonal to current
feature. Please let me know if it's not the case and SPIP process is
necessary.

Thanks,
Jungtaek Lim (HeartSaVioR)

1. https://issues.apache.org/jira/browse/SPARK-28594
2.
https://docs.google.com/document/d/12bdCC4nA58uveRxpeo8k7kGOI2NRTXmXyBOweSi4YcY/edit?usp=sharing