I think this could be of some help to you.

https://issues.apache.org/jira/browse/SPARK-3660



On Tue, Feb 24, 2015 at 2:18 AM, Matus Faro <matus.f...@kik.com> wrote:

> Hi,
>
> Our application is being designed to operate at all times on a large
> sliding window (day+) of data. The operations performed on the window
> of data will change fairly frequently and I need a way to save and
> restore the sliding window after an app upgrade without having to wait
> the duration of the sliding window to "warm up". Because it's an app
> upgrade, checkpointing will not work unfortunately.
>
> I can potentially dump the window to an outside storage periodically
> or on app shutdown, but I don't have an ideal way of restoring it.
>
> I thought about two non-ideal solutions:
> 1. Load the previous data all at once into the sliding window on app
> startup. The problem is, at one point I will have double the data in
> the sliding window until the initial batch of data goes out of scope.
> 2. Broadcast the previous state of the window separately from the
> window. Perform the operations on both sets of data until it comes out
> of scope. The problem is, the data will not fit into memory.
>
> Solutions that would solve my problem:
> 1. Ability to pre-populate sliding window.
> 2. Have control over batch slicing. It would be nice for a Receiver to
> dictate the current batch timestamp in order to slow down or fast
> forward time.
>
> Any feedback would be greatly appreciated!
>
> Thank you,
> Matus
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com

Reply via email to