Quick question then.

Just to clarify my understanding, does bufferserver dump all of the data
when full or starts evicting in LRU fashion on demand?
On 26 Aug 2015 03:53, "Chetan Narsude" <[email protected]> wrote:

> I have a hunch that there may be a problem in terms of adding the latency.
> But ultimately we will use benchmark to rule out the hunches if you
> strongly believe in it.
>
> Here is what happens today: bufferserver tries to hold the data in memory
> for as long as possible but not longer than needed. If you do not persist
> the data to memory, you do not have to load it as well as it's already in
> memory. This greatly reduces the disk related latency. Even when we have to
> persist the data, we pick the block (it's pending correct implementation),
> which we will not need back in memory immediately.
>
> The converse of it is presumably true as well. If you start persisting the
> data in anticipation of buffer being full, you will also need to load this
> data back when needed. This will result in frequent round-trips to disk
> adding to the latency.
>
> --
> Chetan
>
>
>
>
> On Tue, Aug 25, 2015 at 12:53 PM, Atri Sharma <[email protected]> wrote:
>
> > What are the problems you see around loading? I think that it might
> > actually help since we might end up using locality of reference for
> similar
> > data in a single window.
> > On 25 Aug 2015 22:14, "Chetan Narsude" <[email protected]> wrote:
> >
> > > This looks at store side of the equation, what's the impact on the load
> > > side when the time comes to use this data?
> > >
> > > --
> > > Chetan
> > >
> > > On Tue, Aug 25, 2015 at 8:41 AM, Atri Sharma <[email protected]>
> > wrote:
> > >
> > > > On 25 Aug 2015 10:34, "Vlad Rozov" <[email protected]> wrote:
> > > > >
> > > > > I think that the bufferserver should be allowed to use no more than
> > > > application specified amount of memory and behavior like linux file
> > cache
> > > > will make it difficult to allocate operator/container cache without
> > > > reserving too much memory for spikes.
> > > >
> > > > Sure, agreed.
> > > >
> > > > My idea is to use *lesser* memory than what is allocated by
> application
> > > > since I am suggesting some level of control over group commits. So I
> am
> > > > thinking of taking the patch you wrote to have it trigger each time
> > > buffer
> > > > server fills by n units, n being window size.
> > > >
> > > > If n exceed allocated memory, we can error out.
> > > >
> > > > Thoughts?
> > > >
> > > > But I may be wrong and it will be good to have suggested behavior
> > > > implemented in a prototype and benchmark prototype performance.
> > > > >
> > > > > Vlad
> > > > >
> > > > >
> > > > > On 8/24/15 18:24, Atri Sharma wrote:
> > > > >>
> > > > >> The idea is that if bufferserver dumps *all* pages once it runs
> out
> > of
> > > > >> memory, then it's a huge I/O spike. If it starts paging out once
> it
> > > runs
> > > > >> out of memory,  then it behaves like a normal cache and further
> > level
> > > of
> > > > >> paging control can be applied.
> > > > >>
> > > > >> My idea is that there should be functionality to control the
> amount
> > of
> > > > data
> > > > >> that is committed together. This also allows me to 1) define
> optimal
> > > way
> > > > >> writes work on my disk 2) allow my application to define locality
> of
> > > > data.
> > > > >> For eg I might be performing graph analysis in which a time
> window's
> > > > data
> > > > >> consists of sub graph.
> > > > >> On 25 Aug 2015 02:46, "Chetan Narsude" <[email protected]>
> > > wrote:
> > > > >>
> > > > >>> The bufferserver writes pages to disk *only when* it runs out of
> > > memory
> > > > to
> > > > >>> hold them.
> > > > >>>
> > > > >>> Can you elaborate where you see I/O spikes?
> > > > >>>
> > > > >>> --
> > > > >>> Chetan
> > > > >>>
> > > > >>> On Mon, Aug 24, 2015 at 12:39 PM, Atri Sharma <
> [email protected]
> > >
> > > > wrote:
> > > > >>>
> > > > >>>> Folks,
> > > > >>>>
> > > > >>>> I was wondering if it makes sense to have a functionality in
> which
> > > > >>>> bufferserver writes out data pages to disk in batches defined by
> > > > >>>> timeslice/application window.
> > > > >>>>
> > > > >>>> This will allow flexible workloads and reduce I/O spikes (I
> > > understand
> > > > >>>
> > > > >>> that
> > > > >>>>
> > > > >>>> we have non-blocking I/O but it still would incur disk head
> > costs).
> > > > >>>>
> > > > >>>> Thoughts?
> > > > >>>>
> > > > >>>> --
> > > > >>>> Regards,
> > > > >>>>
> > > > >>>> Atri
> > > > >>>> *l'apprenant*
> > > > >>>>
> > > > >
> > > >
> > >
> >
>

Reply via email to