I am not sure I understand why the state is tied to an instance?

cheers
/karthik

On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper <tom.n.coo...@gmail.com>
wrote:

> Yeah, state recovery is a bit more difficult with Heron's architecture. In
> Storm, the task IDs are not just values used for routing they actually
> equate to a task instance within the executor. An executor which currently
> processes the keys 4-8 actually contains 5 task instances of the same
> component. So for each task, they just save its state attached to the
> single task ID and reassemble executors with the new task instances.
>
> We don't want or have to do that with Heron instances but we would need to
> have some way to have a state change tied to the task (or routing key if we
> go to the key range idea). For something like a word count you might save
> counts using a nested map like: { routing key : {word : count }}. The
> routing key could be included in the Tuple instance. However, whether this
> pattern would work for more generic state cases I don't know?
>
> Tom Cooper
> W: www.tomcooper.org.uk  | Twitter: @tomncooper
> <https://twitter.com/tomncooper>
>
>
> On Fri, 4 May 2018 at 15:54, Neng Lu <freen...@gmail.com> wrote:
>
> > +1 for this idea. As long as the predefined key space is large enough, it
> > should work for most of the cases.
> >
> > Based on my experience with topologies, I never saw one component has
> more
> > than 1000 instances in a topology.
> >
> > For recovering states from an update, there will be some problems though.
> > Since the states stored in heron are strongly connected with each
> instance,
> > we either need to have
> > some resolver does the state repartitioning or stores states with the key
> > instead of with each instance.
> >
> >
> >
> > On Fri, May 4, 2018 at 3:01 PM, Karthik Ramasamy <kramas...@gmail.com>
> > wrote:
> >
> > > Thanks for sharing. I like the Storm approach
> > >
> > > - keeps the implementation simpler
> > > - state is deterministic across restarts
> > > - makes it easy to reason and debug
> > >
> > > The hard limit is not a problem at all since most of the topologies
> will
> > > be never that big.
> > > If you can handle Twitter topologies cleanly, it is more that
> sufficient
> > I
> > > believe.
> > >
> > > cheers
> > > /karthik
> > >
> > > > On May 4, 2018, at 2:31 PM, Thomas Cooper <tom.n.coo...@gmail.com>
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > A while ago I emailed about the issue of how fields (key) grouped
> > routing
> > > > in Heron was not consistent across an update and how this makes
> > > preserving
> > > > state across an update very difficult and also makes it
> > > > difficult/impossible to analyse or predict tuple flows through a
> > > > current/proposed topology physical plan.
> > > >
> > > > I suggested adopting Storms approach of pre-defining a routing key
> > > > space for each component (eg 0-999), so that instead of an instance
> > > having
> > > > a single task id that gets reset at every update (eg 10) it has a
> range
> > > of
> > > > id's (eg 10-16) that changes depending on the parallelism of the
> > > component.
> > > > This has the advantage that a key will always hash to the same task
> ID
> > > for
> > > > the lifetime of the topology. Meaning recovering state for an
> instance
> > > > after a crash or update is just a case of pulling the state linked to
> > the
> > > > keys in its task ID range.
> > > >
> > > > I know the above proposal has issues, not least of all placing a hard
> > > upper
> > > > limit on the scale out of a component, and that some alternative
> ideas
> > > are
> > > > being floated for solving the stateful update issue. However, I just
> > > wanted
> > > > to throw some more weight behind the Storm approach. There was a
> recent
> > > > paper about high-performance network load balancing
> > > > <https://blog.acolyer.org/2018/05/03/stateless-
> > > datacenter-load-balancing-with-beamer/>that
> > > > describes an approach using a fixed key space similar to Storm's (see
> > the
> > > > section called Stable Hashing - they assign a range 100x the expected
> > > > connection pool size - which we could do with heron to prevent ever
> > > hitting
> > > > the upper scaling limit). Also, this new load balancer, Beamer,
> claims
> > to
> > > > be twice as fast as Google's Maglev
> > > > <https://blog.acolyer.org/2016/03/21/maglev-a-fast-and-
> > > reliable-software-network-load-balancer/>
> > > > which again uses a pre-defined keyspace and ID ranges to create
> look-up
> > > > tables deterministically.
> > > >
> > > > I know a load balancer is a different beast to a stream grouping but
> > > there
> > > > are some interesting ideas in those papers (The links point to
> summary
> > > blog
> > > > posts so you don't have to read the whole paper).
> > > >
> > > > Anyway, I just thought I would those papers out there and see what
> > people
> > > > think.
> > > >
> > > > Tom Cooper
> > > > W: www.tomcooper.org.uk  | Twitter: @tomncooper
> > > > <https://twitter.com/tomncooper>
> > >
> > >
> >
>

Reply via email to