My apologies if I wasn't clear. PartitionPersist is a Trident stream operation that persists a batch of Trident tuples to a stateful destination, in this case, Elasticsearch. UpdateState is a function in the BaseStateUpdater class that should be called when a batch of tuples arrives.
On Tue, Nov 18, 2014 at 1:26 PM, Itai Frenkel <[email protected]> wrote: > Could you please elaborate what is the relation between "updateState" > and "partitionPersist"? Are those two consecutive topology bolts ? > > > ------------------------------ > *From:* Elliott Bradshaw <[email protected]> > *Sent:* Tuesday, November 18, 2014 5:25 PM > *To:* [email protected] > *Subject:* Fwd: Issues with State updates in Kafka-Trident-Elasticsearch > topology > > > Hi All, > > I'm currently attempting to get a topology running for data into > Elasticsearch. Tuples go through some minimal marshalling and > preprocessing before being sent to partitionPersist, where they are > transformed into JSON and indexed in Elasticsearch. > > The cluster appears to work properly in local mode, but when deployed to > my 4 node cluster, state updates do not seem to fire correctly (sometimes > they don't fire at all). Tuple counter filters show data flowing through > the topology at a healthy rate (approx 80,000 rec/second), however, the > updateState function only rarely appears to be called. After a brief > period of time, no further calls to updateState are seen. > > As a test, I wrote a filter that queues up tuples and batch sends them to > Elasticsearch once a certain threshold is reached. This works perfectly > fine and is capable of managing the processing load. > > I've seen discussion of this behavior before, but have not managed to find > an explanation or solution. Has anybody else had similar issues or have a > solution? > >
