[
https://issues.apache.org/jira/browse/SAMZA-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200361#comment-14200361
]
Chris Riccomini commented on SAMZA-459:
---------------------------------------
One other note: we should consider what the semantics of flush() should be.
Should flush(systemStream) flush messages from *all* sources, or just the
StreamTask that's calling flush? It seems like there are use cases for both,
but generally, I think people will just want to flush outgoing messages from
the StreamTask that's calling flush(). Perhaps we can have an optional second
argument (flushAllPartitions) or something.
> Explicit flush for individual output streams
> --------------------------------------------
>
> Key: SAMZA-459
> URL: https://issues.apache.org/jira/browse/SAMZA-459
> Project: Samza
> Issue Type: Improvement
> Components: container
> Affects Versions: 0.9.0
> Reporter: Ben Kirwin
> Priority: Minor
>
> From the mailing list:
> http://mail-archives.apache.org/mod_mbox/incubator-samza-dev/201411.mbox/%3CCACuX-D8-CS7867ob47fqytCAdvGURc4owv82Rhg2oEJYmr8hpg%40mail.gmail.com%3E
> At the moment, the only way to trigger a flush of the output streams is to
> call TaskCoordinator.commit, which also flushes the state and saves the
> checkpoints. There are a few cases where more granularity would be useful:
> writing out a single stream can be much faster than doing a full commit, and
> if a user cares about the order in which messages are published, they can
> disable the autocommit and trigger flushes manually.
> I'd anticipate this to look something like
> TaskCoordinator.flush(systemStream). It looks like the TaskCoordinator
> normally only queues up work, instead of doing it synchronously -- if that's
> the case, it should be enough to buffer up all the requested flushes, then
> perform them in order when the moment comes.
> Note: you could get *almost* the same effect by switching to a synchronous
> system and letting the user send a batch of messages all at once, much as the
> underlying Kafka client does. This woudn't let you flush a changelog stream,
> though.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)