Ewen Cheslack-Postava created KAFKA-2481:
--------------------------------------------
Summary: Allow copycat sinks to request periodic invocation of put
even if no new data is available
Key: KAFKA-2481
URL: https://issues.apache.org/jira/browse/KAFKA-2481
Project: Kafka
Issue Type: Sub-task
Components: copycat
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Some connectors will need to perform actions periodically (or more generally,
schedule actions in the future). For example, in an HDFS connector, if you want
to roll files every n minutes, the sink connector needs to make sure it gets
control every n minutes, regardless of availbable data. However, if data isn't
flowing into the consumer, we might never invoke {{put(records)}}. Another
variant of this is for connectors that might have an API like the new
consumer's where `poll()` needs to be invoked regularly.
In terms of design, I think there are at least two options:
1. this could be handled via the context, so it is purely opt in to ask to be
scheduled for a put(), and they can specify exactly the timeout
2. alternatively, could be returned by put() since the return type is currently
void. we aren't using a return value right now, but this does mean everyone has
to return. also, unclear that this will always be the only info you want to
return
I think 1 is cleaner and doesn't require connector developers who don't care
about the feature to even know about it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)