Hans,
Which class's javadocs should i look at? From my initial look at the
javadocs and discussion with Michael, it doesn't seem possible.
On Tue, Mar 21, 2017 at 10:44 PM, Hans Jespersen wrote:
> Yes, and yes!
>
> -hans
>
>
>
> > On Mar 21, 2017, at 7:45 AM, Ali Akhtar wrote:
> >
> > That wou
Yes, and yes!
-hans
> On Mar 21, 2017, at 7:45 AM, Ali Akhtar wrote:
>
> That would require
>
> - Knowing the current window's id (or some other identifier) to
> differentiate it from other windows
>
> - Being able to process individual messages in a window
>
> Are those 2 things possible
Hi Ali,
(My use case is, i receive a stream of messages. Messages need to be stored
> and sorted into 'buckets', to indicate 'sessions'. Each time there's a gap
> of 30 mins or more since the last message (under a key), a new 'session'
> (bucket) should be started, and future messages should belon
That would require
- Knowing the current window's id (or some other identifier) to
differentiate it from other windows
- Being able to process individual messages in a window
Are those 2 things possible w/ kafka streams? (java)
On Tue, Mar 21, 2017 at 7:43 PM, Hans Jespersen wrote:
> While it
While it's not exactly the same as the window start/stop time you can store (in
the state store) the earliest and latest timestamps of any messages in each
window and use that as a good approximation for the window boundary times.
-hans
> On Mar 20, 2017, at 1:00 PM, Ali Akhtar wrote:
>
> Y
Yeah, windowing seems perfect, if only I could find out the current
window's start time (so I can log the current bucket's start & end times)
and process window messages individually rather than as aggregates.
It doesn't seem like i can get this metadata from ProcessorContext though,
from looking
Ali,
what you describe is (roughly!) how Kafka Streams implements the internal
state stores to support windowing.
Some users have been following a similar approach as you outlined, using
the Processor API.
On Mon, Mar 20, 2017 at 7:54 PM, Ali Akhtar wrote:
> It would be helpful to know the '
It would be helpful to know the 'start' and 'end' of the current metadata,
so if an out of order message arrives late, and is being processed in
foreach(), you'd know which window / bucket it belongs to, and can handle
it accordingly.
I'm guessing that's not possible at the moment.
(My use case i
> Can windows only be used for aggregations, or can they also be used for
> foreach(),
and such?
As of today, you can use windows only in aggregations.
> And is it possible to get metadata on the message, such as whether or not its
late, its index/position within the other messages, etc?
If you
Can windows only be used for aggregations, or can they also be used for
foreach(), and such?
And is it possible to get metadata on the message, such as whether or not
its late, its index/position within the other messages, etc?
On Mon, Mar 20, 2017 at 9:44 PM, Michael Noll wrote:
> And since yo
Late-arriving and out-of-order data is only treated specially for windowed
aggregations.
For stateless operations such as `KStream#foreach()` or `KStream#map()`,
records are processed in the order they arrive (per partition).
-Michael
On Sat, Mar 18, 2017 at 10:47 PM, Ali Akhtar wrote:
> >
And since you asked for a pointer, Ali:
http://docs.confluent.io/current/streams/concepts.html#windowing
On Mon, Mar 20, 2017 at 5:43 PM, Michael Noll wrote:
> Late-arriving and out-of-order data is only treated specially for windowed
> aggregations.
>
> For stateless operations such as `KStrea
> later when message A arrives it will put that message back into
> the right temporal context and publish an amended result for the proper
> time/session window as if message B were consumed in the timestamp order
> before message A.
Does this apply to the aggregation Kafka stream methods then, a
Yes stream processing and CEP are subtlety different things.
Kafka Streams helps you write stateful apps and allows that state to be
preserved on disk (a local State store) as well as distributed for HA or for
parallel partitioned processing (via Kafka topic partitions and consumer
groups) as
Hans
What you state would work for aggregations, but not for state machines and
CEP.
Regards
Sab
On 19 Mar 2017 12:01 a.m., "Hans Jespersen" wrote:
> The only way to make sure A is consumed first would be to delay the
> consumption of message B for at least 15 minutes which would fly in the
>
The only way to make sure A is consumed first would be to delay the
consumption of message B for at least 15 minutes which would fly in the
face of the principals of a true streaming platform so the short answer to
your question is "no" because that would be batch processing not stream
processing.
> However, Kafka Streams does handle late arriving data. So if you had some
> analytics that computes results on a time window or a session window then
> Kafka streams will compute on the stream in real time (processing message
> B) and then later when message A arrives it will put that message bac
sorry I mixed up Message A and B wrt the to question but the answer is
still valid.
-hans
/**
* Hans Jespersen, Principal Systems Engineer, Confluent Inc.
* h...@confluent.io (650)924-2670
*/
On Sat, Mar 18, 2017 at 11:07 AM, Hans Jespersen wrote:
> The only way to make sure A is consumed f
Is it possible to have Kafka Streams order messages correctly by their
timestamps, even if they arrived out of order?
E.g, say Message A with a timestamp of 5:00 PM and Message B with a
timestamp of 5:15 PM, are sent.
Message B arrives sooner than Message A, due to network issues.
Is it possible
19 matches
Mail list logo