Re: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

2019-05-08 Thread Jose Lopez
Hi John,

Thank you for your reply. Indeed, your explanation makes sense to me. Thank
you again for taking the time to reply.

Regards,
Jose

On Tue, 30 Apr 2019 at 22:22, John Roesler  wrote:

> Hi Ashok,
>
> I think some people may be able to give you advice, but please start a new
> thread instead of replying to an existing message. This just helps keep all
> the messages organized.
>
> Thanks!
> -John
>
> On Thu, Apr 25, 2019 at 6:12 AM ASHOK MACHERLA 
> wrote:
>
> > Hii,
> >
> > what I asking
> >
> > I want to know about kafka partitions
> >
> >
> > we have getting data about 200GB+ from sources to kafka for daily .
> >
> > I need to know how many partitions are required to pull data from source
> > without pileup.
> >
> > please suggest us to fix this issue.
> >
> > is there any mathematical rules to create specific no.of partitions for
> > Topic.???
> >
> >
> > please help me
> >
> > Sent from Outlook
> > 
> > From: Jose Lopez 
> > Sent: 25 April 2019 16:34
> > To: users@kafka.apache.org
> > Subject: [Streams] TimeWindows ignores gracePeriodMs in
> > windowsFor(timestamp)
> >
> > Hi all,
> >
> > Given that gradePeriodMs is "the time to admit late-arriving events after
> > the end of the window", I'd expect it is taken into account in
> > windowsFor(timestamp). E.g.:
> >
> > sizeMs = 5
> > gracePeriodMs = 2
> > advanceMs = 3
> > timestamp = 6
> >
> > | window | windowStart | windowEnd | windowsEnd + gracePeriod |
> > | 1   | 0   | 5 | 7
> >|
> > | 2   | 5   | 10   | 12
> >  |
> > ...
> >
> > Current output:
> > windowsFor(timestamp) returns window 2 only.
> >
> > Expected output:
> > windowsFor(timestamp) returns both window 1 and window 2
> >
> > Do you agree with the expected output? Am I missing something?
> >
> > Regards,
> > Jose
> >
>


Re: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

2019-04-30 Thread John Roesler
Hi Ashok,

I think some people may be able to give you advice, but please start a new
thread instead of replying to an existing message. This just helps keep all
the messages organized.

Thanks!
-John

On Thu, Apr 25, 2019 at 6:12 AM ASHOK MACHERLA  wrote:

> Hii,
>
> what I asking
>
> I want to know about kafka partitions
>
>
> we have getting data about 200GB+ from sources to kafka for daily .
>
> I need to know how many partitions are required to pull data from source
> without pileup.
>
> please suggest us to fix this issue.
>
> is there any mathematical rules to create specific no.of partitions for
> Topic.???
>
>
> please help me
>
> Sent from Outlook
> 
> From: Jose Lopez 
> Sent: 25 April 2019 16:34
> To: users@kafka.apache.org
> Subject: [Streams] TimeWindows ignores gracePeriodMs in
> windowsFor(timestamp)
>
> Hi all,
>
> Given that gradePeriodMs is "the time to admit late-arriving events after
> the end of the window", I'd expect it is taken into account in
> windowsFor(timestamp). E.g.:
>
> sizeMs = 5
> gracePeriodMs = 2
> advanceMs = 3
> timestamp = 6
>
> | window | windowStart | windowEnd | windowsEnd + gracePeriod |
> | 1   | 0   | 5 | 7
>|
> | 2   | 5   | 10   | 12
>  |
> ...
>
> Current output:
> windowsFor(timestamp) returns window 2 only.
>
> Expected output:
> windowsFor(timestamp) returns both window 1 and window 2
>
> Do you agree with the expected output? Am I missing something?
>
> Regards,
> Jose
>


Re: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

2019-04-30 Thread John Roesler
Hey, Jose,

This is an interesting thought that I hadn't considered before. I think
(tentatively) that windowsFor should *not* take the grace period into
account.

What I'm thinking is that the method is supposed to return  "all windows
that contain the provided timestamp" . When we keep window1 open until
stream time 7, it's because we're waiting to see if some record with a
timestamp in range [0,5) arrives before the overall stream time ticks past
7. But if/when we get that event, its own timestamp is still in the range
[0-5). For example, its timestamp is *not* 6 (because then it would belong
in window2, not window1). Thus, window1 does not "contain" the timestamp 6,
and therefore, windowsFor(6) is not required to return window 1.

Does that seem right to you?
-John

On Thu, Apr 25, 2019 at 6:04 AM Jose Lopez  wrote:

> Hi all,
>
> Given that gradePeriodMs is "the time to admit late-arriving events after
> the end of the window", I'd expect it is taken into account in
> windowsFor(timestamp). E.g.:
>
> sizeMs = 5
> gracePeriodMs = 2
> advanceMs = 3
> timestamp = 6
>
> | window | windowStart | windowEnd | windowsEnd + gracePeriod |
> | 1   | 0   | 5 | 7
>|
> | 2   | 5   | 10   | 12
>  |
> ...
>
> Current output:
> windowsFor(timestamp) returns window 2 only.
>
> Expected output:
> windowsFor(timestamp) returns both window 1 and window 2
>
> Do you agree with the expected output? Am I missing something?
>
> Regards,
> Jose
>


Re: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

2019-04-25 Thread ASHOK MACHERLA
Hii,

what I asking

I want to know about kafka partitions


we have getting data about 200GB+ from sources to kafka for daily .

I need to know how many partitions are required to pull data from source 
without pileup.

please suggest us to fix this issue.

is there any mathematical rules to create specific no.of partitions for 
Topic.???


please help me

Sent from Outlook

From: Jose Lopez 
Sent: 25 April 2019 16:34
To: users@kafka.apache.org
Subject: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

Hi all,

Given that gradePeriodMs is "the time to admit late-arriving events after
the end of the window", I'd expect it is taken into account in
windowsFor(timestamp). E.g.:

sizeMs = 5
gracePeriodMs = 2
advanceMs = 3
timestamp = 6

| window | windowStart | windowEnd | windowsEnd + gracePeriod |
| 1   | 0   | 5 | 7
   |
| 2   | 5   | 10   | 12
 |
...

Current output:
windowsFor(timestamp) returns window 2 only.

Expected output:
windowsFor(timestamp) returns both window 1 and window 2

Do you agree with the expected output? Am I missing something?

Regards,
Jose