[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-18 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365846#comment-17365846
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

For flaky tests that fail more recently, we create Jira tickets. But in 
general, you don't need to worry about it and the reviewer will guide you. – As 
long as it's not a compile or checkstyle error, or a test that fails 
reproducible in your local env, it's most likely a flaky test. In the end, the 
committer who merges the PR is responsible to ensure we don't merge bad PRs.

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-18 Thread Juan C. Gonzalez-Zurita (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365836#comment-17365836
 ] 

Juan C. Gonzalez-Zurita commented on KAFKA-12718:
-

I see. But yes that was what I meant. Thanks for looking at the issue and 
explaining what you're concerned about as it relates to builds. I'd heard about 
flake tests from the contribution instructions but didn't know which were 
flakes and which weren't. Does the project have a compiled list of them or is 
the stages and test build that division itself? 

I'll fix the nits rn

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-17 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365248#comment-17365248
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

Not sure what you mean. Just looked at your PR (and left a few nit comments), 
and it seems there are no conflict and that Jenkins passed? (One build failed 
with a know flaky test, so we can ignore it) – We are only interested in the 
"test" builds, not the "stags" builds.

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-16 Thread Juan C. Gonzalez-Zurita (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364613#comment-17364613
 ] 

Juan C. Gonzalez-Zurita commented on KAFKA-12718:
-

Attempting to fix the conflict at the moment. I find it odd that the compile 
step fails here as my local branch is having no issues whatsoever. The test 
cases for streams all succeeded. Since I don't know how to access the stack 
trace from the crashes in Jenkins I'm not sure what to do. Any advice? [~mjsax]

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-02 Thread Juan C. Gonzalez-Zurita (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356044#comment-17356044
 ] 

Juan C. Gonzalez-Zurita commented on KAFKA-12718:
-

I will try to get this PR tonight or tomorrow night depending on circumstances. 
This test case is the last I need to update in order to agree with the gap 
changes within the streams test. As to the new ticket itself I would love to 
pick it up but I've not investigated it with enough depth to be certain whether 
or not I could beat the 14th deadline for it. I would be happy to try, though 
:) [~mjsax].

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-02 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356005#comment-17356005
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

Thanks [~guozhang] – given the last message from [~gonzur], I expect that we 
get a PR for it soon.

Btw: I also think that getting 
https://issues.apache.org/jira/browse/KAFKA-12317 into 3.0 might be worth it? 
Thoughts [~guozhang]? [~gonzur] would be be interested to pick this ticket up, 
too?

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-02 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356000#comment-17356000
 ] 

Guozhang Wang commented on KAFKA-12718:
---

Hi folks, I'm bumping up the priority of this ticket for 3.0 for now (note that 
we still have plenty of time towards the code freeze: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=177046466 but 
still would be good to get this easy fix asap).

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Critical
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-06-02 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355521#comment-17355521
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

`key=[k1@0/5]` is the key of a session, with data key `k1` and session start 
time of 0 and session end time of 5. The format is 
`[dataKey@windowStart/windowEnd]`.

Given the input data we observe and expected the following: The first record 
creates a new session `k1@0/0` – the second record extend the existing session 
(gap is set to 5) – for this case, we get a tombstone for the existing sessions 
and a second record for the new sessions. Thus after processing the first two 
input records, we have 3 output records.

Seems the first 6 output records are actually the same as in the expected 
result, but output records 7 and 8 are not expected in the result. Given that 
grace-period is zero, the fourth input record `k2` with ts=6 actually closes 
the session `k1@0/5` and thus the 5th input record should not result in any 
output. Thus, the expected result seems to be correct, while the observed 
output record 7 and 8 are incorrect: seems this is an issue introduced with 
your code change?

Does this help?

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-05-27 Thread Juan C. Gonzalez-Zurita (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352924#comment-17352924
 ] 

Juan C. Gonzalez-Zurita commented on KAFKA-12718:
-

Hello [~mjsax]. Sorry for this late follow up but I've mostly fixed the issue. 
I've corrected the two test cases that were giving issues due to being tuned to 
accept erroneous results wherein the gap time was previously absent. Them being 
located in 
org.apache.kafka.streams.kstream.internals.KStreamSessionWindowAggregateProcessorTest.
 But now I have an issue with shouldSupportFinalResultsForSessionWindows in 
org.apache.kafka.streams.kstream.internals.SuppressScenarioTest that I am 
unsure of how to fix. I do not understand why the test case has been modeled 
that way. The error result given is as follows:


{code:java}
java.lang.AssertionError: [
 TestRecord[key=[k1@0/0], value=1, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00Z]
 TestRecord[key=[k1@0/0], value=null, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00Z]
 TestRecord[key=[k1@0/5], value=2, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.005Z]
 TestRecord[key=[k1@0/5], value=null, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.005Z]
 TestRecord[key=[k1@0/5], value=3, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.005Z]
 TestRecord[key=[k2@6/6], value=1, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.006Z]
 TestRecord[key=[k1@0/5], value=null, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.005Z]
 TestRecord[key=[k1@0/5], value=4, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.005Z]
 TestRecord[key=[k1@30/30], value=1, headers=RecordHeaders(headers = [], 
isReadOnly = false), recordTime=1970-01-01T00:00:00.030Z]
 ] != [KeyValueTimestamp{key=[k1@0/0], value=1, timestamp=0}, 
KeyValueTimestamp{key=[k1@0/0], value=null, timestamp=0}, 
KeyValueTimestamp{key=[k1@0/5], value=2, timestamp=5}, 
KeyValueTimestamp{key=[k1@0/5], value=null, timestamp=5}, 
KeyValueTimestamp{key=[k1@0/5], value=3, timestamp=5}, 
KeyValueTimestamp{key=[k2@6/6], value=1, timestamp=6}, 
KeyValueTimestamp{key=[k1@30/30], value=1, timestamp=30}]
{code}
I can't seem to make sense of what k1@0/5 means or where value is derived from 
as the original only shows "v1" in its value argument to pipeInput or why the 
original has 6 values but output-raw has 7. If you could offer some assistance 
it would be greatly appreciated. I will be back tomorrow sometime in the 
afternoon to check back on this again. In the meantime, if you need me to 
upload my work so far or require anything else to make it easier on you to help 
simply suggest it.

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-05-11 Thread Juan C. Gonzalez-Zurita (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343001#comment-17343001
 ] 

Juan C. Gonzalez-Zurita commented on KAFKA-12718:
-

Thank you for assigning me. I will begin work on it tomorrow morning. :D

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Juan C. Gonzalez-Zurita
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-05-10 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342293#comment-17342293
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

[~gonzur] it seems [~byusti] lost interest to work on this ticket. Feel free to 
pick it up.

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-05-08 Thread Juan C. Gonzalez-Zurita (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341419#comment-17341419
 ] 

Juan C. Gonzalez-Zurita commented on KAFKA-12718:
-

Hello Mr. Sax, I see someone else already attempted this issue but left prior 
to finishing it. Would it be alright if I tried looking over your comments and 
attempted a fix myself?

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-04-30 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337615#comment-17337615
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

I just realized that the issue is actually not within `suppress()` but within 
`KStreamSessionWindowAggregateProcessor` – note that the test is question test 
the processor, not suppress().

It seem we would prefer to not to enforce `grace > gap` and thus, we should fix 
how we compute `closeTime` inside the processor and add the currently missing 
`gap` parameter.

[~byusti] feel free to start working on a PR. Let us know if you have any 
further questions.

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-04-29 Thread John Roesler (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337042#comment-17337042
 ] 

John Roesler commented on KAFKA-12718:
--

Thanks for reporting this, [~mjsax] !

What [~ableegoldman] said is correct. The grace period is defined at the 
windowing operation and inherited downstream by the suppression operator. Which 
means that when you add a suppression to the topology it has to search up the 
graph to learn the grace period it should use. This logic is defined in 
GraphGraceSearchUtil.

In addition to traversing the graph, this logic actually does inspect the type 
of window definition. IIRC, this was required at the time because of the 
awkward definition of Windows, but that has just recently been cleaned up, so 
maybe this logic can now be cleaned up as well.

The relevant snippet is:
{code:java}
private static Long extractGracePeriod(final GraphNode node) {
if (node instanceof StatefulProcessorNode) {
final ProcessorSupplier processorSupplier = ((StatefulProcessorNode) 
node).processorParameters().oldProcessorSupplier();
if (processorSupplier instanceof KStreamWindowAggregate) {
final KStreamWindowAggregate kStreamWindowAggregate = 
(KStreamWindowAggregate) processorSupplier;
final Windows windows = kStreamWindowAggregate.windows();
return windows.gracePeriodMs();
} else if (processorSupplier instanceof KStreamSessionWindowAggregate) {
final KStreamSessionWindowAggregate kStreamSessionWindowAggregate = 
(KStreamSessionWindowAggregate) processorSupplier;
final SessionWindows windows = 
kStreamSessionWindowAggregate.windows();
return windows.gracePeriodMs() + windows.inactivityGap();
} else if (processorSupplier instanceof KStreamSlidingWindowAggregate) {
final KStreamSlidingWindowAggregate kStreamSlidingWindowAggregate = 
(KStreamSlidingWindowAggregate) processorSupplier;
final SlidingWindows windows = 
kStreamSlidingWindowAggregate.windows();
return windows.gracePeriodMs();
} else {
return null;
}
} else {
return null;
}
} {code}
Actually, it seems like your concern is actually resolved by this code block, 
since the effective grace period for sliding windows is actually "gap + 
declared grace".

However, I'm not happy with how difficult it was to track this down. It should 
have been easy for any reader to verify the correctness of the implementation. 
It might be good to consider ways to generalize the grace period logic so that 
it's defined explicitly in the window definition.

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-04-29 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337019#comment-17337019
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

Well, `gracePeriod` is a generic concept that applies to all windows. However, 
windows may have quite different semantics, and thus I don't think we should 
leak actual semantics into `suppress()`? It does not seem desirable to let 
`suppress()` inspect the actual window type, and do different things based on 
it? Especially, because it is possible that users implement custom windows.

Thus, my take it that we should stick with `closeTime = windowEnd + grace`. If 
we would accept to "leak" the semantics into `suppress()` we would need to 
compute `closeTime = windowEnd + gap + grace` (or add `gap` to the user 
specified `grace` to avoid branching base on window type in suppress() – or 
maybe set `gap=0` for non-session windows). I guess I could be convinced _if_ 
there is a good argument that we don't open Pandoras box this way?

If we stick with `closeTime = windowEnd + grace` and keep window semantics out 
of suppress and session windows semantics stay encapsulated with 
`SessionWindows` – and `SessionWindows` understand that it should enforce 
`grace > gap` to work properly.

Thoughts?

[~byusti] Thanks for your interest. Happy to guide you. I guess, we will first 
need to agree how we want to address this issue though :)

Couple of pointer so you can follow the discussion:
 * suppress() implementation: 
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/suppress/KTableSuppressProcessorSupplier.java]
 
 * test that "reveals" the issue: 
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamSessionWindowAggregateProcessorTest.java#L446-L455]
 
 * session-windows: 
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/SessionWindows.java]
 

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-04-29 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336997#comment-17336997
 ] 

A. Sophie Blee-Goldman commented on KAFKA-12718:


{quote}I was just looking into `suppress()` implementation, that obviously does 
not know anything about semantics of upstream window definitions{quote}
[~mjsax] can you elaborate? Suppression does in fact search the processor graph 
to find the grace period of the windowed operator which is upstream of the 
suppression. It's called GraphGraceSearchUtil or something

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-04-28 Thread Bernardo Yusti (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335118#comment-17335118
 ] 

Bernardo Yusti commented on KAFKA-12718:


Hi [~mjsax] ,

I would like to help out with this issue.

This would be my first issue so any pointers on where to start would be greatly 
appreciated!

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12718) SessionWindows are closed too early

2021-04-28 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334964#comment-17334964
 ] 

Matthias J. Sax commented on KAFKA-12718:
-

I was just looking into `suppress()` implementation, that obviously does not 
know anything about semantics of upstream window definitions. It computes the 
"expiry-time" based on window-end time. Thus, I believe that the right fix for 
session windows would be to enforce that gracePeriod > gap. \cc [~ableegoldman] 
as you proposed to address this inside the session window processor (I think 
this approach won't work).

> SessionWindows are closed too early
> ---
>
> Key: KAFKA-12718
> URL: https://issues.apache.org/jira/browse/KAFKA-12718
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: beginner, easy-fix, newbie
> Fix For: 3.0.0
>
>
> SessionWindows are defined based on a {{gap}} parameter, and also support an 
> additional {{grace-period}} configuration to handle out-of-order data.
> To incorporate the session-gap a session window should only be closed at 
> {{window-end + gap}} and to incorporate grace-period, the close time should 
> be pushed out further to {{window-end + gap + grace}}.
> However, atm we compute the window close time as {{window-end + grace}} 
> omitting the {{gap}} parameter.
> Because default grace-period is 24h most users might not notice this issues. 
> Even if they set a grace period explicitly (eg, when using suppress()), they 
> would most likely set a grace-period larger than gap-time not hitting the 
> issue (or maybe only realize it when inspecting the behavior closely).
> However, if a user wants to disable the grace-period and sets it to zero (on 
> any other value smaller than gap-time), sessions might be close too early and 
> user might notice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)