[jira] [Comment Edited] (FLINK-16349) Use LinkedHashSet in TimeWindow.java

2020-03-01 Thread cpugputpu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048563#comment-17048563
 ] 

cpugputpu edited comment on FLINK-16349 at 3/1/20 1:25 PM:
---

Thanks for your comments! My understanding on this failure is that the failure 
itself does not demonstrate an order issue at all. However, the root cause of 
this issue is the iteration order of a HashSet. The stack trace of is here for 
your reference:

_java.util.HashSet.iterator(HashSet.java:173)_
 
_org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSet.addWindow(MergingWindowSet.java:192)_
 
_org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:356)_

Also, the initialization of the HashSet is here:

_org.apache.flink.streaming.api.windowing.windows.TimeWindow.mergeWindows(TimeWindow.java:236)_

After I use LinkedHashSet instead of HashSet, I find that this 
non-deterministic behaviour is completely removed and the code does not suffer 
from the test failure any more. Due to fact that I am not a developer of the 
project and I have no idea what the alogorithm is,  it is hard for me to 
analyze the reason why the "TimeWindow start is 1 in the actual value". I am 
sorry for this. If you have a better understanding of the test and you have a 
better idea on how to fix it, please tell me and I will have a try.

 


was (Author: cpugputpu):
Thanks for your comments! My understanding on this failure is that the failure 
itself does not demonstrate an order issue at all. However, the root cause of 
this issue is the iteration order of a HashSet. The stack trace of is here for 
your reference:

_java.util.HashSet.iterator(HashSet.java:173)_
_org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSet.addWindow(MergingWindowSet.java:192)_
_org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:356)_

After I use LinkedHashSet instead of HashSet, I find that this 
non-deterministic behaviour is completely removed and the code does not suffer 
from the test failure any more. Due to fact that I am not a developer of the 
project and I have no idea what the alogorithm is,  it is hard for me to 
analyze the reason why the "TimeWindow start is 1 in the actual value". I am 
sorry for this. If you have a better understanding of the test and you have a 
better idea on how to fix it, please tell me and I will have a try.

 

> Use LinkedHashSet in TimeWindow.java
> 
>
> Key: FLINK-16349
> URL: https://issues.apache.org/jira/browse/FLINK-16349
> Project: Flink
>  Issue Type: Bug
>Reporter: cpugputpu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test in 
> _apache.flink,org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest#testMergeLargeWindowCoveringMultipleWindows_
>  can fail due to a different iteration order of HashSet.
> The failure is presented as follows.
> java.lang.AssertionError: java.lang.AssertionError: 
> *Expected*: (iterable over , 
>  in any order or iterable over 
> ,  in any order 
> or iterable over ,  end=13}> in any order)
> *but was*:  at 
> org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:358)
>  
> The root cause of it lies in TimeWindow.java, where _currentMerge.f1 = new 
> LinkedHashSet<>();_ is executed. When calling _W mergedStateWindow = 
> this.mapping.get(mergedWindows.iterator().next());_ 
> (flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet.java
>  , line 192), the _iterator()_ of HashSet will make no guarantee about the 
> order.
> The specification about HashSet says that "it makes no guarantees as to the 
> iteration order of the set; in particular, it does not guarantee that the 
> order will remain constant over time". The documentation is here for your 
> reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html]
>  
> The fix is to use LinkedHashSet instead of HashSet so that the 
> non-deterministic behaviour is eliminated. The code will be more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-16349) Use LinkedHashSet in TimeWindow.java

2020-02-29 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048366#comment-17048366
 ] 

Chesnay Schepler edited comment on FLINK-16349 at 2/29/20 6:21 PM:
---

What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order ...
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 
none of the expected windows have the start value.


was (Author: zentol):
What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 
none of the expected windows have the start value.

> Use LinkedHashSet in TimeWindow.java
> 
>
> Key: FLINK-16349
> URL: https://issues.apache.org/jira/browse/FLINK-16349
> Project: Flink
>  Issue Type: Bug
>Reporter: cpugputpu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test in 
> _apache.flink,org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest#testMergeLargeWindowCoveringMultipleWindows_
>  can fail due to a different iteration order of HashSet.
> The failure is presented as follows.
> java.lang.AssertionError: java.lang.AssertionError: 
> *Expected*: (iterable over , 
>  in any order or iterable over 
> ,  in any order 
> or iterable over ,  end=13}> in any order)
> *but was*:  at 
> org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:358)
>  
> The root cause of it lies in TimeWindow.java, where _currentMerge.f1 = new 
> LinkedHashSet<>();_ is executed. When calling _W mergedStateWindow = 
> this.mapping.get(mergedWindows.iterator().next());_ 
> (flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet.java
>  , line 192), the _iterator()_ of HashSet will make no guarantee about the 
> order.
> The specification about HashSet says that "it makes no guarantees as to the 
> iteration order of the set; in particular, it does not guarantee that the 
> order will remain constant over time". The documentation is here for your 
> reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html]
>  
> The fix is to use LinkedHashSet instead of HashSet so that the 
> non-deterministic behaviour is eliminated. The code will be more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-16349) Use LinkedHashSet in TimeWindow.java

2020-02-29 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048366#comment-17048366
 ] 

Chesnay Schepler edited comment on FLINK-16349 at 2/29/20 6:21 PM:
---

What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order ...
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 
none of the expected windows have that start value.


was (Author: zentol):
What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order ...
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 
none of the expected windows have the start value.

> Use LinkedHashSet in TimeWindow.java
> 
>
> Key: FLINK-16349
> URL: https://issues.apache.org/jira/browse/FLINK-16349
> Project: Flink
>  Issue Type: Bug
>Reporter: cpugputpu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The test in 
> _apache.flink,org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest#testMergeLargeWindowCoveringMultipleWindows_
>  can fail due to a different iteration order of HashSet.
> The failure is presented as follows.
> java.lang.AssertionError: java.lang.AssertionError: 
> *Expected*: (iterable over , 
>  in any order or iterable over 
> ,  in any order 
> or iterable over ,  end=13}> in any order)
> *but was*:  at 
> org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:358)
>  
> The root cause of it lies in TimeWindow.java, where _currentMerge.f1 = new 
> LinkedHashSet<>();_ is executed. When calling _W mergedStateWindow = 
> this.mapping.get(mergedWindows.iterator().next());_ 
> (flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet.java
>  , line 192), the _iterator()_ of HashSet will make no guarantee about the 
> order.
> The specification about HashSet says that "it makes no guarantees as to the 
> iteration order of the set; in particular, it does not guarantee that the 
> order will remain constant over time". The documentation is here for your 
> reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html]
>  
> The fix is to use LinkedHashSet instead of HashSet so that the 
> non-deterministic behaviour is eliminated. The code will be more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-16349) Use LinkedHashSet in TimeWindow.java

2020-02-29 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048366#comment-17048366
 ] 

Chesnay Schepler edited comment on FLINK-16349 at 2/29/20 6:20 PM:
---

What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 
none of the expected windows have the start value.


was (Author: zentol):
What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 0 
in all expected cases.

> Use LinkedHashSet in TimeWindow.java
> 
>
> Key: FLINK-16349
> URL: https://issues.apache.org/jira/browse/FLINK-16349
> Project: Flink
>  Issue Type: Bug
>Reporter: cpugputpu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test in 
> _apache.flink,org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest#testMergeLargeWindowCoveringMultipleWindows_
>  can fail due to a different iteration order of HashSet.
> The failure is presented as follows.
> java.lang.AssertionError: java.lang.AssertionError: 
> *Expected*: (iterable over , 
>  in any order or iterable over 
> ,  in any order 
> or iterable over ,  end=13}> in any order)
> *but was*:  at 
> org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:358)
>  
> The root cause of it lies in TimeWindow.java, where _currentMerge.f1 = new 
> LinkedHashSet<>();_ is executed. When calling _W mergedStateWindow = 
> this.mapping.get(mergedWindows.iterator().next());_ 
> (flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet.java
>  , line 192), the _iterator()_ of HashSet will make no guarantee about the 
> order.
> The specification about HashSet says that "it makes no guarantees as to the 
> iteration order of the set; in particular, it does not guarantee that the 
> order will remain constant over time". The documentation is here for your 
> reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html]
>  
> The fix is to use LinkedHashSet instead of HashSet so that the 
> non-deterministic behaviour is eliminated. The code will be more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-16349) Use LinkedHashSet in TimeWindow.java

2020-02-29 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048366#comment-17048366
 ] 

Chesnay Schepler edited comment on FLINK-16349 at 2/29/20 6:19 PM:
---

What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order
{code}


The actual problem is that the TimeWindow start is 1 in the actual value, but 0 
in all expected cases.


was (Author: zentol):
What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order
{code}


The actual problem is that the TimeWindow start is one in the actual value, but 
0 in all expected cases.

> Use LinkedHashSet in TimeWindow.java
> 
>
> Key: FLINK-16349
> URL: https://issues.apache.org/jira/browse/FLINK-16349
> Project: Flink
>  Issue Type: Bug
>Reporter: cpugputpu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test in 
> _apache.flink,org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest#testMergeLargeWindowCoveringMultipleWindows_
>  can fail due to a different iteration order of HashSet.
> The failure is presented as follows.
> java.lang.AssertionError: java.lang.AssertionError: 
> *Expected*: (iterable over , 
>  in any order or iterable over 
> ,  in any order 
> or iterable over ,  end=13}> in any order)
> *but was*:  at 
> org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:358)
>  
> The root cause of it lies in TimeWindow.java, where _currentMerge.f1 = new 
> LinkedHashSet<>();_ is executed. When calling _W mergedStateWindow = 
> this.mapping.get(mergedWindows.iterator().next());_ 
> (flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet.java
>  , line 192), the _iterator()_ of HashSet will make no guarantee about the 
> order.
> The specification about HashSet says that "it makes no guarantees as to the 
> iteration order of the set; in particular, it does not guarantee that the 
> order will remain constant over time". The documentation is here for your 
> reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html]
>  
> The fix is to use LinkedHashSet instead of HashSet so that the 
> non-deterministic behaviour is eliminated. The code will be more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-16349) Use LinkedHashSet in TimeWindow.java

2020-02-29 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048366#comment-17048366
 ] 

Chesnay Schepler edited comment on FLINK-16349 at 2/29/20 6:19 PM:
---

What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:

{code:java}
iterable over ,  in any 
order
{code}


The actual problem is that the TimeWindow start is one in the actual value, but 
0 in all expected cases.


was (Author: zentol):
What makes you think that the element is the problem? The test assertions 
clearly don't care about the order:
{{code}
iterable over ,  in any 
order
{{code}}

The actual problem is that the TimeWindow start is one in the actual value, but 
0 in all expected cases.

> Use LinkedHashSet in TimeWindow.java
> 
>
> Key: FLINK-16349
> URL: https://issues.apache.org/jira/browse/FLINK-16349
> Project: Flink
>  Issue Type: Bug
>Reporter: cpugputpu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test in 
> _apache.flink,org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest#testMergeLargeWindowCoveringMultipleWindows_
>  can fail due to a different iteration order of HashSet.
> The failure is presented as follows.
> java.lang.AssertionError: java.lang.AssertionError: 
> *Expected*: (iterable over , 
>  in any order or iterable over 
> ,  in any order 
> or iterable over ,  end=13}> in any order)
> *but was*:  at 
> org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSetTest.testMergeLargeWindowCoveringMultipleWindows(MergingWindowSetTest.java:358)
>  
> The root cause of it lies in TimeWindow.java, where _currentMerge.f1 = new 
> LinkedHashSet<>();_ is executed. When calling _W mergedStateWindow = 
> this.mapping.get(mergedWindows.iterator().next());_ 
> (flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet.java
>  , line 192), the _iterator()_ of HashSet will make no guarantee about the 
> order.
> The specification about HashSet says that "it makes no guarantees as to the 
> iteration order of the set; in particular, it does not guarantee that the 
> order will remain constant over time". The documentation is here for your 
> reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html]
>  
> The fix is to use LinkedHashSet instead of HashSet so that the 
> non-deterministic behaviour is eliminated. The code will be more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)