[ https://issues.apache.org/jira/browse/FLINK-30069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Juntao Hu updated FLINK-30069: ------------------------------ Description: When a pattern produces several matches with same priority, is it the expected behavior to keep only the first dequeued one? E.G. pattern: A.followedByAny(B).followedBy(C) seq: a1, b1, b2, b3, c1 aftermatch strategy: skip_to_next Potential matches (a1, b1, c1) (a1, b2, c1) (a1, b3, c1) reach final state at the same time and have the same priority, but (a1, b1, c1) is the first one in the queue, thus (a1, b2, c1) and (a1, b3, c1) will be dropped by skip_to_next strategy. If it's by-design, I guess it relies on two things: * take must be the first transition for every state, if exists * priority queue is first-in-first-out for items with same priority *UPDATE 17/Feb/23* Example case is testNotFollowedByWithinAtEndAfterMatch() in [this commit|https://github.com/apache/flink/pull/21920/commits/1b43eab231ea62df1be2a5f3752664eca5b2e15a]. If we change skip strategy to SKIP_TO_NEXT, the expected output is (a1, a2, a3, c1), (a2, a3, c1) and (a3, c1). If we run in JDK11, it works as expected, but after switching to JDK8 this case will fail with (a2, a3, c1) becoming (a2, c1). PriorityQueue.removeAll() is quite suspicous for breaking the order. *UPDATE 24/Feb/23* [JDK11|https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/19fb8f93c59dfd791f62d41f332db9e306bc1422/src/java.base/share/classes/java/util/PriorityQueue.java#L915] PriorityQueue implements bulk remove, while [JDK8|https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/2544d2a351eca1a3d62276f969dd2d95e4a4d2b6/jdk/src/share/classes/java/util/AbstractCollection.java#L370] removes one by one, which result different orders on elements with same priority. was: When a pattern produces several matches with same priority, is it the expected behavior to keep only the first dequeued one? E.G. pattern: A.followedByAny(B).followedBy(C) seq: a1, b1, b2, b3, c1 aftermatch strategy: skip_to_next Potential matches (a1, b1, c1) (a1, b2, c1) (a1, b3, c1) reach final state at the same time and have the same priority, but (a1, b1, c1) is the first one in the queue, thus (a1, b2, c1) and (a1, b3, c1) will be dropped by skip_to_next strategy. If it's by-design, I guess it relies on two things: * take must be the first transition for every state, if exists * priority queue is first-in-first-out for items with same priority *UPDATE 17/Feb/23* Example case is testNotFollowedByWithinAtEndAfterMatch() in [this commit|https://github.com/apache/flink/pull/21920/commits/1b43eab231ea62df1be2a5f3752664eca5b2e15a]. If we change skip strategy to SKIP_TO_NEXT, the expected output is (a1, a2, a3, c1), (a2, a3, c1) and (a3, c1). If we run in JDK11, it works as expected, but after switching to JDK8 this case will fail with (a2, a3, c1) becoming (a2, c1). PriorityQueue.removeAll() is quite suspicous for breaking the order. *UPDATE 24/Feb/23* [JDK11|https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/19fb8f93c59dfd791f62d41f332db9e306bc1422/src/java.base/share/classes/java/util/PriorityQueue.java#L915] PriorityQueue implements bulk remove, when [JDK8|https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/2544d2a351eca1a3d62276f969dd2d95e4a4d2b6/jdk/src/share/classes/java/util/AbstractCollection.java#L370] removes one by one, which result different orders on elements with same priority. > Expected prune behavior for matches with same priority > ------------------------------------------------------ > > Key: FLINK-30069 > URL: https://issues.apache.org/jira/browse/FLINK-30069 > Project: Flink > Issue Type: Bug > Components: Library / CEP > Affects Versions: 1.15.3, 1.16.1 > Reporter: Juntao Hu > Priority: Major > > When a pattern produces several matches with same priority, is it the > expected behavior to keep only the first dequeued one? > E.G. > pattern: A.followedByAny(B).followedBy(C) > seq: a1, b1, b2, b3, c1 > aftermatch strategy: skip_to_next > Potential matches (a1, b1, c1) (a1, b2, c1) (a1, b3, c1) reach final state at > the same time and have the same priority, but (a1, b1, c1) is the first one > in the queue, thus (a1, b2, c1) and (a1, b3, c1) will be dropped by > skip_to_next strategy. > If it's by-design, I guess it relies on two things: > * take must be the first transition for every state, if exists > * priority queue is first-in-first-out for items with same priority > *UPDATE 17/Feb/23* > Example case is testNotFollowedByWithinAtEndAfterMatch() in [this > commit|https://github.com/apache/flink/pull/21920/commits/1b43eab231ea62df1be2a5f3752664eca5b2e15a]. > If we change skip strategy to SKIP_TO_NEXT, the expected output is (a1, a2, > a3, c1), (a2, a3, c1) and (a3, c1). If we run in JDK11, it works as expected, > but after switching to JDK8 this case will fail with (a2, a3, c1) becoming > (a2, c1). PriorityQueue.removeAll() is quite suspicous for breaking the order. > *UPDATE 24/Feb/23* > [JDK11|https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/19fb8f93c59dfd791f62d41f332db9e306bc1422/src/java.base/share/classes/java/util/PriorityQueue.java#L915] > PriorityQueue implements bulk remove, while > [JDK8|https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/2544d2a351eca1a3d62276f969dd2d95e4a4d2b6/jdk/src/share/classes/java/util/AbstractCollection.java#L370] > removes one by one, which result different orders on elements with same > priority. -- This message was sent by Atlassian Jira (v8.20.10#820010)