[ 
https://issues.apache.org/jira/browse/CALCITE-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18091114#comment-18091114
 ] 

Ruiqi Dong commented on CALCITE-7557:
-------------------------------------

Good question. I think the right behavior here is to match the existing 
EnumerableDefaults/LINQ behavior: take(count <= 0) returns an empty sequence, 
and skip(count <= 0) returns the original sequence.

The main reason is consistency with linq4j's model and with Calcite's existing 
generic implementation:

1. linq4j's Enumerable is documented as analogous to LINQ's IEnumerable. LINQ 
defines this behavior explicitly: Take with count <= 0 returns an empty 
IEnumerable without enumerating the source, and Skip with count <= 0 yields all 
source elements.

2. Calcite already has this behavior in the generic EnumerableDefaults path. 
For negative counts, EnumerableDefaults.take returns empty, and 
EnumerableDefaults.skip skips nothing and returns the original sequence. There 
are also existing tests for negative Take in EnumerableDefaults and 
QueryableDefaults.

3. ListEnumerable is selected by Linq4j.asEnumerable as a more efficient 
implementation for List inputs. I read that as an optimization path, so it 
should be semantically equivalent to the generic EnumerableDefaults behavior.

For completeness, Java Stream goes the other way: Stream.limit/skip reject 
negative values with IllegalArgumentException. But linq4j is modeled after 
LINQ, and Calcite's generic Enumerable implementation already follows the 
LINQ-like behavior. So for this ticket, I think the narrowest and most 
consistent fix is to make ListEnumerable match the existing generic Enumerable 
behavior.

If the project prefers a fail-fast contract instead, that should probably be a 
broader behavior change across EnumerableDefaults/QueryableDefaults as well, 
rather than only the ListEnumerable optimized path.

If this direction sounds right to you, I can submit a PR that makes 
ListEnumerable match the existing EnumerableDefaults behavior and adds 
regression tests for negative take/skip counts.

> Linq4j.ListEnumerable.take(int) / skip(int) diverge from EnumerableDefaults 
> on negative counts
> ----------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-7557
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7557
>             Project: Calcite
>          Issue Type: Bug
>          Components: linq4j
>    Affects Versions: 1.41.0
>            Reporter: Ruiqi Dong
>            Priority: Major
>
> *Summary*
> Linq4j.asEnumerable(list) exposes list-specialized take(int) and skip(int) 
> implementations that are not behaviorally equivalent to the generic 
> EnumerableDefaults versions for negative counts. For the same input 
> enumerable and the same negative count:
>  * EnumerableDefaults.take(source, -1) returns an empty enumerable, but 
> ListEnumerable.take(-1) throws
>  * EnumerableDefaults.skip(source, -1) returns the original sequence, but 
> ListEnumerable.skip(-1) throws
> This is a clean semantic divergence between the generic and optimized paths 
> of the same API.
>  
> *Affected code*
> File: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
> {code:java}
> public static <TSource> Enumerable<TSource> skip(Enumerable<TSource> source,
>     final int count) {
>   return skipWhile(source, (v1, v2) -> {
>     return v2 < count;
>   });
> }
> public static <TSource> Enumerable<TSource> take(Enumerable<TSource> source,
>     final int count) {
>   return takeWhile(
>       source, (v1, v2) -> {
>         return v2 < count;
>       });
> } {code}
> File: linq4j/src/main/java/org/apache/calcite/linq4j/Linq4j.java
> {code:java}
> @Override public Enumerable<T> skip(int count) {
>   final List<T> list = toList();
>   if (count >= list.size()) {
>     return Linq4j.emptyEnumerable();
>   }
>   return new ListEnumerable<>(list.subList(count, list.size()));
> }
> @Override public Enumerable<T> take(int count) {
>   final List<T> list = toList();
>   if (count >= list.size()) {
>     return this;
>   }
>   return new ListEnumerable<>(list.subList(0, count));
> }{code}
>  
> *Reproducer*
> Add the following test to 
> linq4j/src/test/java/org/apache/calcite/linq4j/test/Linq4jTest.java
> {code:java}
>  @Test void testTakeListEnumerableNegativeSize() {
>   final List<Integer> values = Arrays.asList(1, 2, 3);
>   assertThat(EnumerableDefaults.take(Linq4j.asEnumerable(values), 
> -1).toList(),
>       is(empty()));
>   assertThat(Linq4j.asEnumerable(values).take(-1).toList(), is(empty()));
> }
> @Test void testSkipListEnumerableNegativeSize() {
>   final List<Integer> values = Arrays.asList(1, 2, 3);
>   assertThat(EnumerableDefaults.skip(Linq4j.asEnumerable(values), 
> -1).toList(),
>       is(values));
>   assertThat(Linq4j.asEnumerable(values).skip(-1).toList(), is(values));
> }{code}
> Run:
> {code:java}
> ./gradlew :linq4j:test \
>   --tests 
> org.apache.calcite.linq4j.test.Linq4jTest.testTakeListEnumerableNegativeSize \
>   --tests 
> org.apache.calcite.linq4j.test.Linq4jTest.testSkipListEnumerableNegativeSize 
> {code}
> Observed behavior:
> The optimized list path throws in both methods instead of matching the 
> generic enumerable semantics
> {code:java}
> java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
>     at 
> java.base/java.util.AbstractList.subListRangeCheck(AbstractList.java:509)
>     at java.base/java.util.AbstractList.subList(AbstractList.java:497)
>     at org.apache.calcite.linq4j.Linq4j$ListEnumerable.take(Linq4j.java:599)
> java.lang.IndexOutOfBoundsException: fromIndex = -1
>     at 
> java.base/java.util.AbstractList.subListRangeCheck(AbstractList.java:505)
>     at java.base/java.util.AbstractList.subList(AbstractList.java:497)
>     at org.apache.calcite.linq4j.Linq4j$ListEnumerable.skip(Linq4j.java:591) 
> {code}
> Expected behavior:
> Negative counts should behave the same regardless of whether the source 
> enumerable happens to be backed by a List.
>  * Negative `take` should return an empty enumerable, matching 
> EnumerableDefaults.take
>  * Negative `skip` should return the original sequence, matching 
> EnumerableDefaults.skip
>  
> This is an optimization-path semantic regression. The same logical operation 
> has two implementations in Calcite, but the list-specialized path is not 
> behaviorally equivalent to the generic enumerable contract for negative 
> counts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to