paulk-asert commented on PR #2434:
URL: https://github.com/apache/groovy/pull/2434#issuecomment-4223545932
I tested using this test:
```
@Grab('com.google.guava:guava:33.5.0-jre')
import com.google.common.collect.Iterables
def one = [null]
50000.times {
one << it as String
}
def two = []
1000.times {
two << (it * 2) as String
}
println "${one.size()} ${two.size()}"
def copy = new ArrayList(one)
def startTime = System.currentTimeMillis()
def result = Iterables.removeAll(copy, two)
println "Guava took ${System.currentTimeMillis() - startTime}"
println "${one.size()} ${two.size()} ${copy.size()}"
copy = new ArrayList(two)
startTime = System.currentTimeMillis()
result = Iterables.removeAll(copy, one)
println "Guava rev took ${System.currentTimeMillis() - startTime}"
println "${one.size()} ${two.size()} ${copy.size()}"
copy = new ArrayList(one)
startTime = System.currentTimeMillis()
result = copy.removeAll(two)
println "JDK took ${System.currentTimeMillis() - startTime}"
println "${one.size()} ${two.size()} ${copy.size()}"
copy = new ArrayList(two)
startTime = System.currentTimeMillis()
result = copy.removeAll(one)
println "JDK rev took ${System.currentTimeMillis() - startTime}"
println "${one.size()} ${two.size()} ${copy.size()}"
copy = new ArrayList(one)
startTime = System.currentTimeMillis()
result = copy - two
println "Groovy took ${System.currentTimeMillis() - startTime}"
println "${one.size()} ${two.size()} ${copy.size()}"
copy = new ArrayList(two)
startTime = System.currentTimeMillis()
result = copy - one
println "Groovy rev took ${System.currentTimeMillis() - startTime}"
println "${one.size()} ${two.size()} ${copy.size()}"
```
Which had this output:
> // BEFORE FIX
> 50001 1000
> Guava took 65
> 50001 1000 49001
> Guava rev took 1
> 50001 1000 0
> JDK took 53
> 50001 1000 49001
> JDK rev took 1
> 50001 1000 0
> Groovy took 242
> 50001 1000 49001
> Groovy rev took 5
> 50001 1000 0
> // AFTER FIX
> 50001 1000
> Guava took 102
> 50001 1000 49001
> Guava rev took 2
> 50001 1000 0
> JDK took 69
> 50001 1000 49001
> JDK rev took 1
> 50001 1000 0
> Groovy took 129
> 50001 1000 50001
> Groovy rev took 3
> 50001 1000 1000
If I removed the `null` to seed the list (it can be in either one), then the
results are:
> // BEFORE FIX
> 50000 1000
> Guava took 63
> 50000 1000 49000
> Guava rev took 1
> 50000 1000 0
> JDK took 59
> 50000 1000 49000
> JDK rev took 1
> 50000 1000 0
> Groovy took 3
> 50000 1000 50000
> Groovy rev took 6
> 50000 1000 1000
> // AFTER FIX
> 50000 1000
> Guava took 63
> 50000 1000 49000
> Guava rev took 1
> 50000 1000 0
> JDK took 60
> 50000 1000 49000
> JDK rev took 1
> 50000 1000 0
> Groovy took 17
> 50000 1000 50000
> Groovy rev took 16
> 50000 1000 1000
These are really crude in terms of a benchmark but it shows:
* the clobbering edge case behavior when the first element in either list is
not comparable for the BEFORE case with a null
* the optimisation is still in place when one of the lists is much bigger
than the other (whereas Guava and the JDK have asymmetry here)
* we did lose some speed (I think - given the crudeness of this measure)
with the FIX vs the previous one but correctness trumps speed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]