RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Hi, I don't understand the whole discussion here, so please compare these two implementations and tell me which one is faster. Please don't hurt me, if you don't want to see src.jar code from OpenJDK Java6 - just delete this mail if you don’t want to (the code here is licensed under GPL): Arrays.java: /** * Copies the specified array, truncating or padding with zeros (if necessary) * so the copy has the specified length. For all indices that are * valid in both the original array and the copy, the two arrays will * contain identical values. For any indices that are valid in the * copy but not the original, the copy will contain tt(byte)0/tt. * Such indices will exist if and only if the specified length * is greater than that of the original array. * * @param original the array to be copied * @param newLength the length of the copy to be returned * @return a copy of the original array, truncated or padded with zeros * to obtain the specified length * @throws NegativeArraySizeException if ttnewLength/tt is negative * @throws NullPointerException if ttoriginal/tt is null * @since 1.6 */ public static byte[] copyOf(byte[] original, int newLength) { byte[] copy = new byte[newLength]; System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength)); return copy; } This is our implementation, simon replaced and Robert reverted (UnsafeByteArrayOutputStream): private void grow(int newLength) { // It actually should be: (Java 1.7, when its intrinsic on all machines) // buffer = Arrays.copyOf(buffer, newLength); byte[] newBuffer = new byte[newLength]; System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); buffer = newBuffer; } So please look at the code, where is a difference that could slow down, except the Math.min() call which is an intrinsic in almost every JDK on earth? The problem we are talking about here is only about the generic Object[] copyOf method and also affects e.g. *all* Collection.toArray() methods - they all use this code, so whenever you use ArrayList.toArray() or similar, the slow code is executed. This is why we replaced Collections.sort() by CollectionUtil.sort, that does no array copy. Simon me were not willing to replace the reallocations in FST code (Mike you remember, we reverted that on your GIT repo when we did perf tests) and other parts in Lucene (there are only few of them). The idea was only to replace primitive type code to make it easier readable. And with later JDK code it could even get faster (not slower), if Oracle starts to add intrinsics for those new methods (and that’s Dawid and mine reason to change to copyTo for primitive types). In general, if you use Java SDK methods, that are as fast as ours, they always have a chance to get faster in later JDKs. So we should always prefer Java SDK methods, unless they are slower because their default impl is too generic or has too much safety checks or uses reflection. To come back to UnsafeByteArrayOutputStream: I would change the whole code, as I don’t like the allocation strategy in it (it's exponential, on every grow it doubles its size). We should change that to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation strategy like in trunk. Then we can discuss about this problem again when Simon me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just joking, I will never ask again, because this discussion here is endless and does not bring us forward]. The other thing I don’t like in the new faceting module is duplication of vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be much more streamlined with all the code we currently have. Then we can encode the payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? Or maybe add a ByteArrayDataOutput class. Uwe (getting crazy) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Simon Willnauer [mailto:simon.willna...@googlemail.com] Sent: Friday, July 01, 2011 7:47 AM To: Michael McCandless Cc: dev@lucene.apache.org; Dawid Weiss Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java On Fri, Jul 1, 2011 at 12:19 AM, Michael McCandless luc...@mikemccandless.com wrote: On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Fri, Jul 1, 2011 at 2:33 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, I don't understand the whole discussion here, so please compare these two implementations and tell me which one is faster. Please don't hurt me, if you don't want to see src.jar code from OpenJDK Java6 - just delete this mail if you don’t want to (the code here is licensed under GPL): This is the source code for a specific version of one specific Java impl. If we knew all Java impls simply implemented the primitive case using System.arraycopy (admittedly it's hard to imagine that they wouldn't!) then we are fine. This is our implementation, simon replaced and Robert reverted (UnsafeByteArrayOutputStream): private void grow(int newLength) { // It actually should be: (Java 1.7, when its intrinsic on all machines) // buffer = Arrays.copyOf(buffer, newLength); byte[] newBuffer = new byte[newLength]; System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); buffer = newBuffer; } So please look at the code, where is a difference that could slow down, except the Math.min() call which is an intrinsic in almost every JDK on earth? Right, in this case (if you used OpenJDK 6) we are obviously OK. Not sure about other cases... The problem we are talking about here is only about the generic Object[] copyOf method and also affects e.g. *all* Collection.toArray() methods - they all use this code, so whenever you use ArrayList.toArray() or similar, the slow code is executed. This is why we replaced Collections.sort() by CollectionUtil.sort, that does no array copy. Simon me were not willing to replace the reallocations in FST code (Mike you remember, we reverted that on your GIT repo when we did perf tests) and other parts in Lucene (there are only few of them). The idea was only to replace primitive type code to make it easier readable. And with later JDK code it could even get faster (not slower), if Oracle starts to add intrinsics for those new methods (and that’s Dawid and mine reason to change to copyTo for primitive types). In general, if you use Java SDK methods, that are as fast as ours, they always have a chance to get faster in later JDKs. So we should always prefer Java SDK methods, unless they are slower because their default impl is too generic or has too much safety checks or uses reflection. OK I'm convinced (I think!) that for primitive types only, let's use Arrays.copyOf! To come back to UnsafeByteArrayOutputStream: I would change the whole code, as I don’t like the allocation strategy in it (it's exponential, on every grow it doubles its size). We should change that to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation strategy like in trunk. Then we can discuss about this problem again when Simon me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just joking, I will never ask again, because this discussion here is endless and does not bring us forward]. Well, it sounds like for primitive types, we can cutover ArrayUtils.grow methods. Then we can look @ the nightly bench the next day ;) But I agree we should fix UnsafeByteArrayOutputStream... or, isn't it (almost) a dup of ByteArrayDataOutput? The other thing I don’t like in the new faceting module is duplication of vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be much more streamlined with all the code we currently have. Then we can encode the payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? Or maybe add a ByteArrayDataOutput class. That sounds good! Uwe can you commit TODOs to the code w/ these ideas? Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
About the encoders package - there are several encoders there besides VInt, so I wouldn't dispose of it so quickly. That said, I think we should definitely explore consolidating VInt with the core classes, and maybe write an encoder which delegate to them. Or, come up w/ a different approach for allowing to plug in different Encoders. I don't rule out anything, as long as we preserve functionality and capabilities. Shai On Friday, July 1, 2011, Michael McCandless luc...@mikemccandless.com wrote: On Fri, Jul 1, 2011 at 2:33 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, I don't understand the whole discussion here, so please compare these two implementations and tell me which one is faster. Please don't hurt me, if you don't want to see src.jar code from OpenJDK Java6 - just delete this mail if you don’t want to (the code here is licensed under GPL): This is the source code for a specific version of one specific Java impl. If we knew all Java impls simply implemented the primitive case using System.arraycopy (admittedly it's hard to imagine that they wouldn't!) then we are fine. This is our implementation, simon replaced and Robert reverted (UnsafeByteArrayOutputStream): private void grow(int newLength) { // It actually should be: (Java 1.7, when its intrinsic on all machines) // buffer = Arrays.copyOf(buffer, newLength); byte[] newBuffer = new byte[newLength]; System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); buffer = newBuffer; } So please look at the code, where is a difference that could slow down, except the Math.min() call which is an intrinsic in almost every JDK on earth? Right, in this case (if you used OpenJDK 6) we are obviously OK. Not sure about other cases... The problem we are talking about here is only about the generic Object[] copyOf method and also affects e.g. *all* Collection.toArray() methods - they all use this code, so whenever you use ArrayList.toArray() or similar, the slow code is executed. This is why we replaced Collections.sort() by CollectionUtil.sort, that does no array copy. Simon me were not willing to replace the reallocations in FST code (Mike you remember, we reverted that on your GIT repo when we did perf tests) and other parts in Lucene (there are only few of them). The idea was only to replace primitive type code to make it easier readable. And with later JDK code it could even get faster (not slower), if Oracle starts to add intrinsics for those new methods (and that’s Dawid and mine reason to change to copyTo for primitive types). In general, if you use Java SDK methods, that are as fast as ours, they always have a chance to get faster in later JDKs. So we should always prefer Java SDK methods, unless they are slower because their default impl is too generic or has too much safety checks or uses reflection. OK I'm convinced (I think!) that for primitive types only, let's use Arrays.copyOf! To come back to UnsafeByteArrayOutputStream: I would change the whole code, as I don’t like the allocation strategy in it (it's exponential, on every grow it doubles its size). We should change that to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation strategy like in trunk. Then we can discuss about this problem again when Simon me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just joking, I will never ask again, because this discussion here is endless and does not bring us forward]. Well, it sounds like for primitive types, we can cutover ArrayUtils.grow methods. Then we can look @ the nightly bench the next day ;) But I agree we should fix UnsafeByteArrayOutputStream... or, isn't it (almost) a dup of ByteArrayDataOutput? The other thing I don’t like in the new faceting module is duplication of vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be much more streamlined with all the code we currently have. Then we can encode the payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? Or maybe add a ByteArrayDataOutput class. That sounds good! Uwe can you commit TODOs to the code w/ these ideas? Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Hi Robert, you reverted a use of Arrays.copyOf() on native types which is *exactly* implemented like this in Arrays.java code! The slow ones are T T[] copyOf(T[] array, int newlen) (because they use reflection). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: rm...@apache.org [mailto:rm...@apache.org] Sent: Thursday, June 30, 2011 2:42 PM To: comm...@lucene.apache.org Subject: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB yteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB yteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/or g/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1 =1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB yteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf +++ eByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Robert I agree but doesn't that apply to Arrays.copyOf(Object[],int) only? here we use a specialized primitive version? simon On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Robert, as noted in my other eMail, ist only slow for the generic Object[] method (as it uses j.l.reflect.Array.newInstance(Class componentType)). We are talking here about byte[], and the Arrays method is implemented with the same 3 lines of code, Simon replaced. The only difference is a Math.min() which is intrinsic (it is used, as Arrays.copyOf supports shrinking size, so the System.arrayCopy() needs upper limit to not AIOOBE). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 30, 2011 3:05 PM To: dev@lucene.apache.org; simon.willna...@gmail.com Cc: comm...@lucene.apache.org Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/ org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r 1=1141509r2=1141510view=diff == === = --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Un +++ safeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all + machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
We had an issue about this with FST's array growing in Mike's code, in facts ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object (uses slow reflection). For primitives it can only get faster in later JVMs, this is why we want to change all ArrayUtils.grow() to use this (and we don’t have a generic one there for above reason). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Thursday, June 30, 2011 3:11 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510 r1=1141509r2=1141510view=diff == == == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all + machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 3:26 PM, Uwe Schindler u...@thetaphi.de wrote: We had an issue about this with FST's array growing in Mike's code, in facts ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object (uses slow reflection). For primitives it can only get faster in later JVMs, this is why we want to change all ArrayUtils.grow() to use this (and we don’t have a generic one there for above reason). +1 - I don't see why this would be any slower... if we can get improvements we should go for it. The issues and bugreports are all for non-primitive copyOf methods so I don't see how this should affect us simon - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Thursday, June 30, 2011 3:11 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510 r1=1141509r2=1141510view=diff == == == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all + machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I think Dawid is correct here... so we should change this back? still personally when I see array clone() or copyOf() it makes me concerned, I know these are as fast as arraycopy in recent versions, but depending on which variant is used, and whether you use -server, can be slower... in general I just don't want us to have performance regressions on say windows 32bit over this stuff, personally I think arraycopy is a sure fire bet every time, but Ill concede the point that copyOf might not be slower for the primitive versions... I think in jdk7 we will not have this issue as -client sorta goes away in favor of the tiered thing? anyway, I think we should proceed with caution here as far as moving things over to copyOf, I don't see any evidence that its ever faster, but its definitely sometimes slower. On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 4:44 PM, Robert Muir rcm...@gmail.com wrote: I think Dawid is correct here... so we should change this back? still personally when I see array clone() or copyOf() it makes me concerned, I know these are as fast as arraycopy in recent versions, but depending on which variant is used, and whether you use -server, can be slower... in general I just don't want us to have performance regressions on say windows 32bit over this stuff, personally I think arraycopy is a sure fire bet every time, but Ill concede the point that copyOf might not be slower for the primitive versions... I think in jdk7 we will not have this issue as -client sorta goes away in favor of the tiered thing? anyway, I think we should proceed with caution here as far as moving things over to copyOf, I don't see any evidence that its ever faster, but its definitely sometimes slower. I don't seen any evidence that this is any slower though. simon On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I think it's important Lucene keeps good performance on ordinary machines/envs. It's really quite dangerous that the active Lucene devs all use beasts for development/testing. We draw false conclusions. So we really should be testing with -client and if indeed generified Arrays.copyOf (and anything else) is risky in such envs we should not use it when System.arraycopy works more consistently. Mike McCandless http://blog.mikemccandless.com On Thu, Jun 30, 2011 at 2:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I think it's important Lucene keeps good performance on ordinary machines/envs. Not that this voice will help in anything, but I think the above is virtually impossible to achieve unless you have a bunch of machines, OSs and VMs to continually test on and a consistent set of benchmarks plotted over time... and of course check every single commit for regression over all these combinations. And even then you'd always find a case of something being faster or slower on some combination of hardware/ software; optimizing for these differences makes little sense to me (people struggling with performance on some weird software/hardware combination can always change the VM vendor or a VM switch). Sorry for being so pessimistically unconstructive... :( Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Fair enough, and I agree. Though the least we could do is rotate in a Windows env, where Java runs with -client, to our Jenkins. But simple-to-follow rules like Don't use Arrays.copyOf; use System.arraycopy instead (if indeed System.arraycopy seems to generally not be slower) seem like a no-brainer. Why risk Arrays.copyOf, anytime? Shouldn't we never use it...? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 30, 2011 at 3:09 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I think it's important Lucene keeps good performance on ordinary machines/envs. Not that this voice will help in anything, but I think the above is virtually impossible to achieve unless you have a bunch of machines, OSs and VMs to continually test on and a consistent set of benchmarks plotted over time... and of course check every single commit for regression over all these combinations. And even then you'd always find a case of something being faster or slower on some combination of hardware/ software; optimizing for these differences makes little sense to me (people struggling with performance on some weird software/hardware combination can always change the VM vendor or a VM switch). Sorry for being so pessimistically unconstructive... :( Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. this is what I want to see. I don't want to discuss based on some bug reported for a non-primitive version of copyOf thats all. its pointless to discuss if there is no evidence which I don't see. I am happy with arraycopy I would just have appreciated a discussion before backing the change out. simon Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. this is what I want to see. I don't want to discuss based on some bug reported for a non-primitive version of copyOf thats all. its pointless to discuss if there is no evidence which I don't see. I am happy with arraycopy I would just have appreciated a discussion before backing the change out. I think the burden of proof here is on Arrays.copyOf. Ie, until we can prove (through benchmarking in different envs) that it can be trusted, we should just stick with System.arraycopy to reduce the risk. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Fri, Jul 1, 2011 at 12:19 AM, Michael McCandless luc...@mikemccandless.com wrote: On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. this is what I want to see. I don't want to discuss based on some bug reported for a non-primitive version of copyOf thats all. its pointless to discuss if there is no evidence which I don't see. I am happy with arraycopy I would just have appreciated a discussion before backing the change out. I think the burden of proof here is on Arrays.copyOf. Ie, until we can prove (through benchmarking in different envs) that it can be trusted, we should just stick with System.arraycopy to reduce the risk. Mike I think we should do that but the real issue here is what if somebody comes up with any arbitrary method in the future claiming its slow we back out and use the we think safe way what if it is actually the other way around and copyOf is optimized by new VMs and the copyarray is slightly slower. If robert would not have reverted this commit we would have not discussed that her at all. I just want to prevent the we should not do this it might be a problem in the big picture while the microbenchmark doesn't show a difference. At some point we have to rely on the JVM. Even if we benchmark on all OS with various JVMs we can't prevent jvm bug based perf hits. While there was no bug reported for primitives here we don't have to be afraid of it either. I don't think its saver to use arraycopy at all its just a level of indirection safer but makes development more of a pain IMO. Just my $0.05 Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org