RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-07-01 Thread Uwe Schindler
Hi,

I don't understand the whole discussion here, so please compare these two 
implementations and tell me which one is faster. Please don't hurt me, if you 
don't want to see src.jar code from OpenJDK Java6 - just delete this mail if 
you don’t want to (the code here is licensed under GPL):

Arrays.java:
/**
 * Copies the specified array, truncating or padding with zeros (if 
necessary)
 * so the copy has the specified length.  For all indices that are
 * valid in both the original array and the copy, the two arrays will
 * contain identical values.  For any indices that are valid in the
 * copy but not the original, the copy will contain tt(byte)0/tt.
 * Such indices will exist if and only if the specified length
 * is greater than that of the original array.
 *
 * @param original the array to be copied
 * @param newLength the length of the copy to be returned
 * @return a copy of the original array, truncated or padded with zeros
 * to obtain the specified length
 * @throws NegativeArraySizeException if ttnewLength/tt is negative
 * @throws NullPointerException if ttoriginal/tt is null
 * @since 1.6
 */
public static byte[] copyOf(byte[] original, int newLength) {
byte[] copy = new byte[newLength];
System.arraycopy(original, 0, copy, 0,
 Math.min(original.length, newLength));
return copy;
}


This is our implementation, simon replaced and Robert reverted 
(UnsafeByteArrayOutputStream):

  private void grow(int newLength) {
// It actually should be: (Java 1.7, when its intrinsic on all machines)
// buffer = Arrays.copyOf(buffer, newLength);
byte[] newBuffer = new byte[newLength];
System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
buffer = newBuffer;
  }

So please look at the code, where is a difference that could slow down, except 
the Math.min() call which is an intrinsic in almost every JDK on earth?

The problem we are talking about here is only about the generic Object[] copyOf 
method and also affects e.g. *all* Collection.toArray() methods - they all use 
this code, so whenever you use ArrayList.toArray() or similar, the slow code is 
executed. This is why we replaced Collections.sort() by CollectionUtil.sort, 
that does no array copy. Simon  me were not willing to replace the 
reallocations in FST code (Mike you remember, we reverted that on your GIT repo 
when we did perf tests) and other parts in Lucene (there are only few of them). 
The idea was only to replace primitive type code to make it easier readable. 
And with later JDK code it could even get faster (not slower), if Oracle starts 
to add intrinsics for those new methods (and that’s Dawid and mine reason to 
change to copyTo for primitive types). In general, if you use Java SDK methods, 
that are as fast as ours, they always have a chance to get faster in later 
JDKs. So we should always prefer Java SDK methods, unless they are slower 
because their default impl is too generic or has too much safety checks or uses 
reflection.


To come back to UnsafeByteArrayOutputStream:

I would change the whole code, as I don’t like the allocation strategy in it 
(it's exponential, on every grow it doubles its size). We should change that to 
use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation 
strategy like in trunk. Then we can discuss about this problem again when Simon 
 me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just 
joking, I will never ask again, because this discussion here is endless and 
does not bring us forward].

The other thing I don’t like in the new faceting module is duplication of vint 
code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new 
In/OutStream wrapper for DataOutput everywhere. This would be much more 
streamlined with all the code we currently have. Then we can encode the 
payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, 
wrapped with a OutputStreamDataOutput wrapper? Or maybe add a 
ByteArrayDataOutput class.

Uwe
(getting crazy)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Friday, July 01, 2011 7:47 AM
 To: Michael McCandless
 Cc: dev@lucene.apache.org; Dawid Weiss
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 On Fri, Jul 1, 2011 at 12:19 AM, Michael McCandless
 luc...@mikemccandless.com wrote:
  On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
  simon.willna...@googlemail.com wrote:
  On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
  dawid.we...@cs.put.poznan.pl wrote:
  I don't seen any evidence that this is any slower though.
 
  You need to run with -client (if the machine is a beast this is
  tricky because x64 

Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-07-01 Thread Michael McCandless
On Fri, Jul 1, 2011 at 2:33 AM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I don't understand the whole discussion here, so please compare these two 
 implementations and tell me which one is faster. Please don't hurt me, if you 
 don't want to see src.jar code from OpenJDK Java6 - just delete this mail if 
 you don’t want to (the code here is licensed under GPL):

This is the source code for a specific version of one specific Java
impl.  If we knew all Java impls simply implemented the primitive case
using System.arraycopy (admittedly it's hard to imagine that they
wouldn't!) then we are fine.

 This is our implementation, simon replaced and Robert reverted 
 (UnsafeByteArrayOutputStream):

  private void grow(int newLength) {
    // It actually should be: (Java 1.7, when its intrinsic on all machines)
    // buffer = Arrays.copyOf(buffer, newLength);
    byte[] newBuffer = new byte[newLength];
    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
    buffer = newBuffer;
  }

 So please look at the code, where is a difference that could slow down, 
 except the Math.min() call which is an intrinsic in almost every JDK on earth?

Right, in this case (if you used OpenJDK 6) we are obviously OK.  Not
sure about other cases...

 The problem we are talking about here is only about the generic Object[] 
 copyOf method and also affects e.g. *all* Collection.toArray() methods - they 
 all use this code, so whenever you use ArrayList.toArray() or similar, the 
 slow code is executed. This is why we replaced Collections.sort() by 
 CollectionUtil.sort, that does no array copy. Simon  me were not willing to 
 replace the reallocations in FST code (Mike you remember, we reverted that on 
 your GIT repo when we did perf tests) and other parts in Lucene (there are 
 only few of them). The idea was only to replace primitive type code to make 
 it easier readable. And with later JDK code it could even get faster (not 
 slower), if Oracle starts to add intrinsics for those new methods (and that’s 
 Dawid and mine reason to change to copyTo for primitive types). In general, 
 if you use Java SDK methods, that are as fast as ours, they always have a 
 chance to get faster in later JDKs. So we should always prefer Java SDK 
 methods, unless they are slower because their default impl is too generic or 
 has too much safety checks or uses reflection.

OK I'm convinced (I think!) that for primitive types only, let's use
Arrays.copyOf!

 To come back to UnsafeByteArrayOutputStream:

 I would change the whole code, as I don’t like the allocation strategy in it 
 (it's exponential, on every grow it doubles its size). We should change that 
 to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar 
 allocation strategy like in trunk. Then we can discuss about this problem 
 again when Simon  me wants to change ArrayUtils.grow methods to use 
 Arrays.copyOf... *g* [just joking, I will never ask again, because this 
 discussion here is endless and does not bring us forward].

Well, it sounds like for primitive types, we can cutover
ArrayUtils.grow methods.  Then we can look @ the nightly bench the
next day ;)

But I agree we should fix UnsafeByteArrayOutputStream... or, isn't it
(almost) a dup of ByteArrayDataOutput?

 The other thing I don’t like in the new faceting module is duplication of 
 vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's 
 new In/OutStream wrapper for DataOutput everywhere. This would be much more 
 streamlined with all the code we currently have. Then we can encode the 
 payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, 
 wrapped with a OutputStreamDataOutput wrapper? Or maybe add a 
 ByteArrayDataOutput class.

That sounds good!

Uwe can you commit TODOs to the code w/ these ideas?

Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-07-01 Thread Shai Erera
About the encoders package - there are several encoders there besides
VInt, so I wouldn't dispose of it so quickly. That said, I think we
should definitely explore consolidating VInt with the core classes,
and maybe write an encoder which delegate to them.

Or, come up w/ a different approach for allowing to plug in different
Encoders. I don't rule out anything, as long as we preserve
functionality and capabilities.

Shai

On Friday, July 1, 2011, Michael McCandless luc...@mikemccandless.com wrote:
 On Fri, Jul 1, 2011 at 2:33 AM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I don't understand the whole discussion here, so please compare these two 
 implementations and tell me which one is faster. Please don't hurt me, if 
 you don't want to see src.jar code from OpenJDK Java6 - just delete this 
 mail if you don’t want to (the code here is licensed under GPL):

 This is the source code for a specific version of one specific Java
 impl.  If we knew all Java impls simply implemented the primitive case
 using System.arraycopy (admittedly it's hard to imagine that they
 wouldn't!) then we are fine.

 This is our implementation, simon replaced and Robert reverted 
 (UnsafeByteArrayOutputStream):

  private void grow(int newLength) {
    // It actually should be: (Java 1.7, when its intrinsic on all machines)
    // buffer = Arrays.copyOf(buffer, newLength);
    byte[] newBuffer = new byte[newLength];
    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
    buffer = newBuffer;
  }

 So please look at the code, where is a difference that could slow down, 
 except the Math.min() call which is an intrinsic in almost every JDK on 
 earth?

 Right, in this case (if you used OpenJDK 6) we are obviously OK.  Not
 sure about other cases...

 The problem we are talking about here is only about the generic Object[] 
 copyOf method and also affects e.g. *all* Collection.toArray() methods - 
 they all use this code, so whenever you use ArrayList.toArray() or similar, 
 the slow code is executed. This is why we replaced Collections.sort() by 
 CollectionUtil.sort, that does no array copy. Simon  me were not willing to 
 replace the reallocations in FST code (Mike you remember, we reverted that 
 on your GIT repo when we did perf tests) and other parts in Lucene (there 
 are only few of them). The idea was only to replace primitive type code to 
 make it easier readable. And with later JDK code it could even get faster 
 (not slower), if Oracle starts to add intrinsics for those new methods (and 
 that’s Dawid and mine reason to change to copyTo for primitive types). In 
 general, if you use Java SDK methods, that are as fast as ours, they always 
 have a chance to get faster in later JDKs. So we should always prefer Java 
 SDK methods, unless they are slower because their default impl is too 
 generic or has too much safety checks or uses reflection.

 OK I'm convinced (I think!) that for primitive types only, let's use
 Arrays.copyOf!

 To come back to UnsafeByteArrayOutputStream:

 I would change the whole code, as I don’t like the allocation strategy in it 
 (it's exponential, on every grow it doubles its size). We should change that 
 to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar 
 allocation strategy like in trunk. Then we can discuss about this problem 
 again when Simon  me wants to change ArrayUtils.grow methods to use 
 Arrays.copyOf... *g* [just joking, I will never ask again, because this 
 discussion here is endless and does not bring us forward].

 Well, it sounds like for primitive types, we can cutover
 ArrayUtils.grow methods.  Then we can look @ the nightly bench the
 next day ;)

 But I agree we should fix UnsafeByteArrayOutputStream... or, isn't it
 (almost) a dup of ByteArrayDataOutput?

 The other thing I don’t like in the new faceting module is duplication of 
 vint code. Why don’t we change it to use DataInput/DataOutput and use 
 Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be 
 much more streamlined with all the code we currently have. Then we can 
 encode the payloads (or later docvalues) using the new 
 UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? 
 Or maybe add a ByteArrayDataOutput class.

 That sounds good!

 Uwe can you commit TODOs to the code w/ these ideas?

 Mike

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
hmm are you concerned about the extra Math.min that happens in the
copyOf method?
I don't how that relates to intrinsic and java 1.7

I don't have strong feelings here just checking if you mix something
up in the comment you put there... I am happy to keep the old and now
current code

simon

On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Uwe Schindler
Hi Robert,

you reverted a use of Arrays.copyOf() on native types which is *exactly* 
implemented like this in Arrays.java code!

The slow ones are T T[] copyOf(T[] array, int newlen)

(because they use reflection).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: rm...@apache.org [mailto:rm...@apache.org]
 Sent: Thursday, June 30, 2011 2:42 PM
 To: comm...@lucene.apache.org
 Subject: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510
 
 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf
 
 Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB
 yteArrayOutputStream.java
 
 Modified:
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB
 yteArrayOutputStream.java
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/or
 g/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1
 =1141509r2=1141510view=diff
 ==
 
 ---
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB
 yteArrayOutputStream.java (original)
 +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
 +++ eByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;
 
  /**
   * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
}
 
private void grow(int newLength) {
 -buffer = Arrays.copyOf(buffer, newLength);
 +// It actually should be: (Java 1.7, when its intrinsic on all machines)
 +// buffer = Arrays.copyOf(buffer, newLength);
 +byte[] newBuffer = new byte[newLength];
 +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +buffer = newBuffer;
}
 
/**
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Robert Muir
because on windows 32bit at least, -client is still the default on
most jres out there.

i realize people don't care about -client, but i will police
foo[].clone() / arrays.copyOf etc to prevent problems.

There are comments about this stuff on the relevant bug reports
(oracle's site is down, sorry) linked to this issue.
https://issues.apache.org/jira/browse/LUCENE-2674

Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
think we should always use arraycopy.

On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
Robert I agree but doesn't that apply to Arrays.copyOf(Object[],int)
only? here we use a specialized primitive version?

simon

On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all 
 machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Dawid Weiss
Arrays.copyOf(primitive) is actually System.arraycopy by default. If
intrinsics are used it can only get faster. For object types it will
probably be a bit slower for -client because of a runtime check for
the component type.

Dawid

On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all 
 machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Uwe Schindler
Robert, 

as noted in my other eMail, ist only slow for the generic Object[] method (as 
it uses j.l.reflect.Array.newInstance(Class componentType)). We are talking 
here about byte[], and the Arrays method is implemented with the same 3 lines 
of code, Simon replaced. The only difference is a Math.min() which is intrinsic 
(it is used, as Arrays.copyOf supports shrinking size, so the 
System.arrayCopy() needs upper limit to not AIOOBE).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Robert Muir [mailto:rcm...@gmail.com]
 Sent: Thursday, June 30, 2011 3:05 PM
 To: dev@lucene.apache.org; simon.willna...@gmail.com
 Cc: comm...@lucene.apache.org
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 because on windows 32bit at least, -client is still the default on most jres 
 out
 there.
 
 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.
 
 There are comments about this stuff on the relevant bug reports (oracle's
 site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674
 
 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we
 should always use arraycopy.
 
 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
  hmm are you concerned about the extra Math.min that happens in the
  copyOf method?
  I don't how that relates to intrinsic and java 1.7
 
  I don't have strong feelings here just checking if you mix something
  up in the comment you put there... I am happy to keep the old and now
  current code
 
  simon
 
  On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
  Author: rmuir
  Date: Thu Jun 30 12:42:17 2011
  New Revision: 1141510
 
  URL: http://svn.apache.org/viewvc?rev=1141510view=rev
  Log:
  LUCENE-3239: remove use of slow Arrays.copyOf
 
  Modified:
 
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
  ByteArrayOutputStream.java
 
  Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
  ByteArrayOutputStream.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/
 
 org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r
  1=1141509r2=1141510view=diff
 
 ==
 ===
  =
  ---
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
  ByteArrayOutputStream.java (original)
  +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Un
  +++ safeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
  @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
   import java.io.IOException;
   import java.io.OutputStream;
  -import java.util.Arrays;
 
   /**
   * Licensed to the Apache Software Foundation (ASF) under one or more
  @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
}
 
private void grow(int newLength) {
  -buffer = Arrays.copyOf(buffer, newLength);
  +// It actually should be: (Java 1.7, when its intrinsic on all
  + machines)
  +// buffer = Arrays.copyOf(buffer, newLength);
  +byte[] newBuffer = new byte[newLength];
  +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
  +buffer = newBuffer;
}
 
/**
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Uwe Schindler
We had an issue about this with FST's array growing in Mike's code, in facts 
ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object 
(uses slow reflection).

For primitives it can only get faster in later JVMs, this is why we want to 
change all ArrayUtils.grow() to use this (and we don’t have a generic one there 
for above reason).

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf
 Of Dawid Weiss
 Sent: Thursday, June 30, 2011 3:11 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 Arrays.copyOf(primitive) is actually System.arraycopy by default. If 
 intrinsics
 are used it can only get faster. For object types it will probably be a bit 
 slower
 for -client because of a runtime check for the component type.
 
 Dawid
 
 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
  because on windows 32bit at least, -client is still the default on
  most jres out there.
 
  i realize people don't care about -client, but i will police
  foo[].clone() / arrays.copyOf etc to prevent problems.
 
  There are comments about this stuff on the relevant bug reports
  (oracle's site is down, sorry) linked to this issue.
  https://issues.apache.org/jira/browse/LUCENE-2674
 
  Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
  think we should always use arraycopy.
 
  On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
  simon.willna...@googlemail.com wrote:
  hmm are you concerned about the extra Math.min that happens in the
  copyOf method?
  I don't how that relates to intrinsic and java 1.7
 
  I don't have strong feelings here just checking if you mix something
  up in the comment you put there... I am happy to keep the old and now
  current code
 
  simon
 
  On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
  Author: rmuir
  Date: Thu Jun 30 12:42:17 2011
  New Revision: 1141510
 
  URL: http://svn.apache.org/viewvc?rev=1141510view=rev
  Log:
  LUCENE-3239: remove use of slow Arrays.copyOf
 
  Modified:
 
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
 
  Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java
 
 /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510
  r1=1141509r2=1141510view=diff
 
 ==
 ==
  ==
  ---
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java (original)
  +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U
  +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
  @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
   import java.io.IOException;
   import java.io.OutputStream;
  -import java.util.Arrays;
 
   /**
   * Licensed to the Apache Software Foundation (ASF) under one or
  more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
}
 
private void grow(int newLength) {
  -buffer = Arrays.copyOf(buffer, newLength);
  +// It actually should be: (Java 1.7, when its intrinsic on all
  + machines)
  +// buffer = Arrays.copyOf(buffer, newLength);
  +byte[] newBuffer = new byte[newLength];
  +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
  +buffer = newBuffer;
}
 
/**
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Thu, Jun 30, 2011 at 3:26 PM, Uwe Schindler u...@thetaphi.de wrote:
 We had an issue about this with FST's array growing in Mike's code, in facts 
 ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends 
 Object (uses slow reflection).

 For primitives it can only get faster in later JVMs, this is why we want to 
 change all ArrayUtils.grow() to use this (and we don’t have a generic one 
 there for above reason).

+1 - I don't see why this would be any slower... if we can get
improvements we should go for it. The issues and bugreports are all
for non-primitive copyOf methods so I don't see how this should affect
us

simon

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf
 Of Dawid Weiss
 Sent: Thursday, June 30, 2011 3:11 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java

 Arrays.copyOf(primitive) is actually System.arraycopy by default. If 
 intrinsics
 are used it can only get faster. For object types it will probably be a bit 
 slower
 for -client because of a runtime check for the component type.

 Dawid

 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
  because on windows 32bit at least, -client is still the default on
  most jres out there.
 
  i realize people don't care about -client, but i will police
  foo[].clone() / arrays.copyOf etc to prevent problems.
 
  There are comments about this stuff on the relevant bug reports
  (oracle's site is down, sorry) linked to this issue.
  https://issues.apache.org/jira/browse/LUCENE-2674
 
  Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
  think we should always use arraycopy.
 
  On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
  simon.willna...@googlemail.com wrote:
  hmm are you concerned about the extra Math.min that happens in the
  copyOf method?
  I don't how that relates to intrinsic and java 1.7
 
  I don't have strong feelings here just checking if you mix something
  up in the comment you put there... I am happy to keep the old and now
  current code
 
  simon
 
  On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
  Author: rmuir
  Date: Thu Jun 30 12:42:17 2011
  New Revision: 1141510
 
  URL: http://svn.apache.org/viewvc?rev=1141510view=rev
  Log:
  LUCENE-3239: remove use of slow Arrays.copyOf
 
  Modified:
 
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
 
  Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java
 
 /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510
  r1=1141509r2=1141510view=diff
 
 ==
 ==
  ==
  ---
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java (original)
  +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U
  +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
  @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
   import java.io.IOException;
   import java.io.OutputStream;
  -import java.util.Arrays;
 
   /**
   * Licensed to the Apache Software Foundation (ASF) under one or
  more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
    }
 
    private void grow(int newLength) {
  -    buffer = Arrays.copyOf(buffer, newLength);
  +    // It actually should be: (Java 1.7, when its intrinsic on all
  + machines)
  +    // buffer = Arrays.copyOf(buffer, newLength);
  +    byte[] newBuffer = new byte[newLength];
  +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
  +    buffer = newBuffer;
    }
 
    /**
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Robert Muir
I think Dawid is correct here... so we should change this back? still
personally when I see array clone() or copyOf() it makes me concerned, I
know these are as fast as arraycopy in recent versions, but depending on
which variant is used, and whether you use -server, can be slower... in
general I just don't want us to have performance regressions on say windows
32bit over this stuff, personally I think arraycopy is a sure fire bet every
time, but Ill concede the point that copyOf might not be slower for the
primitive versions... I think in jdk7 we will not have this issue as -client
sorta goes away in favor of the tiered thing? anyway, I think we should
proceed with caution here as far as moving things over to copyOf, I don't
see any evidence that its ever faster, but its definitely sometimes slower.
On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:
 Arrays.copyOf(primitive) is actually System.arraycopy by default. If
 intrinsics are used it can only get faster. For object types it will
 probably be a bit slower for -client because of a runtime check for
 the component type.

 Dawid

 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:

 
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified:
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL:
http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff

==
 ---
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
(original)
 +++
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -buffer = Arrays.copyOf(buffer, newLength);
 +// It actually should be: (Java 1.7, when its intrinsic on all
machines)
 +// buffer = Arrays.copyOf(buffer, newLength);
 +byte[] newBuffer = new byte[newLength];
 +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Thu, Jun 30, 2011 at 4:44 PM, Robert Muir rcm...@gmail.com wrote:
 I think Dawid is correct here... so we should change this back? still
 personally when I see array clone() or copyOf() it makes me concerned, I
 know these are as fast as arraycopy in recent versions, but depending on
 which variant is used, and whether you use -server, can be slower... in
 general I just don't want us to have performance regressions on say windows
 32bit over this stuff, personally I think arraycopy is a sure fire bet every
 time, but Ill concede the point that copyOf might not be slower for the
 primitive versions... I think in jdk7 we will not have this issue as -client
 sorta goes away in favor of the tiered thing? anyway, I think we should
 proceed with caution here as far as moving things over to copyOf, I don't
 see any evidence that its ever faster, but its definitely sometimes slower.

I don't seen any evidence that this is any slower though.

simon

 On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:
 Arrays.copyOf(primitive) is actually System.arraycopy by default. If
 intrinsics are used it can only get faster. For object types it will
 probably be a bit slower for -client because of a runtime check for
 the component type.

 Dawid

 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:

  lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified:
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff

 ==
 ---
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 (original)
 +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all
 machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Dawid Weiss
 I don't seen any evidence that this is any slower though.

You need to run with -client (if the machine is a beast this is tricky
because x64 will pick -server regardless of the command-line setting)
and you need to be copying generic arrays. I think this can be shown
-- a caliper benchmark would be perfect to demonstrate this in
isolation; I may write one if I find a spare moment.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Michael McCandless
I think it's important Lucene keeps good performance on ordinary
machines/envs.

It's really quite dangerous that the active Lucene devs all use beasts
for development/testing.  We draw false conclusions.

So we really should be testing with -client and if indeed generified
Arrays.copyOf (and anything else) is risky in such envs we should not
use it when System.arraycopy works more consistently.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jun 30, 2011 at 2:50 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Dawid Weiss
 I think it's important Lucene keeps good performance on ordinary
 machines/envs.

Not that this voice will help in anything, but I think the above is
virtually impossible to achieve unless you have a bunch of machines,
OSs and VMs to continually test on and a consistent set of benchmarks
plotted over time... and of course check every single commit for
regression over all these combinations. And even then you'd always
find a case of something being faster or slower on some combination of
hardware/ software; optimizing for these differences makes little
sense to me (people struggling with performance on some weird
software/hardware combination can always change the VM vendor or a VM
switch).

Sorry for being so pessimistically unconstructive... :(

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Michael McCandless
Fair enough, and I agree.

Though the least we could do is rotate in a Windows env, where Java
runs with -client, to our Jenkins.

But simple-to-follow rules like Don't use Arrays.copyOf; use
System.arraycopy instead (if indeed System.arraycopy seems to
generally not be slower) seem like a no-brainer.

Why risk Arrays.copyOf, anytime?  Shouldn't we never use it...?

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jun 30, 2011 at 3:09 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 I think it's important Lucene keeps good performance on ordinary
 machines/envs.

 Not that this voice will help in anything, but I think the above is
 virtually impossible to achieve unless you have a bunch of machines,
 OSs and VMs to continually test on and a consistent set of benchmarks
 plotted over time... and of course check every single commit for
 regression over all these combinations. And even then you'd always
 find a case of something being faster or slower on some combination of
 hardware/ software; optimizing for these differences makes little
 sense to me (people struggling with performance on some weird
 software/hardware combination can always change the VM vendor or a VM
 switch).

 Sorry for being so pessimistically unconstructive... :(

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

this is what I want to see. I don't want to discuss based on some bug
reported for a non-primitive version of copyOf thats all.
its pointless to discuss if there is no evidence which I don't see. I
am happy with arraycopy I would just have appreciated a discussion
before backing the change out.

simon

 Dawid


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Michael McCandless
On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
 On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
 dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

 this is what I want to see. I don't want to discuss based on some bug
 reported for a non-primitive version of copyOf thats all.
 its pointless to discuss if there is no evidence which I don't see. I
 am happy with arraycopy I would just have appreciated a discussion
 before backing the change out.

I think the burden of proof here is on Arrays.copyOf.

Ie, until we can prove (through benchmarking in different envs) that
it can be trusted, we should just stick with System.arraycopy to
reduce the risk.

Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Fri, Jul 1, 2011 at 12:19 AM, Michael McCandless
luc...@mikemccandless.com wrote:
 On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
 dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

 this is what I want to see. I don't want to discuss based on some bug
 reported for a non-primitive version of copyOf thats all.
 its pointless to discuss if there is no evidence which I don't see. I
 am happy with arraycopy I would just have appreciated a discussion
 before backing the change out.

 I think the burden of proof here is on Arrays.copyOf.

 Ie, until we can prove (through benchmarking in different envs) that
 it can be trusted, we should just stick with System.arraycopy to
 reduce the risk.

Mike I think we should do that but the real issue here is what if
somebody comes up with any arbitrary method in the future claiming its
slow we back out and use the we think safe way what if it is
actually the other way around and copyOf is optimized by new VMs and
the copyarray is slightly slower. If robert would not have reverted
this commit we would have not discussed that her at all. I just want
to prevent the we should not do this it might be a problem in the
big picture while the microbenchmark doesn't show a difference. At
some point we have to rely on the JVM.
Even if we benchmark on all OS with various JVMs we can't prevent jvm
bug based perf hits. While there was no bug reported for primitives
here we don't have to be afraid of it either. I don't think its saver
to use arraycopy at all its just a level of indirection safer but
makes development more of a pain IMO.

Just my $0.05

 Mike


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org