Re: An interessting change for shared char[] in String/StringBuffer

2004-04-08 Thread Thomas Zander
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thursday 08 April 2004 15:24, Eric Blake wrote:
> Tom Tromey wrote:

> I also wonder if the following implementation would be more efficient. 
> In the common case, StringBuffer/StringBuilder is used for appends, and
> then converted to a String just before being discarded.  Currently, for
> every append, we adjust the underlying char[] and copy the appended
> String into that array.  Would it be better ...

An alternate approach;
some time ago I noticed that the ByteArrayOutputStream copied all data 
already present on each add, which is when we came up with the following 
approach that would be usefull here as well;

Use a list of byte[] 's (or char[]'s) and optimistically allocate space in 
the first array.
At appends copy all the bytes that will fit into the already allocated 
byte[], and allocate a second one for the rest.  Use some algoritm that 
allocates based on previous size (say; 20% of current size) so you will 
get increasingly bigger arrays which you copy into.

At the end you will have a list of arrays which you can copy into one big 
array on toString() or similar.

We found that this is the most effecient way of accumulating data in one 
class with minimum garbage collection / allocations.

Hope thats clear.
- -- 
Thomas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAdVc1CojCW6H2z/QRAqsSAKDrtFYdfBLrpSmVgd20zR+uDOespgCgnhBz
Q3bBChRLpbavbuKmP2lePdY=
=x02e
-END PGP SIGNATURE-


___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath


Re: An interessting change for shared char[] in String/StringBuffer

2004-04-08 Thread Eric Blake
Tom Tromey wrote:
">" == Chr Ullenboom <[EMAIL PROTECTED]> writes:


From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:
"In this release, the sharing between String and StringBuffer has been
eliminated."
Does this mean to change to change the implementation in Classpath too? I
like this optimization...


I don't think we necessarily have to change this.  IMO it would depend
on whether the change is observable by user code.  Our implementation
doesn't always share, anyway.  It only shares if the buffer is mostly
in use.
Part of Sun's rationale for their change is that in 1.5, they introduced 
java.lang.StringBuilder, a non-synchronized copy of StringBuffer with 
otherwise identical semantics.  Similar to gnu.java.lang.StringBuffer used in 
gcj, if StringBuilder exists, it allows the compiler to emit more efficient 
string concatenation (and jikes already knows how to use it).  My 
understanding of Sun's implementation of StringBuilder is that it is rather 
simplistic (based on a bug report to Sun's site complaining that the javadoc 
was misleading because of the non-public superclass) - they renamed the old 
StringBuffer into a new package-private class java.lang.AbstractStringBuffer 
(or some such name) for all the implementation, then created StringBuffer and 
StringBuilder as public extension classes with no further implementation 
(other than the fact that all the StringBuffer methods add synchronization). 
So perhaps they made the change on sharing the char[] because their 
class-shuffling broke something.

I agree with Tom that we would have to benchmark it to see if sharing or not 
sharing is more efficient, before blindly choosing one way over the other just 
to match Sun.  If I understand correctly, back in JDK 1.0, Sun did NOT use 
char[] sharing - it was added later as an optimization before JIT compilers 
were as good as they are now (and now Sun claims to be deleting it as an 
optimization).  Also, we will have to be careful that we handle serialization 
of StringBuffer correctly, whichever way we choose.

I also wonder if the following implementation would be more efficient.  In the 
common case, StringBuffer/StringBuilder is used for appends, and then 
converted to a String just before being discarded.  Currently, for every 
append, we adjust the underlying char[] and copy the appended String into that 
array.  Would it be better to just build a String[] that caches all the 
appended Strings, and then create a single char[] at the time toString() is 
called, rather than updating the char[] for every append()?  Of course, we 
would have to create the char[] for any non-append() method.  And one of the 
disadvantages of this method is that we end up creating intermediate Strings 
when we append primitive types, whereas the current implementation can update 
the char[] without creating any intermediate objects.  I haven't coded this up 
to experiment on the difference, but it would be an interesting experiment.

--
Someday, I might put a cute statement here.
Eric Blake [EMAIL PROTECTED]



___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath


Re: An interessting change for shared char[] in String/StringBuffer

2004-04-08 Thread Thomas Zander
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thursday 08 April 2004 01:24, Artur Biesiadowski wrote:
> Tom Tromey wrote:
> >>>From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:
> >>>"In this release, the sharing between String and StringBuffer has
> >>> been eliminated."
> >>>Does this mean to change to change the implementation in Classpath
> >>> too? I like this optimization...
> >
> > I don't think we necessarily have to change this.  IMO it would depend
> > on whether the change is observable by user code.  Our implementation
> > doesn't always share, anyway.  It only shares if the buffer is mostly
> > in use.
>
> 'Observable by user code' is very broad term. It is possible to write a
> short program which will throw OutOfMemoryException with sharing on and
> work without problems when data is always copied - does it counts as
> observation from user code ?

No, I don't think so.
If the only difference (as I understand from Tom's email) is
faster/effecient implementation then copying behavior from Sus JVM is not
needed in the first place.
Waiting for the 1.5 release and seeing if this proved too unstable for the
Sun JVM gives you a good indication weather this feature is a nice
advancement in Classpath..

Just my 2 cents.
- --
Thomas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAdRWJCojCW6H2z/QRAt2aAKClbQqTWl8NU5rjZDH08JzXVvj/ywCeJkZn
X3IMy1RwPHMrHs3oXh0J+Sc=
=1AOO
-END PGP SIGNATURE-


___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath


Re: An interessting change for shared char[] in String/StringBuffer

2004-04-07 Thread Artur Biesiadowski
Tom Tromey wrote:

From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:
"In this release, the sharing between String and StringBuffer has been
eliminated."
Does this mean to change to change the implementation in Classpath too? I
like this optimization...


I don't think we necessarily have to change this.  IMO it would depend
on whether the change is observable by user code.  Our implementation
doesn't always share, anyway.  It only shares if the buffer is mostly
in use.
'Observable by user code' is very broad term. It is possible to write a 
short program which will throw OutOfMemoryException with sharing on and 
work without problems when data is always copied - does it counts as 
observation from user code ?

Artur

___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath


Re: An interessting change for shared char[] in String/StringBuffer

2004-04-07 Thread Tom Tromey
> ">" == Chr Ullenboom <[EMAIL PROTECTED]> writes:

>> From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:
>> "In this release, the sharing between String and StringBuffer has been
>> eliminated."
>> Does this mean to change to change the implementation in Classpath too? I
>> like this optimization...

I don't think we necessarily have to change this.  IMO it would depend
on whether the change is observable by user code.  Our implementation
doesn't always share, anyway.  It only shares if the buffer is mostly
in use.

Tom


___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath


Re: An interessting change for shared char[] in String/StringBuffer

2004-04-07 Thread Michael Koch
Am Mittwoch, 7. April 2004 19:48 schrieb Chr. Ullenboom:
> From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:
>
> "In this release, the sharing between String and StringBuffer has
> been eliminated."
>
> Does this mean to change to change the implementation in Classpath
> too? I like this optimization...

Let us wait until 1.5.0 final is out. I wouldnt like changing it now 
then reverting it later again because Sun removed it again ...


Michael



___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath


An interessting change for shared char[] in String/StringBuffer

2004-04-07 Thread Chr. Ullenboom
>From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:

"In this release, the sharing between String and StringBuffer has been
eliminated."

Does this mean to change to change the implementation in Classpath too? I
like this optimization...

Bye,

Chr. Ullenboom



___
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath