Re: Problem with StringBuffer
Godmar Back wrote: ... > Tatu, how did you measure the memory usage? > I don't doubt that it's worse than jdk1.1.8 (which: IBM's? Blackdown's?) - > but almost twice it shouldn't (have to) be. It was Blackdown's, and so doesn't have a JIT (ie. won't use memory for compiled code... probably not a very significant difference but still). The 'measurements' I did simply by running 'top' on another window and seeing how the SIZE changed. :-) > Did you try the -mx switch? Note that kaffe tries to use up > to 64 MB by default. (The -mx switch may not work properly in kaffe, > however.) I didn't use -mx, but top showed Kaffe to only use about 8 megs of memory. When I have more time to continue developing the application in question I'll try to see if I could get more information. Normally difference seems to be smaller; for another application (Fractlet, a fractal draw application) it's 8M (Blackdown) vs. 9M (Kaffe), which sounds quite normal. -+ Tatu +-
Re: Problem with StringBuffer
> > I updated to latest CVS-sources and yes, the problem went away. The > memory footprint is now almost identical to the case where I just > create a new StringBuffer every time. The memory usage is still bit > high (kaffe seems to use almost twice as much memory as jdk11.8, but > the application is bit memory hungry in any case), but at least it's > not exponentially growing. Besides, it's nice that kaffe is > significantly > faster than vanilla Blackdown JDK in this case. > Tatu, how did you measure the memory usage? I don't doubt that it's worse than jdk1.1.8 (which: IBM's? Blackdown's?) - but almost twice it shouldn't (have to) be. Did you try the -mx switch? Note that kaffe tries to use up to 64 MB by default. (The -mx switch may not work properly in kaffe, however.) - Godmar
Re: Problem with StringBuffer
Archie Cobbs wrote: > > Tatu Saloranta writes: > > > > Then again as there already is a patch that should save space (even if > > StringBuffer is used just once), I'm a happy camper. :-) > > Tatu- > Aside from the theoretical debate, I'd be interested to hear if > the recent checkin's alleviate the problem you're seeing. Patches > are included below in case you don't have the latest CVS. > > Thanks, > -Archie I updated to latest CVS-sources and yes, the problem went away. The memory footprint is now almost identical to the case where I just create a new StringBuffer every time. The memory usage is still bit high (kaffe seems to use almost twice as much memory as jdk11.8, but the application is bit memory hungry in any case), but at least it's not exponentially growing. Besides, it's nice that kaffe is significantly faster than vanilla Blackdown JDK in this case. -+ Tatu +-
Re: Problem with StringBuffer
Artur Biesiadowski writes: > > // Note: value, offset, and count are not private, because > > // StringBuffer uses them for faster access > > - char[] value; > > - int offset; > > - int count; > > + char[] value; // really "final" > > + int offset; // really "final" > > + int count; // really "final" > > > And why these fields are not in fact final ? They are assigned to only > in constructor so they could be safely marked as final. I don't suppose > it will make any difference in speed with current jit, but it would be > selfdocumenting code - plus possibility of catching some bug early > during String changes/optimalizations. Does this answer your question? :-) $ jikes String.java 332. value = str.value; <---> *** Error: Possible attempt to reassign a value to the final variable "value". 334. count = str.count; <---> *** Error: Possible attempt to reassign a value to the final variable "count". -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: Problem with StringBuffer
Archie Cobbs wrote: > // Note: value, offset, and count are not private, because > // StringBuffer uses them for faster access > - char[] value; > - int offset; > - int count; > + char[] value; // really "final" > + int offset; // really "final" > + int count; // really "final" And why these fields are not in fact final ? They are assigned to only in constructor so they could be safely marked as final. I don't suppose it will make any difference in speed with current jit, but it would be selfdocumenting code - plus possibility of catching some bug early during String changes/optimalizations. Artur
Re: Problem with StringBuffer
Tatu Saloranta writes: > > > > - most JAVA programmers try to code a program that behaves well > >under all JVMs available. > > - The default StringBuffer implementation from SUN have problems > >with resuse of large Stringbuffers. > > > > So, IMO all you can do is to code around this problem. > > I perhaps misunderstood what you were saying first, and you are of > course > right in that given that current implementations (I haven't tested IBM's > jdk to see if it also has the problem, though) have problems, it is > best to code around the problem (that's what I did once I found the > problem). What I tried to say was simply that it would be nice to > fix (or at least alleviate) the potential problem in Kaffe, as it > seems relatively easy to do (and as the problem has been fixed in > JDK it seems like it was considered a real implementation deficiency). > > Then again as there already is a patch that should save space (even if > StringBuffer is used just once), I'm a happy camper. :-) Tatu- Aside from the theoretical debate, I'd be interested to hear if the recent checkin's alleviate the problem you're seeing. Patches are included below in case you don't have the latest CVS. Thanks, -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com --- String.java.origFri Mar 31 15:20:58 2000 +++ String.java Fri Mar 31 15:24:01 2000 @@ -20,11 +20,17 @@ final public class String implements Serializable, Comparable { + /** +* Maximum slop (extra unused chars in the char[]) that +* will be accepted in a StringBuffer -> String conversion. +*/ + private static final int STRINGBUFFER_SLOP = 32; + // Note: value, offset, and count are not private, because // StringBuffer uses them for faster access - char[] value; - int offset; - int count; + char[] value; // really "final" + int offset; // really "final" + int count; // really "final" int hash; boolean interned; @@ -51,12 +57,18 @@ } public String (StringBuffer sb) { - - // mark this StringBuffer so that it knows we are using it - sb.isStringized = true; - - count = sb.used; - value = sb.buffer; + synchronized (sb) { + if (sb.buffer.length > sb.used + STRINGBUFFER_SLOP) { + value = new char[sb.used]; + count = sb.used; + System.arraycopy(sb.buffer, 0, value, 0, count); + } + else { + value = sb.buffer; + count = sb.used; + sb.isStringized = true; + } + } } public String( byte[] bytes) { --- StringBuffer.java.orig Fri Mar 31 15:20:58 2000 +++ StringBuffer.java Sat Apr 1 10:03:16 2000 @@ -1,6 +1,3 @@ -package java.lang; - - /* * Java core library component. * @@ -10,13 +7,16 @@ * See the file "license.terms" for information on usage and redistribution * of this file. */ -final public class StringBuffer - implements java.io.Serializable -{ - char[] buffer; - int used; - boolean isStringized; - final private int SPARECAPACITY = 16; + +package java.lang; + +public final class StringBuffer implements java.io.Serializable { + private final int SPARECAPACITY = 16; + + char[] buffer; // character buffer + int used; // # chars used in buffer + boolean isStringized; // buffer also part of String + // and therefore unmodifiable // This is what Sun's JDK1.1 "serialver java.lang.StringBuffer" says private static final long serialVersionUID = 3388685877147921107L; @@ -26,16 +26,13 @@ } public StringBuffer(String str) { - if ( str == null) - str = String.valueOf( str); - used = str.length(); - buffer = new char[used+SPARECAPACITY]; - System.arraycopy(str.toCharArray(), 0, buffer, 0, used); + used = str.count; + buffer = new char[used + SPARECAPACITY]; + System.arraycopy(str.value, str.offset, buffer, 0, used); } public StringBuffer(int length) { - if (length<0) throw new NegativeArraySizeException(); - buffer=new char[length]; + buffer = new char[length]; } public StringBuffer append(Object obj) { @@ -44,9 +41,9 @@ public StringBuffer append ( String str ) { if (str == null) { - str = String.valueOf( str); + str = "null"; } - return (append( str.value, str.offset, str.count)); + return append(str.value, str.offset, str.count); } public StringBuffer append(boolean b) { @@ -54,28 +51,22 @@ } public synchronized StringBuffer append(char c)
Re: Problem with StringBuffer
Mo DeJong writes: > This brings up an interesting question. Should kaffe always > maintain "compatibility" with a Sun JDK implementation > (1.1, 1.2, or 1.3) even when a Sun implementation > is clearly wrong or inefficient? People seem to complain more about incompatible behavior than they do about compatible behavior :-) So I think we should maintain compatibility when we are forced to (eg, same API's), but not when it's possible to do a better job without breaking apps that run successfully on the JDK. -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: Problem with StringBuffer
Wolfgang Muees wrote: > Tatu, I think you miss an important point here: > > - most JAVA programmers try to code a program that behaves well >under all JVMs available. > - The default StringBuffer implementation from SUN have problems >with resuse of large Stringbuffers. > > So, IMO all you can do is to code around this problem. I perhaps misunderstood what you were saying first, and you are of course right in that given that current implementations (I haven't tested IBM's jdk to see if it also has the problem, though) have problems, it is best to code around the problem (that's what I did once I found the problem). What I tried to say was simply that it would be nice to fix (or at least alleviate) the potential problem in Kaffe, as it seems relatively easy to do (and as the problem has been fixed in JDK it seems like it was considered a real implementation deficiency). Then again as there already is a patch that should save space (even if StringBuffer is used just once), I'm a happy camper. :-) -+ Tatu +-
Re: Problem with StringBuffer
On Sun, 2 Apr 2000, Wolfgang Muees wrote: > > Am Sam, 01 Apr 2000 schrieb Tatu Saloranta: > > Wolfgang Muees wrote: > > > > > > The right solution for this problem is IMO: don't reuse StringBuffer. > > > It is designed primary as an input buffer for a single string. > > > > I think I disagree here; at least if there's no other class for similar > > purpose. I am interested in optimizing Java-programs, and in general, > > one of the most efficient optimizations is to recycle objects. Object > > creation is not a cheap operation. Especially in this case, where > > StringBuffer does allocate a character array, it means there are at > > least 2 memory allocations and other initialization code. If the > > array truncation can be done when the array is being copied (during > > destringify() or whatever the method was), it won't add a new array > > allocation. > > > Tatu, I think you miss an important point here: > > - most JAVA programmers try to code a program that behaves well >under all JVMs available. > - The default StringBuffer implementation from SUN have problems >with resuse of large Stringbuffers. > > So, IMO all you can do is to code around this problem. > > best regards > Wolfgang This brings up an interesting question. Should kaffe always maintain "compatibility" with a Sun JDK implementation (1.1, 1.2, or 1.3) even when a Sun implementation is clearly wrong or inefficient? I have to wonder when we are going to give up on waiting for Sun to fix bugs in the core libraries and just make sure Kaffe is reasonable. My "pet peeve" API in the java.util.zip package. You can check out the jar implementation I wrote for Kaffe to see an example of the problem with the ZipEntry implementation. In short, the Sun zip impl forces the user to generate a CRC checksum on the data written to the zip file even though the zip library already does this for you. Here is a quick example of what is currently required for the Sun impl (and mirrored in the Kaffe impl). Also note that this only applies to uncompressed zip entries (it is yet another one of the mysteries of the Sun impl). InputStream in = new FileInputStream(entryfile); ZipEntry ze = new ZipEntry(entryname); ze.setMethod(ZipEntry.STORED); ze.setCrc( 0 ); crc = new CRC32(); in = new CheckedInputStream(in,crc); readwriteStreams(in, zos); // this just writes all the data to in ze.setCrc(crc.getValue()); zos.closeEntry(); // this closes the current entry on the ZipOutputStream Why on earth should I need to do this? Code like the following should run faster because a second CRC calculation over the entire stream would not be needed. This code would not run on the Sun JDK because it raises an exception in the closeEntry() method, but we could fix the problem in Kaffe. InputStream in = new FileInputStream(entryfile); ZipEntry ze = new ZipEntry(entryname); ze.setMethod(ZipEntry.STORED); readwriteStreams(in, zos); // this just writes all the data to in zos.closeEntry(); // this closes the current entry on the ZipOutputStream Any comments? Mo DeJong Red Hat Inc.
Re: Problem with StringBuffer
Am Sam, 01 Apr 2000 schrieb Tatu Saloranta: > Wolfgang Muees wrote: > > > > The right solution for this problem is IMO: don't reuse StringBuffer. > > It is designed primary as an input buffer for a single string. > > I think I disagree here; at least if there's no other class for similar > purpose. I am interested in optimizing Java-programs, and in general, > one of the most efficient optimizations is to recycle objects. Object > creation is not a cheap operation. Especially in this case, where > StringBuffer does allocate a character array, it means there are at > least 2 memory allocations and other initialization code. If the > array truncation can be done when the array is being copied (during > destringify() or whatever the method was), it won't add a new array > allocation. > Tatu, I think you miss an important point here: - most JAVA programmers try to code a program that behaves well under all JVMs available. - The default StringBuffer implementation from SUN have problems with resuse of large Stringbuffers. So, IMO all you can do is to code around this problem. best regards Wolfgang -- No Microsoft programs were used in the creation or distribution of this message. If you are using a Microsoft program to view this message, be forewarned that I am not responsible for any harm you may encounter as a result.
Re: Problem with StringBuffer
Wolfgang Muees wrote: > > Am Mit, 29 Mär 2000 schrieb Tatu Saloranta: > > I finally found out the reason for 'memory leak' on my Java program, > > and the culprit in this case was StringBuffer - implementation. > > ... > > (or length of the buffer via setLength()). The problem here is that > > as soon as I encounter a long token (~4000 chars in this case), > > _all_ tokens after that will use up the same 4k memory > > The right solution for this problem is IMO: don't reuse StringBuffer. > It is designed primary as an input buffer for a single string. I think I disagree here; at least if there's no other class for similar purpose. I am interested in optimizing Java-programs, and in general, one of the most efficient optimizations is to recycle objects. Object creation is not a cheap operation. Especially in this case, where StringBuffer does allocate a character array, it means there are at least 2 memory allocations and other initialization code. If the array truncation can be done when the array is being copied (during destringify() or whatever the method was), it won't add a new array allocation. An alternative would then be a replacement for StringBuffer that would be designed to be reusable. Not a huge task, and perhaps it might be worth the effort, at least in my case. In any case, I guess it all depends on how difficult it is to alleviate the problem. The fundamental problem I guess is that although it is String() instance that should really do the truncation (so as not to use huge char array for storing just few characters), it is StringBuffer that is responsible for doing that; String just takes the given array and length, and after that passively just uses what it got. JDK-fix notes that, and alleviates the problem so that none of the resulting strings have higher than 100% overhead. :-) In any case I think it's an oversight in developers of Java base classes; if they had known the problem, there would be something in description of StringBuffer - class to indicate it's not supposed to be reused. -+ Tatu +-
Re: Problem with StringBuffer
Wolfgang Muees writes: > > I finally found out the reason for 'memory leak' on my Java program, > > and the culprit in this case was StringBuffer - implementation. > > > > I have a lexer (made with JFLex), that uses StringBuffer for > > constructing > > the strings for certain tokens. The StringBuffer is reused so that each > > time a new such token is getting creted, StringBuffer.setLength(0) is > > called. However, in Kaffe's implementation at least, the actual length > > of the buffer is not changed. This wouldn't be a big problem in itself, > > as there's just one StringBuffer instance... But, alas, as Strings & > > StringBuffers are optimized so that when new String(StringBuffer) is > > called, the character array is actually shared between the string > > and string buffer, until the StringBuffer needs to change the data > > (or length of the buffer via setLength()). The problem here is that > > as soon as I encounter a long token (~4000 chars in this case), > > _all_ tokens after that will use up the same 4k memory > > The right solution for this problem is IMO: don't reuse StringBuffer. > It is designed primary as an input buffer for a single string. Yesterday I checked in some changes which should alleviate this problem nonetheless.. please give them a try and let us know if they help. -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: Problem with StringBuffer
Am Mit, 29 Mär 2000 schrieb Tatu Saloranta: > I finally found out the reason for 'memory leak' on my Java program, > and the culprit in this case was StringBuffer - implementation. > > I have a lexer (made with JFLex), that uses StringBuffer for > constructing > the strings for certain tokens. The StringBuffer is reused so that each > time a new such token is getting creted, StringBuffer.setLength(0) is > called. However, in Kaffe's implementation at least, the actual length > of the buffer is not changed. This wouldn't be a big problem in itself, > as there's just one StringBuffer instance... But, alas, as Strings & > StringBuffers are optimized so that when new String(StringBuffer) is > called, the character array is actually shared between the string > and string buffer, until the StringBuffer needs to change the data > (or length of the buffer via setLength()). The problem here is that > as soon as I encounter a long token (~4000 chars in this case), > _all_ tokens after that will use up the same 4k memory The right solution for this problem is IMO: don't reuse StringBuffer. It is designed primary as an input buffer for a single string. regards Wolfgang -- No Microsoft programs were used in the creation or distribution of this message. If you are using a Microsoft program to view this message, be forewarned that I am not responsible for any harm you may encounter as a result.
Re: Problem with StringBuffer
Tatu Saloranta writes: > The same problem occurs in Sun's JDK as well, although by printing > StringBuffer.capacity() regularly, I noticed that the behaviour is > not 100% identical. In both cases, though, I end up getting an > OutOfMemory exception... :-) Here's a related Sun bug: http://developer.java.sun.com/developer/bugParade/bugs/4224987.html -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: Problem with StringBuffer
> I think the actual problem is not in the StringBuffer size, but in the > copy done when the buffer is un-shared; after all, unless the StringBuffer > jump down from 1Mb to 1 byte, a single instance would not be a big problem; > but having many tokens with a lot of wasted memory would be ... I whipped up a quick patch for this... I tested it so I know that it doesn't break anything new. However, I don't know if it will fix your problem. :) Basically, in the String(StringBuffer) constructor, I copy the stringbuffer's contents if used + SLOP < buffer.length. For systems that create a string buffer, stringize it, and then throw away the string buffer, this may introduce an extra copy while only saving a small amount of memory. -Pat - - --- --- -- -- - - - Pat Tullmann [EMAIL PROTECTED] Don't hate yourself in the morning -- sleep until noon! Index: String.java === RCS file: /cvs/kaffe/kaffe/libraries/javalib/java/lang/String.java,v retrieving revision 1.27 diff -u -u -r1.27 String.java --- String.java 1999/10/12 02:29:47 1.27 +++ String.java 1999/12/18 23:46:43 @@ -20,6 +20,12 @@ final public class String implements Serializable, Comparable { + /** +* Maximum slop (extra unused chars in the char[]) that +* will be accepted in a StringBuffer -> String conversion. +*/ + private static final int STRINGBUFFER_SLOP = 32; + // Note: value, offset, and count are not private, because // StringBuffer uses them for faster access char[] value; @@ -52,11 +58,23 @@ public String (StringBuffer sb) { - // mark this StringBuffer so that it knows we are using it + // mark the StringBuffer so that it knows we are using it sb.isStringized = true; + + if ((sb.used + STRINGBUFFER_SLOP) > sb.buffer.length) { + value = new char[sb.used]; + System.arraycopy(sb.buffer, 0, +value, 0, sb.used); + count = sb.used; - count = sb.used; - value = sb.buffer; + // StringBuffer is free to reuse its buffer again + sb.isStringized = false; + } + else { + // Just point directly to the StringBuffer's char[] + count = sb.used; + value = sb.buffer; + } } public String( byte[] bytes) {
Re: Problem with StringBuffer
Tatu Saloranta <[EMAIL PROTECTED]> writes: > I have a lexer (made with JFLex), that uses StringBuffer for > constructing the strings for certain tokens. [ ... ] I had the same problem some times ago, with an old version of the Jacl interpreter on the Sun JDK 1.1.something; in that case it was really dramatic, because the 'big' token there was around 64K; this also simplified indentifing the problem ;-> In my case, the workaround was to copy the string (with a explicit call to the String(String) constructor ... > Any ideas? I think the actual problem is not in the StringBuffer size, but in the copy done when the buffer is un-shared; after all, unless the StringBuffer jump down from 1Mb to 1 byte, a single instance would not be a big problem; but having many tokens with a lot of wasted memory would be ... Maurizio -- Maurizio De Cecco MandrakeSofthttp://www.mandrakesoft.com/
Problem with StringBuffer
I finally found out the reason for 'memory leak' on my Java program, and the culprit in this case was StringBuffer - implementation. I have a lexer (made with JFLex), that uses StringBuffer for constructing the strings for certain tokens. The StringBuffer is reused so that each time a new such token is getting creted, StringBuffer.setLength(0) is called. However, in Kaffe's implementation at least, the actual length of the buffer is not changed. This wouldn't be a big problem in itself, as there's just one StringBuffer instance... But, alas, as Strings & StringBuffers are optimized so that when new String(StringBuffer) is called, the character array is actually shared between the string and string buffer, until the StringBuffer needs to change the data (or length of the buffer via setLength()). The problem here is that as soon as I encounter a long token (~4000 chars in this case), _all_ tokens after that will use up the same 4k memory (actually, with kaffe it's 8192 bytes as array keeps on doubling in size) regardless of their actual length! Even when sharing is removed (as a result of setLength(), for example), the String - instance simply copies the huge array, not checking the length of its contents. The same problem occurs in Sun's JDK as well, although by printing StringBuffer.capacity() regularly, I noticed that the behaviour is not 100% identical. In both cases, though, I end up getting an OutOfMemory exception... :-) I'm not sure what should be done to this; I can 'fix' the problem in my program by instantiating new StringBuffers (I do that now if StringBuffer.capacity() exceeds 80 chars). Still, this is a somewhat subtle but fatal problem, and probably other people have encountered the same problem at some other point. Or perhaps they just thought it's because Java is such a memory hog... :-) Either the char array of StringBuffer could be deflated (not just inflated) on setLength() (perhaps if the new length < old length / 2 or such), or on destringize(), depending on how much wasted space the new array would have? Any ideas? -+ Tatu +-