I was writing a parser for dpkg control files. My first version was very
stupid. It did reads of one char, in a loop.
Under sun, I got acceptable speeds(3s to parse all of /var/lib/dpkg/status).
Kaffe, however, took 1.5 minutes(or so).
So, after a bit of digging, I found the problem. KaffeDecoder(and it's
brother, KaffeEncoder) are very inefficient. When doing single reads, it
calls the large read(buf, off, len) routine, with an array length of one.
This causes a separate conversion call for each char, and this is *very* slow.
I then noticed in the javadoc for InputStreamReader, and OutputStreamWriter,
that the implementation was allowed to do internal buffering, to make
conversion more efficient. So that's what I did.
Before doing this, however, I wrote a small test framework. It tests
input/output, buffered/unbuffered, with/without encoding. 8 tests. I've ran
it under sun14, kaffe, kaffe-fix, and gcj. I won't include the numbers here,
unless someone asks.
However, I will report on the speed increases I saw.
With my fix in place, and a dump loop reading(or writing) one char at time, I
saw a read increase of 200 fold(200 times!), and a write increase of 90 fold.
The stupid version of my parser saw a 25 fold increase.
Anyways, attached you'll find the PerfTest program I wrote, and the patch
itself.
ps: I do have commit access, but this is a very low-level change, and wanted
others to see it first. I haven't run any test cases, other than my parsing
program.
? Makefile.in.es
Index: gnu/java/io/decode/KaffeDecoder.java
===================================================================
RCS file:
/cvs/kaffe/kaffe/libraries/javalib/gnu/java/io/decode/KaffeDecoder.java,v
retrieving revision 1.4
diff -u -r1.4 KaffeDecoder.java
--- gnu/java/io/decode/KaffeDecoder.java 18 May 2004 16:13:28 -0000
1.4
+++ gnu/java/io/decode/KaffeDecoder.java 7 Dec 2004 07:32:28 -0000
@@ -56,6 +56,14 @@
ByteToCharConverter converter;
+/* These three vars are used for the general buffer management */
+private int ptr = 0;
+private int end = 0;
+private char[] buffer = new char[4096];
+
+/* This array is a temporary used during the conversion process. */
+private byte[] inbuf = new byte[4096];
+
/*************************************************************************/
/*
@@ -103,15 +111,83 @@
return(cbuf);
}
-/**
- * Read the requested number of chars from the underlying stream.
- * Some byte fragments may remain in the converter and they are
- * used by the following read. So read and convertToChars must
- * not be used for the same converter instance.
- */
-// copied from kaffe's java/io/InputStreamReader.java
+
+public int
+read() throws IOException
+{
+ synchronized (lock) {
+ if (ptr < end) return buffer[ptr++];
+ int r = _read(buffer, 0, buffer.length);
+ if (r == -1) return -1;
+ ptr = 1;
+ end = r;
+ return buffer[0];
+ }
+}
+
public int
-read ( char cbuf[], int off, int len ) throws IOException
+read(char cbuf[], int off, int len) throws IOException
+{
+ synchronized (lock) {
+ int bytesRead = 0;
+ if (len < end - ptr) {
+ System.arraycopy(buffer, ptr, cbuf, off, len);
+ ptr += len;
+ return len;
+ }
+
+ int preCopy = end - ptr;
+ if (preCopy > 0) {
+ System.arraycopy(buffer, ptr, cbuf, off, preCopy);
+ off += preCopy;
+ len -= preCopy;
+ bytesRead += preCopy;
+ }
+ ptr = 0;
+ end = 0;
+
+ int remainder = len % buffer.length;
+ int bulkCopy = len - remainder;
+ if (bulkCopy > 0) {
+ int r = _read(cbuf, off, bulkCopy);
+ if (r == -1) {
+ return bytesRead == 0 ? -1 : bytesRead;
+ }
+ off += r;
+ len -= r;
+ bytesRead += r;
+ }
+
+ if (remainder > 0) {
+ int r = _read(buffer, 0, buffer.length);
+ if (r == -1) {
+ return bytesRead == 0 ? -1 : bytesRead;
+ }
+ end = r;
+ int remainderCopy = r < remainder ? r : remainder;
+ System.arraycopy(buffer, 0, cbuf, off, remainderCopy);
+ off += remainderCopy;
+ len -= remainderCopy;
+ ptr = remainderCopy;
+ bytesRead += remainderCopy;
+ }
+
+ return bytesRead;
+ }
+}
+
+/*
+ * Read the requested number of chars from the underlying stream.
+ * Some byte fragments may remain in the converter and they are
+ * used by the following read. So read and convertToChars must
+ * not be used for the same converter instance.
+ *
+ * This method *must* be called with lock held, as it uses the
+ * instance variable inbuf.
+ */
+// copied from kaffe's java/io/InputStreamReader.java
+private int
+_read ( char cbuf[], int off, int len ) throws IOException
{
if (len < 0 || off < 0 || off + len > cbuf.length) {
throw new IndexOutOfBoundsException();
@@ -119,8 +195,6 @@
int outlen = 0;
boolean seenEOF = false;
-
- byte[] inbuf = new byte[2048];
while (len > outlen) {
// First we retreive anything left in the converter
Index: gnu/java/io/encode/KaffeEncoder.java
===================================================================
RCS file:
/cvs/kaffe/kaffe/libraries/javalib/gnu/java/io/encode/KaffeEncoder.java,v
retrieving revision 1.4
diff -u -r1.4 KaffeEncoder.java
--- gnu/java/io/encode/KaffeEncoder.java 6 Dec 2004 21:20:40 -0000
1.4
+++ gnu/java/io/encode/KaffeEncoder.java 7 Dec 2004 07:32:28 -0000
@@ -65,6 +65,15 @@
CharToByteConverter converter;
+/* These 2 variables are used in the general buffer management */
+private int ptr = 0;
+private char[] buffer = new char[4096];
+
+/* This buffer is used during the conversion process. It gets expanded
+ * automatically when it overflows.
+ */
+private byte[] bbuf = new byte[buffer.length * 3];
+
/*************************************************************************/
/*
@@ -127,9 +136,74 @@
* Write the requested number of chars to the underlying stream
*/
public void
+write(int c) throws IOException
+{
+ synchronized (lock) {
+ buffer[ptr++] = (char) c;
+ if (ptr == buffer.length) localFlush();
+ }
+}
+
+/**
+ * Write the requested number of chars to the underlying stream
+ */
+public void
write(char[] buf, int offset, int len) throws IOException
{
- out.write(convertToBytes(buf, offset, len));
+ synchronized (lock) {
+ if (len > buffer.length - ptr) {
+ localFlush();
+ _write(buf, offset, len);
+ } else if (len == 1) {
+ buffer[ptr++] = buf[offset];
+ } else {
+ System.arraycopy(buf, offset, buffer, ptr, len);
+ ptr += len;
+ }
+ }
+}
+
+/* This *must* be called with the lock held. */
+private void
+localFlush() throws IOException
+{
+ if (ptr > 0) {
+ // Reset ptr to 0 before the _write call. Otherwise, a
+ // very nasty loop could occur. Please don't ask.
+ int length = ptr;
+ ptr = 0;
+ _write(buffer, 0, length);
+ }
+}
+
+public void
+flush() throws IOException
+{
+ synchronized (lock) {
+ localFlush();
+ out.flush();
+ }
+}
+/*
+ * Write the requested number of chars to the underlying stream
+ *
+ * This method *must* be called with the lock held, as it accesses
+ * the variable bbuf.
+ */
+private void
+_write(char[] buf, int offset, int len) throws IOException
+{
+ int bbuflen = converter.convert(buf, offset, len, bbuf, 0, bbuf.length);
+ int bufferNeeded = 0;
+ while (bbuflen > 0) {
+ out.write(bbuf, 0, bbuflen);
+ bbuflen = converter.flush(bbuf, 0, bbuf.length);
+ bufferNeeded += bbuflen;
+ }
+ if (bufferNeeded > bbuf.length) {
+ // increase size of array
+ bbuf = new byte[bufferNeeded];
+ }
}
} // class KaffEncoder
import java.io.*;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class PerfTest {
protected File dir;
protected int size;
protected String encoding;
protected String data;
protected List contexts = new ArrayList();
protected boolean verbose;
protected String label;
public PerfTest() {
this("<no-label>", new File("."), 50, "UTF-8");
}
public PerfTest(String label, File dir, int size, String encoding) {
this.label = label;
this.dir = dir;
this.size = size;
this.encoding = encoding;
}
public String label() {
return label;
}
public boolean verbose() {
return verbose;
}
public void setVerbose(boolean verbose) {
this.verbose = verbose;
}
public int size() {
return size;
}
public String encoding() {
return encoding;
}
public File dir() {
return dir;
}
public void writeTestData(File file) throws IOException {
}
public void add(Context ctx) {
contexts.add(ctx);
}
public void run() throws IOException {
Iterator it = contexts.iterator();
System.out.println(label() + ".size=" + size() + " megabyte" + (size() == 1 ? "" : "s"));
System.out.println(label() + ".encoding=" + encoding());
while (it.hasNext()) {
Context context = (Context) it.next();
long startInit = System.currentTimeMillis();
context.init(this);
long endInit = System.currentTimeMillis();
System.out.println(label() + '.' + context.type() + ".init=" + (endInit - startInit));
context.run(this);
}
}
public static void main(String[] args) {
String label = "<nolabel>";
File tmpDir = new File(".");
int size = 1;
String encoding = "UTF-8";
boolean verbose = false;
for (int i = 0; i < args.length; i++) {
if ("-label".equals(args[i])) {
label = args[++i];
} else if ("-tmp-dir".equals(args[i])) {
tmpDir = new File(args[++i]);
} else if ("-size".equals(args[i])) {
size = Integer.parseInt(args[++i]);
} else if ("-encoding".equals(args[i])) {
encoding = args[++i];
} else if ("-verbose".equals(args[i])) {
verbose = true;
} else if ("-no-verbose".equals(args[i])) {
verbose = false;
}
}
PerfTest perfTest = new PerfTest(label, tmpDir, size, encoding);
perfTest.setVerbose(verbose);
ReadContext readContext = new ReadContext();
// readContext.add(new RawInputStreamTest());
// readContext.add(new BufferedInputStreamTest());
readContext.add(new BufferedReaderTest());
readContext.add(new RawReaderTest());
WriteContext writeContext = new WriteContext();
// writeContext.add(new RawOutputStreamTest());
// writeContext.add(new BufferedOutputStreamTest());
writeContext.add(new BufferedWriterTest());
writeContext.add(new RawWriterTest());
perfTest.add(writeContext);
perfTest.add(readContext);
try {
perfTest.run();
} catch (IOException e) {
e.printStackTrace();
}
}
public interface Test {
String name();
}
public interface Context {
void init(PerfTest perfTest);
void run(PerfTest perfTest);
void add(Test test);
String type();
}
protected static abstract class ContextBase implements Context {
protected List tests = new ArrayList();
public void run(PerfTest perfTest) {
Iterator it = tests.iterator();
String label = perfTest.label() + '.' + type();
boolean verbose = perfTest.verbose();
while (it.hasNext()) {
Test test = (Test) it.next();
String name = test.name();
if (verbose) System.err.print(label + " running " + name + ':');
try {
long startSetup = System.currentTimeMillis();
if (verbose) System.err.print(" [setup]");
Object data = setup(perfTest, test);
long startTest = System.currentTimeMillis();
if (verbose) System.err.print(" [performing test]");
runTest(perfTest, test, data);
long endTest = System.currentTimeMillis();
if (verbose) System.err.print(" [tearDown]");
tearDown(data);
if (verbose) System.err.println(" done");
long endTearDown = System.currentTimeMillis();
System.out.println(label + '.' + name + ".setup=" + (startTest - startSetup));
System.out.println(label + '.' + name + ".runTest=" + (endTest - startTest));
System.out.println(label + '.' + name + ".tearDown=" + (endTearDown - endTest));
} catch (IOException e) {
System.out.println(" error!");
e.printStackTrace();
}
System.out.flush();
}
}
protected abstract void runTest(PerfTest perfTest, Test test, Object data) throws IOException;
protected abstract Object setup(PerfTest perfTest, Test test) throws IOException;
protected abstract void tearDown(Object data);
}
protected static class ReadContext extends ContextBase {
protected String data;
public String type() {
return "read";
}
public void init(PerfTest perfTest) {
StringBuffer testData = new StringBuffer();
testData.setLength(0);
testData.ensureCapacity(4096*256); // 1 meg
for (int i = 0; i < 4096; i++) {
testData.append((char) i);
}
for (int i = 0; i < 8; i++) {
testData.append(testData);
}
data = testData.toString();
}
protected Object setup(PerfTest perfTest, Test test) throws IOException {
return setup(perfTest, (ReadTest) test);
}
protected File setup(PerfTest perfTest, ReadTest test) throws IOException {
File tmpFile = File.createTempFile(perfTest.label() + "-" + test.name(), ".tmp", perfTest.dir());
Writer writer = new OutputStreamWriter(new FileOutputStream(tmpFile), perfTest.encoding());
for (int i = 0; i < perfTest.size(); i++) {
writer.write(data);
}
writer.close();
return tmpFile;
}
protected void tearDown(Object data) {
tearDown((File) data);
}
protected void tearDown(File file) {
file.delete();
}
protected void runTest(PerfTest perfTest, Test test, Object data) throws IOException {
runTest(perfTest, (ReadTest) test, (File) data);
}
protected void runTest(PerfTest perfTest, ReadTest readTest, File file) throws IOException {
readTest.readTest(perfTest, file);
}
public void add(Test test) {
add((ReadTest) test);
}
public void add(ReadTest test) {
tests.add(test);
}
}
public interface ReadTest extends Test {
void readTest(PerfTest perfTest, File file) throws IOException;
}
public static abstract class InputStreamTest implements ReadTest {
public void readTest(PerfTest perfTest, File file) throws IOException {
readTest(new FileInputStream(file));
}
protected abstract void readTest(InputStream in) throws IOException;
}
public static class RawInputStreamTest extends InputStreamTest {
protected void readTest(InputStream in) throws IOException {
while (in.read() != -1);
}
public String name() {
return "RawInputStream";
}
}
public static class BufferedInputStreamTest extends InputStreamTest {
protected void readTest(InputStream in) throws IOException {
if (!(in instanceof BufferedInputStream)) in = new BufferedInputStream(in);
while (in.read() != -1);
}
public String name() {
return "BufferedInputStream";
}
}
public static abstract class ReaderTest implements ReadTest {
public void readTest(PerfTest perfTest, File file) throws IOException {
readTest(new InputStreamReader(new FileInputStream(file), perfTest.encoding()));
}
protected abstract void readTest(Reader in) throws IOException;
}
public static class RawReaderTest extends ReaderTest {
protected void readTest(Reader in) throws IOException {
while (in.read() != -1);
}
public String name() {
return "RawReader";
}
}
public static class BufferedReaderTest extends ReaderTest {
protected void readTest(Reader in) throws IOException {
if (!(in instanceof BufferedReader)) in = new BufferedReader(in);
while (in.read() != -1);
}
public String name() {
return "BufferedReader";
}
}
protected static class WriteContext extends ContextBase {
protected char[] buffer;
public String type() {
return "write";
}
public void init(PerfTest perfTest) {
StringBuffer testData = new StringBuffer();
testData.setLength(0);
testData.ensureCapacity(4096*256); // 1 meg
for (int i = 0; i < 4096; i++) {
testData.append((char) i);
}
for (int i = 0; i < 8; i++) {
testData.append(testData);
}
buffer = new char[testData.length()];
testData.getChars(0, buffer.length, buffer, 0);
}
protected Object setup(PerfTest perfTest, Test test) throws IOException {
return setup(perfTest, (WriteTest) test);
}
protected File setup(PerfTest perfTest, WriteTest test) throws IOException {
return File.createTempFile(perfTest.label() + "-" + test.name(), ".tmp", perfTest.dir());
}
protected void tearDown(Object data) {
tearDown((File) data);
}
protected void tearDown(File file) {
file.delete();
}
protected void runTest(PerfTest perfTest, Test test, Object data) throws IOException {
runTest(perfTest, (WriteTest) test, (File) data, buffer);
}
protected void runTest(PerfTest perfTest, WriteTest writeTest, File file, char[] buffer) throws IOException {
writeTest.writeTest(perfTest, file, buffer);
}
public void add(Test test) {
add((WriteTest) test);
}
public void add(WriteTest test) {
tests.add(test);
}
}
public interface WriteTest extends Test {
void writeTest(PerfTest perfTest, File file, char[] buffer) throws IOException;
}
public static abstract class OutputStreamTest implements WriteTest {
public void writeTest(PerfTest perfTest, File file, char[] buffer) throws IOException {
writeTest(new FileOutputStream(file), buffer);
}
protected abstract void writeTest(OutputStream out, char[] buffer) throws IOException;
}
public static class RawOutputStreamTest extends OutputStreamTest {
protected void writeTest(OutputStream out, char[] buffer) throws IOException {
for (int i = 0; i < buffer.length; i++) {
out.write(buffer[i]);
}
out.flush();
}
public String name() {
return "RawOutputStream";
}
}
public static class BufferedOutputStreamTest extends OutputStreamTest {
protected void writeTest(OutputStream out, char[] buffer) throws IOException {
if (!(out instanceof BufferedOutputStream)) out = new BufferedOutputStream(out, 16384);
for (int i = 0; i < buffer.length; i++) {
out.write(buffer[i]);
}
out.flush();
}
public String name() {
return "BufferedOutputStream";
}
}
public static abstract class WriterTest implements WriteTest {
public void writeTest(PerfTest perfTest, File file, char[] buffer) throws IOException {
writeTest(new OutputStreamWriter(new FileOutputStream(file), perfTest.encoding()), buffer);
}
protected abstract void writeTest(Writer out, char[] buffer) throws IOException;
}
public static class RawWriterTest extends WriterTest {
protected void writeTest(Writer out, char[] buffer) throws IOException {
for (int i = 0; i < buffer.length; i++) {
out.write(buffer[i]);
}
out.flush();
}
public String name() {
return "RawWriter";
}
}
public static class BufferedWriterTest extends WriterTest {
protected void writeTest(Writer out, char[] buffer) throws IOException {
if (!(out instanceof BufferedWriter)) out = new BufferedWriter(out, 16384);
for (int i = 0; i < buffer.length; i++) {
out.write(buffer[i]);
}
out.flush();
}
public String name() {
return "BufferedWriter";
}
}
}
_______________________________________________
kaffe mailing list
[EMAIL PROTECTED]
http://kaffe.org/cgi-bin/mailman/listinfo/kaffe