Re: RFR: 6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available [v3]

2022-05-12 Thread John Hendrikx
On Thu, 5 May 2022 23:39:30 GMT, Brian Burkhalter  wrote:

>> Modify native multi-byte read-write code used by the `java.io` classes to 
>> limit the size of the allocated native buffer thereby decreasing off-heap 
>> memory footprint and increasing throughput.
>
> Brian Burkhalter has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - 6478546: Clean up io_util.c
>  - Merge
>  - 6478546: Decrease malloc'ed buffer maximum size to 64 kB
>  - 6478546: FileInputStream.read() throws OutOfMemoryError when there is 
> plenty available

src/java.base/share/native/libjava/io_util.c line 79:

> 77:   jint off, jint len, jfieldID fid)
> 78: {
> 79: char stackBuf[STACK_BUF_SIZE];

This was in the original code already, but doesn't this always allocate 8k on 
the stack, regardless of whether this buffer will be used (if len > 8k or len 
== 0)?  Wouldn't it be better to allocate this only in the `else` case?

Would apply to the write code as well.

src/java.base/share/native/libjava/io_util.c line 81:

> 79: char stackBuf[STACK_BUF_SIZE];
> 80: char *buf = NULL;
> 81: jint buf_size, read_size;;

minor: double semi colon

src/java.base/share/native/libjava/io_util.c line 129:

> 127: off += n;
> 128: } else if (n == -1) {
> 129: JNU_ThrowIOExceptionWithLastError(env, "Read error");

The original code would have `nread` set to `-1` here, and thus exit with `-1`. 
 The new code would exit with the last value for `nread` which could be 
anything.

src/java.base/share/native/libjava/io_util.c line 201:

> 199: write_size = len < buf_size ? len : buf_size;
> 200: (*env)->GetByteArrayRegion(env, bytes, off, write_size, 
> (jbyte*)buf);
> 201: if (!(*env)->ExceptionOccurred(env)) {

Wouldn't this result in an infinite loop if `GetByteArrayRegion` triggered an 
exception and `len > 0` ?  It would never enter the `if` and `len` is never 
changed then...

-

PR: https://git.openjdk.java.net/jdk/pull/8235


Re: [Very fast List/Deque to java.util?]

2022-06-13 Thread John Hendrikx

I took a look.

I found a few results odd:

|com.github.coderodde.util.IndexedLinkedList.addLast in (ms): 8 
java.util.LinkedList.addLast in (ms): 2 java.util.ArrayList.addLast in 
(ms): 157 org.apache.commons.collections4.list.TreeList.addLast in (ms): 38|


Basically, ArrayList's performance (which should be O(1) in this case) 
is surprising. Looking at the benchmark, it is calling 
ArrayList#add(int, E) -- this method is not special cased for adding at 
the end of the list (it will do a range check and call System#arrayCopy 
in all cases).


|com.github.coderodde.util.IndexedLinkedList.stream() in (ms): 1 
java.util.LinkedList.stream() in (ms): 20 java.util.ArrayList.stream() 
in (ms): 16 org.apache.commons.collections4.list.TreeList.stream() in 
(ms): 15|


This test isn't only measuring streaming (iteration?) but also 
re-inserting all elements back into a List 
(collect(Collectors.toList()). What I find odd is that the 
IndexedLinkedList would perform much better here than ArrayList, given 
that the time is probably dominated by the final collect, and not the 
iteration itself.  Is ArrayList#stream poorly optimized or is something 
else going on?


A pure iteration test would be interesting to see.

I also ran the benchmark on my machine. The benchmark on Github doesn't 
mention with what parameters it is run, and so when I run it I get quite 
different results. The committed version of the benchmark seems to use 
collections with a max size of 20 elements. The total time when I run 
the benchmark favors TreeList more than any other:


--- Total time elapsed ---
com.github.coderodde.util.IndexedLinkedList in (ms): 207
java.util.LinkedList in (ms): 4248
java.util.ArrayList in (ms): 1799
org.apache.commons.collections4.list.TreeList in (ms): 159

That said, I think there might be a place for a new list implementation 
in the JDK.  One use case I've had in the past was for a list that is 
always sorted but also allows index based access when displaying a 
window into the list. A TreeMap satisfies the sorting criteria but 
doesn't offer index access, while a plain ArrayList doesn't lend itself 
well for doing sorted inserts/removals (locating the correct location is 
trivial enough, but the actual insert or removal results in a 
potentially large copy).  Apache's TreeList is fairly good at this use case.


--John


On 13/06/2022 09:56, Rodion Efremov wrote:

Hello,

I have this List/Deque implementation [1] that (in a versatile benchmark)
runs much faster than ArrayList/LinkedList. Under mild assumptions, it
accesses an element in O(sqrt(N)) time.

Now, if all we want to do is to add at the tail and read via get(int
index), you are better of using java.util.ArrayList. If you wish to iterate
a list removing some elements, the way to go is to use java.util.LinkedList.

  However,, if you deal with larger lists via many different operations
(addFirst/addLast/add(int, E)/get(int)/remove(int)/ettc. my
IndexedLinkedLiist will outperform both of them gracefully.

So, what do you think? Does it deserve to become a class candidate for
java.util?

[1]https://github.com/coderodde/IndexedLinkedList