[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

ASF GitHub Bot (JIRA) Sun, 23 Apr 2017 09:25:14 -0700

    [ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980454#comment-15980454
 ]


ASF GitHub Bot commented on COMPRESS-388:
-----------------------------------------

Github user bodewig commented on a diff in the pull request:

    https://github.com/apache/commons-compress/pull/21#discussion_r112838584
  
    --- Diff: 
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java ---
    @@ -1111,14 +1122,11 @@ public int read() throws IOException {
                     }
                     return -1;
                 }
    -            synchronized (archive) {
    -                archive.position(loc++);
    -                int read = read(1);
    -                if (read < 0) {
    -                    return read;
    -                }
    -                return buffer.get() & 0xff;
    +            int read = read(loc++, 1);
    +            if (read < 0) {
    --- End diff --
    
    I think it depends on what we think `synchronized` is supposed to protect 
against.
    
    As it stands the `synchronized` block also protected concurrent reads from 
the same `BoundedInputStream` from overwriting their results (protecting `loc` 
and `buffer` not just the file position).
    
    If we want to keep that level of protection the changes are not going to 
work. If we say each input stream returned by `getInputStream` can only be used 
by a single thread, then I think it is fine to keep the increment outside of 
the block and only synchronise the code that protects the interaction between 
different streams.
    



> Improve concurrent reads from ZipFile
> -------------------------------------
>
>                 Key: COMPRESS-388
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-388
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>    Affects Versions: 1.13
>         Environment: Any
>            Reporter: Zbynek Vyskovsky
>              Labels: patch, performance
>             Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

Reply via email to