[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980711#comment-15980711
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user kvr000 commented on the issue:

https://github.com/apache/commons-compress/pull/21
  
Additionally, I was thinking about exposing the entry raw stream starting 
offset and length via public API so in case of need one can either map it into 
memory, directly access the raw data (especially useful when zip is just kind 
of flat storage, being quite popular in games but not only). For me it would 
help to implement off-heap read-only storage, using standard file format widely 
supported by lot of tools.

It's quite zip specific (although can be applied to similar containers too) 
but anyway the API already has lot of zip specific stuff... That piece of 
information would have to be only moved from ZipFile.Entry to ZipFileEntry. 
What do you think about it? 


> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980703#comment-15980703
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user kvr000 commented on the issue:

https://github.com/apache/commons-compress/pull/21
  
Update, improving few things:
- made the fields private again
- simplified to single read(long pos, ByteBuffer buf) method
- allocating the instance buffer only for single byte read which is rather 
rare (so far it seems that only Bzip2 uses it for reading the header), all 
other decompressors and even standard java core readers use internal cache.
- wrapping the byte array passed by parameters instead of creating 
temporary ByteBuffer and copying the bytes

Performance improved in comparison to previous commit:
- 15%-20% for stored stream
- 1%-3% for deflated stream



> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980697#comment-15980697
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user coveralls commented on the issue:

https://github.com/apache/commons-compress/pull/21
  

[![Coverage 
Status](https://coveralls.io/builds/11205671/badge)](https://coveralls.io/builds/11205671)

Coverage increased (+0.04%) to 84.277% when pulling 
**283c5910712623351139d2bf3ea7a39065a00b13 on 
kvr000:feature/COMPRESS-388-concurrent-reads-performance-fix** into 
**13a039029ca7d7fca9862cfb792f7148c555f05f on apache:master**.



> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (COMPRESS-389) Inconsistent increment of 'loc' in ZipFile.BoundedInputStream

2017-04-23 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980537#comment-15980537
 ] 

Sebb edited comment on COMPRESS-389 at 4/23/17 10:04 PM:
-

It looks like the read() method is never called at logical EOF, so it always 
reads one byte.


was (Author: s...@apache.org):
It looks like the read() method is never called at logical EOF, so it always 
reads at least one byte.

> Inconsistent increment of 'loc' in ZipFile.BoundedInputStream
> -
>
> Key: COMPRESS-389
> URL: https://issues.apache.org/jira/browse/COMPRESS-389
> Project: Commons Compress
>  Issue Type: Bug
>Reporter: Sebb
>
> ZipFile.BoundedInputStream.read() always increments loc, even if no bytes are 
> read.
> However ZipFile.BoundedInputStream.read(byte[] b, int off, int len) only 
> increments the field if some bytes were read.
> This looks wrong.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-389) Inconsistent increment of 'loc' in ZipFile.BoundedInputStream

2017-04-23 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980537#comment-15980537
 ] 

Sebb commented on COMPRESS-389:
---

It looks like the read() method is never called at logical EOF, so it always 
reads at least one byte.

> Inconsistent increment of 'loc' in ZipFile.BoundedInputStream
> -
>
> Key: COMPRESS-389
> URL: https://issues.apache.org/jira/browse/COMPRESS-389
> Project: Commons Compress
>  Issue Type: Bug
>Reporter: Sebb
>
> ZipFile.BoundedInputStream.read() always increments loc, even if no bytes are 
> read.
> However ZipFile.BoundedInputStream.read(byte[] b, int off, int len) only 
> increments the field if some bytes were read.
> This looks wrong.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NUMBERS-22) Method reciprocal() in Complex for complex numbers with parts very close to 0.0

2017-04-23 Thread Raymond DeCampo (JIRA)

[ 
https://issues.apache.org/jira/browse/NUMBERS-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980515#comment-15980515
 ] 

Raymond DeCampo commented on NUMBERS-22:


The argument for using {{Complex.INF}} is that then you end up with the Riemann 
sphere, aka stereographic projection of the sphere onto the complex plane.

The alternate is to have eight infinities: (±Inf, 0), (0, ±Inf) and (±Inf, 
±Inf).  (In case whatever you are reading this comment with is mangling it, 
that symbol is plus/minus.)  There are probably other alternatives that escape 
my imagination.

Eric Barnhill recently undertook an effort to make the Complex implementation 
conform to IEEE and/or C99 standards, perhaps we should wait for his input.

> Method reciprocal() in Complex for complex numbers with parts very close to 
> 0.0
> ---
>
> Key: NUMBERS-22
> URL: https://issues.apache.org/jira/browse/NUMBERS-22
> Project: Commons Numbers
>  Issue Type: Improvement
>Reporter: Gunel Jahangirova
>Priority: Minor
>
> I have been redirected here from the issue repository of Apache Commons Math, 
> as the Complex class will likely be deprecated in favour of its equivalent in 
> "Commons Numbers".
> In class Complex method reciprocal() returns INF only if the real and 
> imaginary parts are exactly equal to 0.0. In the cases when real and 
> imaginary parts are double numbers very close to 0.0, it does not hold. For 
> example, if we run this code
> {code}
> Complex complex0 = new Complex((-2.44242319E-315));
> Complex complex1 = complex0.reciprocal();
> {code}
> the value of complex1.getReal() will be -Infinity and the value of 
> complex1.getImaginary() will be NaN, instead of complex1 being equal to INF.
> The suggested solutions after the discussion are either checking the equality 
> to ZERO with some tolerance or to detect if one of the real or imaginary 
> parts is going to be infinite or NaN and then return the proper result. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IO-367) Add convenience methods for copyToDirectory

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980504#comment-15980504
 ] 

ASF GitHub Bot commented on IO-367:
---

GitHub user PascalSchumacher opened a pull request:

https://github.com/apache/commons-io/pull/34

IO-367: Add convenience methods for copyToDirectory (closes #18)

patch supplied by James Sawle

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PascalSchumacher/commons-io copyToDirectory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/commons-io/pull/34.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #34


commit 861a4e87e19ec717bef84ec5e37b0b745a611300
Author: Pascal Schumacher 
Date:   2017-04-23T19:02:29Z

IO-367: Add convenience methods for copyToDirectory (closes #18)

patch supplied by James Sawle




> Add convenience methods for copyToDirectory
> ---
>
> Key: IO-367
> URL: https://issues.apache.org/jira/browse/IO-367
> Project: Commons IO
>  Issue Type: New Feature
>  Components: Utilities
>Affects Versions: 2.5
>Reporter: Cornelius Lilge
>Priority: Minor
>  Labels: features
> Attachments: IO-367.patch
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> I suggest adding the following convenience methods:
> First:
> {{void copyToDirectory(final File src, final File destDir)}} which will 
> simply select either
> {{copyFileToDirectory}}
> or
> {{copyDirectoryToDirectory}}.
> Second:
> {{void copyToDirectory(final Collection srcs, final File destDir)}} 
> which will simply use {{copyToDirectory}} for each file object.
> Implementation of these methods should be straight foward as they would only 
> recombine methods that are already existing and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-472) FileUtils.openOutputStream doesn't create file if it doesn't exist

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-472.

Resolution: Incomplete

> FileUtils.openOutputStream doesn't create file if it doesn't exist
> --
>
> Key: IO-472
> URL: https://issues.apache.org/jira/browse/IO-472
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.4
>Reporter: David M. Karr
>
> The javadoc for this method has a pretty unambiguous statement: "The file 
> will be created if it does not exist."  However, this isn't happening.  The 
> code is pretty clear on this:
> {code:java}
> public static FileOutputStream openOutputStream(File file, boolean append) 
> throws IOException {
> if (file.exists()) {
> if (file.isDirectory()) {
> throw new IOException("File '" + file + "' exists but is a 
> directory");
> }
> if (file.canWrite() == false) {
> throw new IOException("File '" + file + "' cannot be written 
> to");
> }
> } else {
> File parent = file.getParentFile();
> if (parent != null) {
> if (!parent.mkdirs() && !parent.isDirectory()) {
> throw new IOException("Directory '" + parent + "' could 
> not be created");
> }
> }
> }
> return new FileOutputStream(file, append);
> }
> {code}
> If it doesn't exist, it will just try to create a FileOutputStream, which 
> throws a FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IO-442) Javadoc contradictory for FileFilterUtils.ageFileFilter(cutoff) and the filter it constructs: AgeFileFilter(cutoff)

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher resolved IO-442.
--
   Resolution: Fixed
 Assignee: Pascal Schumacher
Fix Version/s: 2.6

The javadoc of AgeFileFilter is correct. I have just fixed the FileFilterUtils 
javadoc. Thanks for reporting!

> Javadoc contradictory for FileFilterUtils.ageFileFilter(cutoff) and the 
> filter it constructs: AgeFileFilter(cutoff)
> ---
>
> Key: IO-442
> URL: https://issues.apache.org/jira/browse/IO-442
> Project: Commons IO
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 2.4
>Reporter: Simon Robinson
>Assignee: Pascal Schumacher
>Priority: Trivial
> Fix For: 2.6
>
>
> Documentation states that it returns true if the file is *after* cutoff... 
> but the code does opposite!
> {code}
> /**
>  * Returns a filter that returns true if the file was last modified after
>  * the specified cutoff time.
>  */
> {code}
> BUT..the code constructs the following:
> {code}
> public static IOFileFilter ageFileFilter(long cutoff) {
> return new AgeFileFilter(cutoff);
> }
> {code}
> And the Javadoc for this AgeFileFilter says...OLDER i.e. before
> {code}
> /**
>  * Constructs a new age file filter for files equal to or older than
>  * a certain cutoff
>  *
>  * @param cutoff  the threshold age of the files
>  */
> {code}
> Which is it?!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-430) IOUtils.copy(IS, Writer) implementation and javadoc disagrees

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-430.

Resolution: Invalid

> IOUtils.copy(IS, Writer) implementation and javadoc disagrees
> -
>
> Key: IO-430
> URL: https://issues.apache.org/jira/browse/IO-430
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.5
>Reporter: Bernd Eckenfels
>Priority: Minor
>  Labels: javadoc
>
> The Javadoc of the (deprecated) IOUtils.copy(InputStream,Writer) states:
> {code}
>   * This method uses {@link InputStreamReader}.
> {code}
> but the actual code does not:
> {code}
> @Deprecated
> public static void copy(final InputStream input, final Writer output)
> throws IOException {
> copy(input, output, Charset.defaultCharset());
> }
> {code}
> My suggestion would be to change the javadoc to state "@link 
> copy(InputStream, Writer, Charset)} with @{code Charset.defaultCharset()}"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-411) moveFile throws Exception prematurely?

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-411.

Resolution: Invalid

Hi Nick,

thanks for reporting.

The implementation is correct. FilesUtils.deleteQuietly is called with the 
destination file (not the source file).

> moveFile throws Exception prematurely?
> --
>
> Key: IO-411
> URL: https://issues.apache.org/jira/browse/IO-411
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.4
>Reporter: Nick
>Priority: Minor
>
> This may not be an issue, but I noticed that the moveFile command throws an 
> exception after trying and ignoring the return value of deleteQuietly. Look 
> at line 2969 below.
> Taken from SVN head:
> {code}
> 2965 final boolean rename = srcFile.renameTo(destFile);
> 2966  if (!rename) {
> 2967  copyFile( srcFile, destFile );
> 2968  if (!srcFile.delete()) {
> 2969  FileUtils.deleteQuietly(destFile);
> 2970  throw new IOException("Failed to delete original file 
> '" srcFile + "' after copy to '" + destFile + "'");
> 2972  }
> 2973  }
> {code}
> deleteQuietly will just end up trying File.delete() again which will likely 
> fail at that point, but still, shouldn't there be another if statement there?
> Note: Haven't actually had issues with this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (IO-411) moveFile throws Exception prematurely?

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher updated IO-411:
-
Summary: moveFile throws Exception prematurely?  (was: moveFile throws 
Exception prematurely)

> moveFile throws Exception prematurely?
> --
>
> Key: IO-411
> URL: https://issues.apache.org/jira/browse/IO-411
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.4
>Reporter: Nick
>Priority: Minor
>
> This may not be an issue, but I noticed that the moveFile command throws an 
> exception after trying and ignoring the return value of deleteQuietly. Look 
> at line 2969 below.
> Taken from SVN head:
> {code}
> 2965 final boolean rename = srcFile.renameTo(destFile);
> 2966  if (!rename) {
> 2967  copyFile( srcFile, destFile );
> 2968  if (!srcFile.delete()) {
> 2969  FileUtils.deleteQuietly(destFile);
> 2970  throw new IOException("Failed to delete original file 
> '" srcFile + "' after copy to '" + destFile + "'");
> 2972  }
> 2973  }
> {code}
> deleteQuietly will just end up trying File.delete() again which will likely 
> fail at that point, but still, shouldn't there be another if statement there?
> Note: Haven't actually had issues with this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (IO-411) moveFile throws Exception prematurely

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher updated IO-411:
-
Summary: moveFile throws Exception prematurely  (was: moveFile throws 
Exception prematurely?)

> moveFile throws Exception prematurely
> -
>
> Key: IO-411
> URL: https://issues.apache.org/jira/browse/IO-411
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.4
>Reporter: Nick
>Priority: Minor
>
> This may not be an issue, but I noticed that the moveFile command throws an 
> exception after trying and ignoring the return value of deleteQuietly. Look 
> at line 2969 below.
> Taken from SVN head:
> {code}
> 2965 final boolean rename = srcFile.renameTo(destFile);
> 2966  if (!rename) {
> 2967  copyFile( srcFile, destFile );
> 2968  if (!srcFile.delete()) {
> 2969  FileUtils.deleteQuietly(destFile);
> 2970  throw new IOException("Failed to delete original file 
> '" srcFile + "' after copy to '" + destFile + "'");
> 2972  }
> 2973  }
> {code}
> deleteQuietly will just end up trying File.delete() again which will likely 
> fail at that point, but still, shouldn't there be another if statement there?
> Note: Haven't actually had issues with this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-253) Test Failures using IBM JDK

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-253.

   Resolution: Fixed
Fix Version/s: 2.5

Should have been fixed by 
https://github.com/apache/commons-io/commit/e23402c1dc133842c1acef0a2d7cd1f386647de7

> Test Failures using IBM JDK
> ---
>
> Key: IO-253
> URL: https://issues.apache.org/jira/browse/IO-253
> Project: Commons IO
>  Issue Type: Bug
>Affects Versions: 2.0
>Reporter: Niall Pemberton
>Priority: Minor
> Fix For: 2.5
>
>
> Jörg Schaible reports the following test failures in RC5 of Commons IO 2.0
> * http://markmail.org/message/eoo5bk6n3pfsvfwk
> "IBM JDK 1.5 does not like the OOME handling in the FileCleanerTestCase and 
> FileCleanerTrackingTestCase, while IBM JDK 1.6 has problems with the 
> WriterOutputStream handling UTF16."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-200) CSV component

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-200.

Resolution: Won't Fix

I'm closing this because https://commons.apache.org/proper/commons-csv/ already 
exists.

> CSV component
> -
>
> Key: IO-200
> URL: https://issues.apache.org/jira/browse/IO-200
> Project: Commons IO
>  Issue Type: New Feature
>  Components: Utilities
>Reporter: haruhiko nishi
>Priority: Trivial
>
> TableBuilder is 'Builder ' that maps the CSV to a matrix and provides 
> interface that allows user to manipulate after it is build by parsing a csv 
> file to it parse() method.(There is only one method implemented and it is for 
> copying a column values to another position, as I could not  think of other 
> operation that may be useful)
> Within the TableBuilder, each column of the CSV is represented as byte[] and 
> each becomes a target to be validated against Rule,represented by the 
> interface that you find in the example code below. As TableBuilder 
> "buildTable" ,when parse() method is invoked, a byte[]  representation of the 
> value of each CSV cell  is passed to isValid() method of implementations of 
> Rules, which you apply to the TableBuilder instance through the addRule() 
> method. (you you can add as many Rule as you need.)
> Rule gets executed until the validation fails or succeeds. If any of the Rule 
> fails, then its replace() is called and the column value being processed gets 
> replaced by the retun value of this method.
> Another goodie is that it is possible to refer to the values of preceding 
> cell values of the row within a Rule.
> It is useful if you need to see the entries of the preceding cell when 
> validating the value in a Rule. An example would be,
> Given a csv,
> A,B,C
> 1,2,3
> in order for the value  3 of the column C is to be validated true, the Value 
> of A needs to be less than the value of C.
> TableBuilder is RFC 4180 compliant and therefore distinguishes NL exists by 
> itself and NL found in double quotes.
> So you can add Rule that practically removes all NL chars found in value 
> enclosed within doublequotes. 
> (useful when you need to remove CRLF in double quotes from  CSV exported from 
> Excel)
> Currently, TableBuilder implements a method called copyColumn with method 
> signature of,
> copyColumn(Rule rule,int from,int to, boolean override) which allows user to 
> manipulate the parsed csv.
> What it does is literarly copies column from that is specified  at 'from' and 
> to 'to' position of the matrix.
> If override is true, the copying colum is overriden else the column is right 
> shifted and inserted at the specified position.
> You can specify some kind of Rule here to rewrite the value being copied from 
> the origin.
> An example would be copy column value that all ends with .jpg or .gif and to 
> the position specified prefixing the column value with 
> "http://some.server.com/imanges."; after checking the image exists, after 
> checking that the named file exists at some location also by an 
> implementation of  another Rule.
> TableBuilder is just a "rough skecth" idea of CSV parsing.(The code below 
> works fine though) it still needs alot of refactoring and so.
> I appreciate any comment on this idea. What do you think? My code style sucks 
> I know! 
> Here is simple exampe to use TableBuilder.
> {code:title=TableBuilder|borderStyle=solid}
> public static void main(String[] args)throws Exception{
> TableBuilder tableBuilder=new TableBuilder("UTF-8",
> new MessageHandler(){
> public void handleMessage(String message) {
> System.err.println(message);
> }
> },0,true);
> tableBuilder.addRule(3,new RemoveNLChars()); //removing NL cahracters 
> found in value.
> tableBuilder.parse(new FileInputStream("test.txt"),TableBuilder.CSV);
> List list=tableBuilder.getRowAsListOf(Record.class);
> for(Record record:list)
> System.out.println(record.getA());//TODO not implemented yet!
> tableBuilder.writeTo(new 
> FileOutputStream("test_mod.txt"),TableBuilder.CSV);
> }
> public class RemoveNLChars extends StringValueRuleAdapter {
> protected boolean isValid(String columnValue) {
> return !columnValue.contains(System.getProperty("line.separator"));
> }
> protected String replace(String columnValue) {
> return 
> columnValue.replaceAll(System.getProperty("line.separator"),"");
> }
> public String getMessage() {
> return "";
> }
> }
> public interface Rule {
> public void setRowReference(List rowReference);
> public void setCharsetName(String charsetName);
> boolean isValid(final byte[] columnValue);
> byte[] replace(final byte

[jira] [Updated] (IO-200) CSV component

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher updated IO-200:
-
Fix Version/s: (was: 3.x)

> CSV component
> -
>
> Key: IO-200
> URL: https://issues.apache.org/jira/browse/IO-200
> Project: Commons IO
>  Issue Type: New Feature
>  Components: Utilities
>Reporter: haruhiko nishi
>Priority: Trivial
>
> TableBuilder is 'Builder ' that maps the CSV to a matrix and provides 
> interface that allows user to manipulate after it is build by parsing a csv 
> file to it parse() method.(There is only one method implemented and it is for 
> copying a column values to another position, as I could not  think of other 
> operation that may be useful)
> Within the TableBuilder, each column of the CSV is represented as byte[] and 
> each becomes a target to be validated against Rule,represented by the 
> interface that you find in the example code below. As TableBuilder 
> "buildTable" ,when parse() method is invoked, a byte[]  representation of the 
> value of each CSV cell  is passed to isValid() method of implementations of 
> Rules, which you apply to the TableBuilder instance through the addRule() 
> method. (you you can add as many Rule as you need.)
> Rule gets executed until the validation fails or succeeds. If any of the Rule 
> fails, then its replace() is called and the column value being processed gets 
> replaced by the retun value of this method.
> Another goodie is that it is possible to refer to the values of preceding 
> cell values of the row within a Rule.
> It is useful if you need to see the entries of the preceding cell when 
> validating the value in a Rule. An example would be,
> Given a csv,
> A,B,C
> 1,2,3
> in order for the value  3 of the column C is to be validated true, the Value 
> of A needs to be less than the value of C.
> TableBuilder is RFC 4180 compliant and therefore distinguishes NL exists by 
> itself and NL found in double quotes.
> So you can add Rule that practically removes all NL chars found in value 
> enclosed within doublequotes. 
> (useful when you need to remove CRLF in double quotes from  CSV exported from 
> Excel)
> Currently, TableBuilder implements a method called copyColumn with method 
> signature of,
> copyColumn(Rule rule,int from,int to, boolean override) which allows user to 
> manipulate the parsed csv.
> What it does is literarly copies column from that is specified  at 'from' and 
> to 'to' position of the matrix.
> If override is true, the copying colum is overriden else the column is right 
> shifted and inserted at the specified position.
> You can specify some kind of Rule here to rewrite the value being copied from 
> the origin.
> An example would be copy column value that all ends with .jpg or .gif and to 
> the position specified prefixing the column value with 
> "http://some.server.com/imanges."; after checking the image exists, after 
> checking that the named file exists at some location also by an 
> implementation of  another Rule.
> TableBuilder is just a "rough skecth" idea of CSV parsing.(The code below 
> works fine though) it still needs alot of refactoring and so.
> I appreciate any comment on this idea. What do you think? My code style sucks 
> I know! 
> Here is simple exampe to use TableBuilder.
> {code:title=TableBuilder|borderStyle=solid}
> public static void main(String[] args)throws Exception{
> TableBuilder tableBuilder=new TableBuilder("UTF-8",
> new MessageHandler(){
> public void handleMessage(String message) {
> System.err.println(message);
> }
> },0,true);
> tableBuilder.addRule(3,new RemoveNLChars()); //removing NL cahracters 
> found in value.
> tableBuilder.parse(new FileInputStream("test.txt"),TableBuilder.CSV);
> List list=tableBuilder.getRowAsListOf(Record.class);
> for(Record record:list)
> System.out.println(record.getA());//TODO not implemented yet!
> tableBuilder.writeTo(new 
> FileOutputStream("test_mod.txt"),TableBuilder.CSV);
> }
> public class RemoveNLChars extends StringValueRuleAdapter {
> protected boolean isValid(String columnValue) {
> return !columnValue.contains(System.getProperty("line.separator"));
> }
> protected String replace(String columnValue) {
> return 
> columnValue.replaceAll(System.getProperty("line.separator"),"");
> }
> public String getMessage() {
> return "";
> }
> }
> public interface Rule {
> public void setRowReference(List rowReference);
> public void setCharsetName(String charsetName);
> boolean isValid(final byte[] columnValue);
> byte[] replace(final byte[] columnValue);
> String getMessage();
> }
> //StringValueruleAdapter is an

[jira] [Updated] (IO-533) Introduce Tailable interface

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher updated IO-533:
-
Summary: Introduce Tailable interface  (was: Introduce Tailable interface 
to allow tailing of files accessed using jCIFS)

> Introduce Tailable interface
> 
>
> Key: IO-533
> URL: https://issues.apache.org/jira/browse/IO-533
> Project: Commons IO
>  Issue Type: New Feature
>Reporter: Pascal Schumacher
>
> suggested in https://github.com/apache/commons-io/pull/32



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (NUMBERS-27) In Complex, replace hand-coded hypot with Java.lang.Math.hypot

2017-04-23 Thread Eric Barnhill (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUMBERS-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Barnhill updated NUMBERS-27:
-
Summary: In Complex, replace hand-coded hypot with Java.lang.Math.hypot  
(was: In Complex, replace homegrown hypot with Java.Math.hypot)

> In Complex, replace hand-coded hypot with Java.lang.Math.hypot
> --
>
> Key: NUMBERS-27
> URL: https://issues.apache.org/jira/browse/NUMBERS-27
> Project: Commons Numbers
>  Issue Type: Improvement
>Reporter: Eric Barnhill
>Priority: Trivial
>
> ISO C standard for Complex numbers states that the abs() must be obtained 
> with a hypot() function to avoid over and underflows. The function is 
> correctly hand-coded in the current iteration of Complex, but I see no need 
> for this. Propose replacing with java.lang.Math.hypot to make the code and 
> its conformance to ISO C more easily comprehensible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (NUMBERS-27) In Complex, replace homegrown hypot with Java.Math.hypot

2017-04-23 Thread Eric Barnhill (JIRA)
Eric Barnhill created NUMBERS-27:


 Summary: In Complex, replace homegrown hypot with Java.Math.hypot
 Key: NUMBERS-27
 URL: https://issues.apache.org/jira/browse/NUMBERS-27
 Project: Commons Numbers
  Issue Type: Improvement
Reporter: Eric Barnhill
Priority: Trivial


ISO C standard for Complex numbers states that the abs() must be obtained with 
a hypot() function to avoid over and underflows. The function is correctly 
hand-coded in the current iteration of Complex, but I see no need for this. 
Propose replacing with java.lang.Math.hypot to make the code and its 
conformance to ISO C more easily comprehensible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (COMPRESS-389) Inconsistent increment of 'loc' in ZipFile.BoundedInputStream

2017-04-23 Thread Sebb (JIRA)
Sebb created COMPRESS-389:
-

 Summary: Inconsistent increment of 'loc' in 
ZipFile.BoundedInputStream
 Key: COMPRESS-389
 URL: https://issues.apache.org/jira/browse/COMPRESS-389
 Project: Commons Compress
  Issue Type: Bug
Reporter: Sebb


ZipFile.BoundedInputStream.read() always increments loc, even if no bytes are 
read.

However ZipFile.BoundedInputStream.read(byte[] b, int off, int len) only 
increments the field if some bytes were read.

This looks wrong.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980459#comment-15980459
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user sebbASF commented on a diff in the pull request:

https://github.com/apache/commons-compress/pull/21#discussion_r112838915
  
--- Diff: 
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java ---
@@ -,14 +1122,11 @@ public int read() throws IOException {
 }
 return -1;
 }
-synchronized (archive) {
-archive.position(loc++);
-int read = read(1);
-if (read < 0) {
-return read;
-}
-return buffer.get() & 0xff;
+int read = read(loc++, 1);
+if (read < 0) {
--- End diff --

OK. This needs to be clarified in the Javadoc.


> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980457#comment-15980457
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user kvr000 commented on a diff in the pull request:

https://github.com/apache/commons-compress/pull/21#discussion_r112838711
  
--- Diff: 
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java ---
@@ -1081,16 +1082,26 @@ private boolean startsWithLocalFileHeader() throws 
IOException {
 }
 
 /**
+ * Creates new BoundedInputStream, according to implementation of
+ * underlying archive channel.
+ */
+private BoundedInputStream createBoundedInputStream(long start, long 
remaining) {
+return archive instanceof FileChannel ?
+new BoundedFileChannelInputStream(start, remaining) :
+new BoundedInputStream(start, remaining);
+}
+
+/**
  * InputStream that delegates requests to the underlying
  * SeekableByteChannel, making sure that only bytes from a certain
  * range can be read.
  */
 private class BoundedInputStream extends InputStream {
-private static final int MAX_BUF_LEN = 8192;
-private final ByteBuffer buffer;
-private long remaining;
-private long loc;
-private boolean addDummyByte = false;
+protected static final int MAX_BUF_LEN = 8192;
+protected final ByteBuffer buffer;
+protected long remaining;
+protected long loc;
+protected boolean addDummyByte = false;
 
--- End diff --

They can be package protected too, it's in private class anyway. The fields 
are accessed by the FileChannel specialization so private won't suffice.



> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980454#comment-15980454
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user bodewig commented on a diff in the pull request:

https://github.com/apache/commons-compress/pull/21#discussion_r112838584
  
--- Diff: 
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java ---
@@ -,14 +1122,11 @@ public int read() throws IOException {
 }
 return -1;
 }
-synchronized (archive) {
-archive.position(loc++);
-int read = read(1);
-if (read < 0) {
-return read;
-}
-return buffer.get() & 0xff;
+int read = read(loc++, 1);
+if (read < 0) {
--- End diff --

I think it depends on what we think `synchronized` is supposed to protect 
against.

As it stands the `synchronized` block also protected concurrent reads from 
the same `BoundedInputStream` from overwriting their results (protecting `loc` 
and `buffer` not just the file position).

If we want to keep that level of protection the changes are not going to 
work. If we say each input stream returned by `getInputStream` can only be used 
by a single thread, then I think it is fine to keep the increment outside of 
the block and only synchronise the code that protects the interaction between 
different streams.




> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980453#comment-15980453
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user sebbASF commented on a diff in the pull request:

https://github.com/apache/commons-compress/pull/21#discussion_r112838372
  
--- Diff: 
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java ---
@@ -,14 +1122,11 @@ public int read() throws IOException {
 }
 return -1;
 }
-synchronized (archive) {
-archive.position(loc++);
-int read = read(1);
-if (read < 0) {
-return read;
-}
-return buffer.get() & 0xff;
+int read = read(loc++, 1);
+if (read < 0) {
--- End diff --

Surely the increment of loc needs to be synchronised?


> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980452#comment-15980452
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user sebbASF commented on a diff in the pull request:

https://github.com/apache/commons-compress/pull/21#discussion_r112838358
  
--- Diff: 
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java ---
@@ -1081,16 +1082,26 @@ private boolean startsWithLocalFileHeader() throws 
IOException {
 }
 
 /**
+ * Creates new BoundedInputStream, according to implementation of
+ * underlying archive channel.
+ */
+private BoundedInputStream createBoundedInputStream(long start, long 
remaining) {
+return archive instanceof FileChannel ?
+new BoundedFileChannelInputStream(start, remaining) :
+new BoundedInputStream(start, remaining);
+}
+
+/**
  * InputStream that delegates requests to the underlying
  * SeekableByteChannel, making sure that only bytes from a certain
  * range can be read.
  */
 private class BoundedInputStream extends InputStream {
-private static final int MAX_BUF_LEN = 8192;
-private final ByteBuffer buffer;
-private long remaining;
-private long loc;
-private boolean addDummyByte = false;
+protected static final int MAX_BUF_LEN = 8192;
+protected final ByteBuffer buffer;
+protected long remaining;
+protected long loc;
+protected boolean addDummyByte = false;
 
--- End diff --

Do these really need to be protected rather than private or 
package-protected?


> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980451#comment-15980451
 ] 

Stefan Bodewig commented on COMPRESS-388:
-

Yes, I can see the PR again, thanks.

> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IO-534) FileUtilTestCase.testForceDeleteDir() should not delete testDirectory parent

2017-04-23 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved IO-534.
-
   Resolution: Fixed
Fix Version/s: 2.6

O-534 FileUtilTestCase.testForceDeleteDir() should not delete
testDirectory parent

Project: http://git-wip-us.apache.org/repos/asf/commons-io/repo
Commit: http://git-wip-us.apache.org/repos/asf/commons-io/commit/31e14101
Tree: http://git-wip-us.apache.org/repos/asf/commons-io/tree/31e14101
Diff: http://git-wip-us.apache.org/repos/asf/commons-io/diff/31e14101


> FileUtilTestCase.testForceDeleteDir() should not delete testDirectory parent
> 
>
> Key: IO-534
> URL: https://issues.apache.org/jira/browse/IO-534
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Reporter: Sebb
> Fix For: 2.6
>
>
> The test case FileUtilTestCase.testForceDeleteDir() has always attempted to 
> delete the testDirectory parent.
> This is wrong; it should not assume that the testDirectory has a parent that 
> can safely be deleted.
> This is why the testDirectory is currently defined as "test/io" when it would 
> make more sense to use a temporary directory under target. It also explains 
> why the "test" directory is left behind when tests complete.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (TEXT-45) WordUtils delimiters should be strings, not char varargs

2017-04-23 Thread Rob Tompkins (JIRA)

[ 
https://issues.apache.org/jira/browse/TEXT-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980433#comment-15980433
 ] 

Rob Tompkins edited comment on TEXT-45 at 4/23/17 3:31 PM:
---

I disagree with my comment from Feb 7 now. The varargs delimiters makes 
substantially more sense consider {{WordUtils.capitalize("i am.fine", [' 
','.'])="I Am.Fine";}}. This reads considerably easier than 
{{WordUtils.capitalize("i am.fine"," .")}}. This feels vague because I wonder 
whether to use the string as a whole as the delimiter or the character array 
underlying the string as individual delimiters?


was (Author: chtompki):
I disagree with my comment from Feb 7 now. The varargs delimiters makes 
substantially more sense consider {{WordUtils.capitalize("i am.fine", [' 
','.'])="I Am.Fine";}}. This reads considerably easier than 
{{WordUtils.capitalize("i am.fine"," .")}}. This feels vague because I wonder 
whether to use the string as a whole as the delimiter or the character array 
underlying the string as the delimiter?

> WordUtils delimiters should be strings, not char varargs
> 
>
> Key: TEXT-45
> URL: https://issues.apache.org/jira/browse/TEXT-45
> Project: Commons Text
>  Issue Type: Improvement
>Reporter: Andrew Pennebaker
>Priority: Minor
>  Labels: api,, interface,ease,of,use,, robustness,
> Fix For: 1.1
>
>
> Strings behave like char varargs of arbitrary length, but are much easier to 
> use.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEXT-45) WordUtils delimiters should be strings, not char varargs

2017-04-23 Thread Rob Tompkins (JIRA)

[ 
https://issues.apache.org/jira/browse/TEXT-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980433#comment-15980433
 ] 

Rob Tompkins commented on TEXT-45:
--

I disagree with my comment from Feb 7 now. The varargs delimiters makes 
substantially more sense consider {{WordUtils.capitalize("i am.fine", [' 
','.'])="I Am.Fine";}}. This reads considerably easier than 
{{WordUtils.capitalize("i am.fine"," .")}}. This feels vague because I wonder 
whether to use the string as a whole as the delimiter or the character array 
underlying the string as the delimiter?

> WordUtils delimiters should be strings, not char varargs
> 
>
> Key: TEXT-45
> URL: https://issues.apache.org/jira/browse/TEXT-45
> Project: Commons Text
>  Issue Type: Improvement
>Reporter: Andrew Pennebaker
>Priority: Minor
>  Labels: api,, interface,ease,of,use,, robustness,
> Fix For: 1.1
>
>
> Strings behave like char varargs of arbitrary length, but are much easier to 
> use.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IO-534) FileUtilTestCase.testForceDeleteDir() should not delete testDirectory parent

2017-04-23 Thread Sebb (JIRA)
Sebb created IO-534:
---

 Summary: FileUtilTestCase.testForceDeleteDir() should not delete 
testDirectory parent
 Key: IO-534
 URL: https://issues.apache.org/jira/browse/IO-534
 Project: Commons IO
  Issue Type: Bug
  Components: Utilities
Reporter: Sebb


The test case FileUtilTestCase.testForceDeleteDir() has always attempted to 
delete the testDirectory parent.

This is wrong; it should not assume that the testDirectory has a parent that 
can safely be deleted.

This is why the testDirectory is currently defined as "test/io" when it would 
make more sense to use a temporary directory under target. It also explains why 
the "test" directory is left behind when tests complete.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread Zbynek Vyskovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980431#comment-15980431
 ] 

Zbynek Vyskovsky commented on COMPRESS-388:
---

Stefan: Sorry, github detected my commit as a spam probably and blocked (hid) 
my account. Now it should be visible again.

In the meantime I improved the coverage by creating 
src/test/resources/mixed.zip, containing two relatively big files, one 
inflated, one stored. And various ways are used to read the file, in order to 
properly test cache read, big reads and their combination (to check whether the 
already cached bytes are not discarded etc.). And not surprisingly, it 
discovered a bug.


> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980429#comment-15980429
 ] 

ASF GitHub Bot commented on COMPRESS-388:
-

Github user coveralls commented on the issue:

https://github.com/apache/commons-compress/pull/21
  

[![Coverage 
Status](https://coveralls.io/builds/11201739/badge)](https://coveralls.io/builds/11201739)

Coverage increased (+0.07%) to 84.303% when pulling 
**029a4974f81f423c1b8805f72cec9acad3069335 on 
kvr000:feature/COMPRESS-388-concurrent-reads-performance-fix** into 
**13a039029ca7d7fca9862cfb792f7148c555f05f on apache:master**.



> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IO-447) Possible NPE in FileSystemUtils.freeSpaceWindows; FilenameUtils.normalize can return null

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980427#comment-15980427
 ] 

ASF GitHub Bot commented on IO-447:
---

Github user asfgit closed the pull request at:

https://github.com/apache/commons-io/pull/21


> Possible NPE in FileSystemUtils.freeSpaceWindows; FilenameUtils.normalize can 
> return null
> -
>
> Key: IO-447
> URL: https://issues.apache.org/jira/browse/IO-447
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Sebb
>
> There is a possible NPE in FileSystemUtils.freeSpaceWindows.
> FilenameUtils.normalize can return null so the path.length() will NPE
> For example, ".." returns null.
> I'm not entirely sure why the path needs to be normalised, apart from 
> converting / to \. Even that seems a bit dubious - why should the user want 
> to return the freespace for a Unix-style path on a Windows system?
> And if it does need to be normalised, why not use the File class, which 
> handles / to \ conversion transparently?
> A short term fix would be to return IAE for paths that normalise to null.
> However that would not allow the use of paths such as ".." - though at least 
> that would not cause NPE.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IO-499) FilenameUtils.directoryContains(String, String) gives false positive when two directories exist with equal prefixes

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980428#comment-15980428
 ] 

ASF GitHub Bot commented on IO-499:
---

Github user asfgit closed the pull request at:

https://github.com/apache/commons-io/pull/20


> FilenameUtils.directoryContains(String, String) gives false positive when two 
> directories exist with equal prefixes
> ---
>
> Key: IO-499
> URL: https://issues.apache.org/jira/browse/IO-499
> Project: Commons IO
>  Issue Type: Bug
>Affects Versions: 2.4
>Reporter: Federico Bonelli
>Priority: Minor
>
> In a folder layout as such:
> {code}
> /foo/a.txt
> /foo2/b.txt
> {code}
> The result of invoking directoryContains is wrong:
> {code}
> FilenameUtils.directoryContains("/foo", "/foo2/b.txt"); // returns true
> {code}
> even if "/foo" and "/foo2/b.txt" are the canonical paths, they start with the 
> same characters, and the current implementation of the method fails.
> As workaround we are currently appending a path separator '/' to the first 
> argument.
> It is noteworthy that the current implementation of 
> FileUtils.directoryContains() reveals this issue because it uses the 
> File.getCanonicalPath() to obtain the String paths of "/foo" and 
> "/foo2/b.txt".



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (NUMBERS-20) Copy prime related code from [math]

2017-04-23 Thread Raymond DeCampo (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUMBERS-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond DeCampo resolved NUMBERS-20.

Resolution: Fixed

Issue resolve with merge of feature-NUMBERS-20 branch into master.

> Copy prime related code from [math]
> ---
>
> Key: NUMBERS-20
> URL: https://issues.apache.org/jira/browse/NUMBERS-20
> Project: Commons Numbers
>  Issue Type: Task
>Reporter: Raymond DeCampo
>Priority: Minor
> Fix For: 1.0
>
>
> Port the classes implementation currently in master branch of Commons Math 
> from package {{org.apache.commons.math4.primes}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (NUMBERS-20) Copy prime related code from [math]

2017-04-23 Thread Raymond DeCampo (JIRA)

[ 
https://issues.apache.org/jira/browse/NUMBERS-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980422#comment-15980422
 ] 

Raymond DeCampo edited comment on NUMBERS-20 at 4/23/17 3:01 PM:
-

Issue resolved with merge of feature-NUMBERS-20 branch into master.


was (Author: raydecampo):
Issue resolve with merge of feature-NUMBERS-20 branch into master.

> Copy prime related code from [math]
> ---
>
> Key: NUMBERS-20
> URL: https://issues.apache.org/jira/browse/NUMBERS-20
> Project: Commons Numbers
>  Issue Type: Task
>Reporter: Raymond DeCampo
>Priority: Minor
> Fix For: 1.0
>
>
> Port the classes implementation currently in master branch of Commons Math 
> from package {{org.apache.commons.math4.primes}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IO-533) Introduce Tailable interface to allow tailing of files accessed using jCIFS

2017-04-23 Thread Pascal Schumacher (JIRA)
Pascal Schumacher created IO-533:


 Summary: Introduce Tailable interface to allow tailing of files 
accessed using jCIFS
 Key: IO-533
 URL: https://issues.apache.org/jira/browse/IO-533
 Project: Commons IO
  Issue Type: New Feature
Reporter: Pascal Schumacher


suggested in https://github.com/apache/commons-io/pull/32



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IO-532) DirectoryUtils - isEqual to compare directories

2017-04-23 Thread Pascal Schumacher (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980356#comment-15980356
 ] 

Pascal Schumacher commented on IO-532:
--

suggested in https://github.com/apache/commons-io/pull/31

> DirectoryUtils - isEqual to compare directories
> ---
>
> Key: IO-532
> URL: https://issues.apache.org/jira/browse/IO-532
> Project: Commons IO
>  Issue Type: New Feature
>Reporter: Pascal Schumacher
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IO-532) DirectoryUtils - isEqual to compare directories

2017-04-23 Thread Pascal Schumacher (JIRA)
Pascal Schumacher created IO-532:


 Summary: DirectoryUtils - isEqual to compare directories
 Key: IO-532
 URL: https://issues.apache.org/jira/browse/IO-532
 Project: Commons IO
  Issue Type: New Feature
Reporter: Pascal Schumacher






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-424) Javadoc fixes, mostly to appease 1.8.0

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-424.

   Resolution: Fixed
Fix Version/s: 2.5

> Javadoc fixes, mostly to appease 1.8.0
> --
>
> Key: IO-424
> URL: https://issues.apache.org/jira/browse/IO-424
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Ville Skyttä
>Priority: Minor
>  Labels: patch
> Fix For: 2.5
>
> Attachments: 0001-Javadoc-fixes-mostly-to-appease-1.8.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (IO-134) Document thread safety of classes

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher updated IO-134:
-
Fix Version/s: (was: 3.x)

> Document thread safety of classes
> -
>
> Key: IO-134
> URL: https://issues.apache.org/jira/browse/IO-134
> Project: Commons IO
>  Issue Type: Wish
>Reporter: Sebb
>
> It would be useful to document the thread-safety of all the classes:
> Fully thread-safe (e.g. immutable)
> Thread-safe if particular methods are synchronised
> Thread-safe if specified combinations of methods are synchronised
> Thread-hostile - cannot be used safely from multiple threads



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-530) Tailer pegs CPU if file disappears and doesn't come back

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-530.

Resolution: Duplicate

duplicates fixed issue [IO-528]

> Tailer pegs CPU if file disappears and doesn't come back
> 
>
> Key: IO-530
> URL: https://issues.apache.org/jira/browse/IO-530
> Project: Commons IO
>  Issue Type: Bug
>Affects Versions: 2.5
>Reporter: Randall Theobald
>Priority: Critical
> Attachments: io-530.patch
>
>
> I ran into a situation where a bug in my log rotation leads to the tailed 
> file being renamed, but the original file name does not re-appear (new log 
> entries still go to the renamed log file). This uncovered a bug in the Tailer 
> class. In this case, tailer enters a tight loop trying to re-open the file:
> {code}
> while (getRun()) {
> final boolean newer = FileUtils.isFileNewer(file, last); // 
> IO-279, must be done first
> // Check the file length to see if it was rotated
> final long length = file.length();
> if (length < position) {
> // File was rotated
> listener.fileRotated();
> // Reopen the reader after rotation ensuring that the old 
> file is closed iff we re-open it
> // successfully
> try (RandomAccessFile save = reader) {
> reader = new RandomAccessFile(file, RAF_MODE);
> // At this point, we're sure that the old file is 
> rotated
> // Finish scanning the old file and then we'll start 
> with the new one
> try {
> readLines(save);
> }  catch (IOException ioe) {
> listener.handle(ioe);
> }
> position = 0;
> } catch (final FileNotFoundException e) {
> // in this case we continue to use the previous 
> reader and position values
> listener.fileNotFound();
> }
> continue;
> {code}
> Since a non-existent file returns a length of zero, we keep entering this top 
> loop, trying to open the missing file, getting a FileNotFoundException and 
> starting over.
> There should be some delay here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IO-398) listener.fileRotated() will be invoked more than one time in a real rotate activity

2017-04-23 Thread Pascal Schumacher (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980345#comment-15980345
 ] 

Pascal Schumacher commented on IO-398:
--

duplicates fixed issue [IO-528]

> listener.fileRotated() will be invoked more than one time in a real rotate 
> activity
> ---
>
> Key: IO-398
> URL: https://issues.apache.org/jira/browse/IO-398
> Project: Commons IO
>  Issue Type: Bug
>Affects Versions: 2.4
>Reporter: Lantao Jin
> Attachments: IO-398.patch, IO398_with_ut.patch
>
>
>When Tailer considers file rotation is occurred, 
> listener.fileRotated() will be execute, and file will re-open by "reader = 
> new RandomAccessFile(file, RAF_MODE);". However, the new file may not be 
> created yet, FileNotFoundException would be caught and while loop would be 
> executed again and again until the new file is actually created, which cause 
> listener.fileRotated() triggered repeatedly. 
> This is the piece of code causing the problem:
> {noformat} 
> while (getRun()) {
> final boolean newer = isFileNewer(file, last); // IO-279, must be done 
> first
> // Check the file length to see if it was rotated
> final long length = file.length();
> if (length < position) {
> // File was rotated
> listener.fileRotated();
> // Reopen the reader after rotation
> try {
> // Ensure that the old file is closed iff we re-open it 
> successfully
> final RandomAccessFile save = reader;
> reader = new RandomAccessFile(file, RAF_MODE);
> /* some code */
> } catch (final FileNotFoundException e) {
> // in this case we continue to use the previous reader and 
> position values
> listener.fileNotFound();
> }
> continue;
> {noformat}
>   While condition checkes can be deployed in listener.fileRotated() to 
> correct the sematic of fileRotate, it is better to prevent multiple 
> invocation of listener.fileRotated() on this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (IO-528) Tailer.run race condition runaway logging

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher resolved IO-528.
--
   Resolution: Fixed
 Assignee: Pascal Schumacher
Fix Version/s: 2.6

> Tailer.run race condition runaway logging
> -
>
> Key: IO-528
> URL: https://issues.apache.org/jira/browse/IO-528
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.5
>Reporter: Dave Moten
>Assignee: Pascal Schumacher
> Fix For: 2.6
>
>
> `Tailer.run` has a race condition that can have serious effects. 
> The `run()` method has two while loops. The first waits till the file exists 
> and the second loop reads lines from the file doing some file rotation 
> checking on the way.  If the file is deleted while the second loop is in 
> progress then the loop goes crazy logging warnings that look like this:
> `
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> `
> In our case this had serious effects. The file being tailed was deleted by 
> another process and all available disk space was rapidly used up by the 
> logging. This crashed a system.
> The fix is to put a sleep after the call to `fileNotFound()`.
> This problem was raised in IO-398 three years ago but no change was made to 
> the code base.
> PR submitted via github repo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (IO-398) listener.fileRotated() will be invoked more than one time in a real rotate activity

2017-04-23 Thread Pascal Schumacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher closed IO-398.

Resolution: Duplicate

> listener.fileRotated() will be invoked more than one time in a real rotate 
> activity
> ---
>
> Key: IO-398
> URL: https://issues.apache.org/jira/browse/IO-398
> Project: Commons IO
>  Issue Type: Bug
>Affects Versions: 2.4
>Reporter: Lantao Jin
> Attachments: IO-398.patch, IO398_with_ut.patch
>
>
>When Tailer considers file rotation is occurred, 
> listener.fileRotated() will be execute, and file will re-open by "reader = 
> new RandomAccessFile(file, RAF_MODE);". However, the new file may not be 
> created yet, FileNotFoundException would be caught and while loop would be 
> executed again and again until the new file is actually created, which cause 
> listener.fileRotated() triggered repeatedly. 
> This is the piece of code causing the problem:
> {noformat} 
> while (getRun()) {
> final boolean newer = isFileNewer(file, last); // IO-279, must be done 
> first
> // Check the file length to see if it was rotated
> final long length = file.length();
> if (length < position) {
> // File was rotated
> listener.fileRotated();
> // Reopen the reader after rotation
> try {
> // Ensure that the old file is closed iff we re-open it 
> successfully
> final RandomAccessFile save = reader;
> reader = new RandomAccessFile(file, RAF_MODE);
> /* some code */
> } catch (final FileNotFoundException e) {
> // in this case we continue to use the previous reader and 
> position values
> listener.fileNotFound();
> }
> continue;
> {noformat}
>   While condition checkes can be deployed in listener.fileRotated() to 
> correct the sematic of fileRotate, it is better to prevent multiple 
> invocation of listener.fileRotated() on this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IO-528) Tailer.run race condition runaway logging

2017-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980343#comment-15980343
 ] 

ASF GitHub Bot commented on IO-528:
---

Github user asfgit closed the pull request at:

https://github.com/apache/commons-io/pull/29


> Tailer.run race condition runaway logging
> -
>
> Key: IO-528
> URL: https://issues.apache.org/jira/browse/IO-528
> Project: Commons IO
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.5
>Reporter: Dave Moten
>
> `Tailer.run` has a race condition that can have serious effects. 
> The `run()` method has two while loops. The first waits till the file exists 
> and the second loop reads lines from the file doing some file rotation 
> checking on the way.  If the file is deleted while the second loop is in 
> progress then the loop goes crazy logging warnings that look like this:
> `
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileRotated
> INFO: file rotated
> Dec 06, 2016 1:02:18 AM com.github.davidmoten.logan.LogFile$1 fileNotFound
> WARNING: file not found
> `
> In our case this had serious effects. The file being tailed was deleted by 
> another process and all available disk space was rapidly used up by the 
> logging. This crashed a system.
> The fix is to put a sleep after the call to `fileNotFound()`.
> This problem was raised in IO-398 three years ago but no change was made to 
> the code base.
> PR submitted via github repo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980335#comment-15980335
 ] 

Stefan Bodewig commented on COMPRESS-388:
-

github says there is no PR 21 ?

> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-384) Tar File EOF not being detected

2017-04-23 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980334#comment-15980334
 ] 

Stefan Bodewig commented on COMPRESS-384:
-

Oh my, tar dialects. Here is the full story.

Tar archive consist of records (512 bytes each) grouped in blocks (usually 20 
records, i.e. 10kB). The end of the tar archive is signalled by two consecutive 
records of zeros, your archive has such a trailer and {{TarArchiveInputStream}} 
detects it. But your archive ends after those two records while 
{{TarArchiveInputStream}} tries to finish reading the current block (there are 
still > 3500 bytes missing for the first block). So 7z uses one of the dialects 
that don't pad out the last block but {{TarArchiveInputStream}} has to try to 
consume it and thus still tries to read when your archive is done.

I'm afraid there isn't anything we can do about it. See my suggestion from 
about half an hour ago for a workaround.

> Tar File EOF not being detected
> ---
>
> Key: COMPRESS-384
> URL: https://issues.apache.org/jira/browse/COMPRESS-384
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 1.13
> Environment: Windows 10, JDK 1.8
>Reporter: Jason Shattu
> Attachments: file.tar
>
>
> I've created both a zip and tar file, with the same contents using the latest 
> version of 7zip. When I read both archives using code of the form:
> ArchiveStreamFactory().createArchiveInputStream(format, inputStream);
> I notice that both formats correctly list their contents, however the Tar 
> Input doesn't return a "null" entry when it hits the EOF from 
> archiveStream.getNextEntry() 
> this makes it hard to distinguish between a genuine EOF or a file which is 
> still being written to. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980328#comment-15980328
 ] 

Stefan Bodewig commented on COMPRESS-388:
-

Thanks Zbynek. Don't worry about the coverage change, it goes up and down all 
the time only dependent on which of the JDKs we use to build in Travis finishes 
first (for some reason coverage seems to depend on the JDK, maybe the compiler 
is performing some optimizations).

> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-384) Tar File EOF not being detected

2017-04-23 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980327#comment-15980327
 ] 

Stefan Bodewig commented on COMPRESS-384:
-

I still need to look into why the tar stream doesn't detect EOF, but want to 
share an idea, Jason. While most archiving formats haven't got any way of 
knowing when the stream is finished, most compression formats do. gzip, xz, 
bzip2 all know when the compressed stream is finished, so if you can make your 
archive creator produce a tar.gz rather than a tar you should be able to work 
around the problem - at the cost of higher processing time on both sides.

> Tar File EOF not being detected
> ---
>
> Key: COMPRESS-384
> URL: https://issues.apache.org/jira/browse/COMPRESS-384
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 1.13
> Environment: Windows 10, JDK 1.8
>Reporter: Jason Shattu
> Attachments: file.tar
>
>
> I've created both a zip and tar file, with the same contents using the latest 
> version of 7zip. When I read both archives using code of the form:
> ArchiveStreamFactory().createArchiveInputStream(format, inputStream);
> I notice that both formats correctly list their contents, however the Tar 
> Input doesn't return a "null" entry when it hits the EOF from 
> archiveStream.getNextEntry() 
> this makes it hard to distinguish between a genuine EOF or a file which is 
> still being written to. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile

2017-04-23 Thread Zbynek Vyskovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980311#comment-15980311
 ] 

Zbynek Vyskovsky commented on COMPRESS-388:
---

Pull request: https://github.com/apache/commons-compress/pull/21

Code coverage seems to be a bit challenging on small data and maybe one of the 
mandatory InputStream method never being called.


> Improve concurrent reads from ZipFile
> -
>
> Key: COMPRESS-388
> URL: https://issues.apache.org/jira/browse/COMPRESS-388
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Affects Versions: 1.13
> Environment: Any
>Reporter: Zbynek Vyskovsky
>  Labels: patch, performance
> Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor 
> systems. On my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples 
> for example.
> The cause is the use of synchronized blocks to access the underlying file 
> channel. This may be required for generic SeekableByteChannel but most 
> commonly there is FileChannel implementation which supports lock-free reading 
> from any position (i.e. using pread/pwrite system calls or their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with 
> more processor the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)