date:20120326

Deprecate and then remove all methods that use the default encoding
---

 Key: IO-314
 URL: https://issues.apache.org/jira/browse/IO-314
 Project: Commons IO
  Issue Type: Improvement
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla
Priority: Minor


On Stackoverflow.com, I often see this kind of question: When I read my text 
on a different computer, it's all garbled.

The underlying issue is that people don't understand the concept of encoding 
and therefore, they use the default encoding which breaks their code when they 
least expect it. Worse, it often causes data corruption without throwing 
exceptions.

Therefore my suggestion: Deprecate and then remove all methods that use the 
default encoding. Users should always specify an encoding when doing text I/O, 
to make sure that data cannot be corrupted even when they don't know what 
they're doing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (IO-315) Replace all String encoding parameters with a value type

Replace all String encoding parameters with a value type
--

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla


Please create an interface Encoding plus a set of useful defaults (UTF_8, 
ISO_LATIN_1, CP_1250 and CP_1252).

Use this interface in all places where String encoding is used now. This 
would make the API more reliable, improve code reuse and reduce futile catch 
blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-294) Adding FileUtils.byteCountToDisplaySize(long size, boolean useSiUnits)


[ 
https://issues.apache.org/jira/browse/IO-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238227#comment-13238227
 ] 

Aaron Digulla commented on IO-294:
--

It would probably belong to a new class in the package 
{{org.apache.commons.lang3.text}} but I fear that nobody will ever find it 
there.

{{byteCountToDisplaySize()}} is a method that is strongly related to I/O - I 
can't think that we'd use that to print the size of String labels in a UI, for 
example.

So it should go into commons IO somewhere but maybe into {{IOUtils}} since it's 
not especially related to files.

And if you support I18n, you must allow to specify a locale. Otherwise, this 
code can't be used in a multi-language web app, for example.

 Adding FileUtils.byteCountToDisplaySize(long size, boolean useSiUnits)
 --

 Key: IO-294
 URL: https://issues.apache.org/jira/browse/IO-294
 Project: Commons IO
  Issue Type: New Feature
  Components: Utilities
Affects Versions: 2.1
Reporter: Jean-Noel Rouvignac
 Attachments: FileUtils.java, FileUtilsTest.java


 I have written a little Utility method that might benefit Commons IO:
 {code}
 public class FileUtils {
 /**
  * Returns a human-readable version of the file size (original is in 
 bytes). The implementation has the following features:
  * ul
  * liSupports the SI or IEC units./li
  * liSupports I18n/li
  * liDisplay a one digit remainder (rounded down if less than 5, 
 rounded up otherwise)/li
  * liOnce the main unit is = 100, drops the remainder which would be 
 over precision./li
  * /ul
  * 
  * @param size The number of bytes.
  * @param useSiUnits if false, uses the IEC (International 
 Electrotechnical Commission) units (powers of 2), else uses SI (International 
 System of Units)
  *units (powers of 10).
  * @return A human-readable display value (includes units).
  */
 public static String byteCountToDisplaySize(long size, boolean 
 useSiUnits) {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-286) FastByteArrayStream implementations to replace syncronized JDK ByteArrayStream


[ 
https://issues.apache.org/jira/browse/IO-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238231#comment-13238231
 ] 

Aaron Digulla commented on IO-286:
--

Comments:

# Always use spaces for indentation
# Allow to override the allocation strategy. If the buffer grows fast, a 
different strategy could allocate bigger chunks in the beginning (= less 
overall allocations) and be more conservative later (so after the buffer has 
grown to 10MB, maybe 15MB is better than 20MB for the next round). To implement 
this, export the code to calculate {{newBufferSize}} into a protected method.
# I'm also confused by {{writeBytes()}} but more about the loop inside: Why do 
you need a loop there? You already know how many bytes to append and how much 
buffer you have left - why not extend the buffer as necessary and then append 
all the new bytes in one go?

 FastByteArray*Stream implementations to replace syncronized JDK 
 ByteArray*Stream
 

 Key: IO-286
 URL: https://issues.apache.org/jira/browse/IO-286
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Reporter: Paul Loy
Priority: Minor
  Labels: streams, synchronized
 Attachments: FastByteArrayOutputStream_commons-io.patch, 
 FastByteArrayOutputStream_commons-io[2].patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 In CASSANDRA-2820 I reintoduced the FastByteArrayInputStream and 
 FastByteArrayOutputStream to cassandra. These steams are un-synchronized 
 versions of the Apache Harmony ByteArrayInputStream and ByteArrayOutputStream 
 respectively.
 During my own testing of the streams I found a big difference in the 
 performance of the standard JDK BA*S steams and the FBA*S streams on most 
 JREs. Then cassandra load testing also showed an up to 10% improvement in 
 cassandra performance using these streams.
 Then Thrift has TByteArrayOutputStream which contains a way to get the 
 underlying byte[] buffer without a deep copy that would probably be a good 
 further enhancement.
 Patch to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-255) XML handler to serialize/de-serialize FlieEntrys to/from XML

2012-03-26 Thread Aaron Digulla (Issue Comment Edited) (JIRA)


[ 
https://issues.apache.org/jira/browse/IO-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238232#comment-13238232
 ] 

Aaron Digulla commented on IO-255:
--

Please fix the typo in the bug title: FlieEntrys should be FileEntrys

 XML handler to serialize/de-serialize FlieEntrys to/from XML
 

 Key: IO-255
 URL: https://issues.apache.org/jira/browse/IO-255
 Project: Commons IO
  Issue Type: New Feature
Affects Versions: 2.0
Reporter: Niall Pemberton
Assignee: Niall Pemberton
Priority: Minor
 Attachments: FileEntryXmlHandler.java


 It may be usefule to *capture* the state of a Filesystem and serialize it. 
 I've been playing with a handler that can serialize/de-serialize a FileEntry 
 and its children to/from XML.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (IO-218) Introduce new filter input stream with replacement facilities


[ 
https://issues.apache.org/jira/browse/IO-218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238233#comment-13238233
 ] 

Aaron Digulla edited comment on IO-218 at 3/26/12 9:56 AM:
---

Isn't this a duplicate of issue IO-199?

  was (Author: digulla):
Isn't this a duplicate if issue IO-199?
  
 Introduce new filter input stream with replacement facilities
 -

 Key: IO-218
 URL: https://issues.apache.org/jira/browse/IO-218
 Project: Commons IO
  Issue Type: Improvement
  Components: Filters
Affects Versions: 1.4
 Environment: all environments
Reporter: Denis Zhdanov
 Attachments: ReplaceFilterInputStream.java, 
 ReplaceFilterInputStreamTest.java


 It seems convenient to have a FilterInputStream that allows to apply 
 predefined repalcement rules against the read data. 
 For example we may want to configure the following replacements:
 {noformat}
 {1, 2} - {7, 8}
 {1} - {9}
 {3, 2} - {}
 {noformat}
 and apply them to the input like
 {noformat}
 {4, 3, 2, 1, 2, 1, 3}
 {noformat}
 in order to get a result like
 {noformat}
 {4, 7, 8, 9, 3}
 {noformat}
 I created the class that allows to do that and attached it to this ticket. 
 Unit test class at junit4 format is attached as well.
 So, the task is to review the provided classes, consider if it's worth to add 
 them to commons-io distribution and perform the inclusion in the case of 
 possible result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-218) Introduce new filter input stream with replacement facilities


[ 
https://issues.apache.org/jira/browse/IO-218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238233#comment-13238233
 ] 

Aaron Digulla commented on IO-218:
--

Isn't this a duplicate if issue IO-199?

 Introduce new filter input stream with replacement facilities
 -

 Key: IO-218
 URL: https://issues.apache.org/jira/browse/IO-218
 Project: Commons IO
  Issue Type: Improvement
  Components: Filters
Affects Versions: 1.4
 Environment: all environments
Reporter: Denis Zhdanov
 Attachments: ReplaceFilterInputStream.java, 
 ReplaceFilterInputStreamTest.java


 It seems convenient to have a FilterInputStream that allows to apply 
 predefined repalcement rules against the read data. 
 For example we may want to configure the following replacements:
 {noformat}
 {1, 2} - {7, 8}
 {1} - {9}
 {3, 2} - {}
 {noformat}
 and apply them to the input like
 {noformat}
 {4, 3, 2, 1, 2, 1, 3}
 {noformat}
 in order to get a result like
 {noformat}
 {4, 7, 8, 9, 3}
 {noformat}
 I created the class that allows to do that and attached it to this ticket. 
 Unit test class at junit4 format is attached as well.
 So, the task is to review the provided classes, consider if it's worth to add 
 them to commons-io distribution and perform the inclusion in the case of 
 possible result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-233) Add Methods for Buffering Streams/Writers To IOUtils


[ 
https://issues.apache.org/jira/browse/IO-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238235#comment-13238235
 ] 

Aaron Digulla commented on IO-233:
--

It would be worth if, if you'd add the encoding:

{code}
public static BufferedReader buffer(InputStream inputStream, String encoding) {
return new BufferedReader(new InputStreamReader(inputStream, encoding));
}
{code}

which would a nice API together with my issue IO-315: Replace all String 
encoding parameters with a value type

 Add Methods for Buffering Streams/Writers To IOUtils
 

 Key: IO-233
 URL: https://issues.apache.org/jira/browse/IO-233
 Project: Commons IO
  Issue Type: Improvement
  Components: Streams/Writers
Affects Versions: 2.0
 Environment: Java 1.4+
Reporter: Michael Wooten
Priority: Minor
 Fix For: 3.x

   Original Estimate: 2h
  Remaining Estimate: 2h

 I suggest adding utility methods for buffering streams and writers to the 
 IOUtils class. The methods would have the following signatures:
 BufferedInputStream buffer(InputStream inputStream)
 BufferedOutputStream buffer(OutputStream outputStream)
 BufferedReader buffer(Reader reader)
 BufferedWriter buffer(Writer writer)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-71) [io] PipedUtils

[
https://issues.apache.org/jira/browse/IO-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238238#comment-13238238
]

Aaron Digulla commented on IO-71:
-

For the records: I had the same issues and couldn't find a solution without a
second thread that doesn't need an unknown amount of memory.

Since this isn't easy to get right, I vote to add this code even though is
seems broken. My rationale is this:

1. The code is useful
1. It's hard to get right
1. If there is a better solution (i.e. one without the second thread) the
outside API doesn't change, so it would be easy to fix later.

[io] PipedUtils
---

Key: IO-71
URL: https://issues.apache.org/jira/browse/IO-71
Project: Commons IO
Issue Type: Improvement
Components: Utilities
Environment: Operating System: All
Platform: All
Reporter: David Smiley
Priority: Minor
Fix For: 3.x

Attachments: PipedUtils.zip, ReverseFilterOutputStream.patch

I developed some nifty code that takes an OutputStream and sort of reverses
it as if it were an
InputStream. Error passing and handling close is dealt with. It needs
another thread to do the work
which runs in parallel. It uses Piped streams. I created this because I
had to conform
GZIPOutputStream to my framework which demanded an InputStream.
See URL to source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-237) Add Additional toFiles() and toURLs() Methods to FileUtils


[ 
https://issues.apache.org/jira/browse/IO-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238246#comment-13238246
 ] 

Aaron Digulla commented on IO-237:
--

I suggest to close this.

With Java 5, you can use the generic {{Arrays.asList()}} for this.

If you need to modify the list, you need to wrap the call in {{new 
ArrayList()}}. Due to internal optimizations, this is a pretty fast operation.

 Add Additional toFiles() and toURLs() Methods to FileUtils
 --

 Key: IO-237
 URL: https://issues.apache.org/jira/browse/IO-237
 Project: Commons IO
  Issue Type: Improvement
  Components: Utilities
Affects Versions: 2.0
 Environment: Java 1.5+
Reporter: Michael Wooten
 Fix For: 3.x

 Attachments: path-convert-fileArray-andURLArray-into-varargs.patch

   Original Estimate: 10h
  Remaining Estimate: 10h

 I suggest modifying the signatures of the toFiles() and toURLs() to use 
 varargs since that approach will automatically accept arrays and also allow 
 the user to send an arbitrary number of them.
 Convert File[] toFiles(URL[]) to File[] toFiles(URL...)
 Convert URL[] toURLs(File[]) to URL[] toURLs(File...)
 I also suggest adding new methods for converting a collection of URLs or 
 Files to an array, or to a List.
 File[] toFiles(CollectionURL)
 ListFile toFilesList(URL...)
 ListFile toFilesList(CollectionURL)
 URL[] toURLs(CollectionFile)
 ListURL toURLsList(File...)
 ListURL toURLsList(CollectionFile)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-249) Enhance closeQuietly to indicate success


[ 
https://issues.apache.org/jira/browse/IO-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238248#comment-13238248
 ] 

Aaron Digulla commented on IO-249:
--

I also like the handler solution but can we have a factory, please? That way, I 
could move all the logging code into my own handler implementation and just 
call {{handler.close()}} in my {{finally}} block.

 Enhance closeQuietly to indicate success
 

 Key: IO-249
 URL: https://issues.apache.org/jira/browse/IO-249
 Project: Commons IO
  Issue Type: Improvement
  Components: Utilities
Affects Versions: 2.0
Reporter: Paul Benedict
Assignee: Paul Benedict
Priority: Minor
 Fix For: 3.x

 Attachments: IO-249-CloseableHandler.patch


 A convention of some programmers is to emit a log warning when a resource 
 fails to close. Granted, such a condition is an error, but there's no 
 reasonable recourse to the failure. Using IOUtils.closeQuietly() is very 
 useful but all information about the success/failure is hidden. Returning 
 Throwable will give insight into the error for diagnostic purposes. This 
 change will be compatible with today's usage since the method currently 
 returns void.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type

[
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238278#comment-13238278
]

Sebb commented on IO-315:
-

I don't think this is a good idea.
There are a lot of different encodings, and who is to say which ones are
useful?
There would still need to be a way to use the String encoding to allow for
encodings that are not provided by the interface.

Also, the code would still need to catch {{UnsupportedEncodingException}}.
As far as I know there is no requirement for a Java class-library to support
any specific encodings, though it would be a fairly useless implementation that
did not support UTF-8.

Replace all String encoding parameters with a value type
--

Key: IO-315
URL: https://issues.apache.org/jira/browse/IO-315
Project: Commons IO
Issue Type: New Feature
Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

Please create an interface Encoding plus a set of useful defaults (UTF_8,
ISO_LATIN_1, CP_1250 and CP_1252).
Use this interface in all places where String encoding is used now. This
would make the API more reliable, improve code reuse and reduce futile catch
blocks for {{UnsupportedEncodingException}}.

[jira] [Updated] (IO-255) XML handler to serialize/de-serialize FileEntry instances to/from XML

2012-03-26 Thread Sebb (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/IO-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated IO-255:


Summary: XML handler to serialize/de-serialize FileEntry instances to/from 
XML  (was: XML handler to serialize/de-serialize FlieEntrys to/from XML)

 XML handler to serialize/de-serialize FileEntry instances to/from XML
 -

 Key: IO-255
 URL: https://issues.apache.org/jira/browse/IO-255
 Project: Commons IO
  Issue Type: New Feature
Affects Versions: 2.0
Reporter: Niall Pemberton
Assignee: Niall Pemberton
Priority: Minor
 Attachments: FileEntryXmlHandler.java


 It may be usefule to *capture* the state of a Filesystem and serialize it. 
 I've been playing with a handler that can serialize/de-serialize a FileEntry 
 and its children to/from XML.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JCS-94) LateralTCPService should implement getGroupKeys

2012-03-26 Thread Andrew Leamon (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/JCS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238284#comment-13238284
 ] 

Andrew Leamon commented on JCS-94:
--

Yes.  It's complete.  Planning to submit this week.  I just need to see if
my company requires me to do anything before making the submission...
  Drew

On Fri, Mar 23, 2012 at 11:09 AM, David Wood (Commented) (JIRA) 



 LateralTCPService should implement getGroupKeys
 ---

 Key: JCS-94
 URL: https://issues.apache.org/jira/browse/JCS-94
 Project: Commons JCS
  Issue Type: Improvement
  Components: TCP Lateral Cache
Affects Versions: jcs-1.3, jcs-1.4-dev
Reporter: Andrew Leamon

 Calling getGroupKeys on LateralTCPService throws new 
 UnsupportedOperationException( Groups not implemented. );

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-233) Add Methods for Buffering Streams/Writers To IOUtils


[ 
https://issues.apache.org/jira/browse/IO-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238298#comment-13238298
 ] 

Sebb commented on IO-233:
-

@Aaron
As I understand it, this issue is about adding buffering to I/O classes.
Converting between byte and character oriented classes is somewhat different.

 Add Methods for Buffering Streams/Writers To IOUtils
 

 Key: IO-233
 URL: https://issues.apache.org/jira/browse/IO-233
 Project: Commons IO
  Issue Type: Improvement
  Components: Streams/Writers
Affects Versions: 2.0
 Environment: Java 1.4+
Reporter: Michael Wooten
Priority: Minor
 Fix For: 3.x

   Original Estimate: 2h
  Remaining Estimate: 2h

 I suggest adding utility methods for buffering streams and writers to the 
 IOUtils class. The methods would have the following signatures:
 BufferedInputStream buffer(InputStream inputStream)
 BufferedOutputStream buffer(OutputStream outputStream)
 BufferedReader buffer(Reader reader)
 BufferedWriter buffer(Writer writer)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JCS-94) LateralTCPService should implement getGroupKeys

2012-03-26 Thread David Wood (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/JCS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238310#comment-13238310
 ] 

David Wood commented on JCS-94:
---

That's great to hear.  And, I understand the issues with the legal department.  
Good luck.

 LateralTCPService should implement getGroupKeys
 ---

 Key: JCS-94
 URL: https://issues.apache.org/jira/browse/JCS-94
 Project: Commons JCS
  Issue Type: Improvement
  Components: TCP Lateral Cache
Affects Versions: jcs-1.3, jcs-1.4-dev
Reporter: Andrew Leamon

 Calling getGroupKeys on LateralTCPService throws new 
 UnsupportedOperationException( Groups not implemented. );

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (IO-316) New API: BackupFileWriter

New API: BackupFileWriter
-

 Key: IO-316
 URL: https://issues.apache.org/jira/browse/IO-316
 Project: Commons IO
  Issue Type: Bug
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla
Priority: Minor


Add the new file based I/O class {{BackupFileWriter}} with the following 
properties:

- Saves the file to a temporary name
- Creates backup of existing file on {{close()}}
- Renames temp file to desired name on {{close()}}

The backup strategy (number of backups, backup file name) should be pluggable.

There should also be a hook to compare the temporary and the existing file and 
do the rename only when they are different. The default hook should always 
replace the file.

It should also be possible to override the temporary file name (including the 
path, so the temp file can be in the same directory or a different one or even 
on a different disk).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DBUTILS-88) Make AsyncQueryRunner be a decorator around a QueryRunner

2012-03-26 Thread William R. Speirs (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/DBUTILS-88?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238335#comment-13238335
 ] 

William R. Speirs commented on DBUTILS-88:
--

I'll take another look and commit the patch. However, I've had trouble in the 
past and I'm unable to release a new version of DBUtils. I'll see if I can get 
some help on that and get this done by the end of the week.

 Make AsyncQueryRunner be a decorator around a QueryRunner
 -

 Key: DBUTILS-88
 URL: https://issues.apache.org/jira/browse/DBUTILS-88
 Project: Commons DbUtils
  Issue Type: Task
Reporter: Moandji Ezana
Priority: Minor
 Attachments: AsyncQueryRunner_wraps_QueryRunner.txt, 
 DBUTILS-88v1.patch, DBUTILS-88v2.patch


 AsyncQueryRunner duplicates much of the code in QueryRunner. Would it be 
 possible for AsyncQueryRunner to simply decorate a QueryRunner with async 
 functionality, in the same way a BufferedInputStream might decorate an 
 InputStream?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (IO-287) Use terabyte (TB), petabyte (PB) and exabyte (EB) in FileUtils.byteCountToDisplaySize(long size)

2012-03-26 Thread Gary D. Gregory (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/IO-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory updated IO-287:
---

Summary: Use terabyte (TB), petabyte (PB) and exabyte (EB) in 
FileUtils.byteCountToDisplaySize(long size)  (was: Use terabyte (TB) , petabyte 
(PB) and exabyte (EB) in FileUtils.byteCountToDisplaySize(long size))

Fix typo in issue title.

 Use terabyte (TB), petabyte (PB) and exabyte (EB) in 
 FileUtils.byteCountToDisplaySize(long size)
 

 Key: IO-287
 URL: https://issues.apache.org/jira/browse/IO-287
 Project: Commons IO
  Issue Type: Improvement
  Components: Utilities
Affects Versions: 2.1
 Environment: Apache Maven 3.0.3 (r1075438; 2011-02-28 12:31:09-0500)
 Maven home: C:\Java\apache-maven-3.0.3\bin\..
 Java version: 1.6.0_24, vendor: Sun Microsystems Inc.
 Java home: C:\Program Files\Java\jdk1.6.0_24\jre
 Default locale: en_US, platform encoding: Cp1252
 OS name: windows 7, version: 6.1, arch: amd64, family: windows
Reporter: Gary D. Gregory
 Fix For: 2.2

 Attachments: IO-287-r1200701.patch.txt


 Use terabyte (TB) , petabyte (PB) and exabyte (EB) in 
 FileUtils.byteCountToDisplaySize(long size).
 Currently, the code is commented out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-82) CSVRecord inconsistent behaviour when header mapping is not found

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CSV-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238343#comment-13238343
]

Emmanuel Bourg commented on CSV-82:
---

I don't find this inconsistent. There are two cases to handle:

# The header is missing and a field is accessed by name
# The header is present but the column name requested doesn't exist

In case 1 an IllegalStateException is thrown, because the expectation of the
user is severely broken. He attempts to access the fields by name which is
completely impossible.

In case 2 a null value is returned, because some column may be optional. It's
not reasonable to throw an exception in this case, otherwise reading optional
columns in a try catch block will become very cumbersome.

If you want to differentiate between missing columns and null values this can
be done with a CSVRecord#has(String) method. But is it actually useful?

CSVRecord inconsistent behaviour when header mapping is not found
-

Key: CSV-82
URL: https://issues.apache.org/jira/browse/CSV-82
Project: Commons CSV
Issue Type: Bug
Reporter: Sebb

The CSVRecord#get(String) method has inconsistent behaviour.
If no header mapping was provided, then it throws IllegalStateException.
If the header name is not found, null is returned.
Apart from being inconsistent, it might be useful in the future to be able to
return null as a column value (as distinct from the empty string).
It should throw IllegalArgumentException for a missing header name, instead
of returning null.

[jira] [Commented] (CSV-78) Use Character instead of char for char fields except delimiter

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238354#comment-13238354
 ] 

Emmanuel Bourg commented on CSV-78:
---

I'm fine with Characters instead of primitives if there is no impact on the 
parser performance. This will certainly mean that the primitive values have to 
be cached in the parser.


 Use Character instead of char for char fields except delimiter
 --

 Key: CSV-78
 URL: https://issues.apache.org/jira/browse/CSV-78
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 Apart from the delimiter - which must be specified (obviously) - the other 
 char fields are optional.
 At present it's not possible to create a new format from an existing format 
 and remove (say) the encapsulation character.
 If the parameters were changed to Character instead of char, then it would be 
 possible to pass null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-82) CSVRecord inconsistent behaviour when header mapping is not found


[ 
https://issues.apache.org/jira/browse/CSV-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238362#comment-13238362
 ] 

Sebb commented on CSV-82:
-

bq. In case 2 a null value is returned, because some column may be optional.

I don't understand this concept of optional columns.
Surely any CSV header line needs to provide names for all the columns?
Otherwise, what's the point of the header line?

 CSVRecord inconsistent behaviour when header mapping is not found
 -

 Key: CSV-82
 URL: https://issues.apache.org/jira/browse/CSV-82
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 The CSVRecord#get(String) method has inconsistent behaviour.
 If no header mapping was provided, then it throws IllegalStateException.
 If the header name is not found, null is returned.
 Apart from being inconsistent, it might be useful in the future to be able to 
 return null as a column value (as distinct from the empty string).
 It should throw IllegalArgumentException for a missing header name, instead 
 of returning null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type

2012-03-26 Thread Gary D. Gregory (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238371#comment-13238371
 ] 

Gary D. Gregory commented on IO-315:


There are six required encodings: 
http://docs.oracle.com/javase/6/docs/api/java/nio/charset/Charset.html

These are defined in a couple of Commons places:

- [lang]: org.apache.commons.lang3.CharEncoding
- [codec]: org.apache.commons.codec.CharEncoding

Gary

 Replace all String encoding parameters with a value type
 --

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

 Please create an interface Encoding plus a set of useful defaults (UTF_8, 
 ISO_LATIN_1, CP_1250 and CP_1252).
 Use this interface in all places where String encoding is used now. This 
 would make the API more reliable, improve code reuse and reduce futile catch 
 blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-773) You should be able to run evolution simulation for a certain amount of time.

2012-03-26 Thread Reid Hochstedler (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/MATH-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Hochstedler updated MATH-773:
--

Attachment: MATH-773.r2.txt

Revision 2, with Thomas's suggestions.

 You should be able to run evolution simulation for a certain amount of time.
 

 Key: MATH-773
 URL: https://issues.apache.org/jira/browse/MATH-773
 Project: Commons Math
  Issue Type: Improvement
Reporter: Reid Hochstedler
  Labels: genetics
 Attachments: MATH-773.r2.txt, MATH-773.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 You should be able to run your GeneticAlgorithm for a fixed amount of time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type


[ 
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238378#comment-13238378
 ] 

Sebb commented on IO-315:
-

Thanks for the list.
However, I don't think that makes a significant difference.

 Replace all String encoding parameters with a value type
 --

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

 Please create an interface Encoding plus a set of useful defaults (UTF_8, 
 ISO_LATIN_1, CP_1250 and CP_1252).
 Use this interface in all places where String encoding is used now. This 
 would make the API more reliable, improve code reuse and reduce futile catch 
 blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-82) CSVRecord inconsistent behaviour when header mapping is not found

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CSV-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238440#comment-13238440
]

Emmanuel Bourg commented on CSV-82:
---

The header line should indeed identify all the columns of the file, but the
parsing code might be ready to handle more columns than actually present.

I have a case where a file contains a date and a time column. The file is
generated by a servlet, and depending on a query parameter the time column
might be omitted. There is a unique parsing code for both cases, this code
considers the time information as optional. It fetches the time column by name
and simply ignores a null value.

CSVRecord inconsistent behaviour when header mapping is not found
-

Key: CSV-82
URL: https://issues.apache.org/jira/browse/CSV-82
Project: Commons CSV
Issue Type: Bug
Reporter: Sebb

[jira] [Commented] (CSV-72) CSVFormat.DEFAULT should be renamed as RFC4180

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238450#comment-13238450
 ] 

Emmanuel Bourg commented on CSV-72:
---

Because RFC 4180 is the only internet standard related to CSV? Excel is not a 
well defined format, the delimiter is locale specific. It would be a poor 
choice for a default format.

 CSVFormat.DEFAULT should be renamed as RFC4180
 --

 Key: CSV-72
 URL: https://issues.apache.org/jira/browse/CSV-72
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb
Priority: Minor

 CSVFormat.DEFAULT should be renamed as RFC4180.
 It's confusing to use the name DEFAULT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type

[
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238469#comment-13238469
]

Aaron Digulla commented on IO-315:
--

My point is that everyone litters their code with string constants. String
constants are bad for various reasons and APIs should not endorse them. In my
own code, I use an interface so everyone can add more encodings if they need
that but afterwards, I always know what is an encoding and what is text data
(so no mixups like {{FileUtils.write(UTF-8, Hello, world)}}).

But I agree that commons IO is probably the wrong place to add them. Moving to
commons-lang (which also contains code that handles the exception).

Replace all String encoding parameters with a value type
--

Key: IO-315
URL: https://issues.apache.org/jira/browse/IO-315
Project: Commons IO
Issue Type: New Feature
Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

[jira] [Created] (LANG-795) Replace all String encoding parameters with Charset

Replace all String encoding parameters with Charset
-

 Key: LANG-795
 URL: https://issues.apache.org/jira/browse/LANG-795
 Project: Commons Lang
  Issue Type: New Feature
  Components: lang.*
Affects Versions: 3.1
Reporter: Aaron Digulla


In several places, String constants are used to specify an encoding for data.

Please add methods that accept {{Charset}} instead, and deprecate the existing 
methods.

The goal of this change is to reduce code smell (using String constants instead 
of a real value type).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (LANG-795) Replace all String encoding parameters with Charset


[ 
https://issues.apache.org/jira/browse/LANG-795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238477#comment-13238477
 ] 

Aaron Digulla commented on LANG-795:


This would be a first step to reduce string constants in other commons projects 
like commons-io (https://issues.apache.org/jira/browse/IO-315)

 Replace all String encoding parameters with Charset
 -

 Key: LANG-795
 URL: https://issues.apache.org/jira/browse/LANG-795
 Project: Commons Lang
  Issue Type: New Feature
  Components: lang.*
Affects Versions: 3.1
Reporter: Aaron Digulla

 In several places, String constants are used to specify an encoding for data.
 Please add methods that accept {{Charset}} instead, and deprecate the 
 existing methods.
 The goal of this change is to reduce code smell (using String constants 
 instead of a real value type).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type


[ 
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238480#comment-13238480
 ] 

Aaron Digulla commented on IO-315:
--

If you don't like to add a new interface, how about supporting {{Charset}}? It 
doesn't throw a checked exception, for example and eventually, all the methods 
that accept string will have to lookup a {{Charset}}.

I'll try to convince commons-lang to convert the String constants to Charset 
constants (https://issues.apache.org/jira/browse/LANG-795)

 Replace all String encoding parameters with a value type
 --

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

 Please create an interface Encoding plus a set of useful defaults (UTF_8, 
 ISO_LATIN_1, CP_1250 and CP_1252).
 Use this interface in all places where String encoding is used now. This 
 would make the API more reliable, improve code reuse and reduce futile catch 
 blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-77) RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CSV-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238484#comment-13238484
]

Emmanuel Bourg commented on CSV-77:
---

Section 2.4 relates to a field value, but a blank line doesn't contain any
field. I agree this is open to interpretation.

Ignoring blank lines is probably the best default behavior. I often see csv
files with leading blank lines (because they were generated by a JSP for
example) or trailing blank lines. Getting an empty record in these cases is
never desirable.

Regarding the trimming of values, I agree this should not be part of the
default format.

RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines
---

Key: CSV-77
URL: https://issues.apache.org/jira/browse/CSV-77
Project: Commons CSV
Issue Type: Bug
Reporter: Sebb

According to the RFC [1]:
Section 2.4
{quote}
... Spaces are considered part
of a field and should not be ignored.
{quote}
Also, the RFC does not mention that blank lines are to be ignored.
However, some of the alternate CSV specifications referenced by RFC 4180 do
say that spaces are ignored.
I've not yet found a mention to say that blank lines are to be ignored.
[1] http://tools.ietf.org/html/rfc4180

[jira] [Commented] (CSV-73) HSQLDB supports two different field separators

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-73?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238529#comment-13238529
 ] 

Emmanuel Bourg commented on CSV-73:
---

I don't understand the problem they tried to solve. The explanation is not very 
clear:

bq. Since HSQLDB treats CHAR and VARCHAR strings the same, the ability to 
assign a different separator to the latter is provided

 HSQLDB supports two different field separators
 --

 Key: CSV-73
 URL: https://issues.apache.org/jira/browse/CSV-73
 Project: Commons CSV
  Issue Type: New Feature
Reporter: Sebb

 HSQLDB supports a second field separator for VARCHAR fields according to:
 http://www.hsqldb.org/doc/2.0/guide/texttables-chapt.html#ttc_configuration
 Do we want to implrment this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (IO-315) Replace all String encoding parameters with a value type

2012-03-26 Thread Sebb (Issue Comment Edited) (JIRA)


[ 
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238598#comment-13238598
 ] 

Sebb edited comment on IO-315 at 3/26/12 5:43 PM:
--

That makes more sense now, but I think it would be overkill to introduce a new 
interface here.

Using Charset would be better IMO.

Using Charset would convert the checked {{UnsupportedEncodingException}} into 
the unchecked {{UnsupportedCharsetException}}.
This should simplify application code that does not already catch 
{{IOException}}, though of course in Commons IO many methods throw IOE already.

AFAICT, parameters would need to be changed to use (e.g.) 
{{Charset.forName(UTF-8)}} instead of {{UTF-8}} so user code would be 
slightly longer.

  was (Author: s...@apache.org):
That makes a more sense now, but I think it would be overkill to introduce 
a new interface here.

Using Charset would be better IMO.

Using Charset would convert the checked {{UnsupportedEncodingException}} into 
the unchecked {{UnsupportedCharsetException}}.
This should simplify application code that does not already catch 
{{IOException}}, though of course in Commons IO many methods throw IOE already.

AFAICT, parameters would need to be changed to use (e.g.) 
{{Charset.forName(UTF-8)}} instead of {{UTF-8}} so user code would be 
slightly longer.
  
 Replace all String encoding parameters with a value type
 --

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

 Please create an interface Encoding plus a set of useful defaults (UTF_8, 
 ISO_LATIN_1, CP_1250 and CP_1252).
 Use this interface in all places where String encoding is used now. This 
 would make the API more reliable, improve code reuse and reduce futile catch 
 blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type


[ 
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238598#comment-13238598
 ] 

Sebb commented on IO-315:
-

That makes a more sense now, but I think it would be overkill to introduce a 
new interface here.

Using Charset would be better IMO.

Using Charset would convert the checked {{UnsupportedEncodingException}} into 
the unchecked {{UnsupportedCharsetException}}.
This should simplify application code that does not already catch 
{{IOException}}, though of course in Commons IO many methods throw IOE already.

AFAICT, parameters would need to be changed to use (e.g.) 
{{Charset.forName(UTF-8)}} instead of {{UTF-8}} so user code would be 
slightly longer.

 Replace all String encoding parameters with a value type
 --

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

 Please create an interface Encoding plus a set of useful defaults (UTF_8, 
 ISO_LATIN_1, CP_1250 and CP_1252).
 Use this interface in all places where String encoding is used now. This 
 would make the API more reliable, improve code reuse and reduce futile catch 
 blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-64) CSVPrinter does not distinguish null and the empty string

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238610#comment-13238610
 ] 

Emmanuel Bourg commented on CSV-64:
---

PostgreSQL can also use a special string for null values:

http://www.postgresql.org/docs/current/static/sql-copy.html

It supports two default text formats, a tab delimited format and a comma 
delimited format. The tab delimited format uses \N for null values. The comma 
delimited format uses an empty string.

 CSVPrinter does not distinguish null and the empty string
 -

 Key: CSV-64
 URL: https://issues.apache.org/jira/browse/CSV-64
 Project: Commons CSV
  Issue Type: Bug
  Components: Printer
Reporter: Sebb
 Fix For: 1.x


 CSVPrinter does not distinguish null and the empty string.
 There should be a way to denote that a string is null rather than empty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-73) HSQLDB supports two different field separators


[ 
https://issues.apache.org/jira/browse/CSV-73?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238623#comment-13238623
 ] 

Sebb commented on CSV-73:
-

Nor do I, but for whatever reason, they do use two different separators.

 HSQLDB supports two different field separators
 --

 Key: CSV-73
 URL: https://issues.apache.org/jira/browse/CSV-73
 Project: Commons CSV
  Issue Type: New Feature
Reporter: Sebb

 HSQLDB supports a second field separator for VARCHAR fields according to:
 http://www.hsqldb.org/doc/2.0/guide/texttables-chapt.html#ttc_configuration
 Do we want to implrment this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CSV-73) HSQLDB supports two different field separators

2012-03-26 Thread Sebb (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/CSV-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CSV-73:


Description: 
HSQLDB supports a second field separator for VARCHAR fields according to:

http://www.hsqldb.org/doc/2.0/guide/texttables-chapt.html#ttc_configuration

Do we want to implement this?

  was:
HSQLDB supports a second field separator for VARCHAR fields according to:

http://www.hsqldb.org/doc/2.0/guide/texttables-chapt.html#ttc_configuration

Do we want to implrment this?


 HSQLDB supports two different field separators
 --

 Key: CSV-73
 URL: https://issues.apache.org/jira/browse/CSV-73
 Project: Commons CSV
  Issue Type: New Feature
Reporter: Sebb

 HSQLDB supports a second field separator for VARCHAR fields according to:
 http://www.hsqldb.org/doc/2.0/guide/texttables-chapt.html#ttc_configuration
 Do we want to implement this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-77) RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CSV-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238630#comment-13238630
]

Emmanuel Bourg commented on CSV-77:
---

DEFAULT follows RFC 4180 for printing. For parsing it makes a convenient choice
that is unspecified by the RFC, but I think it's in line with the
interoperability considerations at the end of the specification:

bq. Due to lack of a single specification, there are considerable differences
among implementations. Implementors should be conservative in what you do, be
liberal in what you accept from others when processing CSV files.

RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines
---

Key: CSV-77
URL: https://issues.apache.org/jira/browse/CSV-77
Project: Commons CSV
Issue Type: Bug
Reporter: Sebb

[jira] [Commented] (IO-315) Replace all String encoding parameters with a value type

2012-03-26 Thread Gary D. Gregory (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/IO-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238635#comment-13238635
 ] 

Gary D. Gregory commented on IO-315:


FYI: I'm experimenting with a Charsets constant class in [codec] now (not 
committed). 

 Replace all String encoding parameters with a value type
 --

 Key: IO-315
 URL: https://issues.apache.org/jira/browse/IO-315
 Project: Commons IO
  Issue Type: New Feature
  Components: Streams/Writers
Affects Versions: 2.1
Reporter: Aaron Digulla

 Please create an interface Encoding plus a set of useful defaults (UTF_8, 
 ISO_LATIN_1, CP_1250 and CP_1252).
 Use this interface in all places where String encoding is used now. This 
 would make the API more reliable, improve code reuse and reduce futile catch 
 blocks for {{UnsupportedEncodingException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CSV-77) RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines


 [ 
https://issues.apache.org/jira/browse/CSV-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved CSV-77.
-

Resolution: Fixed

 RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines
 ---

 Key: CSV-77
 URL: https://issues.apache.org/jira/browse/CSV-77
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 According to the RFC [1]:
 Section 2.4
 {quote}
 ... Spaces are considered part
 of a field and should not be ignored.
 {quote}
 Also, the RFC does not mention that blank lines are to be ignored.
 However, some of the alternate CSV specifications referenced by RFC 4180 do 
 say that spaces are ignored.
 I've not yet found a mention to say that blank lines are to be ignored.
 [1] http://tools.ietf.org/html/rfc4180

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CSV-72) CSVFormat.DEFAULT should be renamed as RFC4180


 [ 
https://issues.apache.org/jira/browse/CSV-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved CSV-72.
-

Resolution: Fixed

Added new RFC4180 constant

 CSVFormat.DEFAULT should be renamed as RFC4180
 --

 Key: CSV-72
 URL: https://issues.apache.org/jira/browse/CSV-72
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb
Priority: Minor

 CSVFormat.DEFAULT should be renamed as RFC4180.
 It's confusing to use the name DEFAULT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (IO-286) FastByteArrayStream implementations to replace syncronized JDK ByteArrayStream

2012-03-26 Thread Paul Loy (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/IO-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238668#comment-13238668
]

Paul Loy commented on IO-286:
-

It's funny you should say #3. This is a direct copy from the existing code
which does this. However I have already begun to optimize this code. I also
noticed that the write methods are maintaining 2 counters when only 1 is really
needed.

regarding protected void writeBytes(byte[] b, int off, int len) I think this
should not be there.

I'll send a new patch tonight with all these.

FastByteArray*Stream implementations to replace syncronized JDK
ByteArray*Stream

Key: IO-286
URL: https://issues.apache.org/jira/browse/IO-286
Project: Commons IO
Issue Type: New Feature
Components: Streams/Writers
Reporter: Paul Loy
Priority: Minor
Labels: streams, synchronized
Attachments: FastByteArrayOutputStream_commons-io.patch,
FastByteArrayOutputStream_commons-io[2].patch

Original Estimate: 24h
Remaining Estimate: 24h

In CASSANDRA-2820 I reintoduced the FastByteArrayInputStream and
FastByteArrayOutputStream to cassandra. These steams are un-synchronized
versions of the Apache Harmony ByteArrayInputStream and ByteArrayOutputStream
respectively.
During my own testing of the streams I found a big difference in the
performance of the standard JDK BA*S steams and the FBA*S streams on most
JREs. Then cassandra load testing also showed an up to 10% improvement in
cassandra performance using these streams.
Then Thrift has TByteArrayOutputStream which contains a way to get the
underlying byte[] buffer without a deep copy that would probably be a good
further enhancement.
Patch to follow.

[jira] [Resolved] (CSV-54) Confusing semantic of the ignore leading/trailing spaces parameters


 [ 
https://issues.apache.org/jira/browse/CSV-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved CSV-54.
-

Resolution: Fixed

 Confusing semantic of the ignore leading/trailing spaces parameters
 ---

 Key: CSV-54
 URL: https://issues.apache.org/jira/browse/CSV-54
 Project: Commons CSV
  Issue Type: Bug
  Components: Parser
Reporter: Emmanuel Bourg
 Fix For: 1.0


 {{CSVFormat}} has two parameters to control how the leading and trailing 
 spaces around values are handled, but the actual behavior depends on the 
 value being enclosed in quotes or not.
 If the value is not enclosed in quotes, setting 
 {{leading/trailingSpacesIgnored}} to {{true}} will left or right trim the 
 value. For example with this input (using the default format):
 {code}a,  b  ,c{code}
 the second value will be equal to {{'b'}}.
 But if the value is enclosed into quotes, the value is no longer trimmed:
 {code}a, b ,c{code}
 this will give {{' b '}}.
 With quoted values the parser actually ignores the spaces between the 
 delimiter and the quote. Thus with this input:
 {code}a,  b  ,c{code}
 The value returned is {{' b '}}.
 If {{leading/trailingSpacesIgnored}} is set to {{false}}, we get instead {{' 
  b  '}} which is consistent with RFC 4180.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-772) Change genetics component API to allow for different types of CrossoverPolicys and SelectionPolicys.

2012-03-26 Thread Reid Hochstedler (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/MATH-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238730#comment-13238730
 ] 

Reid Hochstedler commented on MATH-772:
---

A use case, for greater than two parents during crossover would be the [Three 
parent 
crossover|http://en.wikipedia.org/wiki/Crossover_(genetic_algorithm)#Three_parent_crossover]
 described on wikipedia. However, I don't believe that is a common use case, so 
I've reworked my patch to only allow for 2 parent crossover. I have however 
kept the change to the SelectionPolicy API, my thinking is that allowing a 
larger pool of potential parents allow for more diversity.

 Change genetics component API to allow for different types of 
 CrossoverPolicys and SelectionPolicys.
 

 Key: MATH-772
 URL: https://issues.apache.org/jira/browse/MATH-772
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
Reporter: Reid Hochstedler
  Labels: api-change
 Attachments: MATH-772.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-772) Change genetics component API to allow for different types of CrossoverPolicys and SelectionPolicys.

2012-03-26 Thread Reid Hochstedler (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/MATH-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Hochstedler updated MATH-772:
--

Attachment: MATH-772.r2.txt

 Change genetics component API to allow for different types of 
 CrossoverPolicys and SelectionPolicys.
 

 Key: MATH-772
 URL: https://issues.apache.org/jira/browse/MATH-772
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
Reporter: Reid Hochstedler
  Labels: api-change
 Attachments: MATH-772.r2.txt, MATH-772.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-86) Remove current character input parameter from Lexer methods


[ 
https://issues.apache.org/jira/browse/CSV-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238827#comment-13238827
 ] 

Sebb commented on CSV-86:
-

I cannot see the point of dropping the current char from the simpleTokenLexer 
method call if the first thing it does is fetch the char by invoking 
in.readAgain().

 Remove current character input parameter from Lexer methods
 ---

 Key: CSV-86
 URL: https://issues.apache.org/jira/browse/CSV-86
 Project: Commons CSV
  Issue Type: Sub-task
  Components: Parser
Reporter: Benedikt Ritter
 Fix For: 1.0

 Attachments: CSV-86.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-85) Allow comments to be returned in CSVRecord

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238831#comment-13238831
 ] 

Emmanuel Bourg commented on CSV-85:
---

How do you define a comment associated to a record?

 Allow comments to be returned in CSVRecord
 --

 Key: CSV-85
 URL: https://issues.apache.org/jira/browse/CSV-85
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 It might be useful to provide a comment field in the CSVRecord class.
 This would be null if no comment is present for that record.
 A line consisting of only a comment would have a size() of 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-84) Comment handling

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CSV-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238843#comment-13238843
]

Emmanuel Bourg commented on CSV-84:
---

I have never seen trailing comments in the wild, is there an application
actually producing this kind of files?

Regarding case 3, I would handle it exactly like the case 2. There is no point
returning an empty line. Empty lines are useless in almost every case.

Comment handling

Key: CSV-84
URL: https://issues.apache.org/jira/browse/CSV-84
Project: Commons CSV
Issue Type: New Feature
Reporter: Sebb

Comment handling is not currently fully documented / tested.
There are several possible situations:
1) trailing comment following one or more values
2) comment marker starts in the first column
3) comment marker starts in the first non-whitespace column
How should these be handled?
1) The trailing comment should be ignored
2) Entire line should be ignored, i.e. don't treat it as a blank line
3) This is trickier: if whitespace is being trimmed, treat as 2, else treat
as 1, i.e. there is a single value containing whitespace
It might also be useful to consider returning comments to the application
program as part of CSVRecord.

[jira] [Commented] (CSV-83) Provide a header encapsulator class

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238848#comment-13238848
 ] 

Emmanuel Bourg commented on CSV-83:
---

For what concrete purpose would this be useful?

 Provide a header encapsulator class
 ---

 Key: CSV-83
 URL: https://issues.apache.org/jira/browse/CSV-83
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 Might be useful to have a class to encapsulate the headers, rather than 
 assuming they are stored as a map.
 This would allow more a more flexible implementation if required.
 Methods which it would be useful to have:
 - getIndex(name) - could return -1 for unknown name?
 - iterator()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-76) Move CSVParser.getRecord() into a separate class?

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238855#comment-13238855
 ] 

Emmanuel Bourg commented on CSV-76:
---

I don't see the need to extract a 30 lines long method from a class with 150 
lines of code. CSVParser has a very reasonable size and is easy to read. Moving 
getRecord() to a separate class will make the code more difficult to follow.

 Move CSVParser.getRecord() into a separate class?
 -

 Key: CSV-76
 URL: https://issues.apache.org/jira/browse/CSV-76
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb
Priority: Minor

 Much of the CSVParser code is generic and does not depend on the parser 
 implementation.
 Moving the getRecord() method to its own class would make it easier to 
 follow, and would make it easier to provide alternative implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-85) Allow comments to be returned in CSVRecord


[ 
https://issues.apache.org/jira/browse/CSV-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238863#comment-13238863
 ] 

Sebb commented on CSV-85:
-

{code}
a,b,c#Inline comment (1)
#Separate comment (2)
d,e,f,#Inline comment (3)
{code}

In the first case, the record would have 3 values and a comment, in the second, 
0 values and a comment, and the 3rd would have 4 values and a comment.

At present, the code treats (1) as part of the 3rd value, but it does allow 
case (3) - but thinks there are 3 values.
Maybe only separate comment lines should be allowed, but that is not the case 
currently.

 Allow comments to be returned in CSVRecord
 --

 Key: CSV-85
 URL: https://issues.apache.org/jira/browse/CSV-85
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 It might be useful to provide a comment field in the CSVRecord class.
 This would be null if no comment is present for that record.
 A line consisting of only a comment would have a size() of 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-84) Comment handling

[
https://issues.apache.org/jira/browse/CSV-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238874#comment-13238874
]

Sebb commented on CSV-84:
-

I don't know if there are formats with inline comments.
The code currently recognises comments at the start of a value as well as at
the start of a record.
This behaviour may be unintentional.

I agree that empty lines are generally not needed, but a comment line may not
be useless.

Comment handling

Key: CSV-84
URL: https://issues.apache.org/jira/browse/CSV-84
Project: Commons CSV
Issue Type: New Feature
Reporter: Sebb

[jira] [Commented] (CSV-83) Provide a header encapsulator class


[ 
https://issues.apache.org/jira/browse/CSV-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238876#comment-13238876
 ] 

Sebb commented on CSV-83:
-

bq. This would allow more a more flexible implementation if required.

 Provide a header encapsulator class
 ---

 Key: CSV-83
 URL: https://issues.apache.org/jira/browse/CSV-83
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 Might be useful to have a class to encapsulate the headers, rather than 
 assuming they are stored as a map.
 This would allow more a more flexible implementation if required.
 Methods which it would be useful to have:
 - getIndex(name) - could return -1 for unknown name?
 - iterator()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-84) Comment handling

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238879#comment-13238879
 ] 

Emmanuel Bourg commented on CSV-84:
---

That's most likely unintentional. This case is not covered by the tests, and 
CSVPrinter always output the comments on a new line.

 Comment handling
 

 Key: CSV-84
 URL: https://issues.apache.org/jira/browse/CSV-84
 Project: Commons CSV
  Issue Type: New Feature
Reporter: Sebb

 Comment handling is not currently fully documented / tested.
 There are several possible situations:
 1) trailing comment following one or more values
 2) comment marker starts in the first column
 3) comment marker starts in the first non-whitespace column
 How should these be handled?
 1) The trailing comment should be ignored
 2) Entire line should be ignored, i.e. don't treat it as a blank line
 3) This is trickier: if whitespace is being trimmed, treat as 2, else treat 
 as 1, i.e. there is a single value containing whitespace
 It might also be useful to consider returning comments to the application 
 program as part of CSVRecord.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CSV-84) Clarify comment handling

2012-03-26 Thread Emmanuel Bourg (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/CSV-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Bourg updated CSV-84:
--

Fix Version/s: 1.0
   Issue Type: Improvement  (was: New Feature)
  Summary: Clarify comment handling  (was: Comment handling)

 Clarify comment handling
 

 Key: CSV-84
 URL: https://issues.apache.org/jira/browse/CSV-84
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb
 Fix For: 1.0


 Comment handling is not currently fully documented / tested.
 There are several possible situations:
 1) trailing comment following one or more values
 2) comment marker starts in the first column
 3) comment marker starts in the first non-whitespace column
 How should these be handled?
 1) The trailing comment should be ignored
 2) Entire line should be ignored, i.e. don't treat it as a blank line
 3) This is trickier: if whitespace is being trimmed, treat as 2, else treat 
 as 1, i.e. there is a single value containing whitespace
 It might also be useful to consider returning comments to the application 
 program as part of CSVRecord.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-85) Allow comments to be returned in CSVRecord

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/CSV-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238900#comment-13238900
]

Emmanuel Bourg commented on CSV-85:
---

Assuming that inline comments are dropped, I would define a record comment as
the concatenation of the comment lines preceding the current record in the file
(up to the previous record).

1. The simple case, a one line comment:

{code}
# Comment
a,b,c,e,f
{code}

Expected record comment: Comment (the comment char and the leading space are
removed)

2. A comment on multiple lines

{code:none}
# Very Long
# Comment
a,b,c,e,f
{code}

Expected record comment: Very Long\nComment

3. A comment on multiple lines, with blank lines interleaved

{code:none}
# Very Long

# Comment
a,b,c,e,f
{code}

Here the result depends on the empty line handling. If empty lines are ignored
this is equivalent to the previous case. If empty lines are not ignored, this
will produce two records, the first one with the comment Very Long and the
second one with Comment.

Allow comments to be returned in CSVRecord
--

Key: CSV-85
URL: https://issues.apache.org/jira/browse/CSV-85
Project: Commons CSV
Issue Type: Improvement
Reporter: Sebb

It might be useful to provide a comment field in the CSVRecord class.
This would be null if no comment is present for that record.
A line consisting of only a comment would have a size() of 0.

[jira] [Commented] (CSV-85) Allow comments to be returned in CSVRecord


[ 
https://issues.apache.org/jira/browse/CSV-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238916#comment-13238916
 ] 

Sebb commented on CSV-85:
-

OK, that makes sense, and should be relatively easy to implement.

The only detail still missing is what to do about the header record (if 
specified).

It would be simplest to just treat it as a normal record, i.e. comments would 
be ignored when evaluating the column names.


 Allow comments to be returned in CSVRecord
 --

 Key: CSV-85
 URL: https://issues.apache.org/jira/browse/CSV-85
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 It might be useful to provide a comment field in the CSVRecord class.
 This would be null if no comment is present for that record.
 A line consisting of only a comment would have a size() of 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FILEUPLOAD-202) org.apache.commons.fileupload.FileUploadBase$IOFileUploadException: Processing of multipart/form-data request failed. Stream ended unexpectedly

2012-03-26 Thread Brett Okken (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/FILEUPLOAD-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238919#comment-13238919
 ] 

Brett Okken commented on FILEUPLOAD-202:


Many connectors have the ability to have a backlog of connections. This is in 
the [ServerSocket 
constructor|http://docs.oracle.com/javase/6/docs/api/java/net/ServerSocket.html#ServerSocket(int,%20int)]
 in java and the 
[ListenBacklog|http://httpd.apache.org/docs/2.2/mod/mpm_common.html#listenbacklog]
 directive in the apache httpd server.
I would not be surprised if the 350 concurrent requests simply caused 94 
requests to go into backlog until threads freed up.

 org.apache.commons.fileupload.FileUploadBase$IOFileUploadException: 
 Processing of multipart/form-data request failed. Stream ended unexpectedly
 ---

 Key: FILEUPLOAD-202
 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-202
 Project: Commons FileUpload
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: tina
  Labels: fileupload
 Fix For: 1.2.1

   Original Estimate: 1h
  Remaining Estimate: 1h

 I used this one to write the servlet
 http://www.servletworld.com/servlet-tutorials/servlet-file-upload-example.html
 I can successfully upload the file through localhost, however, when I use 
 Jmeter to test the app server using 300 threads, it
 will report this error:
 [10:40:23.577] {http--8080-244$1283730842} 
 WebApp[http://localhost:8080/OrderFile] CommonsFileUploadServlet: Error 
 encountered while parsing the request
 [10:40:23.577] {http--8080-244$1283730842} 
 org.apache.commons.fileupload.FileUploadBase$IOFileUploadException: 
 Processing of multipart/form-data request failed. Stream ended unexpectedly
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:371)
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
 [10:40:23.577] {http--8080-244$1283730842}at 
 CommonsFileUploadServlet.doPost(CommonsFileUploadServlet.java:66)
 [10:40:23.577] {http--8080-244$1283730842}at 
 javax.servlet.http.HttpServlet.service(HttpServlet.java:153)
 [10:40:23.577] {http--8080-244$1283730842}at 
 javax.servlet.http.HttpServlet.service(HttpServlet.java:91)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:103)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:187)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:265)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:273)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.server.port.TcpConnection.run(TcpConnection.java:682)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:743)
 [10:40:23.577] {http--8080-244$1283730842}at 
 com.caucho.util.ThreadPool$Item.run(ThreadPool.java:662)
 [10:40:23.577] {http--8080-244$1283730842}at 
 java.lang.Thread.run(Thread.java:619)
 [10:40:23.577] {http--8080-244$1283730842} Caused by: 
 org.apache.commons.fileupload.MultipartStream$MalformedStreamException: 
 Stream ended unexpectedly
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:982)
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.MultipartStream$ItemInputStream.read(MultipartStream.java:886)
 [10:40:23.577] {http--8080-244$1283730842}at 
 java.io.FilterInputStream.read(FilterInputStream.java:116)
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.util.LimitedInputStream.read(LimitedInputStream.java:125)
 [10:40:23.577] {http--8080-244$1283730842}at 
 java.io.FilterInputStream.read(FilterInputStream.java:90)
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.util.Streams.copy(Streams.java:96)
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.util.Streams.copy(Streams.java:66)
 [10:40:23.577] {http--8080-244$1283730842}at 
 org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:366)
 [10:40:23.577] {http--8080-244$1283730842}... 12 more
 Is it because of the size limit? the uploaded file size is 8KB.
 Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent

[jira] [Commented] (CSV-85) Allow comments to be returned in CSVRecord

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238931#comment-13238931
 ] 

Emmanuel Bourg commented on CSV-85:
---

Ok for ignoring the comments before the header.

 Allow comments to be returned in CSVRecord
 --

 Key: CSV-85
 URL: https://issues.apache.org/jira/browse/CSV-85
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 It might be useful to provide a comment field in the CSVRecord class.
 This would be null if no comment is present for that record.
 A line consisting of only a comment would have a size() of 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-77) RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238939#comment-13238939
 ] 

Emmanuel Bourg commented on CSV-77:
---

{{CSVFormat.RFC4180}} adds little value to the API. For printing it's identical 
to {{CSVFormat.DEFAULT}}. And for parsing, ignoring blank lines is much more 
useful.

I would rather remove {{CSVFormat.RFC4180}} and improve the documentation of 
{{CSVFormat.DEFAULT}} to explain it follows RFC 4180 and ignores empty lines as 
a convenience.

 RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines
 ---

 Key: CSV-77
 URL: https://issues.apache.org/jira/browse/CSV-77
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 According to the RFC [1]:
 Section 2.4
 {quote}
 ... Spaces are considered part
 of a field and should not be ignored.
 {quote}
 Also, the RFC does not mention that blank lines are to be ignored.
 However, some of the alternate CSV specifications referenced by RFC 4180 do 
 say that spaces are ignored.
 I've not yet found a mention to say that blank lines are to be ignored.
 [1] http://tools.ietf.org/html/rfc4180

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-83) Provide a header encapsulator class

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238966#comment-13238966
 ] 

Emmanuel Bourg commented on CSV-83:
---

Sorry but without a real use case this tends to light up the overengineered 
warning in my mind.

Would this affect the public API?

 Provide a header encapsulator class
 ---

 Key: CSV-83
 URL: https://issues.apache.org/jira/browse/CSV-83
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 Might be useful to have a class to encapsulate the headers, rather than 
 assuming they are stored as a map.
 This would allow more a more flexible implementation if required.
 Methods which it would be useful to have:
 - getIndex(name) - could return -1 for unknown name?
 - iterator()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-77) RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines


[ 
https://issues.apache.org/jira/browse/CSV-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238991#comment-13238991
 ] 

Sebb commented on CSV-77:
-

I think we should keep the strict RFC4180 definition.

 RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines
 ---

 Key: CSV-77
 URL: https://issues.apache.org/jira/browse/CSV-77
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 According to the RFC [1]:
 Section 2.4
 {quote}
 ... Spaces are considered part
 of a field and should not be ignored.
 {quote}
 Also, the RFC does not mention that blank lines are to be ignored.
 However, some of the alternate CSV specifications referenced by RFC 4180 do 
 say that spaces are ignored.
 I've not yet found a mention to say that blank lines are to be ignored.
 [1] http://tools.ietf.org/html/rfc4180

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-77) RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239013#comment-13239013
 ] 

Emmanuel Bourg commented on CSV-77:
---

What's the point of adding a useless constant? It bloats the API for no purpose.

 RFC 4180 (DEFAULT) format is wrong; should not ignore spaces or blank lines
 ---

 Key: CSV-77
 URL: https://issues.apache.org/jira/browse/CSV-77
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 According to the RFC [1]:
 Section 2.4
 {quote}
 ... Spaces are considered part
 of a field and should not be ignored.
 {quote}
 Also, the RFC does not mention that blank lines are to be ignored.
 However, some of the alternate CSV specifications referenced by RFC 4180 do 
 say that spaces are ignored.
 I've not yet found a mention to say that blank lines are to be ignored.
 [1] http://tools.ietf.org/html/rfc4180

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CSV-42) Lots of possible changes

2012-03-26 Thread Emmanuel Bourg (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/CSV-42?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Bourg updated CSV-42:
--

Description: 
I made a lot of changes to pretty much all of the classes in the csv package.  
I thought it would be better to put all of the the changes here in one issue, 
but feel free to only take the parts you like (if any).  Hopefully if nothing 
else the test cases will be useful to you.

I'll try to list most of the changes here, but I'm sure I'm forgetting some. 
This should include all of the big changes at least. I focused mostly on the 
parser, but I also made a few changes to the printer classes (although I don't 
think I added any new test cases there).

h3. General Changes

- Changed all class names with CSV in them to use Csv. This is how it 
appears in the commons-lang escapeCsv methods and I think it's easier to read 
the class name when acronyms are not in all upper case. (x) {color:red}3 letter 
acronyms are usually kept in uppercase (for example URLConnection or SAXParser 
in the JDK, but there are some exceptions){color}

- Formatted the code. I used Eclipse with a version of the Java formatting 
style that uses spaces instead of tabs and with a few other small changes to 
try to make it more similar to the style of this code. The formatting was 
inconsistent before (sometimes 2 space indent, sometimes 4) which made it 
really hard to work on. (/) {color:green}DONE{color}

- Removed all deprecated methods/constructors (/) {color:green}DONE{color}

- Made all public classes final. If there is ever a need to create subclasses 
of them then this could be changed, but I think it would be better to at least 
start them as final (since once they are released as non-final it's hard to go 
back).

- A few bug fixes (and test cases for them)

h3. CsvParser

- There were a few bugs for special cases, so I made as small of changes as I 
could to the parser code to fix these.

- Added a lot of test cases. I created a test case for all bugs that I found, 
so even if you don't use my changes to this class you should be able to use the 
test cases to find all of the same bugs.

- Added a close method. (x) {color:red} The try-with-resources statement in 
Java 7 makes resources management much easier, there is no need to add a 
close() method to the parser.{color}

- Renamed the nextValue method to getValue (so it is more consistent with the 
getAll and getLine method names). I think I would prefer to use a different 
method name prefix for all three of these (like readAll) since I wouldn't 
normally expect a get method to have side effects, but I didn't want to just 
change the names of the most used methods. (x) {color:red}This method has been 
removed, the parser now works line by line.{color}

- Changed the getLineNumber method to return the correct line number when there 
are multi-line values. (x) {color:red}The suggested code counts the number of 
records instead of the number of lines. For debugging it's better to return the 
actual line number.{color}

- Moved all of the lexer methods into an inner CsvLexer class that is 
completely independent of the CsvParser class. The methods were already 
separated out, so it wasn't a very big change. I also moved the lexer test 
cases into a new CsvLexerTest class. (/) {color:green}DONE{color}

- Got rid of the interpreting unicode escape options. This doesn't really have 
anything to do with parsing a CSV file so I think it should be left up to the 
user of the class to implement this if needed. As an example, I made a 
CsvParserUnicodeEscapeTest class that uses the code from the lexer in a Reader 
subclass. One nice thing is that with this implementation, the interpreted 
values can be used as the delimiter, encapsulator, etc. (/) {color:green}DONE - 
The unicode unescaping is now handled by a class implementing java.io.Reader 
(to be contributed to Commons IO).{color}

- Got rid of the escape option for the same reason as the unicode escape 
option. I replaced it with an encapsulator escape option that is only used as 
an escape operator on the encapsulator character.

h3. ExtendedBufferedReader

- Greatly simplified this class. I removed all the methods that weren't being 
used (including keeping track of the line number) and changed the lookahead 
option to use the BufferedReader mark and reset methods. (/) {color:green}DONE 
- ExtendedBufferedReader is still counting the lines, but the mark/reset 
lookahead improved the performance by 30%.{color}


h3. CsvStrategy

- I split this class into three classes: an abstract base class (CsvStrategy), 
a parser-specific version (CsvParseStrategy) and a printer-specific version 
(CsvPrintStrategy). I didn't like that the strategy was used for both parsing 
and printing even though some of the values only applied to parsing (and there 
could be values that apply only to printing as well). (x) {color:red}There 
aren't

[jira] [Updated] (CSV-42) Lots of possible changes

2012-03-26 Thread Emmanuel Bourg (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/CSV-42?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Bourg updated CSV-42:
--

Description: 
I made a lot of changes to pretty much all of the classes in the csv package.  
I thought it would be better to put all of the the changes here in one issue, 
but feel free to only take the parts you like (if any).  Hopefully if nothing 
else the test cases will be useful to you.

I'll try to list most of the changes here, but I'm sure I'm forgetting some. 
This should include all of the big changes at least. I focused mostly on the 
parser, but I also made a few changes to the printer classes (although I don't 
think I added any new test cases there).

h3. General Changes

- Changed all class names with CSV in them to use Csv. This is how it 
appears in the commons-lang escapeCsv methods and I think it's easier to read 
the class name when acronyms are not in all upper case. (x) {color:red}3 letter 
acronyms are usually kept in uppercase (for example URLConnection or SAXParser 
in the JDK, but there are some exceptions){color}

- Formatted the code. I used Eclipse with a version of the Java formatting 
style that uses spaces instead of tabs and with a few other small changes to 
try to make it more similar to the style of this code. The formatting was 
inconsistent before (sometimes 2 space indent, sometimes 4) which made it 
really hard to work on. (/) {color:green}DONE{color}

- Removed all deprecated methods/constructors (/) {color:green}DONE{color}

- Made all public classes final. If there is ever a need to create subclasses 
of them then this could be changed, but I think it would be better to at least 
start them as final (since once they are released as non-final it's hard to go 
back).

- A few bug fixes (and test cases for them)

h3. CsvParser

- There were a few bugs for special cases, so I made as small of changes as I 
could to the parser code to fix these.

- Added a lot of test cases. I created a test case for all bugs that I found, 
so even if you don't use my changes to this class you should be able to use the 
test cases to find all of the same bugs.

- Added a close method. (x) {color:red} The try-with-resources statement in 
Java 7 makes resources management much easier, there is no need to add a 
close() method to the parser.{color}

- Renamed the nextValue method to getValue (so it is more consistent with the 
getAll and getLine method names). I think I would prefer to use a different 
method name prefix for all three of these (like readAll) since I wouldn't 
normally expect a get method to have side effects, but I didn't want to just 
change the names of the most used methods. (x) {color:red}This method has been 
removed, the parser now works line by line.{color}

- Changed the getLineNumber method to return the correct line number when there 
are multi-line values. (x) {color:red}The suggested code counts the number of 
records instead of the number of lines. For debugging it's better to return the 
actual line number.{color}

- Moved all of the lexer methods into an inner CsvLexer class that is 
completely independent of the CsvParser class. The methods were already 
separated out, so it wasn't a very big change. I also moved the lexer test 
cases into a new CsvLexerTest class. (/) {color:green}DONE{color}

- Got rid of the interpreting unicode escape options. This doesn't really have 
anything to do with parsing a CSV file so I think it should be left up to the 
user of the class to implement this if needed. As an example, I made a 
CsvParserUnicodeEscapeTest class that uses the code from the lexer in a Reader 
subclass. One nice thing is that with this implementation, the interpreted 
values can be used as the delimiter, encapsulator, etc. (/) {color:green}DONE - 
The unicode unescaping is now handled by a class implementing java.io.Reader 
(to be contributed to Commons IO).{color}

- Got rid of the escape option for the same reason as the unicode escape 
option. I replaced it with an encapsulator escape option that is only used as 
an escape operator on the encapsulator character.

h3. ExtendedBufferedReader

- Greatly simplified this class. I removed all the methods that weren't being 
used (including keeping track of the line number) and changed the lookahead 
option to use the BufferedReader mark and reset methods. (/) {color:green}DONE 
- ExtendedBufferedReader is still counting the lines, but the mark/reset 
lookahead improved the performance by 30%.{color}


h3. CsvStrategy

- I split this class into three classes: an abstract base class (CsvStrategy), 
a parser-specific version (CsvParseStrategy) and a printer-specific version 
(CsvPrintStrategy). I didn't like that the strategy was used for both parsing 
and printing even though some of the values only applied to parsing (and there 
could be values that apply only to printing as well). (x) {color:red}There 
aren't

[jira] [Created] (CSV-87) CSVParser.getRecords() returns null rather than empty List at EOF

2012-03-26 Thread Sebb (Created) (JIRA)

CSVParser.getRecords() returns null rather than empty List at EOF
-

 Key: CSV-87
 URL: https://issues.apache.org/jira/browse/CSV-87
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb


CSVParser.getRecords() returns null rather than empty List at EOF.

It's usually easier for applications to deal with empty lists than to have to 
check for null after every invocation of the method.

If the application really does need to know if the list is emty, then it can 
use a method such as isEmpty().


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-83) Provide a header encapsulator class


[ 
https://issues.apache.org/jira/browse/CSV-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239032#comment-13239032
 ] 

Sebb commented on CSV-83:
-

It looks as though the header Map is not part of the public API.
If that is the case then using a separate encapsulator class is not necessary.
We could change the headers to be (e.g.) a sorted list - if that were more 
efficient - without affecting the API.

It's not good to over-engineer, but IMO it's worse to expose internal 
implementation details in the public API, which is why I raised the issue 
initially.

 Provide a header encapsulator class
 ---

 Key: CSV-83
 URL: https://issues.apache.org/jira/browse/CSV-83
 Project: Commons CSV
  Issue Type: Improvement
Reporter: Sebb

 Might be useful to have a class to encapsulate the headers, rather than 
 assuming they are stored as a map.
 This would allow more a more flexible implementation if required.
 Methods which it would be useful to have:
 - getIndex(name) - could return -1 for unknown name?
 - iterator()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-75) ExtendedBufferReader does not handle EOL consistently

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239052#comment-13239052
 ] 

Emmanuel Bourg commented on CSV-75:
---

I added a test demonstrating the issue.

I wonder if the line counting should be handled by the lexer instead.

 ExtendedBufferReader does not handle EOL consistently
 -

 Key: CSV-75
 URL: https://issues.apache.org/jira/browse/CSV-75
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb
 Attachments: CSV-75.patch


 ExtendedBufferReader checks for '\n' (LF) in the read() methods, incrementing 
 linecount when found.
 However, the readLine() method calls BufferedReader.readLine() which treats 
 CR, LF and CRLF equally (and drops them).
 If the code is to be flexible in what it accepts, the class should also allow 
 for CR alone as a line terminator.
 It should work if the code increments the line counter for CR, and for LF if 
 the previous character was not CR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-87) CSVParser.getRecords() returns null rather than empty List at EOF

2012-03-26 Thread Emmanuel Bourg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-87?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239071#comment-13239071
 ] 

Emmanuel Bourg commented on CSV-87:
---

+1

 CSVParser.getRecords() returns null rather than empty List at EOF
 -

 Key: CSV-87
 URL: https://issues.apache.org/jira/browse/CSV-87
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 CSVParser.getRecords() returns null rather than empty List at EOF.
 It's usually easier for applications to deal with empty lists than to have to 
 check for null after every invocation of the method.
 If the application really does need to know if the list is emty, then it can 
 use a method such as isEmpty().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-75) ExtendedBufferReader does not handle EOL consistently


[ 
https://issues.apache.org/jira/browse/CSV-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239072#comment-13239072
 ] 

Sebb commented on CSV-75:
-

Yes, perhaps it should be done by the lexer. But the quickest fix would be to 
patch the reader.

I wonder whether ExtendedBufferReader is actually necessary; lookAhead() could 
easily be provided by the Lexer class.
And I'm not sure that readAgain() is really necessary.

 ExtendedBufferReader does not handle EOL consistently
 -

 Key: CSV-75
 URL: https://issues.apache.org/jira/browse/CSV-75
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb
 Attachments: CSV-75.patch


 ExtendedBufferReader checks for '\n' (LF) in the read() methods, incrementing 
 linecount when found.
 However, the readLine() method calls BufferedReader.readLine() which treats 
 CR, LF and CRLF equally (and drops them).
 If the code is to be flexible in what it accepts, the class should also allow 
 for CR alone as a line terminator.
 It should work if the code increments the line counter for CR, and for LF if 
 the previous character was not CR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CSV-87) CSVParser.getRecords() returns null rather than empty List at EOF


 [ 
https://issues.apache.org/jira/browse/CSV-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved CSV-87.
-

Resolution: Fixed

 CSVParser.getRecords() returns null rather than empty List at EOF
 -

 Key: CSV-87
 URL: https://issues.apache.org/jira/browse/CSV-87
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb

 CSVParser.getRecords() returns null rather than empty List at EOF.
 It's usually easier for applications to deal with empty lists than to have to 
 check for null after every invocation of the method.
 If the application really does need to know if the list is emty, then it can 
 use a method such as isEmpty().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CSV-75) ExtendedBufferReader does not handle EOL consistently


[ 
https://issues.apache.org/jira/browse/CSV-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239084#comment-13239084
 ] 

Sebb commented on CSV-75:
-

Enabled new test and added patch to fix it.
But we should still consider if ExtendedBufferReader is really necessary.

 ExtendedBufferReader does not handle EOL consistently
 -

 Key: CSV-75
 URL: https://issues.apache.org/jira/browse/CSV-75
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb
 Attachments: CSV-75.patch


 ExtendedBufferReader checks for '\n' (LF) in the read() methods, incrementing 
 linecount when found.
 However, the readLine() method calls BufferedReader.readLine() which treats 
 CR, LF and CRLF equally (and drops them).
 If the code is to be flexible in what it accepts, the class should also allow 
 for CR alone as a line terminator.
 It should work if the code increments the line counter for CR, and for LF if 
 the previous character was not CR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CSV-75) ExtendedBufferReader does not handle EOL consistently


 [ 
https://issues.apache.org/jira/browse/CSV-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved CSV-75.
-

Resolution: Fixed

 ExtendedBufferReader does not handle EOL consistently
 -

 Key: CSV-75
 URL: https://issues.apache.org/jira/browse/CSV-75
 Project: Commons CSV
  Issue Type: Bug
Reporter: Sebb
 Attachments: CSV-75.patch


 ExtendedBufferReader checks for '\n' (LF) in the read() methods, incrementing 
 linecount when found.
 However, the readLine() method calls BufferedReader.readLine() which treats 
 CR, LF and CRLF equally (and drops them).
 If the code is to be flexible in what it accepts, the class should also allow 
 for CR alone as a line terminator.
 It should work if the code increments the line counter for CR, and for LF if 
 the previous character was not CR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CSV-83) Provide a header encapsulator class