[jira] [Commented] (NET-418) File truncated when transfer on ftp

2011-08-07 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/NET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080620#comment-13080620
 ] 

Sebb commented on NET-418:
--

What is the code you are using to upload the file?

Are you using binary or ascii mode?

 File truncated when transfer on ftp
 ---

 Key: NET-418
 URL: https://issues.apache.org/jira/browse/NET-418
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 3.0.1
 Environment: Transfer from Windows server 2008 R2 64 bits to Linux 
 Centos 5.5 x86 64 bits Pro FTPD A31
Reporter: PARENT JP
 Attachments: notices.txt, notices_total.zip


 File after transfer is truncated.
 Original file has a size of 17 172 261 bytes  and file after transfer 17 170 
 762 bytes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (NET-418) File truncated when transfer on ftp

2011-08-09 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved NET-418.
--

Resolution: Not A Problem

 File truncated when transfer on ftp
 ---

 Key: NET-418
 URL: https://issues.apache.org/jira/browse/NET-418
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 3.0.1
 Environment: Transfer from Windows server 2008 R2 64 bits to Linux 
 Centos 5.5 x86 64 bits Pro FTPD A31
Reporter: PARENT JP
 Attachments: notices.txt, notices_total.zip


 File after transfer is truncated.
 Original file has a size of 17 172 261 bytes  and file after transfer 17 170 
 762 bytes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (NET-418) File truncated when transfer on ftp

2011-08-09 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb closed NET-418.



 File truncated when transfer on ftp
 ---

 Key: NET-418
 URL: https://issues.apache.org/jira/browse/NET-418
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 3.0.1
 Environment: Transfer from Windows server 2008 R2 64 bits to Linux 
 Centos 5.5 x86 64 bits Pro FTPD A31
Reporter: PARENT JP
 Attachments: notices.txt, notices_total.zip


 File after transfer is truncated.
 Original file has a size of 17 172 261 bytes  and file after transfer 17 170 
 762 bytes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-125) Implement a Beider-Morse phonetic matching codec

2011-08-11 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083198#comment-13083198
 ] 

Sebb commented on CODEC-125:


Changing API means more than a version bump; for Commons it generally requires 
a change of package name and Maven id so that old and new versions can co-exist.

So it's important to get the API correct before release if at all possible.

In the case of brand new code, maybe it would be possible to document it as 
being unstable, and therefore allow changes to the API. But this should be 
discussed on the developer list first.

 Implement a Beider-Morse phonetic matching codec
 

 Key: CODEC-125
 URL: https://issues.apache.org/jira/browse/CODEC-125
 Project: Commons Codec
  Issue Type: New Feature
Reporter: Matthew Pocock
Priority: Minor
 Attachments: Rule$4$1-All_Objects.html, acz.patch, bm-gg.diff, 
 bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, 
 bmpm.patch, bmpm.patch, comparator.patch, fightingMemoryChurn.patch, 
 fightingMemoryChurn.patch, fixmeInvariant.patch, handleH.patch, 
 majorFix.patch, performanceAndBugs.patch, testAllChars-mem-profile.html, 
 testEncodeGna.patch


 I have implemented Beider Morse Phonetic Matching as a codec against the 
 commons-codec svn trunk. I would like to contribute this to commons-codec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-125) Implement a Beider-Morse phonetic matching codec

2011-08-11 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083428#comment-13083428
 ] 

Sebb commented on CODEC-125:


@Gary. If there is already a need to break compat., then I don't have an issue 
with changing the bmpm api at the same time.

 Implement a Beider-Morse phonetic matching codec
 

 Key: CODEC-125
 URL: https://issues.apache.org/jira/browse/CODEC-125
 Project: Commons Codec
  Issue Type: New Feature
Reporter: Matthew Pocock
Priority: Minor
 Attachments: Rule$4$1-All_Objects.html, acz.patch, bm-gg.diff, 
 bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, 
 bmpm.patch, bmpm.patch, comparator.patch, fightingMemoryChurn.patch, 
 fightingMemoryChurn.patch, fixmeInvariant.patch, handleH.patch, 
 majorFix.patch, performanceAndBugs.patch, testAllChars-mem-profile.html, 
 testEncodeGna.patch


 I have implemented Beider Morse Phonetic Matching as a codec against the 
 commons-codec svn trunk. I would like to contribute this to commons-codec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-125) Implement a Beider-Morse phonetic matching codec

2011-08-11 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083501#comment-13083501
 ] 

Sebb commented on CODEC-125:


@Gary: yes, I know it's new; my concern is about the next release after adding 
bmpm.

I am concerned that the bmpm API is not stable, and that an early release might 
entail an incompatible change later.
Hence the suggestion to discuss if that would require a package/maven name 
change, given that few external classes would be using it.

 Implement a Beider-Morse phonetic matching codec
 

 Key: CODEC-125
 URL: https://issues.apache.org/jira/browse/CODEC-125
 Project: Commons Codec
  Issue Type: New Feature
Reporter: Matthew Pocock
Priority: Minor
 Attachments: Rule$4$1-All_Objects.html, acz.patch, bm-gg.diff, 
 bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, bmpm.patch, 
 bmpm.patch, bmpm.patch, comparator.patch, fightingMemoryChurn.patch, 
 fightingMemoryChurn.patch, fixmeInvariant.patch, handleH.patch, 
 majorFix.patch, performanceAndBugs.patch, testAllChars-mem-profile.html, 
 testEncodeGna.patch


 I have implemented Beider Morse Phonetic Matching as a codec against the 
 commons-codec svn trunk. I would like to contribute this to commons-codec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-30) [codec] Character ö or é not mapped in soundex encoding

2011-08-12 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-30:
--

Description: 
When calling soundex.soundex\(x) with x a string with a diacritical mark 
like ö or é the following exception occurs:

java.lang.ArrayIndexOutOfBoundsException: 131
at org.apache.commons.codec.language.Soundex.map(Soundex.java:199)
at 
org.apache.commons.codec.language.Soundex.getMappingCode(Soundex.java:157)

This happens when calling the difference(s1, s2) in codec verion 1.3-dev this 
exception occurs too

Cheers Rogier

  was:
When calling soundex.soundex(x) with x a string with a diacritical mark 
like ö or é the following exception occurs:

java.lang.ArrayIndexOutOfBoundsException: 131
at org.apache.commons.codec.language.Soundex.map(Soundex.java:199)
at org.apache.commons.codec.language.Soundex.getMappingCode(Soundex.java
:157)
This happens when calling the difference(s1, s2) in codec verion 1.3-dev this 
exception occurs too

Cheers Rogier


Escape inadvertent special sequence

 [codec] Character ö or é not mapped in soundex encoding
 -

 Key: CODEC-30
 URL: https://issues.apache.org/jira/browse/CODEC-30
 Project: Commons Codec
  Issue Type: Bug
Affects Versions: 1.2
 Environment: Operating System: All
 Platform: All
Reporter: Rogier Selie

 When calling soundex.soundex\(x) with x a string with a diacritical mark 
 like ö or é the following exception occurs:
 java.lang.ArrayIndexOutOfBoundsException: 131
 at org.apache.commons.codec.language.Soundex.map(Soundex.java:199)
 at 
 org.apache.commons.codec.language.Soundex.getMappingCode(Soundex.java:157)
 This happens when calling the difference(s1, s2) in codec verion 1.3-dev this 
 exception occurs too
 Cheers Rogier

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CODEC-127) Non-ascii characters in test source files

2011-08-13 Thread Sebb (JIRA)
Non-ascii characters in test source files
-

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb


Some of the test cases include characters in a native encoding (possibly 
UTF-8), rather than using Unicode escapes.

This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
compilation errors, which is how I found the issue), and possibly some 
transformations may corrupt the contents, e.g. fixing EOL.

I think we should have a rule of using Unicode escapes for all such non-ascii 
characters.
It's particularly important for non-ISO-8859-1 characters.

Some example classes with non-ascii characters:

{code}
binary\Base64Test.java:96 byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
664645214},
language\ColognePhoneticTest.java:130 String[][] data = 
{{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
language\ColognePhoneticTest.java:143 {ganz, Gänse},
language\DoubleMetaphoneTest.java:1222 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
language\DoubleMetaphoneTest.java:1227 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:375 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:395 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
{code}

The characters are probably not correct above, because I used a crude perl 
script to find them:

{code}
perl ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
*/*.java
{code}

language\SoundexTest.java:367 in particular is incorrect, because it's supposed 
to be a single character.

Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
gives:

if (Character.isLetter('\ufffd'))

which is an unknown character.

Similarly for binary\Base64Test.java:96.

It's not all that clear what the Unicode escapes should be in these cases, but 
probably not the unknown character.

[Possibly the characters got mangled at some point, or maybe they have always 
been wrong]

The ColognePhoneticTest.java cases are less serious, as the characters are 
valid ISO-8859-1 (accented German), but given that the rest of the file uses 
unicode escaps, I think they should be changed too (but add comments to say 
what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in test source files

2011-08-13 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084611#comment-13084611
 ] 

Sebb commented on CODEC-127:


The problem is that it's not possible to see what the test data is in the IDE 
(apart from the German chars).

Also, unless you tell SVN the encoding (e.g. via mime-type), diff e-mails (and 
possibly conversion to local EOL) may suffer.

Saving IDE settings in SVN is a non-starter, because there are many different 
IDEs, and it's anyway not possible to have the settings automatically picked 
up, as far as I know.

Have a look again at the non-ISO-8858-1 characters and see if they are correct. 
I suspect not, as they all appear to be the unspecified character (\ufffd), at 
least when treated as UTF-8.

 Non-ascii characters in test source files
 -

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
 */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in test source files

2011-08-13 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084743#comment-13084743
 ] 

Sebb commented on CODEC-127:


Here's the full list of lines containing non-ASCII characters:

{code}
java/org/apache/commons/codec/language/ColognePhonetic.java:264private 
static final char[][] PREPROCESS_MAP = new char[][]{{'\u00C4', 'A'}, // ├âÔÇ×
java/org/apache/commons/codec/language/ColognePhonetic.java:265
{'\u00DC', 'U'}, // Ü
java/org/apache/commons/codec/language/ColognePhonetic.java:266
{'\u00D6', 'O'}, // ├âÔÇô
java/org/apache/commons/codec/language/ColognePhonetic.java:267
{'\u00DF', 'S'} // ├â┼©
java/org/apache/commons/codec/language/ColognePhonetic.java:388 * Converts 
the string to upper case and replaces germanic umlauts, and the 
├óÔé¼┼ô├â┼©├óÔé¼´┐¢.
test/org/apache/commons/codec/binary/Base64Test.java:96byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
test/org/apache/commons/codec/language/ColognePhoneticTest.java:110
{m├Ânchengladbach, 664645214},
test/org/apache/commons/codec/language/ColognePhoneticTest.java:130
String[][] data = {{bergisch-gladbach, 174845214}, 
{M├╝ller-L├╝denscheidt, 65752682}};
test/org/apache/commons/codec/language/ColognePhoneticTest.java:137
{Meyer, M├╝ller},
test/org/apache/commons/codec/language/ColognePhoneticTest.java:143
{ganz, Gänse},
test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1222
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1227
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
test/org/apache/commons/codec/language/SoundexTest.java:367if 
(Character.isLetter('´┐¢')) {
test/org/apache/commons/codec/language/SoundexTest.java:369
Assert.assertEquals(´┐¢000, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/SoundexTest.java:375
Assert.assertEquals(, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/SoundexTest.java:387if 
(Character.isLetter('´┐¢')) {
test/org/apache/commons/codec/language/SoundexTest.java:389
Assert.assertEquals(´┐¢000, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/SoundexTest.java:395
Assert.assertEquals(, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/bm/BeiderMorseEncoderTest.java:93
String[] names = { ácz, átz, Ignácz, Ignátz, Ignác };
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:47  
  { Nu├▒ez, spanish, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:49  
  { ─îapek, czech, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:52  
  { Küçük, turkish, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:55  
  { Ceauşescu, romanian, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:57  
  { ╬æ╬│╬│╬Á╬╗¤î¤Ç╬┐¤à╬╗╬┐¤é, greek, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:58  
  { ðƒÐâÐêð║ð©ð¢, cyrillic, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:59  
  { ÎøÎö΃, hebrew, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:60  
  { ácz, any, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:61  
  { átz, any, EXACT } });
{code}

Note the comment at ColognePhonetic.java:388 - this does not seem to make sense 
in any encoding, but I could be wrong.

 Non-ascii characters in test source files
 -

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 

[jira] [Issue Comment Edited] (CODEC-127) Non-ascii characters in test source files

2011-08-13 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084743#comment-13084743
 ] 

Sebb edited comment on CODEC-127 at 8/14/11 12:04 AM:
--

Here's the full list of lines containing non-ASCII characters:

{code}
java/org/apache/commons/codec/language/ColognePhonetic.java:264private 
static final char[][] PREPROCESS_MAP = new char[][]{{'\u00C4', 'A'}, // ├âÔÇ×
java/org/apache/commons/codec/language/ColognePhonetic.java:265
{'\u00DC', 'U'}, // Ü
java/org/apache/commons/codec/language/ColognePhonetic.java:266
{'\u00D6', 'O'}, // ├âÔÇô
java/org/apache/commons/codec/language/ColognePhonetic.java:267
{'\u00DF', 'S'} // ├â┼©
java/org/apache/commons/codec/language/ColognePhonetic.java:388 * Converts 
the string to upper case and replaces germanic umlauts, and the 
├óÔé¼┼ô├â┼©├óÔé¼´┐¢.
test/org/apache/commons/codec/binary/Base64Test.java:96byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
test/org/apache/commons/codec/language/ColognePhoneticTest.java:110
{m├Ânchengladbach, 664645214},
test/org/apache/commons/codec/language/ColognePhoneticTest.java:130
String[][] data = {{bergisch-gladbach, 174845214}, 
{M├╝ller-L├╝denscheidt, 65752682}};
test/org/apache/commons/codec/language/ColognePhoneticTest.java:137
{Meyer, M├╝ller},
test/org/apache/commons/codec/language/ColognePhoneticTest.java:143
{ganz, Gänse},
test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1222
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1227
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
test/org/apache/commons/codec/language/SoundexTest.java:367if 
(Character.isLetter('´┐¢')) {
test/org/apache/commons/codec/language/SoundexTest.java:369
Assert.assertEquals(´┐¢000, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/SoundexTest.java:375
Assert.assertEquals(, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/SoundexTest.java:387if 
(Character.isLetter('´┐¢')) {
test/org/apache/commons/codec/language/SoundexTest.java:389
Assert.assertEquals(´┐¢000, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/SoundexTest.java:395
Assert.assertEquals(, this.getSoundexEncoder().encode(´┐¢));
test/org/apache/commons/codec/language/bm/BeiderMorseEncoderTest.java:93
String[] names = { ácz, átz, Ignácz, Ignátz, Ignác };
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:47  
  { Nu├▒ez, spanish, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:49  
  { ─îapek, czech, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:52  
  { Küçük, turkish, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:55  
  { Ceauşescu, romanian, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:57  
  { ╬æ╬│╬│╬Á╬╗¤î¤Ç╬┐¤à╬╗╬┐¤é, greek, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:58  
  { ðƒÐâÐêð║ð©ð¢, cyrillic, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:59  
  { ÎøÎö΃, hebrew, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:60  
  { ácz, any, EXACT },
test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:61  
  { átz, any, EXACT } });
{code}

Note the comment at ColognePhonetic.java:388 - this does not seem to make sense 
in any encoding, but I could be wrong.
[You'll need to look at it in the source file itself - the Perl script I used 
is crude and does not display non-ASCII properly]

The other dubious entris are:

Base64Test.java:96
DoubleMetaphoneTest.java:1222
DoubleMetaphoneTest.java:1227
and most of the SoundexTest.java entries.

  was (Author: s...@apache.org):
Here's the full list of lines containing non-ASCII characters:

{code}
java/org/apache/commons/codec/language/ColognePhonetic.java:264private 
static final char[][] PREPROCESS_MAP = new char[][]{{'\u00C4', 'A'}, // ├âÔÇ×
java/org/apache/commons/codec/language/ColognePhonetic.java:265
{'\u00DC', 'U'}, // Ü
java/org/apache/commons/codec/language/ColognePhonetic.java:266
{'\u00D6', 'O'}, // ├âÔÇô
java/org/apache/commons/codec/language/ColognePhonetic.java:267
{'\u00DF', 'S'} // ├â┼©
java/org/apache/commons/codec/language/ColognePhonetic.java:388 * Converts 
the string to upper case and replaces germanic umlauts, and the 
├óÔé¼┼ô├â┼©├óÔé¼´┐¢.
test/org/apache/commons/codec/binary/Base64Test.java:96byte[] decode = 

[jira] [Commented] (CODEC-127) Non-ascii characters in test source files

2011-08-13 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084752#comment-13084752
 ] 

Sebb commented on CODEC-127:


Just done a comparison of the various versions of ColognePhonetic.java in trunk.

The corruption of the comments on PREPROCESS_MAP occurred between r1080701 and 
r1087901 (April 1st, ironically).

This also corrupted other comments, and the string at line 382.
The SVN log message says Annotate with @Override and @Deprecated - were those 
added automatically perhaps?

 Non-ascii characters in test source files
 -

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
 */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in test source files

2011-08-13 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084753#comment-13084753
 ] 

Sebb commented on CODEC-127:


SoundexTest appears to have been corrupted in r1075426 = r1080414.
Log comment says Keep these files in UTF-8 encoding for proper Javadoc 
processing
However, I suspect the file was originally in ISO-8859-1, not UTF-8.


 Non-ascii characters in test source files
 -

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
 */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085110#comment-13085110
 ] 

Sebb commented on CODEC-127:


What error do you get? Just curious.

I now get:

{code}
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:110
  {m├Ânchengladbach, 664645214},
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:130
  String[][] data = {{bergisch-gladbach, 174845214}, 
{M├╝ller-L├╝denscheidt, 65752682}};
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:137
 {Meyer, M├╝ller},
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:143
 {ganz, Gänse},
commons-codec-generics/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1222
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
commons-codec-generics/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1227
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/BeiderMorseEncoderTest.java:93
 String[] names = { ácz, átz, Ignácz, Ignátz, Ignác };
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:47
   { Nu├▒ez, spanish, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:49
   { ─îapek, czech, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:52
   { Küçük, turkish, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:55
   { Ceauşescu, romanian, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:57
   { ╬æ╬│╬│╬Á╬╗¤î¤Ç╬┐¤à╬╗╬┐¤é, greek, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:58
   { ðƒÐâÐêð║ð©ð¢, cyrillic, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:59
   { ÎøÎö΃, hebrew, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:60
   { ácz, any, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:61
   { átz, any, EXACT } });
{code}

and

{code}
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:110
 {m├Ânchengladbach, 664645214},
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:130
   String[][] data = {{bergisch-gladbach, 174845214}, 
{M├╝ller-L├╝denscheidt, 65752682}};
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:137
  {Meyer, M├╝ller},
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:143
  {ganz, Gänse},
commons-codec/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1227
  this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
commons-codec/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1232
  this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
commons-codec/src/test/org/apache/commons/codec/language/bm/BeiderMorseEncoderTest.java:93
  String[] names = { ácz, átz, Ignácz, Ignátz, Ignác };
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:47
   { Nu├▒ez, spanish, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:49
   { ─îapek, czech, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:52
   { Küçük, turkish, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:55
   { Ceauşescu, romanian, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:57
   { ╬æ╬│╬│╬Á╬╗¤î¤Ç╬┐¤à╬╗╬┐¤é, greek, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:58
   { ðƒÐâÐêð║ð©ð¢, cyrillic, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:59
   { ÎøÎö΃, hebrew, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:60
   { ácz, any, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:61
   { átz, any, EXACT } });
{code}

This was using an updated version of the script that uses File::Find to process 
directory traversal better.
(Some lines shortened above by manually removing leading spaces)

I think all the actual errors have now been 

[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Description: 
Some of the test cases include characters in a native encoding (possibly 
UTF-8), rather than using Unicode escapes.

This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
compilation errors, which is how I found the issue), and possibly some 
transformations may corrupt the contents, e.g. fixing EOL.

I think we should have a rule of using Unicode escapes for all such non-ascii 
characters.
It's particularly important for non-ISO-8859-1 characters.

Some example classes with non-ascii characters:

{code}
binary\Base64Test.java:96 byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
664645214},
language\ColognePhoneticTest.java:130 String[][] data = 
{{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
language\ColognePhoneticTest.java:143 {ganz, Gänse},
language\DoubleMetaphoneTest.java:1222 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
language\DoubleMetaphoneTest.java:1227 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:375 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:395 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
{code}

The characters are probably not correct above, because I used a crude perl 
script to find them:

{code}
perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
*/*.java
{code}

language\SoundexTest.java:367 in particular is incorrect, because it's supposed 
to be a single character.

Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
gives:

if (Character.isLetter('\ufffd'))

which is an unknown character.

Similarly for binary\Base64Test.java:96.

It's not all that clear what the Unicode escapes should be in these cases, but 
probably not the unknown character.

[Possibly the characters got mangled at some point, or maybe they have always 
been wrong]

The ColognePhoneticTest.java cases are less serious, as the characters are 
valid ISO-8859-1 (accented German), but given that the rest of the file uses 
unicode escaps, I think they should be changed too (but add comments to say 
what they are, e.g. o-umlaut, u-umlaut)

  was:
Some of the test cases include characters in a native encoding (possibly 
UTF-8), rather than using Unicode escapes.

This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
compilation errors, which is how I found the issue), and possibly some 
transformations may corrupt the contents, e.g. fixing EOL.

I think we should have a rule of using Unicode escapes for all such non-ascii 
characters.
It's particularly important for non-ISO-8859-1 characters.

Some example classes with non-ascii characters:

{code}
binary\Base64Test.java:96 byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
664645214},
language\ColognePhoneticTest.java:130 String[][] data = 
{{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
language\ColognePhoneticTest.java:143 {ganz, Gänse},
language\DoubleMetaphoneTest.java:1222 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
language\DoubleMetaphoneTest.java:1227 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:375 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:395 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
{code}

The characters are probably not correct above, because I used a crude perl 
script to find them:

{code}
perl ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
*/*.java
{code}

language\SoundexTest.java:367 in particular 

[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085128#comment-13085128
 ] 

Sebb commented on CODEC-127:


If you change Eclipse to set the container / resource / text file encoding to 
UTF-8 (since that is what the POM says) the files should display correctly 
assuming they really are UTF-8.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085135#comment-13085135
 ] 

Sebb commented on CODEC-127:


See my fix to ColognePhoneticTest in trunk.

That now shows native comments for all unicode escapes.

Two of the otherwise lowercase names were previously converted to the Unicode 
for upper case umlauts; I wonder if that was a mistake?

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085145#comment-13085145
 ] 

Sebb commented on CODEC-127:


Sorry, forgot I was using a local module which handles DOS wildcards, see

http://docs.activestate.com/activeperl/5.14/lib/pods/perlwin32.html#command_line_wildcard_expansion

Either pass each file in separately, or create Wild.pm and use:

{code}
perl -MWild -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
m/\P{ASCII}/;$s=$ARGV; */*.java
{code}

Wild.pm only works for one level of directories.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085149#comment-13085149
 ] 

Sebb commented on CODEC-127:


It's not that one cannot edit UTF-8; the problem is that it is easy to mangle 
non-ASCII characters by mistake.

The safest is to only use ASCII, i.e. Unicode escapes, which are valid in both 
UTF-8 and ISO-8859-1 and all likely default encodings.

However, they are difficult to read, hence the comments on the lines.
If the comments get mangled, it will be obvious, because they won't look right; 
and it's relatively easy to fix them from the Unicode.

I don't think it's an option to use native characters in the non-comment code, 
because we already know they can get corrupted, and the corruption won't 
necessarily cause errors.

I don't see the harm in translating the code into commments; after all the 
translation can be done again.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085145#comment-13085145
 ] 

Sebb edited comment on CODEC-127 at 8/15/11 4:55 PM:
-

Sorry, forgot I was using a local module which handles DOS wildcards, see

http://docs.activestate.com/activeperl/5.14/lib/pods/perlwin32.html#command_line_wildcard_expansion

Either pass each file in separately, or create Wild.pm and use:

{code}
perl -MWild -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
m/\P{ASCII}/;$s=$ARGV; */*.java
{code}

Wild.pm only works for one level of directories.

  was (Author: s...@apache.org):
Sorry, forgot I was using a local module which handles DOS wildcards, see

http://docs.activestate.com/activeperl/5.14/lib/pods/perlwin32.html#command_line_wildcard_expansion

Either pass each file in separately, or create Wild.pm and use:

{code}
perl -MWild -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
m/\P{ASCII}/;$s=$ARGV; */*.java
{code}

Wild.pm only works for one level of directories.
  
 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085165#comment-13085165
 ] 

Sebb commented on CODEC-127:


Sorry, closing  was in the wrong place; it should have been before the file 
name params

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CHAIN-53) Global Update of Chain - Generics, JDK 1.5, Update Dependency Versions

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CHAIN-53?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085237#comment-13085237
 ] 

Sebb commented on CHAIN-53:
---

Major version bump is not required when changing minimum Java version (though 
would be sensible if making a major jump)

http://commons.apache.org/releases/versioning.html

 Global Update of Chain - Generics, JDK 1.5, Update Dependency Versions
 --

 Key: CHAIN-53
 URL: https://issues.apache.org/jira/browse/CHAIN-53
 Project: Commons Chain
  Issue Type: Improvement
Reporter: Elijah Zupancic
  Labels: newbie, patch

 As posted in the mailing list, I've done this work outside of an offical 
 branch.
 Here is the source:
 http://elijah.zupancic.name/projects/commons-chain-v2-proof-of-concept.tar.gz
 And here is a diff:
 http://elijah.zupancic.name/projects/uber-diff
 In this patch:
 * Global upgrade to the JDK 1.5
 * Added @Override annotations
 * Upgraded to the Servlet 2.5 API
 * Upgraded to the Faces 2.1 API
 * Upgraded to the Portlet 2.0 API
 * Upgraded the Maven Parent POM version
 * Added generics support to Command so that Command's API looks like:
 public interface CommandT extends Context {
 ...
boolean execute(T context) throws Exception;
 }
 I'm very much new to the ASF and I was advised to file a bug in order to get 
 the process started for these changes to be integrated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085242#comment-13085242
 ] 

Sebb commented on CODEC-127:


Actually, DoubleMetaphoneTest is still corrupt; fixing now.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085266#comment-13085266
 ] 

Sebb commented on CODEC-127:


Tried it here; works fine.

Probably an error in your Wild.pm, because I see the same if I omit the -MWild 
option.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Sebb:

I get errors when I try your perl script on Windows with the latest perl (64 
bit) from ActiveState. Rather than use this space to figure out why, can you 
please run it again and check if we are done with this ticket? 

Thank you,
Gary)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Sorry, closing  was in the wrong place; it should have been before the 
file name params)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; */*.java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Description: 
Some of the test cases include characters in a native encoding (possibly 
UTF-8), rather than using Unicode escapes.

This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
compilation errors, which is how I found the issue), and possibly some 
transformations may corrupt the contents, e.g. fixing EOL.

I think we should have a rule of using Unicode escapes for all such non-ascii 
characters.
It's particularly important for non-ISO-8859-1 characters.

Some example classes with non-ascii characters:

{code}
binary\Base64Test.java:96 byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
664645214},
language\ColognePhoneticTest.java:130 String[][] data = 
{{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
language\ColognePhoneticTest.java:143 {ganz, Gänse},
language\DoubleMetaphoneTest.java:1222 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
language\DoubleMetaphoneTest.java:1227 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:375 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:395 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
{code}

The characters are probably not correct above, because I used a crude perl 
script to find them:

{code}
perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
.java
{code}

language\SoundexTest.java:367 in particular is incorrect, because it's supposed 
to be a single character.

Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
gives:

if (Character.isLetter('\ufffd'))

which is an unknown character.

Similarly for binary\Base64Test.java:96.

It's not all that clear what the Unicode escapes should be in these cases, but 
probably not the unknown character.

[Possibly the characters got mangled at some point, or maybe they have always 
been wrong]

The ColognePhoneticTest.java cases are less serious, as the characters are 
valid ISO-8859-1 (accented German), but given that the rest of the file uses 
unicode escaps, I think they should be changed too (but add comments to say 
what they are, e.g. o-umlaut, u-umlaut)

  was:
Some of the test cases include characters in a native encoding (possibly 
UTF-8), rather than using Unicode escapes.

This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
compilation errors, which is how I found the issue), and possibly some 
transformations may corrupt the contents, e.g. fixing EOL.

I think we should have a rule of using Unicode escapes for all such non-ascii 
characters.
It's particularly important for non-ISO-8859-1 characters.

Some example classes with non-ascii characters:

{code}
binary\Base64Test.java:96 byte[] decode = 
b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
664645214},
language\ColognePhoneticTest.java:130 String[][] data = 
{{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
language\ColognePhoneticTest.java:143 {ganz, Gänse},
language\DoubleMetaphoneTest.java:1222 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
language\DoubleMetaphoneTest.java:1227 
this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:375 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
this.getSoundexEncoder().encode(´┐¢));
language\SoundexTest.java:395 Assert.assertEquals(, 
this.getSoundexEncoder().encode(´┐¢));
{code}

The characters are probably not correct above, because I used a crude perl 
script to find them:

{code}
perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; 
*/*.java
{code}

language\SoundexTest.java:367 in 

[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: If I run the command as is, I get:
{quote}
Can't open perl script ne: No such file or directory
{quote})

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Can you post your .pm here or email to ggregory at apache dot org? )

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085110#comment-13085110
 ] 

Sebb edited comment on CODEC-127 at 8/15/11 8:07 PM:
-

I now get:

{code}
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:110
  {m├Ânchengladbach, 664645214},
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:130
  String[][] data = {{bergisch-gladbach, 174845214}, 
{M├╝ller-L├╝denscheidt, 65752682}};
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:137
 {Meyer, M├╝ller},
commons-codec-generics/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:143
 {ganz, Gänse},
commons-codec-generics/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1222
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
commons-codec-generics/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1227
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/BeiderMorseEncoderTest.java:93
 String[] names = { ácz, átz, Ignácz, Ignátz, Ignác };
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:47
   { Nu├▒ez, spanish, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:49
   { ─îapek, czech, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:52
   { Küçük, turkish, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:55
   { Ceauşescu, romanian, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:57
   { ╬æ╬│╬│╬Á╬╗¤î¤Ç╬┐¤à╬╗╬┐¤é, greek, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:58
   { ðƒÐâÐêð║ð©ð¢, cyrillic, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:59
   { ÎøÎö΃, hebrew, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:60
   { ácz, any, EXACT },
commons-codec-generics/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:61
   { átz, any, EXACT } });
{code}

and

{code}
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:110
 {m├Ânchengladbach, 664645214},
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:130
   String[][] data = {{bergisch-gladbach, 174845214}, 
{M├╝ller-L├╝denscheidt, 65752682}};
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:137
  {Meyer, M├╝ller},
commons-codec/src/test/org/apache/commons/codec/language/ColognePhoneticTest.java:143
  {ganz, Gänse},
commons-codec/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1227
  this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
commons-codec/src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java:1232
  this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
commons-codec/src/test/org/apache/commons/codec/language/bm/BeiderMorseEncoderTest.java:93
  String[] names = { ácz, átz, Ignácz, Ignátz, Ignác };
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:47
   { Nu├▒ez, spanish, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:49
   { ─îapek, czech, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:52
   { Küçük, turkish, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:55
   { Ceauşescu, romanian, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:57
   { ╬æ╬│╬│╬Á╬╗¤î¤Ç╬┐¤à╬╗╬┐¤é, greek, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:58
   { ðƒÐâÐêð║ð©ð¢, cyrillic, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:59
   { ÎøÎö΃, hebrew, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:60
   { ácz, any, EXACT },
commons-codec/src/test/org/apache/commons/codec/language/bm/LanguageGuessingTest.java:61
   { átz, any, EXACT } });
{code}

This was using an updated version of the script that uses File::Find to process 
directory traversal better.
(Some lines shortened above by manually removing leading spaces)

I think all the actual errors have now 

[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Typo - missing hyphen for flags)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Tried it here; works fine.

Probably an error in your Wild.pm, because I see the same if I omit the -MWild 
option.)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Perl:

I did all that and I get:

{noformat}
C:\svn\org\apache\commons\trunks-proper\codecperl -MWild -ne $.=1 if $s ne 
$ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; */*.java
syntax error at -e line 1, near *.
Execution of -e aborted due to compilation errors.
{noformat}

I also have:

PERL5OPT=-MWild

in my environment.

Gary)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Arg:
{noformat}
C:\svn\org\apache\commons\trunks-proper\codecperl -MWild -ne $.=1 if $s ne 
$ARGV;print qq($ARGV:$. $_) if m/\P{ASCII}/;$s=$ARGV; */*.java
Can't open */*.java: Invalid argument.
{noformat}
)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: Sorry, forgot I was using a local module which handles DOS wildcards, see

http://docs.activestate.com/activeperl/5.14/lib/pods/perlwin32.html#command_line_wildcard_expansion

Either pass each file in separately, or create Wild.pm and use:

{code}
perl -MWild -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
m/\P{ASCII}/;$s=$ARGV; */*.java
{code}

Wild.pm only works for one level of directories.)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated CODEC-127:
---

Comment: was deleted

(was: If I run:

{noformat}
perl -n -e $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
m/\P{ASCII}/;$s=$ARGV; */*.java
{noformat}

I get:
{noformat}
Can't open */*.java: Invalid argument.
{noformat}
)

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085301#comment-13085301
 ] 

Sebb commented on CODEC-127:


I think all the files are now fixed so that the code uses Unicode escapes; the 
only non-ASCII characters are now in comments.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (JCI-67) Dubious use of mkdirs() return code

2011-08-15 Thread Sebb (JIRA)
Dubious use of mkdirs() return code
---

 Key: JCI-67
 URL: https://issues.apache.org/jira/browse/JCI-67
 Project: Commons JCI
  Issue Type: Bug
Reporter: Sebb
Priority: Minor


FileRestoreStore.java uses mkdirs() as follows:

{code}
final File parent = file.getParentFile();
if (!parent.exists()) {
if (!parent.mkdirs()) {
throw new IOException(could not create + parent);
}
}
{code}

Now mkdirs() returns true *only* if the method actually created the 
directories; it's theoretically possible for the directory to be created in the 
window between the exists() and mkdirs() invocations.

Also, the initial exists() call is redundant, because that's what mkdirs() does 
anyway (in the RI implementation, at least).

I suggest the following instead:

{code}
final File parent = file.getParentFile();
if (!parent.mkdirs()  !parent.exists()) {
throw new IOException(could not create + parent);
}
}
{code}

If mkdirs() returns false, the code then checks to see if the directory exists, 
so the throws clause will only be invoked if the parent really cannot be 
created.

The same code also appears in AbstractTestCase and 
FilesystemAlterationMonitorTestCase.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCI-67) Dubious use of mkdirs() return code

2011-08-15 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/JCI-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085339#comment-13085339
 ] 

Sebb commented on JCI-67:
-

Safer would be the following, as it checks the path is actually a directory:

{code}
final File parent = file.getParentFile();
if (!parent.mkdirs()  !parent.isDirectory()) {
throw new IOException(could not create + parent);
}
}
{code}

 Dubious use of mkdirs() return code
 ---

 Key: JCI-67
 URL: https://issues.apache.org/jira/browse/JCI-67
 Project: Commons JCI
  Issue Type: Bug
Reporter: Sebb
Priority: Minor

 FileRestoreStore.java uses mkdirs() as follows:
 {code}
 final File parent = file.getParentFile();
 if (!parent.exists()) {
 if (!parent.mkdirs()) {
 throw new IOException(could not create + parent);
 }
 }
 {code}
 Now mkdirs() returns true *only* if the method actually created the 
 directories; it's theoretically possible for the directory to be created in 
 the window between the exists() and mkdirs() invocations.
 Also, the initial exists() call is redundant, because that's what mkdirs() 
 does anyway (in the RI implementation, at least).
 I suggest the following instead:
 {code}
 final File parent = file.getParentFile();
 if (!parent.mkdirs()  !parent.exists()) {
 throw new IOException(could not create + parent);
 }
 }
 {code}
 If mkdirs() returns false, the code then checks to see if the directory 
 exists, so the throws clause will only be invoked if the parent really cannot 
 be created.
 The same code also appears in AbstractTestCase and 
 FilesystemAlterationMonitorTestCase.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (IO-280) Dubious use of mkdirs() return code

2011-08-15 Thread Sebb (JIRA)
Dubious use of mkdirs() return code
---

 Key: IO-280
 URL: https://issues.apache.org/jira/browse/IO-280
 Project: Commons IO
  Issue Type: Bug
Reporter: Sebb
Priority: Minor


FileUtils.openOutputStream() has the following code:

{code}
File parent = file.getParentFile();
if (parent != null  parent.exists() == false) {
if (parent.mkdirs() == false) {
throw new IOException(File ' + file + ' could not be created);
}
}
{code}

Now mkdirs() returns true only if the method actually created the directories; 
it's theoretically possible for the directory to be created in the window 
between the exists() and mkdirs() invocations. [Indeed the class actually 
checks for this in the forceMkdir() method]

It would be safer to use:

{code}
File parent = file.getParentFile();
if (parent != null  !parent.mkdirs()  !parent.isDirectory()) {
throw new IOException(Directory ' + parent + ' could not be 
created); // note changed text
}
}
{code}

Similarly elsewhere in the class where mkdirs() is used.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CODEC-127) Non-ascii characters in source files

2011-08-16 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085997#comment-13085997
 ] 

Sebb commented on CODEC-127:


I think Base64Test is OK - I looked back at the original commits, and found an 
uncorrupted version.

By the way, it was only Test files that needed fixing, apart from 
ColognePhonetic, where the fixes were only needed in comments anyway.

 Non-ascii characters in source files
 

 Key: CODEC-127
 URL: https://issues.apache.org/jira/browse/CODEC-127
 Project: Commons Codec
  Issue Type: Bug
Reporter: Sebb

 Some of the test cases include characters in a native encoding (possibly 
 UTF-8), rather than using Unicode escapes.
 This can cause a problem for IDEs if they don't know the encoding (e.g. cause 
 compilation errors, which is how I found the issue), and possibly some 
 transformations may corrupt the contents, e.g. fixing EOL.
 I think we should have a rule of using Unicode escapes for all such non-ascii 
 characters.
 It's particularly important for non-ISO-8859-1 characters.
 Some example classes with non-ascii characters:
 {code}
 binary\Base64Test.java:96 byte[] decode = 
 b64.decode(SGVsbG{´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢}8gV29ybGQ=);
 language\ColognePhoneticTest.java:110 {m├Ânchengladbach, 
 664645214},
 language\ColognePhoneticTest.java:130 String[][] data = 
 {{bergisch-gladbach, 174845214}, {M├╝ller-L├╝denscheidt, 65752682}};
 language\ColognePhoneticTest.java:137 {Meyer, M├╝ller},
 language\ColognePhoneticTest.java:143 {ganz, Gänse},
 language\DoubleMetaphoneTest.java:1222 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, S);
 language\DoubleMetaphoneTest.java:1227 
 this.getDoubleMetaphone().isDoubleMetaphoneEqual(´┐¢, N);
 language\SoundexTest.java:367 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:369 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:375 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:387 if (Character.isLetter('´┐¢')) {
 language\SoundexTest.java:389 Assert.assertEquals(´┐¢000, 
 this.getSoundexEncoder().encode(´┐¢));
 language\SoundexTest.java:395 Assert.assertEquals(, 
 this.getSoundexEncoder().encode(´┐¢));
 {code}
 The characters are probably not correct above, because I used a crude perl 
 script to find them:
 {code}
 perl -ne $.=1 if $s ne $ARGV;print qq($ARGV:$. $_) if 
 m/\P{ASCII}/;$s=$ARGV; .java
 {code}
 language\SoundexTest.java:367 in particular is incorrect, because it's 
 supposed to be a single character.
 Now one might think that native2ascii -encoding UTF-8 would fix that, but it 
 gives:
 if (Character.isLetter('\ufffd'))
 which is an unknown character.
 Similarly for binary\Base64Test.java:96.
 It's not all that clear what the Unicode escapes should be in these cases, 
 but probably not the unknown character.
 [Possibly the characters got mangled at some point, or maybe they have always 
 been wrong]
 The ColognePhoneticTest.java cases are less serious, as the characters are 
 valid ISO-8859-1 (accented German), but given that the rest of the file uses 
 unicode escaps, I think they should be changed too (but add comments to say 
 what they are, e.g. o-umlaut, u-umlaut)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-08-22 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088637#comment-13088637
 ] 

Sebb commented on LANG-744:
---

The static code should probably just catch Exception.

Do we really want any RuntimeExceptions to escape into the calling code?

 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis

 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MATH-649) SimpleRegression needs the ability to suppress the intercept

2011-08-22 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated MATH-649:
--

Summary: SimpleRegression needs the ability to suppress the intercept  
(was: SimpleRegression needs the ability to surpress the intercept)

Typo

 SimpleRegression needs the ability to suppress the intercept
 

 Key: MATH-649
 URL: https://issues.apache.org/jira/browse/MATH-649
 Project: Commons Math
  Issue Type: New Feature
Affects Versions: 1.2, 2.1, 2.2
 Environment: JAVA
Reporter: greg sterijevski
Priority: Minor
  Labels: NOINTERCEPT, SIMPLEREGRESSION
 Fix For: 3.0

 Attachments: simplereg, simpleregtest

   Original Estimate: 2h
  Remaining Estimate: 2h

 The SimpleRegression class is a useful class for running regressions 
 involving one independent variable. It lacks the ability to constrain the 
 constant to be zero. I am attaching a patch which gives a constructor for 
 setting NOINT. I am also checking in two NIST data sets for noint estimation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (IO-281) WildcardFileFilter fails for wild card pattern with a '*' in it

2011-08-22 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088946#comment-13088946
 ] 

Sebb commented on IO-281:
-

Are you sure that the dir variable points to the correct directory? Try 
printing it out, and/or removing the filter.

 WildcardFileFilter fails for wild card pattern with a '*' in it
 ---

 Key: IO-281
 URL: https://issues.apache.org/jira/browse/IO-281
 Project: Commons IO
  Issue Type: Bug
  Components: Filters
Affects Versions: 1.3.2
 Environment: Windows XP
Reporter: Dean Schulze
Priority: Blocker

 The code below reports no files found when there is a file matching the wild 
 card pattern.  If I enter this command in a DOS windows in the same directory 
 it finds the file so the wild card pattern is correct as far as DOS is 
 concerned:
 {code}
 C:\dean\clipper\src\metadata.maildir 320620110821433-*.RWD
  
  Directory of C:\dean\clipper\src\metadata.mail
 08/22/2011  12:36 PM 9,728 320620110821433-1.RWD
1 File(s)  9,728 bytes
0 Dir(s)  50,033,049,600 bytes free
 {code}
 This code should work according to the docs but it reports no file found:
 {code}
   void testFileNameFilter() throws IOException {
   
   String fileNamePrefix = 320620110821433;
   File f = new File(fileNamePrefix + .rwd);
   String filterString = fileNamePrefix + -*.RWD;
   FileFilter filter = new WildcardFileFilter(filterString, 
 IOCase.SYSTEM);
   File dir = f.getCanonicalFile();
   File[] existingFiles = dir.listFiles(filter);
   
   if (existingFiles != null)
   for (File f2 : existingFiles)
   System.out.println(f2.getName());
   else
   System.out.println(No files found for + filterString);
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (IO-281) WildcardFileFilter fails for wild card pattern with a '*' in it

2011-08-22 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb closed IO-281.
---


 WildcardFileFilter fails for wild card pattern with a '*' in it
 ---

 Key: IO-281
 URL: https://issues.apache.org/jira/browse/IO-281
 Project: Commons IO
  Issue Type: Bug
  Components: Filters
Affects Versions: 1.3.2
 Environment: Windows XP
Reporter: Dean Schulze
Priority: Blocker

 The code below reports no files found when there is a file matching the wild 
 card pattern.  If I enter this command in a DOS windows in the same directory 
 it finds the file so the wild card pattern is correct as far as DOS is 
 concerned:
 {code}
 C:\dean\clipper\src\metadata.maildir 320620110821433-*.RWD
  
  Directory of C:\dean\clipper\src\metadata.mail
 08/22/2011  12:36 PM 9,728 320620110821433-1.RWD
1 File(s)  9,728 bytes
0 Dir(s)  50,033,049,600 bytes free
 {code}
 This code should work according to the docs but it reports no file found:
 {code}
   void testFileNameFilter() throws IOException {
   
   String fileNamePrefix = 320620110821433;
   File f = new File(fileNamePrefix + .rwd);
   String filterString = fileNamePrefix + -*.RWD;
   FileFilter filter = new WildcardFileFilter(filterString, 
 IOCase.SYSTEM);
   File dir = f.getCanonicalFile();
   File[] existingFiles = dir.listFiles(filter);
   
   if (existingFiles != null)
   for (File f2 : existingFiles)
   System.out.println(f2.getName());
   else
   System.out.println(No files found for + filterString);
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (IO-281) WildcardFileFilter fails for wild card pattern with a '*' in it

2011-08-22 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved IO-281.
-

Resolution: Invalid

 WildcardFileFilter fails for wild card pattern with a '*' in it
 ---

 Key: IO-281
 URL: https://issues.apache.org/jira/browse/IO-281
 Project: Commons IO
  Issue Type: Bug
  Components: Filters
Affects Versions: 1.3.2
 Environment: Windows XP
Reporter: Dean Schulze
Priority: Blocker

 The code below reports no files found when there is a file matching the wild 
 card pattern.  If I enter this command in a DOS windows in the same directory 
 it finds the file so the wild card pattern is correct as far as DOS is 
 concerned:
 {code}
 C:\dean\clipper\src\metadata.maildir 320620110821433-*.RWD
  
  Directory of C:\dean\clipper\src\metadata.mail
 08/22/2011  12:36 PM 9,728 320620110821433-1.RWD
1 File(s)  9,728 bytes
0 Dir(s)  50,033,049,600 bytes free
 {code}
 This code should work according to the docs but it reports no file found:
 {code}
   void testFileNameFilter() throws IOException {
   
   String fileNamePrefix = 320620110821433;
   File f = new File(fileNamePrefix + .rwd);
   String filterString = fileNamePrefix + -*.RWD;
   FileFilter filter = new WildcardFileFilter(filterString, 
 IOCase.SYSTEM);
   File dir = f.getCanonicalFile();
   File[] existingFiles = dir.listFiles(filter);
   
   if (existingFiles != null)
   for (File f2 : existingFiles)
   System.out.println(f2.getName());
   else
   System.out.println(No files found for + filterString);
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-08-23 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb reopened LANG-744:
---


Why do we care which Exceptions can be generated?

We take the same action in each case, so I don't see the point of enumerating 
the Exceptions, unless there is different action to be taken for some of them.

But even then, we would probably need a catchall Exception.

 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (LANG-746) NumberUtils does not handle upper-case hex: 0X and -0X

2011-08-23 Thread Sebb (JIRA)
NumberUtils does not handle upper-case hex: 0X and -0X
--

 Key: LANG-746
 URL: https://issues.apache.org/jira/browse/LANG-746
 Project: Commons Lang
  Issue Type: Bug
Reporter: Sebb


NumberUtils.createNumber() should work equally for 0x1234 and 0X1234; currently 
0X1234 generates a NumberFormatException

Integer.decode() handles both upper and lower case hex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (LANG-746) NumberUtils does not handle upper-case hex: 0X and -0X

2011-08-23 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LANG-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated LANG-746:
--

Affects Version/s: 3.0
   3.0.1
Fix Version/s: 3.0.2

 NumberUtils does not handle upper-case hex: 0X and -0X
 --

 Key: LANG-746
 URL: https://issues.apache.org/jira/browse/LANG-746
 Project: Commons Lang
  Issue Type: Bug
Affects Versions: 3.0, 3.0.1
Reporter: Sebb
 Fix For: 3.0.2


 NumberUtils.createNumber() should work equally for 0x1234 and 0X1234; 
 currently 0X1234 generates a NumberFormatException
 Integer.decode() handles both upper and lower case hex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (LANG-746) NumberUtils does not handle upper-case hex: 0X and -0X

2011-08-23 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LANG-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved LANG-746.
---

Resolution: Fixed

URL: http://svn.apache.org/viewvc?rev=1160660view=rev
Log:
LANG-746 NumberUtils does not handle upper-case hex: 0X and -0X

Modified:
   
commons/proper/lang/trunk/src/main/java/org/apache/commons/lang3/math/NumberUtils.java
   commons/proper/lang/trunk/src/site/changes/changes.xml
   
commons/proper/lang/trunk/src/test/java/org/apache/commons/lang3/math/NumberUtilsTest.java

 NumberUtils does not handle upper-case hex: 0X and -0X
 --

 Key: LANG-746
 URL: https://issues.apache.org/jira/browse/LANG-746
 Project: Commons Lang
  Issue Type: Bug
Affects Versions: 3.0, 3.0.1
Reporter: Sebb

 NumberUtils.createNumber() should work equally for 0x1234 and 0X1234; 
 currently 0X1234 generates a NumberFormatException
 Integer.decode() handles both upper and lower case hex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (LANG-747) NumberUtils does not handle Long Hex numbers

2011-08-23 Thread Sebb (JIRA)
NumberUtils does not handle Long Hex numbers


 Key: LANG-747
 URL: https://issues.apache.org/jira/browse/LANG-747
 Project: Commons Lang
  Issue Type: Bug
Reporter: Sebb


NumberUtils.createLong() does not handle hex numbers, but createInteger() 
handles hex and octal.
This seems odd.

NumberUtils.createNumber() assumes that hex numbers can only be Integer.
Again, why not handle bigger Hex numbers?

==

It is trivial to fix createLong() - just use Long.decode() instead of valueOf().
It's not clear why this was not done originally - the decode() method was added 
to both Integer and Long in Java 1.2.

Fixing createNumber() is also fairly easy - if the hex string has more than 8 
digits, use Long.

Should we allow for leading zeros in an Integer? 
If not, the length check is trivial.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-08-24 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090224#comment-13090224
 ] 

Sebb commented on MATH-650:
---

Any change needs to bear in mind that the fields need to remain thread-safe, 
i.e. whatever generates them must publish the values safely to all threads. 
This is currently achieved by using the static{} block with final fields.

Seems to me there are two possible approaches to fix this:
- improve the performance of the existing code
- change the code to use initialisation on demand, so only the required parts 
are intialised.
Static holder classes can probably be used here to ensure safe publication.

In the case of floor(), that does not need the calculated fields, so it would 
speed it up.

Note that the FastMath version of floor() is likely to have similar performance 
to Math.floor(), as it's not an algorithm that benefits from (or indeed needs) 
the FastMath approach.

It would be useful to have performance figures for more complicated 
calculations, where FastMath should start to show benefits.


 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-08-24 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090308#comment-13090308
 ] 

Sebb commented on MATH-650:
---

bq. I wonder if we should have some way to compute these tables at compile time 
and have them simply loaded without recomputation.

Not sure the compiler can create the values.
But we could add code to print out the generated data, and then incorporate 
back into the source.

Should be no need to update it once created, however to check the ongoing 
accuracy of the tables, the generating code could be moved into a test class, 
and used to compare against the fixed data. This would probably require some 
package protected helper methods to give access to the private data. Or the 
generator code could remain in the FastMath class, to be called by the unit 
test code only.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NET-420) Retrieving files from AS400 FTP systems returns null timestamps in FTPFile.getTimestamp

2011-08-25 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/NET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091494#comment-13091494
 ] 

Sebb commented on NET-420:
--

Can you provide access to an AS400 FTP server for testing purposes?

If not, please at least provide a full listing as returned by the LIST command.

 Retrieving files from AS400 FTP systems returns null timestamps in 
 FTPFile.getTimestamp
 ---

 Key: NET-420
 URL: https://issues.apache.org/jira/browse/NET-420
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 2.0
 Environment: Commons Net 2.0
 FTP System: AS400 systems
Reporter: Ramya Rajendiran
Priority: Critical

 We are trying to list files from AS400 systems and retrieve the timestamps 
 from these files using the following code:
 FTPClientConfig conf = new FTPClientConfig(FTPClientConfig.SYST_AS400);
 conf.setDefaultDateFormatStr(MM/dd/yy HH:mm:ss);
 ftpClient.configure(conf); 
 ftpClient.connect(hostName);
 FTPFile[] file = ftpClient.listFiles(remoteFileName);
 Calendar timeStamp = files[0].getTimestamp();
 timeStamp returned is always null.
 I have also tried various setting other parsers.. but that also does not work:
 FTPListParseEngine engine = 
 ftpClient.initiateListParsing(org.apache.commons.net.ftp.parser.OS400FTPEntryParser,remoteFileName);
 FTPFile[] files = engine.getNext(25);  
 The LIST command which is used internally in the FTPClient retrieves the 
 timestamps successfully. However after parsing the FTPFile has a null value 
 for the timestamp field.
 I tried the latest commons net 3.0.1 and the problem still exists.
 Please help us fix this problem. It is critical to us.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (IO-280) Dubious use of mkdirs() return code

2011-08-30 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/IO-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved IO-280.
-

Resolution: Fixed

 Dubious use of mkdirs() return code
 ---

 Key: IO-280
 URL: https://issues.apache.org/jira/browse/IO-280
 Project: Commons IO
  Issue Type: Bug
Reporter: Sebb
Priority: Minor

 FileUtils.openOutputStream() has the following code:
 {code}
 File parent = file.getParentFile();
 if (parent != null  parent.exists() == false) {
 if (parent.mkdirs() == false) {
 throw new IOException(File ' + file + ' could not be created);
 }
 }
 {code}
 Now mkdirs() returns true only if the method actually created the 
 directories; it's theoretically possible for the directory to be created in 
 the window between the exists() and mkdirs() invocations. [Indeed the class 
 actually checks for this in the forceMkdir() method]
 It would be safer to use:
 {code}
 File parent = file.getParentFile();
 if (parent != null  !parent.mkdirs()  !parent.isDirectory()) {
 throw new IOException(Directory ' + parent + ' could not be 
 created); // note changed text
 }
 }
 {code}
 Similarly elsewhere in the class where mkdirs() is used.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (NET-421) Problem connecting to TLS/SSL SMTP server using explicit mode

2011-08-30 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved NET-421.
--

Resolution: Fixed

 Problem connecting to TLS/SSL SMTP server using explicit mode
 -

 Key: NET-421
 URL: https://issues.apache.org/jira/browse/NET-421
 Project: Commons Net
  Issue Type: Bug
  Components: SMTP
Affects Versions: 3.0, 3.0.1
Reporter: Oliver Saggau
Priority: Critical
 Fix For: 3.1


 Just tried to send an email through gmail servers by doing the following:
 {code}AuthenticatingSMTPClient client = new AuthenticatingSMTPClient();
 client.connect(smtp.gmail.com, 587); // reply: 220 220 mx.google.com ESMTP
 client.login(); // reply: 250 250 mx.google.com at your service
 client.execTLS(); // reply: 220 2.0.0 Ready to start TLS
 client.auth(AUTH_METHOD.PLAIN, username, password); // exception
 ...{code}
 Unfortunality after execTLS() I get a MalformedServerReplyException. I looked 
 at the SMTPSClient source code and found out that the reader/writer are wrong 
 after execTLS() got called. The performSSLNegotiation() method sets _input_ 
 and _output_ to the new input/output streams from SSLSocket, but the 
 reader/writer are still pointing to the values set inside _connectAction_().
 Possible fix for this issue:
 {code}public boolean execTLS() throws SSLException, IOException
 {
   if (!SMTPReply.isPositiveCompletion(sendCommand(STARTTLS)))
   {
   return false;
   //throw new SSLException(getReplyString());
   }
   performSSLNegotiation();
   _reader = new CRLFLineReader(new InputStreamReader(_input_, encoding));
   _writer = new BufferedWriter(new OutputStreamWriter(_output_, 
 encoding));
   return true;
 }{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (NET-421) Problem connecting to TLS/SSL SMTP server using explicit mode

2011-08-30 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated NET-421:
-

Affects Version/s: 3.0.1
Fix Version/s: 3.1

 Problem connecting to TLS/SSL SMTP server using explicit mode
 -

 Key: NET-421
 URL: https://issues.apache.org/jira/browse/NET-421
 Project: Commons Net
  Issue Type: Bug
  Components: SMTP
Affects Versions: 3.0, 3.0.1
Reporter: Oliver Saggau
Priority: Critical
 Fix For: 3.1


 Just tried to send an email through gmail servers by doing the following:
 {code}AuthenticatingSMTPClient client = new AuthenticatingSMTPClient();
 client.connect(smtp.gmail.com, 587); // reply: 220 220 mx.google.com ESMTP
 client.login(); // reply: 250 250 mx.google.com at your service
 client.execTLS(); // reply: 220 2.0.0 Ready to start TLS
 client.auth(AUTH_METHOD.PLAIN, username, password); // exception
 ...{code}
 Unfortunality after execTLS() I get a MalformedServerReplyException. I looked 
 at the SMTPSClient source code and found out that the reader/writer are wrong 
 after execTLS() got called. The performSSLNegotiation() method sets _input_ 
 and _output_ to the new input/output streams from SSLSocket, but the 
 reader/writer are still pointing to the values set inside _connectAction_().
 Possible fix for this issue:
 {code}public boolean execTLS() throws SSLException, IOException
 {
   if (!SMTPReply.isPositiveCompletion(sendCommand(STARTTLS)))
   {
   return false;
   //throw new SSLException(getReplyString());
   }
   performSSLNegotiation();
   _reader = new CRLFLineReader(new InputStreamReader(_input_, encoding));
   _writer = new BufferedWriter(new OutputStreamWriter(_output_, 
 encoding));
   return true;
 }{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NET-420) Retrieving files from AS400 FTP systems returns null timestamps in FTPFile.getTimestamp

2011-08-30 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/NET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093790#comment-13093790
 ] 

Sebb commented on NET-420:
--

Your code is using:

bq. conf.setDefaultDateFormatStr(MM/dd/yy HH:mm:ss);

yet the list looks like the following:

bq. -rwxrwxrwx 1 RAMYARAJ 0 22 Aug 25 22:31 file.txt

It's not surprising that the entries are not being parsed.

Try using a date format string that corresponds with the listing output, for 
example:

{code}
conf.setDefaultDateFormatStr(dd MMM yy HH:mm:ss);
{code}

 Retrieving files from AS400 FTP systems returns null timestamps in 
 FTPFile.getTimestamp
 ---

 Key: NET-420
 URL: https://issues.apache.org/jira/browse/NET-420
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 2.0, 3.0.1
 Environment: Commons Net 2.0
 FTP System: AS400 systems
Reporter: Ramya Rajendiran
Priority: Critical

 We are trying to list files from AS400 systems and retrieve the timestamps 
 from these files using the following code:
 FTPClientConfig conf = new FTPClientConfig(FTPClientConfig.SYST_AS400);
 conf.setDefaultDateFormatStr(MM/dd/yy HH:mm:ss);
 ftpClient.configure(conf); 
 ftpClient.connect(hostName);
 FTPFile[] file = ftpClient.listFiles(remoteFileName);
 Calendar timeStamp = files[0].getTimestamp();
 timeStamp returned is always null.
 I have also tried various setting other parsers.. but that also does not work:
 FTPListParseEngine engine = 
 ftpClient.initiateListParsing(org.apache.commons.net.ftp.parser.OS400FTPEntryParser,remoteFileName);
 FTPFile[] files = engine.getNext(25);  
 The LIST command which is used internally in the FTPClient retrieves the 
 timestamps successfully. However after parsing the FTPFile has a null value 
 for the timestamp field.
 I tried the latest commons net 3.0.1 and the problem still exists.
 Please help us fix this problem. It is critical to us.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (NET-420) Retrieving files from AS400 FTP systems returns null timestamps in FTPFile.getTimestamp

2011-08-30 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated NET-420:
-

Priority: Minor  (was: Critical)

 Retrieving files from AS400 FTP systems returns null timestamps in 
 FTPFile.getTimestamp
 ---

 Key: NET-420
 URL: https://issues.apache.org/jira/browse/NET-420
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 2.0, 3.0.1
 Environment: Commons Net 2.0
 FTP System: AS400 systems
Reporter: Ramya Rajendiran
Priority: Minor

 We are trying to list files from AS400 systems and retrieve the timestamps 
 from these files using the following code:
 FTPClientConfig conf = new FTPClientConfig(FTPClientConfig.SYST_AS400);
 conf.setDefaultDateFormatStr(MM/dd/yy HH:mm:ss);
 ftpClient.configure(conf); 
 ftpClient.connect(hostName);
 FTPFile[] file = ftpClient.listFiles(remoteFileName);
 Calendar timeStamp = files[0].getTimestamp();
 timeStamp returned is always null.
 I have also tried various setting other parsers.. but that also does not work:
 FTPListParseEngine engine = 
 ftpClient.initiateListParsing(org.apache.commons.net.ftp.parser.OS400FTPEntryParser,remoteFileName);
 FTPFile[] files = engine.getNext(25);  
 The LIST command which is used internally in the FTPClient retrieves the 
 timestamps successfully. However after parsing the FTPFile has a null value 
 for the timestamp field.
 I tried the latest commons net 3.0.1 and the problem still exists.
 Please help us fix this problem. It is critical to us.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (NET-420) Retrieving files from AS400 FTP systems returns null timestamps in FTPFile.getTimestamp

2011-08-30 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated NET-420:
-

Affects Version/s: 3.0.1

 Retrieving files from AS400 FTP systems returns null timestamps in 
 FTPFile.getTimestamp
 ---

 Key: NET-420
 URL: https://issues.apache.org/jira/browse/NET-420
 Project: Commons Net
  Issue Type: Bug
  Components: FTP
Affects Versions: 2.0, 3.0.1
 Environment: Commons Net 2.0
 FTP System: AS400 systems
Reporter: Ramya Rajendiran
Priority: Minor

 We are trying to list files from AS400 systems and retrieve the timestamps 
 from these files using the following code:
 FTPClientConfig conf = new FTPClientConfig(FTPClientConfig.SYST_AS400);
 conf.setDefaultDateFormatStr(MM/dd/yy HH:mm:ss);
 ftpClient.configure(conf); 
 ftpClient.connect(hostName);
 FTPFile[] file = ftpClient.listFiles(remoteFileName);
 Calendar timeStamp = files[0].getTimestamp();
 timeStamp returned is always null.
 I have also tried various setting other parsers.. but that also does not work:
 FTPListParseEngine engine = 
 ftpClient.initiateListParsing(org.apache.commons.net.ftp.parser.OS400FTPEntryParser,remoteFileName);
 FTPFile[] files = engine.getNext(25);  
 The LIST command which is used internally in the FTPClient retrieves the 
 timestamps successfully. However after parsing the FTPFile has a null value 
 for the timestamp field.
 I tried the latest commons net 3.0.1 and the problem still exists.
 Please help us fix this problem. It is critical to us.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NET-153) Add getCause method to CopyStreamException

2011-08-30 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/NET-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093823#comment-13093823
 ] 

Sebb commented on NET-153:
--

Yes, the update does seem to have been lost.

However, now that we require Java 1.5, and getCause()/initCause() were added to 
Throwable in 1.4, would it not be better to use the underlying cause field from 
Throwable?

i.e. rather than overriding getCause, we should store the cause using 
initCause(), and update the method as follows:

{code}

public IOException getIOException()
{
return getCause();
}
{code}

This could further be simplified to merge the initCause() in the super(message) 
method invocation once Java 1.6 is a minimum requirement.


 Add getCause method to CopyStreamException
 --

 Key: NET-153
 URL: https://issues.apache.org/jira/browse/NET-153
 Project: Commons Net
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Dan Godfrey
Priority: Trivial
 Fix For: 2.0

 Attachments: CopyStreamException.patch


 Add a getCause method to CopyStreamException that has the same signature as 
 Throwable#getCause from JDK 1.4 and returns the wrapped IOException.
 This will override the existing getCause method in version of Java  1.4 and 
 hence include the IOExceptions stack trace in the CopyStreamExceptions stack 
 trace or just be ignored in Java 1.3.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-09-01 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095211#comment-13095211
 ] 

Sebb commented on LANG-744:
---

The message will be thrown even if the sun method is not needed; that does not 
seem right.

If the sun method is unavailable, the code that conditionally calls it throws 
UnsupportedOperationException:

The stripAccents(CharSequence) method requires at least Java 1.6 or a Sun 
JVM);

We could record the Exception in the static block, and add it as the cause for 
the UOE.
It would then only appear when necessary.

 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CHAIN-54) upgrate JUnit dependency to latest released version and adapt tests

2011-09-02 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CHAIN-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095982#comment-13095982
 ] 

Sebb commented on CHAIN-54:
---

Latest version is 4.9.

AFAIK this is backwards compatible, so tests don't *have* to be updated.
However, updating tests to the new way using annotations can make them easier 
to read and maintain.

 upgrate JUnit dependency to latest released version and adapt tests
 ---

 Key: CHAIN-54
 URL: https://issues.apache.org/jira/browse/CHAIN-54
 Project: Commons Chain
  Issue Type: Improvement
Affects Versions: 2.0
Reporter: Simone Tripodi
Assignee: Simone Tripodi
 Fix For: 2.0


 JUnit dependency has to be migrated to latest stable 4.X released - and tests 
 consequently have to be updated

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-04 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096971#comment-13096971
 ] 

Sebb commented on MATH-650:
---

I think the simplest would be to just print out the values of the arrays at the 
end of the static block, and feed them back into the code.

Rather than deleting the setup code, it could be left as documentation - either 
commented out or disabled via an if (false) block.

I've done most of the work to implement this.

Thoughts?

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-05 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097063#comment-13097063
 ] 

Sebb commented on MATH-650:
---

Yes, I did suggest that in an earlier comment.
However turns out it's quite a bit of extra work to do so, which I was hoping 
to avoid.
Also there is already a unit test which compares the accuracy with Dfp.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-09-06 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13098076#comment-13098076
 ] 

Sebb commented on LANG-744:
---

Reworked static init in r1165701.

 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-621) BOBYQA is missing in optimization

2011-09-06 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13098079#comment-13098079
 ] 

Sebb commented on MATH-621:
---

Hover over the comment you want to edit; there will be an edit icon in the top 
rhs of the grey box.

 BOBYQA is missing in optimization
 -

 Key: MATH-621
 URL: https://issues.apache.org/jira/browse/MATH-621
 Project: Commons Math
  Issue Type: New Feature
Affects Versions: 3.0
Reporter: Dr. Dietmar Wolz
 Fix For: 3.0

 Attachments: BOBYQA.math.patch, BOBYQA.v02.math.patch, 
 BOBYQAOptimizer.java.patch, BOBYQAOptimizer0.4.zip, bobyqa.zip, 
 bobyqa_convert.pl, bobyqaoptimizer0.4.zip, bobyqav0.3.zip

   Original Estimate: 8h
  Remaining Estimate: 8h

 During experiments with space flight trajectory optimizations I recently
 observed, that the direct optimization algorithm BOBYQA
 http://plato.asu.edu/ftp/other_software/bobyqa.zip
 from Mike Powell is significantly better than the simple Powell algorithm
 already in commons.math. It uses significantly lower function calls and is
 more reliable for high dimensional problems. You can replace CMA-ES in many
 more application cases by BOBYQA than by the simple Powell optimizer.
 I would like to contribute a Java port of the algorithm.
 I maintained the structure of the original FORTRAN code, so the
 code is fast but not very nice.
 License status: Michael Powell has sent the agreement via snail mail
 - it hasn't arrived yet.
 Progress: The attached patch relative to the trunk contains both the
 optimizer and the related unit tests - which are all green now.  
 Performance:
 Performance difference (number of function evaluations)
 PowellOptimizer / BOBYQA for different test functions (taken from
 the unit test of BOBYQA, dimension=13 for most of the
 tests. 
 Rosen = 9350 / 1283
 MinusElli = 118 / 59
 Elli = 223 / 58
 ElliRotated = 8626 / 1379
 Cigar = 353 / 60
 TwoAxes = 223 / 66
 CigTab = 362 / 60
 Sphere = 223 / 58
 Tablet = 223 / 58
 DiffPow = 421 / 928
 SsDiffPow = 614 / 219
 Ackley = 757 / 97
 Rastrigin = 340 / 64
 The number for DiffPow should be dicussed with Michael Powell,
 I will send him the details. 
 Open Problems:
 Some checkstyle violations because of the original Fortran source:
 - Original method comments were copied - doesn't follow javadoc standard
 - Multiple variable declarations in one line as in the original source
 - Problems related to goto conversions:
   gotos not convertible in loops were transated into a finite automata 
 (switch statement)
   no default in switch
   fall through from previos case in switch
   which usually are bad style make no sense here.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-07 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13098883#comment-13098883
 ] 

Sebb commented on MATH-650:
---

FastMath has been updated to use preset tables, eliminating the static setup 
code.

@Alexis: Would you be able to check if the changes have helped on Android?
There is a SNAPSHOT available at:

https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math/3.0-SNAPSHOT/
commons-math-3.0-20110907.123252-61.jar

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-07 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099185#comment-13099185
 ] 

Sebb commented on MATH-650:
---

It appears that the new code is almost twice as fast as the old.
However, it can still take 20-30ms to initialise the class.

This seems to be because of the large array initialisations.
I hacked the code to comment out most of the array entries, leaving just one or 
two in each of the large arrays, and that improved the startup time to about 6 
times as fast - about 6-7ms. [Of course that code won't work properly]

So it might be worth attempting initialisation on demand, using a static holder 
class that contains the pre-calculated data.

There was also a slight speed up from removing all the unused initialisation 
code and its data items.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-07 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099597#comment-13099597
 ] 

Sebb commented on MATH-650:
---

Yes, I looked at IODH. Turns out that the holder is not required.

Instead, one can use a static class which contains the initial data:

{code}
public class FastMath{
  private static class lnMant {
private static final double LN_MANT[][] = {
...
};
...
   double d = lnMant.LN_MANT[j][j];
// was double d = LN_MANT[i][j];
  }
}
{code}

Very simple to implement; doing that plus commenting out all init code and data 
results in speed-up of about 6 times for FastMath.max().

Does not seem to affect performance of method calls once its table(s) has/ve 
been loaded.

What remains to be decided is what to do with the init code. Some of it might 
be useful in its own right - Taylor expansions for sine/cosine etc. 
Perhaps create another class (SlowMath anyone?) in the same package. 
And/or move it to the test tree?

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-07 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099617#comment-13099617
 ] 

Sebb commented on MATH-650:
---

Another snapshot uploaded as commons-math-3.0-20110907.222813-62.jar

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-07 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099657#comment-13099657
 ] 

Sebb commented on MATH-650:
---

In my tests, I found that pre-calculating the data is about twice as fast as 
calculating it in the static block.

That seems a worthwhile improvement to me.

Converting the larger preset data tables to IOD gives a massive improvement for 
routines that don't need any of the IOD tables, and gives corresponding 
improvements for methods that only use some of the IOD tables.

It's also trivial to do, so I did it.

Tidying up the code to move the now-unused init code makes only a minor 
improvement to load times, but is worth it for readability and maintenance.

You can do the tests if you want.
Math 2.2 is the original FastMath implementation

https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math/3.0-SNAPSHOT/
has the jars:
commons-math-3.0-20110907.123252-61.jar - preset arrays
commons-math-3.0-20110907.222813-62.jar - the IOD code

So yes, if an application makes lots of calls the overhead will gradually fade 
away, but the overhead is very large.
In my test I used exp(1000) which uses an IOD table. 
The repeat time for that is about 5000ns.
The first times are approx 40,000,000ns (original) and 8,000,000ns (current).

I agree that the lazy init does not help applications that use all the tables.
However all applications perform better if the table calculation is done 
beforehand.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-658) Dead code in FastMath.pow(double, double) and some improvement in test coverage

2011-09-08 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100397#comment-13100397
 ] 

Sebb commented on MATH-658:
---

The body format (3rd line onwards) is OK, but the header lines are incorrect.
They should look something like:

{code}
Index: src/main/java/org/apache/commons/math/util/FastMath.java
===
--- src/main/java/org/apache/commons/math/util/FastMath.java(revision 
1166437)
+++ src/main/java/org/apache/commons/math/util/FastMath.java(working copy)
{code}

This was created by updating the file in an SVN working copy and then creating 
the patch (I used Eclipse, specifying project-relative mode, but svn diff would 
produce much the same output).

Your patch has completely different names and paths for the input and output 
files:

{code}
--- D:/DOCUME~1/tanguyy/LOCALS~1/Temp/FastMath.java-revBASE.svn000.tmp.java 
  jeu. sept.  8 16:28:36 2011
+++ 
D:/DONNEES/ATELIER_JAVA/workspace/Commons-Math_Trunk/src/main/java/org/apache/commons/math/util/FastMath.java
   jeu. sept.  8 16:10:02 2011
{code}

This means it's impossible to apply the patch automatically.

However, it's not too difficult to fix the header lines, e.g. in the above case 
to:

{code}
--- FastMath.java   jeu. sept.  8 16:28:36 2011
+++ FastMath.java   jeu. sept.  8 16:10:02 2011
{code}

and the patch can then be applied in the appropriate directory.

No need to resubmit these particular patches, but if you submit any more please 
use the proper unified diff format relative to the top-level project directory, 
so paths start with src/.

 Dead code in FastMath.pow(double, double) and some improvement in test 
 coverage
 ---

 Key: MATH-658
 URL: https://issues.apache.org/jira/browse/MATH-658
 Project: Commons Math
  Issue Type: Improvement
Reporter: Yannick TANGUY
Priority: Minor
 Fix For: 3.0

 Attachments: FastMath.java.diff, FastMathTest.java, 
 FastMathTest.java.diff


 This issue concerns the FastMath class and its test class.
 (1) In the double pow(double, double) function, there are 2 identical if 
 blocks. The second one can be suppressed.
 if (y  0  y == yi  (yi  1) == 1) {
 return Double.NEGATIVE_INFINITY;
 }
 // this block is never used - to be suppressed
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 (2) To obtain better code coverage, we added some tests case in 
 FastMathTest.java (see attached file)
 - Added test for log1p
 - Added tests in testPowSpecialCases()
 - Added tests for a 100% coverage of acos().
 - Added tests for a 100% coverage of asin().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MATH-658) Dead code in FastMath.pow(double, double) and some improvement in test coverage

2011-09-08 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved MATH-658.
---

Resolution: Fixed

Patches applied.

 Dead code in FastMath.pow(double, double) and some improvement in test 
 coverage
 ---

 Key: MATH-658
 URL: https://issues.apache.org/jira/browse/MATH-658
 Project: Commons Math
  Issue Type: Improvement
Reporter: Yannick TANGUY
Priority: Minor
 Fix For: 3.0

 Attachments: FastMath.java.diff, FastMathTest.java, 
 FastMathTest.java.diff


 This issue concerns the FastMath class and its test class.
 (1) In the double pow(double, double) function, there are 2 identical if 
 blocks. The second one can be suppressed.
 if (y  0  y == yi  (yi  1) == 1) {
 return Double.NEGATIVE_INFINITY;
 }
 // this block is never used - to be suppressed
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 (2) To obtain better code coverage, we added some tests case in 
 FastMathTest.java (see attached file)
 - Added test for log1p
 - Added tests in testPowSpecialCases()
 - Added tests for a 100% coverage of acos().
 - Added tests for a 100% coverage of asin().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (LANG-747) NumberUtils does not handle Long Hex numbers

2011-09-08 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LANG-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated LANG-747:
--

Component/s: lang.math.*

 NumberUtils does not handle Long Hex numbers
 

 Key: LANG-747
 URL: https://issues.apache.org/jira/browse/LANG-747
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.math.*
Reporter: Sebb

 NumberUtils.createLong() does not handle hex numbers, but createInteger() 
 handles hex and octal.
 This seems odd.
 NumberUtils.createNumber() assumes that hex numbers can only be Integer.
 Again, why not handle bigger Hex numbers?
 ==
 It is trivial to fix createLong() - just use Long.decode() instead of 
 valueOf().
 It's not clear why this was not done originally - the decode() method was 
 added to both Integer and Long in Java 1.2.
 Fixing createNumber() is also fairly easy - if the hex string has more than 8 
 digits, use Long.
 Should we allow for leading zeros in an Integer? 
 If not, the length check is trivial.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (LANG-752) Fix createLong() so it behaves like createInteger()

2011-09-08 Thread Sebb (JIRA)
Fix createLong() so it behaves like createInteger()
---

 Key: LANG-752
 URL: https://issues.apache.org/jira/browse/LANG-752
 Project: Commons Lang
  Issue Type: Sub-task
Reporter: Sebb


NumberUtils.createLong() does not handle hex numbers, but createInteger() 
handles hex and octal.

Fix it by using Long.decode() instead of Long.valueOf().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (LANG-752) Fix createLong() so it behaves like createInteger()

2011-09-08 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LANG-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved LANG-752.
---

   Resolution: Fixed
Fix Version/s: 3.0.2

 Fix createLong() so it behaves like createInteger()
 ---

 Key: LANG-752
 URL: https://issues.apache.org/jira/browse/LANG-752
 Project: Commons Lang
  Issue Type: Sub-task
  Components: lang.math.*
Reporter: Sebb
 Fix For: 3.0.2


 NumberUtils.createLong() does not handle hex numbers, but createInteger() 
 handles hex and octal.
 Fix it by using Long.decode() instead of Long.valueOf().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONFIGURATION-454) Malformed pom uploaded to repositories

2011-09-08 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CONFIGURATION-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100497#comment-13100497
 ] 

Sebb commented on CONFIGURATION-454:


Maven Central won't allow updates to jars etc. because that would mean builds 
were not repeatable.

They do allow hashes and sigs to be replaced, and they may allow the url to be 
fixed as it does not affect builds.

Worth reporting in case.

 Malformed pom uploaded to repositories
 --

 Key: CONFIGURATION-454
 URL: https://issues.apache.org/jira/browse/CONFIGURATION-454
 Project: Commons Configuration
  Issue Type: Bug
  Components: Build
Affects Versions: 1.6
Reporter: Kevin Meyer
Priority: Minor
  Labels: maven

 The pom downloaded, for example, from:
 http://uk.maven.org/maven2/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.pom
 is damaged: e.g. the url is given as:
 urlhttp://commons.apache.org/${pom.artifactId.substring(8)}//url
 directory${basedir}/directory
 etc.
 This affects the generated licenses for other projects.
 Compare with commons-collections, which is fine.
 Problems seems to be in trunk/project.xml ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-08 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100733#comment-13100733
 ] 

Sebb commented on MATH-650:
---

bq. As far as I understand, only the tables that are used are loaded.

Yes. I used lazy init for the larger tables only. 
There are two paired tables, each pair in its own class, and another table in a 
third class.
The tables are only referenced where they are used.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor

 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-658) Dead code in FastMath.pow(double, double) and some improvement in test coverage

2011-09-08 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100734#comment-13100734
 ] 

Sebb commented on MATH-658:
---

Thanks - I did see the wildcard import, but left it as it is test code so not 
so important.

 Dead code in FastMath.pow(double, double) and some improvement in test 
 coverage
 ---

 Key: MATH-658
 URL: https://issues.apache.org/jira/browse/MATH-658
 Project: Commons Math
  Issue Type: Improvement
Reporter: Yannick TANGUY
Priority: Minor
 Fix For: 3.0

 Attachments: FastMath.java.diff, FastMathTest.java, 
 FastMathTest.java.diff


 This issue concerns the FastMath class and its test class.
 (1) In the double pow(double, double) function, there are 2 identical if 
 blocks. The second one can be suppressed.
 if (y  0  y == yi  (yi  1) == 1) {
 return Double.NEGATIVE_INFINITY;
 }
 // this block is never used - to be suppressed
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 (2) To obtain better code coverage, we added some tests case in 
 FastMathTest.java (see attached file)
 - Added test for log1p
 - Added tests in testPowSpecialCases()
 - Added tests for a 100% coverage of acos().
 - Added tests for a 100% coverage of asin().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-658) Dead code in FastMath.pow(double, double) and some improvement in test coverage

2011-09-08 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100746#comment-13100746
 ] 

Sebb commented on MATH-658:
---

Also, just noticed some tab characters in the test class patch which I have 
just fixed. We don't allow tabs.

 Dead code in FastMath.pow(double, double) and some improvement in test 
 coverage
 ---

 Key: MATH-658
 URL: https://issues.apache.org/jira/browse/MATH-658
 Project: Commons Math
  Issue Type: Improvement
Reporter: Yannick TANGUY
Priority: Minor
 Fix For: 3.0

 Attachments: FastMath.java.diff, FastMathTest.java, 
 FastMathTest.java.diff


 This issue concerns the FastMath class and its test class.
 (1) In the double pow(double, double) function, there are 2 identical if 
 blocks. The second one can be suppressed.
 if (y  0  y == yi  (yi  1) == 1) {
 return Double.NEGATIVE_INFINITY;
 }
 // this block is never used - to be suppressed
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 (2) To obtain better code coverage, we added some tests case in 
 FastMathTest.java (see attached file)
 - Added test for log1p
 - Added tests in testPowSpecialCases()
 - Added tests for a 100% coverage of acos().
 - Added tests for a 100% coverage of asin().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-658) Dead code in FastMath.pow(double, double) and some improvement in test coverage

2011-09-09 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101203#comment-13101203
 ] 

Sebb commented on MATH-658:
---

Thanks, format looks OK now.

@Luc - sorry, should have noticed the incorrect testing code.

If I'm being picky, I'd say that code such as
{code}
// Logp of -1.0 should be -Inf
Assert.assertTrue(Double.isInfinite(FastMath.log1p(-1.0)));
{code}

would be better expressed as

{code}
Assert.assertTrue(Logp of -1.0 should be 
-Inf,Double.isInfinite(FastMath.log1p(-1.0)));
{code}

because it's then obvious what the error is without needing to check which line 
has failed.
[And what if the test class has been amended since the test run?]

No need to resubmit; I can fix that later, but please consider for future 
patches.

 Dead code in FastMath.pow(double, double) and some improvement in test 
 coverage
 ---

 Key: MATH-658
 URL: https://issues.apache.org/jira/browse/MATH-658
 Project: Commons Math
  Issue Type: Improvement
Reporter: Yannick TANGUY
Priority: Minor
 Fix For: 3.0

 Attachments: FastMath.java.diff, FastMathTest.java.diff


 This issue concerns the FastMath class and its test class.
 (1) In the double pow(double, double) function, there are 2 identical if 
 blocks. The second one can be suppressed.
 if (y  0  y == yi  (yi  1) == 1) {
 return Double.NEGATIVE_INFINITY;
 }
 // this block is never used - to be suppressed
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 (2) To obtain better code coverage, we added some tests case in 
 FastMathTest.java (see attached file)
 - Added test for log1p
 - Added tests in testPowSpecialCases()
 - Added tests for a 100% coverage of acos().
 - Added tests for a 100% coverage of asin().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MATH-658) Dead code in FastMath.pow(double, double) and some improvement in test coverage

2011-09-09 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved MATH-658.
---

Resolution: Fixed

Hope this is now better resolved ...

 Dead code in FastMath.pow(double, double) and some improvement in test 
 coverage
 ---

 Key: MATH-658
 URL: https://issues.apache.org/jira/browse/MATH-658
 Project: Commons Math
  Issue Type: Improvement
Reporter: Yannick TANGUY
Priority: Minor
 Fix For: 3.0

 Attachments: FastMath.java.diff, FastMathTest.java.diff


 This issue concerns the FastMath class and its test class.
 (1) In the double pow(double, double) function, there are 2 identical if 
 blocks. The second one can be suppressed.
 if (y  0  y == yi  (yi  1) == 1) {
 return Double.NEGATIVE_INFINITY;
 }
 // this block is never used - to be suppressed
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 if (y  0  y == yi  (yi  1) == 1) {
 return -0.0;
 }
 (2) To obtain better code coverage, we added some tests case in 
 FastMathTest.java (see attached file)
 - Added test for log1p
 - Added tests in testPowSpecialCases()
 - Added tests for a 100% coverage of acos().
 - Added tests for a 100% coverage of asin().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-11 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated MATH-650:
--

Attachment: FastMathLoadCheck.java

Very simple test to demonstrate effect of IOD and calculate.

Requires that FastMath.USE_PRECOMPUTED_TABLES be set to package-protected and 
non-final.

Should be set back to final before release.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor
 Attachments: FastMathLoadCheck.java


 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-11 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102300#comment-13102300
 ] 

Sebb edited comment on MATH-650 at 9/11/11 4:48 PM:


Very simple test to demonstrate effect of IOD and calculate.

Requires that FastMath.USE_PRECOMPUTED_TABLES be set to package-protected and 
non-final.
Should be set back to final before release.

  was (Author: s...@apache.org):
Very simple test to demonstrate effect of IOD and calculate.

Requires that FastMath.USE_PRECOMPUTED_TABLES be set to package-protected and 
non-final.

Should be set back to final before release.
  
 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor
 Attachments: FastMathLoadCheck.java


 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-09-12 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102665#comment-13102665
 ] 

Sebb commented on LANG-744:
---

It might be worth changing the static init to a lazy init (IOD).
This would reduce the overhead for applications that don't call stripAccents.

Even if it is possible to change permissions without reloading the class, I'm 
not sure we check the methods each time.



 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CHAIN-57) Chain 2.0 does not build on older JDKs

2011-09-12 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CHAIN-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102691#comment-13102691
 ] 

Sebb commented on CHAIN-57:
---

I just committed an alternate fix that works for me on Sun Java 1.5, but does 
not require an unchecked cast

 Chain 2.0 does not build on older JDKs
 --

 Key: CHAIN-57
 URL: https://issues.apache.org/jira/browse/CHAIN-57
 Project: Commons Chain
  Issue Type: Bug
Affects Versions: 2.0
 Environment: OS name: linux version: 2.6.35-30-generic arch: 
 amd64 Family: unix
 Ubuntu 10.10 x64
 Versions Tested:
 {noformat}
 ibm-java2-x86_64-50 (1.5 j9vmxa6423ifx-20110624) [SUCCESS]
 Sun/Oracle 1.5.0_22 [FAILURE]
 OpenJdk 1.6.0_20 [SUCCESS]
 Sun/Oracle 1.6.0_11 [FAILURE]
 Sun/Oracle 1.6.0_21 [FAILURE]
 Sun/Oracle 1.6.0_27 [SUCCESS]
 ibm-java-x86_64-60 (1.6 jvmxa6460-20081105_25433) [FAILURE]
 ibm-java-x86_64-60 (1.6 jvmxa6460sr9-20110624_85526) [SUCCESS]
 {noformat}
Reporter: Elijah Zupancic
Priority: Minor

 Older versions of the JDK irrespective of vendor fail to compile chain v2.
 I recommend that we do not do any code changes, but rather inform the users 
 in the documentation to compile with a newer JDK version.
 The following is the typical output of a failed build. This particular output 
 is when I tried to build using the Sun/Oracle JDK 1.6.0_21.
 {noformat}
 mvn clean package
 [INFO] Scanning for projects...
 [INFO] 
 
 [INFO] Building Commons Chain
 [INFO]task-segment: [clean, package]
 [INFO] 
 
 [INFO] artifact org.apache.maven.plugins:maven-idea-plugin: checking for 
 updates from internal
 [INFO] Repository 'internal' will be blacklisted
 [INFO] [clean:clean {execution: default-clean}]
 [INFO] Deleting /home/elijah/dev/version-2.0-work/target
 [INFO] [antrun:run {execution: javadoc.resources}]
 [INFO] Executing tasks
 main:
  [copy] Copying 2 files to 
 /home/elijah/dev/version-2.0-work/target/apidocs/META-INF
 [INFO] Executed tasks
 [INFO] Setting property: classpath.resource.loader.class = 
 'org.codehaus.plexus.velocity.ContextClassLoaderResourceLoader'.
 [INFO] Setting property: velocimacro.messages.on = 'false'.
 [INFO] Setting property: resource.loader = 'classpath'.
 [INFO] Setting property: resource.manager.logwhenfound = 'false'.
 [INFO] [remote-resources:process {execution: default}]
 [INFO] [resources:resources {execution: default-resources}]
 [INFO] Using 'iso-8859-1' encoding to copy filtered resources.
 [INFO] Copying 2 resources to META-INF
 [INFO] [compiler:compile {execution: default-compile}]
 [INFO] Compiling 63 source files to 
 /home/elijah/dev/version-2.0-work/target/classes
 [INFO] [bundle:manifest {execution: bundle-manifest}]
 [WARNING] Warning in manifest for 
 commons-chain:commons-chain:jar:2.0-SNAPSHOT : Did not find matching referal 
 for !javax.portlet
 [INFO] [resources:testResources {execution: default-testResources}]
 [INFO] Using 'iso-8859-1' encoding to copy filtered resources.
 [INFO] Copying 2 resources
 [INFO] [compiler:testCompile {execution: default-testCompile}]
 [INFO] Compiling 37 source files to 
 /home/elijah/dev/version-2.0-work/target/test-classes
 [INFO] -
 [ERROR] COMPILATION ERROR : 
 [INFO] -
 [ERROR] 
 /home/elijah/dev/version-2.0-work/src/test/java/org/apache/commons/chain/generic/DispatchCommandTestCase.java:[141,42]
  type parameters of TT cannot be determined; no unique maximal instance 
 exists for type variable T with upper bounds T,java.lang.Object
 [INFO] 1error
 [INFO] -
 [INFO] 
 
 [ERROR] BUILD FAILURE
 [INFO] 
 
 [INFO] Compilation failure
 /home/elijah/dev/version-2.0-work/src/test/java/org/apache/commons/chain/generic/DispatchCommandTestCase.java:[141,42]
  type parameters of TT cannot be determined; no unique maximal instance 
 exists for type variable T with upper bounds T,java.lang.Object
 [INFO] 
 
 [INFO] For more information, run Maven with the -e switch
 [INFO] 
 
 [INFO] Total time: 15 seconds
 [INFO] Finished at: Wed Sep 07 08:09:12 PDT 2011
 [INFO] Final Memory: 51M/300M
 [INFO] 
 
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: 

[jira] [Issue Comment Edited] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-09-12 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102665#comment-13102665
 ] 

Sebb edited comment on LANG-744 at 9/12/11 6:07 PM:


It might be worth changing the static init to a lazy init (IOD).
This would reduce the overhead for applications that don't call stripAccents.

Even if it is possible to change permissions without reloading the class, I 
don't think we should check the methods each time.



  was (Author: s...@apache.org):
It might be worth changing the static init to a lazy init (IOD).
This would reduce the overhead for applications that don't call stripAccents.

Even if it is possible to change permissions without reloading the class, I'm 
not sure we check the methods each time.


  
 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2

 Attachments: LANG-744.patch


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-09-12 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated LANG-744:
--

Attachment: LANG-744.patch

Patch to convert the static checks to IOD

 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2

 Attachments: LANG-744.patch


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-157) Wrong EOF detection in CBZip2InputStream

2011-09-16 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106082#comment-13106082
 ] 

Sebb commented on COMPRESS-157:
---

Are you sure the class is in Compress? I can only find 
BZip2CompressorInputStream which does not have the problem.

Commons VFS does have a file called CBZip2InputStream, and there is one 
instance of casting read() to a char:

{code}
619   while (bsLive  1)
620   {
621   char ch = 0;
622   try
623   {
624   ch = (char) inputStream.read();
625   }
626   catch (IOException e)
627   {
628   compressedStreamEOF();
629   }
630
631   bsBuff = (bsBuff  8) | (ch  0xff);
632   bsLive += 8;
633   }
{code}

That does look wrong.

 Wrong EOF detection in CBZip2InputStream
 

 Key: COMPRESS-157
 URL: https://issues.apache.org/jira/browse/COMPRESS-157
 Project: Commons Compress
  Issue Type: Bug
Reporter: Jan
Priority: Minor

 The following snippet form CBZip2InputStream does a wrong EOF check. The char 
 'thech' will never be equal to the integer '-1'. You have to check for 
 #read() returning -1 before casting to char. 
 I found the bug in 
 http://svn.wikimedia.org/svnroot/mediawiki/trunk/mwdumper/src/org/apache/commons/compress/bzip2/
  not in your TRUNK.
 {noformat}
 int zzi;
 char thech = 0;
 try
 {
 thech = (char)m_input.read();
 }
 catch( IOException e )
 {
 compressedStreamEOF();
 }
 if( thech == -1 ) //HERE
 {
 compressedStreamEOF();
 }
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (VFS-363) Wrong EOF detection in CBZip2InputStream

2011-09-16 Thread Sebb (JIRA)
Wrong EOF detection in CBZip2InputStream


 Key: VFS-363
 URL: https://issues.apache.org/jira/browse/VFS-363
 Project: Commons VFS
  Issue Type: Bug
Reporter: Sebb


See https://issues.apache.org/jira/browse/COMPRESS-157

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-650) FastMath has static code which slows the first access to FastMath

2011-09-17 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107265#comment-13107265
 ] 

Sebb commented on MATH-650:
---

Thanks, useful to know.
It would be interesting to know the times for the second invocation as well.

 FastMath has static code which slows the first access to FastMath
 -

 Key: MATH-650
 URL: https://issues.apache.org/jira/browse/MATH-650
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: Nightly Builds
 Environment: Android 2.3 (Dalvik VM with JIT)
Reporter: Alexis Robert
Priority: Minor
 Attachments: FastMathLoadCheck.java, LucTestPerformance.java


 Working on an Android application using Orekit, I've discovered that a simple 
 FastMath.floor() takes about 4 to 5 secs on a 1GHz Nexus One phone (only the 
 first time it's called). I've launched the Android profiling tool (traceview) 
 and the problem seems to be linked with the static portion of FastMath code 
 named // Initialize tables
 The timing resulted in :
 - FastMath.slowexp (40.8%)
 - FastMath.expint (39.2%)
  \- FastMath.quadmult() (95.6% of expint)
 - FastMath.slowlog (18.2%)
 Hoping that would help
 Thanks!
 Alexis Robert

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (DBUTILS-80) DbUtils.loadDriver catches Throwable

2011-09-20 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/DBUTILS-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated DBUTILS-80:


Attachment: dbutils-80.patch

 DbUtils.loadDriver catches Throwable
 

 Key: DBUTILS-80
 URL: https://issues.apache.org/jira/browse/DBUTILS-80
 Project: Commons DbUtils
  Issue Type: Bug
Reporter: Sebb
 Attachments: dbutils-80.patch


 DbUtils.loadDriver catches Throwable, which is a very bad idea.
 It should just catch Exception.
 Suggested patch to follow (also simplifies code)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (DBUTILS-80) DbUtils.loadDriver catches Throwable

2011-09-20 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/DBUTILS-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb updated DBUTILS-80:


Affects Version/s: 1.3

 DbUtils.loadDriver catches Throwable
 

 Key: DBUTILS-80
 URL: https://issues.apache.org/jira/browse/DBUTILS-80
 Project: Commons DbUtils
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Sebb
 Attachments: dbutils-80.patch


 DbUtils.loadDriver catches Throwable, which is a very bad idea.
 It should just catch Exception.
 Suggested patch to follow (also simplifies code)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (DBUTILS-81) DbUtils.loadDriver() uses Class.forName()

2011-09-22 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/DBUTILS-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13112470#comment-13112470
 ] 

Sebb commented on DBUTILS-81:
-

Have you a suggested patch for this?

 DbUtils.loadDriver() uses Class.forName()
 -

 Key: DBUTILS-81
 URL: https://issues.apache.org/jira/browse/DBUTILS-81
 Project: Commons DbUtils
  Issue Type: Bug
Reporter: Simone Tripodi

 The {{Class.forName()}} statement should be avoided due to potential OSGi 
 issues - commons components are OSGi bundles!
 The ClassLoader should be used instead to [load 
 classes|http://download.oracle.com/javase/6/docs/api/java/lang/ClassLoader.html#loadClass(java.lang.String)]
  and add a new method to pass custom ClassLoader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (DBUTILS-80) DbUtils.loadDriver catches Throwable

2011-09-22 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/DBUTILS-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved DBUTILS-80.
-

Resolution: Fixed

 DbUtils.loadDriver catches Throwable
 

 Key: DBUTILS-80
 URL: https://issues.apache.org/jira/browse/DBUTILS-80
 Project: Commons DbUtils
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Sebb
 Attachments: dbutils-80.patch


 DbUtils.loadDriver catches Throwable, which is a very bad idea.
 It should just catch Exception.
 Suggested patch to follow (also simplifies code)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LANG-744) StringUtils throws java.security.AccessControlException on Google App Engine

2011-09-22 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13112992#comment-13112992
 ] 

Sebb commented on LANG-744:
---

Any objection to applying the patch to convert the method checks to IOD?

That will remove the overhead for applications that don't use stripAccents.

 StringUtils throws java.security.AccessControlException on Google App Engine
 

 Key: LANG-744
 URL: https://issues.apache.org/jira/browse/LANG-744
 Project: Commons Lang
  Issue Type: Bug
  Components: lang.*
Affects Versions: 3.0.1
 Environment: Google App Engine
Reporter: Clément Denis
 Fix For: 3.0.2

 Attachments: LANG-744.patch


 In the static initializer of org.apache.commons.lang3.StringUtils, there is 
 an attempt to load the class sun.text.Normalizer.
 Such a class is prohibited on Google App Engine, and the static intializer 
 throws a java.security.AccessControlException.
 {code}
 Caused by: java.security.AccessControlException: access denied 
 (java.lang.RuntimePermission accessClassInPackage.sun.text)
   at 
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
   at 
 java.security.AccessController.checkPermission(AccessController.java:546)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
   at 
 com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:166)
   at 
 java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1512)
   at java.lang.Class.checkMemberAccess(Class.java:2164)
   at java.lang.Class.getMethod(Class.java:1602)
   at org.apache.commons.lang3.StringUtils.clinit(StringUtils.java:739)
 {code}
 The exception should be caught in the catch clauses around 
 loadClass(sun.text.Normalizer).
 Commons lang 2 worked fine on GAE.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (VFS-360) Migrate to HttpComponent HttpClient from the old Commons HttpClient

2011-09-24 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/VFS-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114105#comment-13114105
 ] 

Sebb commented on VFS-360:
--

HC4 uses a different groupId/artifactId and package names, so AFAICT VFS could 
be updated without affecting JackRabbit.

 Migrate to HttpComponent HttpClient from the old Commons HttpClient
 ---

 Key: VFS-360
 URL: https://issues.apache.org/jira/browse/VFS-360
 Project: Commons VFS
  Issue Type: Improvement
Reporter: Gary D. Gregory

 Migrate to HttpComponent HttpClient from the old Commons HttpClient.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (NET-460) _retrieveFile() blocks calling thread, on FTP I/O till the time file transfer is complete

2012-04-23 Thread Sebb (JIRA)

 [ 
https://issues.apache.org/jira/browse/NET-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebb resolved NET-460.
--

Resolution: Duplicate

 _retrieveFile() blocks calling thread, on FTP I/O till the time file transfer 
 is complete
 -

 Key: NET-460
 URL: https://issues.apache.org/jira/browse/NET-460
 Project: Commons Net
  Issue Type: Improvement
  Components: FTP
Affects Versions: 3.1
 Environment: linux/windows
Reporter: Agent Vinod
  Labels: newbie, patch

 The Function _retrieveFile in file: FTPClient.java , does not respond to 
 interrupts from calling thread.
 For Example:
 A Basic FTP Client Application has 1 Main (Parent) Thread and 1 Child Thread.
 Main (Parent) thread handles all functions except the FtpClient 
 download/upload.
 Child Thread handles only FtpClient related functions mainly 
 (_retrieveFile()) etc.
 Steps to reproduce:
 1) Main Thread has initiated child Thread .
 2) Child thread is presently downloading a file using _retrieveFile(String 
 command, String remote, OutputStream local) . 
 3) After some time, Main Thread fires Interrupt on child Thread to stop( 
 Abort) download.
 Expected behavior:
 Child Thread immediately aborts download and dies.
 Observed behavior:
 Child Thread blocks on retrieveFile(String command, String remote, 
 OutputStream local) till the file finishes download. 
 Only after this ,does the Child thread respond to any interrupt from Parent 
 Thread.
 My Workaround:
 file:  FTPClient.java
 Class: FTPClient
 Step 1: declare private Socket mySocket;
 Step 2: In the function : protected boolean _retrieveFile(String command, 
 String remote, OutputStream local) throws IOException{}
 Comment out:  Socket socket;
 and instead use:  mySocket ( declared as global in step1)
 Step 3:  In the function : public boolean abort() throws IOException
 Add a statement: Util.closeQuietly(mySocket);
 before the statement: return FTPReply.isPositiveCompletion(abor());
 This way, every time the Main Thread calls abort(), the download active and 
 blocked on mySocket in _retrieveFile() is immediately interrupted and stopped.
 raising an immediate Exception and thus stopping the Child thread (of course 
 one needs to catch this exception properly).
 I am not sure if this is the right way of doing it and am afraid if this 
 breaks something else.
 Requesting the core developers to look into a better solution to this 
 workaround.
 thank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NET-460) _retrieveFile() blocks calling thread, on FTP I/O till the time file transfer is complete

2012-04-23 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/NET-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259566#comment-13259566
 ] 

Sebb commented on NET-460:
--

The abort() method sends an ABOR command to the server; that is its only 
function.
If the server fails to honour the ABOR command, that is not the fault of 
Commons NET.
Unfortunately, many FTP servers seem to stop processing the control channel 
whilst data transfer is in use.

The abort() method cannot be extended as suggested.

However, it might be possible to provide a new method which allows the data 
socket to be closed as suggested.
This is covered under NET-419.

 _retrieveFile() blocks calling thread, on FTP I/O till the time file transfer 
 is complete
 -

 Key: NET-460
 URL: https://issues.apache.org/jira/browse/NET-460
 Project: Commons Net
  Issue Type: Improvement
  Components: FTP
Affects Versions: 3.1
 Environment: linux/windows
Reporter: Agent Vinod
  Labels: newbie, patch

 The Function _retrieveFile in file: FTPClient.java , does not respond to 
 interrupts from calling thread.
 For Example:
 A Basic FTP Client Application has 1 Main (Parent) Thread and 1 Child Thread.
 Main (Parent) thread handles all functions except the FtpClient 
 download/upload.
 Child Thread handles only FtpClient related functions mainly 
 (_retrieveFile()) etc.
 Steps to reproduce:
 1) Main Thread has initiated child Thread .
 2) Child thread is presently downloading a file using _retrieveFile(String 
 command, String remote, OutputStream local) . 
 3) After some time, Main Thread fires Interrupt on child Thread to stop( 
 Abort) download.
 Expected behavior:
 Child Thread immediately aborts download and dies.
 Observed behavior:
 Child Thread blocks on retrieveFile(String command, String remote, 
 OutputStream local) till the file finishes download. 
 Only after this ,does the Child thread respond to any interrupt from Parent 
 Thread.
 My Workaround:
 file:  FTPClient.java
 Class: FTPClient
 Step 1: declare private Socket mySocket;
 Step 2: In the function : protected boolean _retrieveFile(String command, 
 String remote, OutputStream local) throws IOException{}
 Comment out:  Socket socket;
 and instead use:  mySocket ( declared as global in step1)
 Step 3:  In the function : public boolean abort() throws IOException
 Add a statement: Util.closeQuietly(mySocket);
 before the statement: return FTPReply.isPositiveCompletion(abor());
 This way, every time the Main Thread calls abort(), the download active and 
 blocked on mySocket in _retrieveFile() is immediately interrupted and stopped.
 raising an immediate Exception and thus stopping the Child thread (of course 
 one needs to catch this exception properly).
 I am not sure if this is the right way of doing it and am afraid if this 
 breaks something else.
 Requesting the core developers to look into a better solution to this 
 workaround.
 thank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   3   4   5   6   7   8   9   10   >