[jira] [Commented] (TEXT-58) All uppercase methods?

2017-01-16 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEXT-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824763#comment-15824763
 ] 

Gilles commented on TEXT-58:


bq. Also, any thoughts on how to properly rename ISO8859_1_ESCAPE in camel 
case? iso8859-1-Escape?

{{iso8859-1-Escape}} is not a valid identifier.


> All uppercase methods?
> --
>
> Key: TEXT-58
> URL: https://issues.apache.org/jira/browse/TEXT-58
> Project: Commons Text
>  Issue Type: Improvement
>Reporter: Gilles
>Priority: Trivial
>  Labels: api, convention, standard
> Fix For: 1.0
>
>
> Class {{o.a.c.text.translate.EntityArrays}} contains methods names with all 
> uppercase letters (and underscores).
> I understand that they create copies of _static_ constants (although even 
> that is not true since they return arrays!), but are you sure you want to 
> release a new component that does not follow the usual convention?
> I understand these comes from LANG but isn't it the right time to fix the API?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEXT-59) Mutable static data

2017-01-16 Thread Gilles (JIRA)
Gilles created TEXT-59:
--

 Summary: Mutable static data
 Key: TEXT-59
 URL: https://issues.apache.org/jira/browse/TEXT-59
 Project: Commons Text
  Issue Type: Bug
Reporter: Gilles
 Fix For: 1.0


Class {{o.a.c.text.translate.EntityArrays}} contain methods that return mutable 
data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824663#comment-15824663
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
--

Github user ansell commented on the issue:

https://github.com/apache/commons-rdf/pull/30
  
In the RDF4J/Sesame case, we have had some users request, and some other 
users complain about , both lowercasing, which was used in the past, and 
canonicalisation, so RDF4J will default to leaving case alone, but any user is 
free to switch on the canonicalisation. Currently there isn't a 
lowercase-all-tags option, but that may also appear in the future.

For reference, the language tag canonicalisation procedure that RDF4J 
optionally uses, which relies on the JDK's copy of the IANA Language Subtag 
Registry, is:

```
new Locale.Builder().setLanguageTag(tag).build().toLanguageTag()
```

There are other possible methods, but the method above is the only one that 
I could find which throws an error if the original tag is illformed.


> RDF-1.1 specifies that language tags need to be compared using lower-case
> -
>
> Key: COMMONSRDF-51
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
> Project: Apache Commons RDF
>  Issue Type: Bug
>  Components: api
>Affects Versions: 0.3.0
>Reporter: Peter Ansell
>Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824660#comment-15824660
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
--

Github user afs commented on the issue:

https://github.com/apache/commons-rdf/pull/30
  
@ansell mentions one of the reasons the wording for RDF 1.1is not so direct 
- RDF 1.0 did not sanction the common normalization defined in BCP47 
canonicalization, although that actually requires consulting the registry as 
well.

Jena is lax by default, and retains the form as originally written. In 
practice, datasets seem to be internally consistent, all lower case or all 
syntax-canonical. 

Variations of case are different nodes in the general case but are 
`Node.sameValue` (compare) and cause matching in graph.find. Some storage 
layers may differ and canonicalize the form, in order to index.



> RDF-1.1 specifies that language tags need to be compared using lower-case
> -
>
> Key: COMMONSRDF-51
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
> Project: Apache Commons RDF
>  Issue Type: Bug
>  Components: api
>Affects Versions: 0.3.0
>Reporter: Peter Ansell
>Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824637#comment-15824637
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
--

Github user ansell commented on a diff in the pull request:

https://github.com/apache/commons-rdf/pull/30#discussion_r96309778
  
--- Diff: api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java 
---
@@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws 
Exception {
 assertEquals("\"Herbert Van de Sompel\"@vls", 
vls.ntriplesString());
 }
 
+public void testCreateLiteralLangCaseInsensitive() throws Exception {
+// COMMONSRDF-51: Literal langtag may not be in lowercase, but
+// must be COMPARED (aka .equals and .hashCode()) in lowercase
+// as the language space is lower case.   
+final Literal lower = factory.createLiteral("Hello", "en-gb"); 
+final Literal upper = factory.createLiteral("Hello", "EN-GB"); 
+final Literal mixed = factory.createLiteral("Hello", "en-GB");
+
+
+assertEquals("en-gb", lower.getLanguageTag().get());
--- End diff --

RDF4J may not follow this in some cases. It may use the BCP47 normalisation 
conventions to obtain en-GB instead.


> RDF-1.1 specifies that language tags need to be compared using lower-case
> -
>
> Key: COMMONSRDF-51
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
> Project: Apache Commons RDF
>  Issue Type: Bug
>  Components: api
>Affects Versions: 0.3.0
>Reporter: Peter Ansell
>Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824636#comment-15824636
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
--

Github user ansell commented on a diff in the pull request:

https://github.com/apache/commons-rdf/pull/30#discussion_r96309546
  
--- Diff: api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java 
---
@@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws 
Exception {
 assertEquals("\"Herbert Van de Sompel\"@vls", 
vls.ntriplesString());
 }
 
+public void testCreateLiteralLangCaseInsensitive() throws Exception {
--- End diff --

Does this need @Test annotation?


> RDF-1.1 specifies that language tags need to be compared using lower-case
> -
>
> Key: COMMONSRDF-51
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
> Project: Apache Commons RDF
>  Issue Type: Bug
>  Components: api
>Affects Versions: 0.3.0
>Reporter: Peter Ansell
>Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DAEMON-341) prunsrv injects garbage into ImagePath

2017-01-16 Thread Mikhail Dobrinin (JIRA)

[ 
https://issues.apache.org/jira/browse/DAEMON-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824409#comment-15824409
 ] 

Mikhail Dobrinin commented on DAEMON-341:
-

This may be resolved in version 1.0.15.

> prunsrv injects garbage into ImagePath
> --
>
> Key: DAEMON-341
> URL: https://issues.apache.org/jira/browse/DAEMON-341
> Project: Commons Daemon
>  Issue Type: Bug
>  Components: Procrun
>Affects Versions: 1.0.13
> Environment: Windows Server 2008 (not R2)
>Reporter: Mikhail Dobrinin
>
> Here is a reproducible example that works every time:
> {noformat}
> prunsrv.exe //IS//abcd.branch2 --StartMode=jvm 
> --StartClass=abc.abcdefghih.abcd.abcdef.abcd.MyImportantClass 
> --StartMethod=startService ++StartParams=abcd.branch2
> {noformat}
> The ImagePath entry for the service ends up being:
> {noformat}
> C:\path\to\prunsrv.exe 12-08.loɥ//RS//abcd.branch2
> {noformat}
> As you see, there is garbage inserted in front of the {{//RS//}} string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DAEMON-341) prunsrv injects garbage into ImagePath

2017-01-16 Thread Mikhail Dobrinin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DAEMON-341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Dobrinin updated DAEMON-341:

Affects Version/s: (was: 1.0.15)
   1.0.13

> prunsrv injects garbage into ImagePath
> --
>
> Key: DAEMON-341
> URL: https://issues.apache.org/jira/browse/DAEMON-341
> Project: Commons Daemon
>  Issue Type: Bug
>  Components: Procrun
>Affects Versions: 1.0.13
> Environment: Windows Server 2008 (not R2)
>Reporter: Mikhail Dobrinin
>
> Here is a reproducible example that works every time:
> {noformat}
> prunsrv.exe //IS//abcd.branch2 --StartMode=jvm 
> --StartClass=abc.abcdefghih.abcd.abcdef.abcd.MyImportantClass 
> --StartMethod=startService ++StartParams=abcd.branch2
> {noformat}
> The ImagePath entry for the service ends up being:
> {noformat}
> C:\path\to\prunsrv.exe 12-08.loɥ//RS//abcd.branch2
> {noformat}
> As you see, there is garbage inserted in front of the {{//RS//}} string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824368#comment-15824368
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
--

Github user stain commented on the issue:

https://github.com/apache/commons-rdf/pull/30
  
There seems to be consensus on 
http://lists.w3.org/Archives/Public/public-rdf-comments/2017Jan/thread.html and 
http://lists.w3.org/Archives/Public/semantic-web/2017Jan/thread.html in the 
_Are literal language tags case sensitive?_ threads that it is not meant to be 
a change from RDF 1.0 - that language tags should still be compared case 
insensitively.

That should be inline with what this PR suggests - case insensitive in 
`.equals()` and `.hashCode()`

Do you agree on that line, @afs and @ansell ..?


> RDF-1.1 specifies that language tags need to be compared using lower-case
> -
>
> Key: COMMONSRDF-51
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
> Project: Apache Commons RDF
>  Issue Type: Bug
>  Components: api
>Affects Versions: 0.3.0
>Reporter: Peter Ansell
>Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824366#comment-15824366
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
--

Github user stain commented on the issue:

https://github.com/apache/commons-rdf/pull/30
  
This pull request returns `getLanguageTag()` in whatever case the 
underlying platform does (e.g. I think RDF4J and JSONLD-Java preserves casing, 
while Jena and Simple converts to lowercase.

I think it is only in `.equals()` and `.hashCode()` we need case 
insensitivity.

There's arguments both ways if we should provide a consistent view across 
the implementations (e.g. always lowercase); or if we should provide a 
consistency with what the underlying implementation does (e.g. if it is 
preserves casing for presentation purposes). 

Commons RDF don't have any value handling mechanisms now for say 
converting`"13.37"^^xsd:float` to a Java float `13.37f` (without going through 
the underlying implementations and related methods); or determining value 
equality, so I think it is not too weird if  Commons RDF doesn't do anything 
clever about language tags either (beyond spec  compliance).

But if someone were to add a Common RDF API for such literal value 
handling, it could be natural to also add "utils" methods for presenting or 
parsing language tags (e.g. `isLanguageTagEqual("en-us", "en-US")` as well as 
hierarchical comparisons, something like `isSameLanguageTagFamily("en-us", 
"en-GB")`



> RDF-1.1 specifies that language tags need to be compared using lower-case
> -
>
> Key: COMMONSRDF-51
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
> Project: Apache Commons RDF
>  Issue Type: Bug
>  Components: api
>Affects Versions: 0.3.0
>Reporter: Peter Ansell
>Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (IO-373) FileUtils.byteCountToDisplaySize improvement/rounding issues

2017-01-16 Thread Asaf Mesika (JIRA)

[ 
https://issues.apache.org/jira/browse/IO-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824112#comment-15824112
 ] 

Asaf Mesika commented on IO-373:


Any reason nothing is merged in this issue?

> FileUtils.byteCountToDisplaySize improvement/rounding issues
> 
>
> Key: IO-373
> URL: https://issues.apache.org/jira/browse/IO-373
> Project: Commons IO
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.4
>Reporter: Mark
>Priority: Minor
> Attachments: byteCountToDisplaySize.patch, 
> byteCountToHumanReadableGnu.patch
>
>
> Issue IO-226 is not fixed but closed.
> ?
> Here is my solution that also support a user-defined precision in terms of a 
> maximum length of the digits part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEXT-58) All uppercase methods?

2017-01-16 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEXT-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824016#comment-15824016
 ] 

Gilles commented on TEXT-58:


I've not checked the code, but are those methods actually useful (other than 
being used internally by TEXT codes)?

If the functionality is necessary in the public API, and since much defensive 
copying happens anyway, why not define a single method
{code}
public String[][] getEscape(EscapeId id) {
return copy(escapes.get(id));
}
{code}
where the {{EscapeId}} argument would be an {{enum}}, {{escapes}} a static 
instance of {{EnumMap}} and {{copy}} a private method 
performing the copy?


> All uppercase methods?
> --
>
> Key: TEXT-58
> URL: https://issues.apache.org/jira/browse/TEXT-58
> Project: Commons Text
>  Issue Type: Improvement
>Reporter: Gilles
>Priority: Trivial
>  Labels: api, convention, standard
> Fix For: 1.0
>
>
> Class {{o.a.c.text.translate.EntityArrays}} contains methods names with all 
> uppercase letters (and underscores).
> I understand that they create copies of _static_ constants (although even 
> that is not true since they return arrays!), but are you sure you want to 
> release a new component that does not follow the usual convention?
> I understand these comes from LANG but isn't it the right time to fix the API?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEXT-58) All uppercase methods?

2017-01-16 Thread Rob Tompkins (JIRA)

[ 
https://issues.apache.org/jira/browse/TEXT-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823964#comment-15823964
 ] 

Rob Tompkins commented on TEXT-58:
--

What if we deprecated the methods in leu of simply renaming them? Clearly the 
converse argument to that would be: why would you ever do a first release with 
something deprecated.

Also, any thoughts on how to properly rename {{ISO8859_1_ESCAPE}} in camel 
case? {{iso8859-1-Escape}}?

> All uppercase methods?
> --
>
> Key: TEXT-58
> URL: https://issues.apache.org/jira/browse/TEXT-58
> Project: Commons Text
>  Issue Type: Improvement
>Reporter: Gilles
>Priority: Trivial
>  Labels: api, convention, standard
> Fix For: 1.0
>
>
> Class {{o.a.c.text.translate.EntityArrays}} contains methods names with all 
> uppercase letters (and underscores).
> I understand that they create copies of _static_ constants (although even 
> that is not true since they return arrays!), but are you sure you want to 
> release a new component that does not follow the usual convention?
> I understand these comes from LANG but isn't it the right time to fix the API?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)