[jira] [Commented] (TEXT-58) All uppercase methods?
[ https://issues.apache.org/jira/browse/TEXT-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824763#comment-15824763 ] Gilles commented on TEXT-58: bq. Also, any thoughts on how to properly rename ISO8859_1_ESCAPE in camel case? iso8859-1-Escape? {{iso8859-1-Escape}} is not a valid identifier. > All uppercase methods? > -- > > Key: TEXT-58 > URL: https://issues.apache.org/jira/browse/TEXT-58 > Project: Commons Text > Issue Type: Improvement >Reporter: Gilles >Priority: Trivial > Labels: api, convention, standard > Fix For: 1.0 > > > Class {{o.a.c.text.translate.EntityArrays}} contains methods names with all > uppercase letters (and underscores). > I understand that they create copies of _static_ constants (although even > that is not true since they return arrays!), but are you sure you want to > release a new component that does not follow the usual convention? > I understand these comes from LANG but isn't it the right time to fix the API? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEXT-59) Mutable static data
Gilles created TEXT-59: -- Summary: Mutable static data Key: TEXT-59 URL: https://issues.apache.org/jira/browse/TEXT-59 Project: Commons Text Issue Type: Bug Reporter: Gilles Fix For: 1.0 Class {{o.a.c.text.translate.EntityArrays}} contain methods that return mutable data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824663#comment-15824663 ] ASF GitHub Bot commented on COMMONSRDF-51: -- Github user ansell commented on the issue: https://github.com/apache/commons-rdf/pull/30 In the RDF4J/Sesame case, we have had some users request, and some other users complain about , both lowercasing, which was used in the past, and canonicalisation, so RDF4J will default to leaving case alone, but any user is free to switch on the canonicalisation. Currently there isn't a lowercase-all-tags option, but that may also appear in the future. For reference, the language tag canonicalisation procedure that RDF4J optionally uses, which relies on the JDK's copy of the IANA Language Subtag Registry, is: ``` new Locale.Builder().setLanguageTag(tag).build().toLanguageTag() ``` There are other possible methods, but the method above is the only one that I could find which throws an error if the original tag is illformed. > RDF-1.1 specifies that language tags need to be compared using lower-case > - > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api >Affects Versions: 0.3.0 >Reporter: Peter Ansell >Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824660#comment-15824660 ] ASF GitHub Bot commented on COMMONSRDF-51: -- Github user afs commented on the issue: https://github.com/apache/commons-rdf/pull/30 @ansell mentions one of the reasons the wording for RDF 1.1is not so direct - RDF 1.0 did not sanction the common normalization defined in BCP47 canonicalization, although that actually requires consulting the registry as well. Jena is lax by default, and retains the form as originally written. In practice, datasets seem to be internally consistent, all lower case or all syntax-canonical. Variations of case are different nodes in the general case but are `Node.sameValue` (compare) and cause matching in graph.find. Some storage layers may differ and canonicalize the form, in order to index. > RDF-1.1 specifies that language tags need to be compared using lower-case > - > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api >Affects Versions: 0.3.0 >Reporter: Peter Ansell >Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824637#comment-15824637 ] ASF GitHub Bot commented on COMMONSRDF-51: -- Github user ansell commented on a diff in the pull request: https://github.com/apache/commons-rdf/pull/30#discussion_r96309778 --- Diff: api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java --- @@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws Exception { assertEquals("\"Herbert Van de Sompel\"@vls", vls.ntriplesString()); } +public void testCreateLiteralLangCaseInsensitive() throws Exception { +// COMMONSRDF-51: Literal langtag may not be in lowercase, but +// must be COMPARED (aka .equals and .hashCode()) in lowercase +// as the language space is lower case. +final Literal lower = factory.createLiteral("Hello", "en-gb"); +final Literal upper = factory.createLiteral("Hello", "EN-GB"); +final Literal mixed = factory.createLiteral("Hello", "en-GB"); + + +assertEquals("en-gb", lower.getLanguageTag().get()); --- End diff -- RDF4J may not follow this in some cases. It may use the BCP47 normalisation conventions to obtain en-GB instead. > RDF-1.1 specifies that language tags need to be compared using lower-case > - > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api >Affects Versions: 0.3.0 >Reporter: Peter Ansell >Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824636#comment-15824636 ] ASF GitHub Bot commented on COMMONSRDF-51: -- Github user ansell commented on a diff in the pull request: https://github.com/apache/commons-rdf/pull/30#discussion_r96309546 --- Diff: api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java --- @@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws Exception { assertEquals("\"Herbert Van de Sompel\"@vls", vls.ntriplesString()); } +public void testCreateLiteralLangCaseInsensitive() throws Exception { --- End diff -- Does this need @Test annotation? > RDF-1.1 specifies that language tags need to be compared using lower-case > - > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api >Affects Versions: 0.3.0 >Reporter: Peter Ansell >Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DAEMON-341) prunsrv injects garbage into ImagePath
[ https://issues.apache.org/jira/browse/DAEMON-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824409#comment-15824409 ] Mikhail Dobrinin commented on DAEMON-341: - This may be resolved in version 1.0.15. > prunsrv injects garbage into ImagePath > -- > > Key: DAEMON-341 > URL: https://issues.apache.org/jira/browse/DAEMON-341 > Project: Commons Daemon > Issue Type: Bug > Components: Procrun >Affects Versions: 1.0.13 > Environment: Windows Server 2008 (not R2) >Reporter: Mikhail Dobrinin > > Here is a reproducible example that works every time: > {noformat} > prunsrv.exe //IS//abcd.branch2 --StartMode=jvm > --StartClass=abc.abcdefghih.abcd.abcdef.abcd.MyImportantClass > --StartMethod=startService ++StartParams=abcd.branch2 > {noformat} > The ImagePath entry for the service ends up being: > {noformat} > C:\path\to\prunsrv.exe 12-08.loɥ//RS//abcd.branch2 > {noformat} > As you see, there is garbage inserted in front of the {{//RS//}} string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DAEMON-341) prunsrv injects garbage into ImagePath
[ https://issues.apache.org/jira/browse/DAEMON-341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Dobrinin updated DAEMON-341: Affects Version/s: (was: 1.0.15) 1.0.13 > prunsrv injects garbage into ImagePath > -- > > Key: DAEMON-341 > URL: https://issues.apache.org/jira/browse/DAEMON-341 > Project: Commons Daemon > Issue Type: Bug > Components: Procrun >Affects Versions: 1.0.13 > Environment: Windows Server 2008 (not R2) >Reporter: Mikhail Dobrinin > > Here is a reproducible example that works every time: > {noformat} > prunsrv.exe //IS//abcd.branch2 --StartMode=jvm > --StartClass=abc.abcdefghih.abcd.abcdef.abcd.MyImportantClass > --StartMethod=startService ++StartParams=abcd.branch2 > {noformat} > The ImagePath entry for the service ends up being: > {noformat} > C:\path\to\prunsrv.exe 12-08.loɥ//RS//abcd.branch2 > {noformat} > As you see, there is garbage inserted in front of the {{//RS//}} string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824368#comment-15824368 ] ASF GitHub Bot commented on COMMONSRDF-51: -- Github user stain commented on the issue: https://github.com/apache/commons-rdf/pull/30 There seems to be consensus on http://lists.w3.org/Archives/Public/public-rdf-comments/2017Jan/thread.html and http://lists.w3.org/Archives/Public/semantic-web/2017Jan/thread.html in the _Are literal language tags case sensitive?_ threads that it is not meant to be a change from RDF 1.0 - that language tags should still be compared case insensitively. That should be inline with what this PR suggests - case insensitive in `.equals()` and `.hashCode()` Do you agree on that line, @afs and @ansell ..? > RDF-1.1 specifies that language tags need to be compared using lower-case > - > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api >Affects Versions: 0.3.0 >Reporter: Peter Ansell >Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824366#comment-15824366 ] ASF GitHub Bot commented on COMMONSRDF-51: -- Github user stain commented on the issue: https://github.com/apache/commons-rdf/pull/30 This pull request returns `getLanguageTag()` in whatever case the underlying platform does (e.g. I think RDF4J and JSONLD-Java preserves casing, while Jena and Simple converts to lowercase. I think it is only in `.equals()` and `.hashCode()` we need case insensitivity. There's arguments both ways if we should provide a consistent view across the implementations (e.g. always lowercase); or if we should provide a consistency with what the underlying implementation does (e.g. if it is preserves casing for presentation purposes). Commons RDF don't have any value handling mechanisms now for say converting`"13.37"^^xsd:float` to a Java float `13.37f` (without going through the underlying implementations and related methods); or determining value equality, so I think it is not too weird if Commons RDF doesn't do anything clever about language tags either (beyond spec compliance). But if someone were to add a Common RDF API for such literal value handling, it could be natural to also add "utils" methods for presenting or parsing language tags (e.g. `isLanguageTagEqual("en-us", "en-US")` as well as hierarchical comparisons, something like `isSameLanguageTagFamily("en-us", "en-GB")` > RDF-1.1 specifies that language tags need to be compared using lower-case > - > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api >Affects Versions: 0.3.0 >Reporter: Peter Ansell >Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (IO-373) FileUtils.byteCountToDisplaySize improvement/rounding issues
[ https://issues.apache.org/jira/browse/IO-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824112#comment-15824112 ] Asaf Mesika commented on IO-373: Any reason nothing is merged in this issue? > FileUtils.byteCountToDisplaySize improvement/rounding issues > > > Key: IO-373 > URL: https://issues.apache.org/jira/browse/IO-373 > Project: Commons IO > Issue Type: Improvement > Components: Utilities >Affects Versions: 2.4 >Reporter: Mark >Priority: Minor > Attachments: byteCountToDisplaySize.patch, > byteCountToHumanReadableGnu.patch > > > Issue IO-226 is not fixed but closed. > ? > Here is my solution that also support a user-defined precision in terms of a > maximum length of the digits part. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEXT-58) All uppercase methods?
[ https://issues.apache.org/jira/browse/TEXT-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824016#comment-15824016 ] Gilles commented on TEXT-58: I've not checked the code, but are those methods actually useful (other than being used internally by TEXT codes)? If the functionality is necessary in the public API, and since much defensive copying happens anyway, why not define a single method {code} public String[][] getEscape(EscapeId id) { return copy(escapes.get(id)); } {code} where the {{EscapeId}} argument would be an {{enum}}, {{escapes}} a static instance of {{EnumMap}} and {{copy}} a private method performing the copy? > All uppercase methods? > -- > > Key: TEXT-58 > URL: https://issues.apache.org/jira/browse/TEXT-58 > Project: Commons Text > Issue Type: Improvement >Reporter: Gilles >Priority: Trivial > Labels: api, convention, standard > Fix For: 1.0 > > > Class {{o.a.c.text.translate.EntityArrays}} contains methods names with all > uppercase letters (and underscores). > I understand that they create copies of _static_ constants (although even > that is not true since they return arrays!), but are you sure you want to > release a new component that does not follow the usual convention? > I understand these comes from LANG but isn't it the right time to fix the API? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEXT-58) All uppercase methods?
[ https://issues.apache.org/jira/browse/TEXT-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823964#comment-15823964 ] Rob Tompkins commented on TEXT-58: -- What if we deprecated the methods in leu of simply renaming them? Clearly the converse argument to that would be: why would you ever do a first release with something deprecated. Also, any thoughts on how to properly rename {{ISO8859_1_ESCAPE}} in camel case? {{iso8859-1-Escape}}? > All uppercase methods? > -- > > Key: TEXT-58 > URL: https://issues.apache.org/jira/browse/TEXT-58 > Project: Commons Text > Issue Type: Improvement >Reporter: Gilles >Priority: Trivial > Labels: api, convention, standard > Fix For: 1.0 > > > Class {{o.a.c.text.translate.EntityArrays}} contains methods names with all > uppercase letters (and underscores). > I understand that they create copies of _static_ constants (although even > that is not true since they return arrays!), but are you sure you want to > release a new component that does not follow the usual convention? > I understand these comes from LANG but isn't it the right time to fix the API? -- This message was sent by Atlassian JIRA (v6.3.4#6332)