[GitHub] jena issue #237: JENA-1313: compare using a Collator when both literals are ...

kinow Sat, 29 Apr 2017 18:37:28 -0700

Github user kinow commented on the issue:

    https://github.com/apache/jena/pull/237
  
    Sorry about the mess. I reverted the previous changes, and wanted to keep 
everything in the branch history in case we decided to go back that way, but 
messed up with a `git rebase`. Cherry picked a few commits, now it's looking OK.
    
    So now this updated pull request is following a different direction. 
Instead of changing the default behaviour, based on language tags, it contains 
a 2-parameters "collation" function. All changes in ARQ.
    
    Please, ignore comments/unit tests/code readability/etc, as what this pull 
request is right now is a mere suggestion of an alternative for JENA-1313, and 
may be again discarded in case there are too many problems with this 
implementation.
    
    The FN_Collation.java contains the code for the new function. The first 
argument is a locale, used for finding the collator. The second argument to the 
function is the NodeValue (Expr). What the function does, is quite simple - and 
possibly naÃ¯ve?. It extracts the string literal from the Expr part, then 
creates a new NodeValue that contains both String + locale.
    
    Further down, the NodeValueString was modified as well to keep track of the 
string locale. Alternatively, we could create a new NodeValue subtype, instead 
of adding an optional locale (backward binary compatible change, as we add, but 
not change existing methods).
    
    Then, when the SortCondition in the Query is evaluated, and then the 
NodeValueString#compare method is called, it checks if it was given a desired 
locale. If so, it sorts using that locale.
    
    Notice that this function will be applied always in the String Value Space 
in ARQ, as even when we have a Language Tag, it is discarded and we use only 
the string. Basically, any node with a literal string will become a 
NodeValueString, when this function is applied to the node.
    
    With this, users are able to choose a Collation, overriding any language 
tags. This way, if your data contains @en and @en-GB, you can decide to use any 
Collation you desire on your query.
    
    Thoughts?
    
    Cheers
    Bruno



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] jena issue #237: JENA-1313: compare using a Collator when both literals are ...

Reply via email to