On 21/01/14 16:58, Martynas Jusevičius wrote:
Andy,

what if I'm sending the query to a remote endpoint that does not
support Java style regex syntax? Do I need to use FmtUtils then?

FmtUtils does not have code to escape regex metacharacters.
You'll need to escape all metacharacter by string manipulation. It's a shame that the standard Java RT uses \Q\E

Perl has quotemeta which implements it's \Q\E


Looking at the example from SPARQL 1.1 STR() [1]:

This query selects the set of people who use their work.example
address in their foaf profile:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
  WHERE { ?x foaf:name  ?name ;
             foaf:mbox  ?mbox .
          FILTER regex(str(?mbox), "@work\\.example$") }

How does "work.example" become "work\\.example"? Is one backslash
escaping the regex, and the second one escaping the literal? How do I
achieve this with Jena?

Yes. \\ puts a single real \ into the SPARQL string.

That's how you do it in Jena.
"." is a metacharacter so it needs escaping.
==> \.
but it's in a SPARQL string so syntax needs to be \\
==> \\.


The Xerces implement in REUtil.quoteMeta is:

public static String quoteMeta(String literal) {
        int len = literal.length();
        StringBuffer buffer = null;
        for (int i = 0;  i < len;  i ++) {
            int ch = literal.charAt(i);
            if (".*+?{[()|\\^$".indexOf(ch) >= 0) {
                if (buffer == null) {
                    buffer = new StringBuffer(i+(len-i)*2);
                    if (i > 0)  buffer.append(literal.substring(0, i));
                }
                buffer.append((char)'\\');
                buffer.append((char)ch);
            } else if (buffer != null)
                buffer.append((char)ch);
        }
        return buffer != null ? buffer.toString() : literal;
    }

and there are others via Google.

        Andy


[1] http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#func-str

Martynas

On Tue, Jan 21, 2014 at 10:10 AM, Andy Seaborne <a...@apache.org> wrote:
Works for me:

SELECT * {
   VALUES ?o { "+35" "abc+35def" }
   FILTER regex(?o , "\\Q+35\\E", "i")
}

and in Java you need \\\\ due to both the levels of escaping (Java text,
SPARQL).

[[
Regex: Pattern exception: java.util.regex.PatternSyntaxException: Dangling
meta character '+' near index 0
]]
which gives the game away it's using java regexs :-)

The regex engine is Java's and the string is used untouched.

Strictly, it should be XSD v1 which are slightly different:

http://www.w3.org/TR/xmlschema-2/#regexs

and there is a strict Xerces provided alternative if you want exact XSD
regular expressions.

XSD and Java differs in only very small ways (e.g. XSD has one extra
modifier flag, "m", XSD has no \Q\E, and XSD has "Is" for unicode code
blocks inside \p e.g. \p{IsMongolian})

And now
http://www.w3.org/TR/xmlschema11-2/#regexs

         Andy


On 21/01/14 02:32, Joshua TAYLOR wrote:

My apologies.  I replied too quickly.  I just wrote this test with
Jena's command line tools. To match the string "+35", I had to use the
"\\+35" in the query:

select ?label where {
    values ?label { "+35" "-35" }
    filter(regex(str(?label),"\\+35"))
}

---------
| label |
=========
| "+35" |
---------

That's _two_ slashes in the query string, which means that in Java
you'd end up writing

String query = ... + "filter(regex(...,\"\\\\+35\")" + ...;

Sorry for the hasty and inaccurate reply.

On Mon, Jan 20, 2014 at 9:27 PM, Joshua TAYLOR <joshuaaa...@gmail.com>
wrote:

On Mon, Jan 20, 2014 at 8:48 PM, Martynas Jusevičius
<marty...@graphity.org> wrote:

OK maybe "+35" was a bad example. But isn't "+" a special char in
SPARQL regex? And there are more like "*", "?" etc.
http://www.w3.org/TR/xpath-functions/#regex-syntax



Oh, good point.  But if it needs to be escape with a slash, then wouldn't

     filter regex(str(?label), "\+35")

be fine?  Note that if you're constructing this programmatically, you
might end up writing code like

      String queryString = ... + "filter regex(str(?label), \"\\+35\")" +
...;

--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/






Reply via email to