[ 
https://issues.apache.org/jira/browse/STANBOL-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977889#comment-14977889
 ] 

Rupert Westenthaler commented on STANBOL-877:
---------------------------------------------

An analysis based on the information provided in [1] showed that Virtuoso does 
no longer like quotes in full text query strings. After some digging I found 
the specification of full text queries in the Virtuoso documentation at [2].

{code:none}
expr ::= proximity_expr
        expr AND expr
        | expr OR expr
        | expr AND NOT  expr
        | '(' expr ')'

word_expr ::=
          word
        | '"' phrase '"'

proximity_expr ::=
          word_expr
        | proximity_expr NEAR word_expr

word ::=
        <word char>*

phrase ::=
          word
        | phrase <whitespace> word

word_char ::=  alphanumeric characters, '*',  ISO Latin accented characters.
{code}

To ensure that only alphanumeric characters are used in full text query parts 
'{{\W}}' is now used for splitting query strings instead of '{{\s}}'.

[2] http://docs.openlinksw.com/virtuoso/queryingftcols.html#textexprsyntax

> Double quote in query text cause sparql query to fail
> -----------------------------------------------------
>
>                 Key: STANBOL-877
>                 URL: https://issues.apache.org/jira/browse/STANBOL-877
>             Project: Stanbol
>          Issue Type: Bug
>          Components: Entityhub
>            Reporter: Florent ANDRE
>            Assignee: Rupert Westenthaler
>         Attachments: SPARQL-grammar-escapes-STANBOL-877_rw.patch, 
> escape-quote-877.patch
>
>
> With the use of NLP engines and some content with quoted text inside, quotes 
> can be in the string searched by the entityhub.
> Associated with a RDF store, the generated sparql query is not legal as the 
> double quote is not escaped.
> Patch submitted as I'm actually stick to rev 1420034.
> This patch contains :
> * A unit test at the 
> query/clerezza/src/test/java/org/apache/stanbol/entityhub/query/clerezza/SparqlQueryUtilsTest.java
>  level
> * A quote escape in 
> generic/servicesapi/src/main/java/org/apache/stanbol/entityhub/servicesapi/query/TextConstraint.java
>  for escaping in all query generation cases
> * a remove in 
> generic/servicesapi/src/main/java/org/apache/stanbol/entityhub/servicesapi/util/PatternUtils.java
>  as this double escape something already escaped that lead to not still 
> escape the characters during regex part generation.
> All the project compile with this patch at this rev.
> ++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to