On 29/03/2023 14:24, Mikael Pesonen wrote:

Here the next line was REPLACE that's why regex

VALUES ?class_label { " \\(häiriö\\)" " \\(löydös\\)" " \\(toimenpide\\)" }
?concept rdfs:label ?fsnl FILTER (REGEX(?fsnl, $class_label)) .
BIND (REPLACE(?fsnl, ?class_label, "") AS ?newl) .

But indeed, I didn't mean to use $ in $class_label, no idea what that syntax means. But it was not the cause here?

No.

>> org.apache.jena.sparql.expr.ExprException: REGEX: Pattern is not a
string: " \\(häiriö\\)"@fi

It says " \\(häiriö\\)"@fi

It has a language tag.

The query you show does not seem to be the query being run.

Regexs are xsd:strings, not language tag strings.



So to use constants, write above like this?

?concept rdfs:label ?fsnl
FILTER (REGEX(?fsnl, " \\(häiriö\\)") | REGEX(?fsnl, " \\(löydös\\)") | REGEX(?fsnl, " \\(toimenpide\\)") ) .
BIND (REPLACE(?fsnl, " \\(häiriö\\)", "") AS ?newl1) .
BIND (REPLACE(?newl1, " \\(löydös\\)", "") AS ?newl2) .
BIND (REPLACE(?newl2, " \\(toimenpide\\)", "") AS ?newl) .

Try it on a small amount of data.

    Andy



On 29/03/2023 15.20, Andy Seaborne wrote:


On 29/03/2023 12:56, Rob @ DNR wrote:
Yes, you can filter these out, the logger in question is the class name shown, the log4j configuration will need to reference that via its fully qualified name i.e. org.apache.jena.sparql.engine.iterator.QueryIterFilterExpr and set it to ERROR/OFF to suppress these warnings

Issuing millions of instances of the same identical warning certainly seems like a bug to me, especially since this is elicited by query input it could potentially be abused as a DoS attack vector.

Rob


From: Mikael Pesonen <[email protected]>
Date: Wednesday, 29 March 2023 at 10:22
To: [email protected] <[email protected]>
Subject: Re: Strategies to avoid log flooding
Below is the log, so is it possible to filter just these out?

Unfortunately I don't recall the exact regex but it was related to
escaping parentheses, so maybe this or with one back slash:
...


VALUES ?class_label { " \\(häiriö\\)" " \\(löydös\\)" " \\(toimenpide\\)" }
?concept rdfs:label ?fsnl FILTER (REGEX(?fsnl, $class_label)) .

That does not align with the log message which says the pattern is " \\(häiriö\\)"@fi

meaning $class_label is @fi.

Use str() to get the lexical part.

The regex is potentially different every call. So the regex is compiled every call. (If it's the same, a constant, it is compiled once.)

Here, write as three calls, one per constant.

Or use CONTAINS, because a regex is unnecessary in this case.

    Andy

...

So this is a bug not a feature and can be corrected?

Mar 27 13:13:33 insight-terms java[2512289]: [2023-03-27 13:13:33]
QueryIterFilterExpr WARN  Expression Exception in (regex ?fsnl ?class_label)
Mar 27 13:13:33 insight-terms java[2512289]:
org.apache.jena.sparql.expr.ExprException: REGEX: Pattern is not a
string: " \\(häiriö\\)"@fi
Mar 27 13:13:33 insight-terms java[2512289]: #011at
org.apache.jena.sparql.expr.E_Regex.makeRegexEngine(E_Regex.java:120)
~[fuseki-server.jar:4.6.1]

Reply via email to