[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2020-04-03 Thread Krinkle
Krinkle added a comment. I've merged the RFC bits into T176312 , which seems to be about the same thing. TASK DETAIL https://phabricator.wikimedia.org/T176312 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2020-01-08 Thread Bawolff
Bawolff added a comment. Just as an aside, the dirt simple solution here would be to shell out to `grep -p` (or even just to php fed just the preg_match call) and rely on limit.sh to prevent undue resourse usage. TASK DETAIL https://phabricator.wikimedia.org/T176312 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2019-10-22 Thread Addshore
Addshore added a comment. @Milimetric @Krinkle we will write something up that is a bit tidier and then pass it over in your direction. TASK DETAIL https://phabricator.wikimedia.org/T176312 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2019-03-27 Thread Milimetric
Milimetric added a comment. @Lucas_Werkmeister_WMDE do you think Tech Com can be useful here or should this task just live in a backlog? TASK DETAIL https://phabricator.wikimedia.org/T176312 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2018-07-25 Thread Krinkle
Krinkle added a comment. Hi, just wanted to mention that TechCom has been reading along here. Let us know if you need our input and/or would like an IRC meeting for finding more options, or narrowing down options, or approving a specific option.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2018-07-23 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. safe-regex mainly works by measuring the star height of the regex, and disabling nesting is a crude way to limit the star height to 1, so that we don’t need to parse the regex. (a*b*[ac]*$ has multiple stars, but its star height is still one.) That said, a

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2018-07-17 Thread Smalyshev
Smalyshev added a comment. I don’t think it’s possible to construct a ReDoS attack without any nesting Wouldn't something like a*b*[ac]*$ be still dangerous? Maybe not as dangerous as nested ones, but seems to still have some evil potential. There's also this one:

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2018-07-17 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. As an interim solution, would it be okay to use unsandboxed preg_match for certain regexes which are known to be safe, and to continue to use SPARQL for all others? For example, currently, 2304 out of 2999 regexes in format constraints don’t contain any

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-10-03 Thread Halfak
Halfak added a comment. I'm not sure we'd make use of an external service. In our case, I think a more robust timeout and some testing gets us what we need.TASK DETAILhttps://phabricator.wikimedia.org/T176312EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-25 Thread Smalyshev
Smalyshev added a comment. There's pcre.backtrack_limit and pcre.recursion_limit in PHP settings, which can be significantly reduced against defaults (which are very generous). However, I am not 100% sure it covers all scenarios where PCRE can get into the woods.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-25 Thread daniel
daniel added a comment. so the service should still be limited in CPU time. ...and RAM.TASK DETAILhttps://phabricator.wikimedia.org/T176312EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: Anomie, Smalyshev, tstarling, daniel, GWicke, Joe,

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-25 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. I’m still thinking of a per-request microservice (like minisparql) which gets torn down at the end of each request anyways – but even if the service handles multiple requests, we can just restart it, it’s stateless anyways. What I’m worried about is not

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-25 Thread daniel
daniel added a comment. In T176312#3630934, @Lucas_Werkmeister_WMDE wrote: If we’re going to check the regexes on a microservice, then we might as well use PCRE IMHO, for compatibility’s sake. I don’t think any of the regular regexes on Wikidata result in catastrophic runtime behavior, so if that

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-25 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. AIUI, this would be called when editing Not quite – constraint checks (including the “format” constraint) are currently run when a user who has enabled the checkConstraints gadget views an entity page or saves a statement. (And we plan to enable that gadget

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-25 Thread Joe
Joe added a comment. I think re2 seems like an interesting candidate. I would argue we still want to have a separate microservice running on a separate cluster from MediaWiki, for security reasons, and I would think it could be used to run the regular _expression_ validations as well. AIUI, this

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-22 Thread daniel
daniel added a comment. In T176312#3626696, @Lucas_Werkmeister_WMDE wrote: RE2 sounds nice… we could probably live without backreferences. But we’d need some way to use RE2 in PHP. There seems to be a php module for it: https://github.com/arraypad/php-re2 If that isn't maintained or

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-22 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. RE2 sounds nice… we could probably live without backreferences. But we’d need some way to use RE2 in PHP. I don’t think Lua patterns are powerful enough to check format constraints. There are a lot of format constraints on Wikidata already, many of them

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-21 Thread Anomie
Anomie added a comment. Lua’s regexes One item of note is that Lua's patterns don't support basic regex features such as general alternation (only alternation over a set of single characters) or applying the Kleene star to arbitrary subpatterns (only to single characters or character sets).

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-21 Thread Smalyshev
Smalyshev added a comment. I don't think there are many other regex libraries which have been verified as safe for arbitrary user input, with limited time and memory. PCRE is probably not one, especially pcre+PHP, I've seen a bunch of issues there. Also I think full PCRE is way over-powered for

[Wikidata-bugs] [Maniphest] [Commented On] T176312: Don’t check format constraint via SPARQL (safely evaluating user-provided regular expressions)

2017-09-20 Thread tstarling
tstarling added a comment. If you just want an approximately PCRE-like syntax, you could just translate the regex to a Lua pattern. Scribunto has equivalent code going in the other direction, in Scribunto_LuaUstringLibrary::patternToRegex(), which you could look at for ideas. Obviously you would