Hoo man has uploaded a new change for review. (
https://gerrit.wikimedia.org/r/381229 )
Change subject: Skip simple strings in N3Quoter::escapeLiteral
......................................................................
Skip simple strings in N3Quoter::escapeLiteral
This makes the function about 50% faster on the first 5,000 Wikidata
entities. That makes the truthy dump about 7% faster in general.
This might have more or less effect on the entire truthy dump, but
I'm not willing to benchmark this in detail.
Bug: T176844
Change-Id: I7141a58a022a98373c1390ee4e336fb9ee54f6c2
---
M src/N3Quoter.php
1 file changed, 6 insertions(+), 0 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/purtle refs/changes/29/381229/1
diff --git a/src/N3Quoter.php b/src/N3Quoter.php
index f7f1899..901e0b4 100644
--- a/src/N3Quoter.php
+++ b/src/N3Quoter.php
@@ -50,6 +50,12 @@
* @return string
*/
public function escapeLiteral( $s ) {
+ // Performance: If the entire string is just (a safe subset) of
ASCII, let it through.
+ // Ok are space (31), ! (32), # (35) - [ (91) and ] (93) to ~
(126). This excludes " (34) and \ (92).
+# if ( preg_match( '/^[#-\[\]-~ !]*\z/', $s ) ) {
+# return $s;
+# }
+
// String escapes. Note that the N3 spec is more restrictive
than the Turtle and TR
// specifications, see
<https://www.w3.org/TeamSubmission/n3/#escaping>
// and <https://www.w3.org/TR/turtle/#string>
--
To view, visit https://gerrit.wikimedia.org/r/381229
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7141a58a022a98373c1390ee4e336fb9ee54f6c2
Gerrit-PatchSet: 1
Gerrit-Project: purtle
Gerrit-Branch: master
Gerrit-Owner: Hoo man <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits