Re: [SMW-devel] {{#ask}}
On Donnerstag, 27. Dezember 2007, cnit wrote: (2) Query answering is done without any caching, and this is clearly a problem. While inline queries are computed only once and stored in the parser cache afterwards, Special:Ask has no caching facility at all. This needs to change in the future. Targetted cache invalidation might still be difficult and it is not clear whether the effort is needed (one could enable manual cache clearing like for pages). A new query cache -- design, architecture and implementation -- is needed here. Too much of caching can hurt dynamic content - it's nice to have the page with query being updated at least once per hour or two. Well, that is not the case for the current parser cache, neither in MW nor in SMW. But if course it could be achieved with some server-side cronjobs. Speaking of Special:Ask I believe it should be limited to registered users only. It might slow down the operation Which is due to the lack of caching ... and also is suggestive for hackers trying to build an exploiting query. My strong hope is that none such query is possible. If security issues with queries should exist, I would like to find them rather sooner than later. I expected that it would be possible to limit Special page access based on some MW mechanism already. Is there no way of configuring MediaWiki this way? Markus -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org signature.asc Description: This is a digitally signed message part. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] SMW performance
Forget my previous post. The problem goesaway when I removed one template. It seems the performance issue is related tothe application instead of the database. Try setting up eAccelerator for PHP, maybe it would help a bit. Also, I believe that MW/SMW requires dedicated server (co-location). We've tried usual low-cost hosting (with hundreds of other's virtual hosts) and even MW alone was crawling.. Also, it was slow under Windows. It's ok with Linux server. Of course you could also try dedicated MySQL server. MW even supports MySQL clustering and web load-balancing, because it's being used by wikipedia - one of the busiest and largest sites in the world. Dmitriy - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] {{#ask}}
Well, that is not the case for the current parser cache, neither in MW nor in SMW. But if course it could be achieved with some server-side cronjobs. Ah, I didn't knew about MW cronjobs. That sounds nice. Will try to find out some examples. Maybe you're right that such functionality shouldn't belong to main application itself.. Which is due to the lack of caching ... Well, yes. Of course if someone wants to slow down the site, he could use many different queries. But, it can be traced with apache logs and banned by IP.. My strong hope is that none such query is possible. If security issues with queries should exist, I would like to find them rather sooner than later. I hope that, too. I expected that it would be possible to limit Special page access based on some MW mechanism already. Is there no way of configuring MediaWiki this way? http://meta.wikimedia.org/wiki/Help:Special_page#Restricted_special_pages e.g. includes/SpecialBlockip.php contains the following check: # Permission check if( !$wgUser-isAllowed( 'block' ) ) { $wgOut-permissionRequired( 'block' ); return; } BUT, I've remebered that further results links are Special:Ask with query parameters. In such case, further results would be unavailable to anonymous users, which is sad. Only if every ask query had it's own ID, which would be passed to further results page instead of query itself... Maybe I am asking too much and IP ban (see above) is enough. Dmitriy - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] {{#ask}}
On Freitag, 28. Dezember 2007, cnit wrote: Well, that is not the case for the current parser cache, neither in MW nor in SMW. But if course it could be achieved with some server-side cronjobs. Ah, I didn't knew about MW cronjobs. That sounds nice. Will try to find out some examples. Maybe you're right that such functionality shouldn't belong to main application itself.. What I meant was: a simple cron-job can touch LocalSettings.php regularly to purge the MW cache globally. Not much interaction with MW needed for that. Which is due to the lack of caching ... Well, yes. Of course if someone wants to slow down the site, he could use many different queries. But, it can be traced with apache logs and banned by IP.. My strong hope is that none such query is possible. If security issues with queries should exist, I would like to find them rather sooner than later. I hope that, too. I expected that it would be possible to limit Special page access based on some MW mechanism already. Is there no way of configuring MediaWiki this way? http://meta.wikimedia.org/wiki/Help:Special_page#Restricted_special_pages e.g. includes/SpecialBlockip.php contains the following check: # Permission check if( !$wgUser-isAllowed( 'block' ) ) { $wgOut-permissionRequired( 'block' ); return; } BUT, I've remebered that further results links are Special:Ask with query parameters. In such case, further results would be unavailable to anonymous users, which is sad. Only if every ask query had it's own ID, which would be passed to further results page instead of query itself... Maybe I am asking too much and IP ban (see above) is enough. I guess a strong solution for that will still take some time. One could of course store inline queries in some table, use IDs for each, and permit anyone to use ask with such an (internal) ID only, whereas making custom queries would require further permissions. But this is some more code, and I am not entirely convinced of that design. Did you experience problems with anonymous users that access Special:Ask? On ontoworld it seems that a significant amount of Special:Ask requests really come from further results links. Markus -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org signature.asc Description: This is a digitally signed message part. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] [PATCH] Support LIKE in queries
On Freitag, 28. Dezember 2007, Yaron Koren wrote: How about ~%substring% instead? The ~ is the symbol for pattern matching in Perl and some UNIX languages, and it might be a clearer indicator of function than %. I would immediately use that, but IFRC the Halo extension has a similar syntax for a custom editing-distance database function (requires modified MySQL version, and probably also has significant performance issues). So the question is whether we want to overwrite that (assuming that this particular Halo function is not used widely), or is there another idea for doing it? Other imaginable operators on my keyboard would be #, , ?, @ -- none really as nice as ~ ... Markus On Dec 27, 2007 2:16 PM, Markus Krötzsch [EMAIL PROTECTED] wrote: Thanks. I have applied the patch, and added a way of configuring this feature: the parameter $smwgQComparators gives a (|-separated) list of supported comparators, and can be used to enable or disable any of , , !, and %. By default its value is '||!|%'. In this way one can also disable ! or even , if these are considered to be problematic. I wonder whether one should use another character instead of % as a wildcard inside the pattern string, so that no double-% confusion can arise. Would * be an alternative or would it be too confusing w.r.t. the old ask print requests? What about +? According examples (preprocessing would in each case ensure full compatibility with SQL): - %%substring% - %*substring* - %+substring+ Cheers Markus On Donnerstag, 20. Dezember 2007, Asheesh Laroia wrote: On Thu, 20 Dec 2007, Thomas Bleher wrote: Yesterday I needed LIKE queries for properties, so I added it to SMW (patch attached). It was surprisingly simple. This would be LIKE TOTALLY AWESOME to get in to Semantic MediaWiki. It would be great if later SMW could have Valgol support http://www.indwes.edu/Faculty/bcupp/things/computer/VALGOL.html. -- Asheesh. P.S. In all total like seriousness, queries with LIKE support are a good idea -- The star of riches is shining upon you. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org signature.asc Description: This is a digitally signed message part. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] [PATCH] Support LIKE in queries
A lot of people are accustomed to the ? (single-character match) and * (multi-character match) format. It would be easy to escape the '_'s and '%'s in a match and then do a replace of ? to _ and * to %. (A little preg and \ could still easily escape those.) I don't know about ~ though, in the languages I've used I recall ~ having something to do with regex. I'd rather save that character for in case we want to be able to use the REGEXP matching inside of SQL. From what I remember, I think most people with only a little insight into technical stuff, would adjust easiest to using this set: = Equals Greater than = Greater than or equal to Less than or equal to ! Not * Multi-character match ? Single-character match ~ regex But I did have a thought about the @... It's not used anywhere afaik. I did make a suggestion on using a pattern to separate the comparators from the match value. It was using [[Property::comparitor::match]], but as I now remember SMW lets you use :: to specify multiple properties. However it may be a good idea if the separator was one which wouldn't cause conflicting issues with other things. @ is not commonly used and does provide a little bit of a way for people to understand it's use. Or if you want a little farther from what can actually be used in a title (To avoid clashing with things) the # is always invalid. Say, [[prop::[EMAIL PROTECTED] or [[prop::comp#match]]. So for a not [[Has value::[EMAIL PROTECTED] or [[Has value::!#Value]]. I'm probably droning on now... But what about finding a good separator and allowing textual names ie: EQ[=], NOT/NEQ/[!] (!= could be thought of),LT[], GT[], REGEX(P)[~], LIKE[%_], wildcard[*?], etc... There also is the possibility of instead of a separator, using brackets to encompass a comparator. I can hardly think of many places which would use (NOT) at the start of a title ([[Has value::(NOT) Title]]) or, we also have the {} and [] type brackets. [] is used by external links, but {} is only used in multiples as a template or variable bit but never has use singularly, templates and values will have already been parsed out so only the singles remain, and as a bonus, { and } are illegal in titles. So [[Has value::{NOT} Title]] is guaranteed to never clash with a legal title or match you can make. If you're worried about templates and parsing issues, those can't occur when your using something like {{{1}}} as the title ([[Has value:{NOT} {{{1}}}]]) so there's no clash. The only potential class is if someone wants to use {{{comparator|EQ}}} to specify the comparator. In that case, we could easily make { EQ } valid (trim spaces), so { {{{comparator|EQ}}} } would work. But... now I'm droning a bit much... ~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com) Markus Krötzsch wrote: On Freitag, 28. Dezember 2007, Yaron Koren wrote: How about ~%substring% instead? The ~ is the symbol for pattern matching in Perl and some UNIX languages, and it might be a clearer indicator of function than %. I would immediately use that, but IFRC the Halo extension has a similar syntax for a custom editing-distance database function (requires modified MySQL version, and probably also has significant performance issues). So the question is whether we want to overwrite that (assuming that this particular Halo function is not used widely), or is there another idea for doing it? Other imaginable operators on my keyboard would be #, , ?, @ -- none really as nice as ~ ... Markus On Dec 27, 2007 2:16 PM, Markus Krötzsch [EMAIL PROTECTED] wrote: Thanks. I have applied the patch, and added a way of configuring this feature: the parameter $smwgQComparators gives a (|-separated) list of supported comparators, and can be used to enable or disable any of , , !, and %. By default its value is '||!|%'. In this way one can also disable ! or even , if these are considered to be problematic. I wonder whether one should use another character instead of % as a wildcard inside the pattern string, so that no double-% confusion can arise. Would * be an alternative or would it be too confusing w.r.t. the old ask print requests? What about +? According examples (preprocessing would in each case ensure full compatibility with SQL): - %%substring% - %*substring* - %+substring+ Cheers Markus On Donnerstag, 20. Dezember 2007, Asheesh Laroia wrote: On Thu, 20 Dec 2007, Thomas Bleher wrote: Yesterday I needed LIKE queries for properties, so I added it to SMW (patch attached). It was surprisingly simple. This would be LIKE TOTALLY AWESOME to get in to Semantic MediaWiki. It would be great if later SMW could have Valgol support http://www.indwes.edu/Faculty/bcupp/things/computer/VALGOL.html. -- Asheesh. P.S. In all total like seriousness, queries with LIKE support are a good