[SMW-devel] ? an invalid character in Type:URL

2007-10-14 Thread Audra Johnson
Line 54/55 of includes/SMW_DV_URI.php defines the ? character as  
invalid for URLs and URIs:

// simple check for invalid characters: '?', ' ', '{', '}'
$check1 = "@(\?\}|\{| )+@";

When I change the check to no longer reject ?, it seems to work  
alright, so now I'm wondering why it was considered an invalid  
character. I'm worried what adverse effects might come from including  
URLs with ?.

There are lots of links that might contain a ? in them to specific,  
unique pages on sites like bulletin boards, blog entries, and wikis,  
so not accepting these URLs seems like a harsh limitation.

--Audra

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


[SMW-devel] Subquery performance seems to degrade in SMW 1.0 from SMW 0.7

2007-10-25 Thread Audra Johnson
I do a search with three subqueries only interested in the category  
of the property:

[[property::[[Category:X  (93 members, 1 subcategory)
[[property::[[Category:Y  (45 members, no subcategories)
[[property::[[Category:Z  (122 members, 3 subcategories)

And it makes mysqld chug CPU for twenty minutes.  Now, I'd think I  
was just asking too much from it and including too many subqueries,  
but even two of the subqueries (Y and Z) above is enough to make it  
choke for about five minutes.  One category (X) is done in just a  
couple seconds.  Meanwhile, the same searches on SMW 0.7 are  
instantaneous or take no more than a second or two.  (Same server  
under same conditions, same data.)

If there's a quick answer to why this might be happening and how I  
can fix it, let me know, but I'll be trying to look at it in the  
meantime.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


[SMW-devel] Search form parser hook

2007-11-14 Thread Audra Johnson
I wanted to put a SMW search box on the front page and other pages,  
so I made a  parser hook that will insert one.  I don't know  
how useful it would be to others, but I put it up on the wiki just in  
case someone else was wanting one:

http://ontoworld.org/wiki/Semantic_MediaWiki_search_form_parser_hook


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


[SMW-devel] A new SMW Special page: Redirected annotations

2007-11-18 Thread Audra Johnson
A few weeks ago, I discovered that searches with a few subqueries  
were dreadfully slow and found the cause to be making annotated  
redirects equivalent to their targets.  I suggested making redirect  
equivalency a SMW option--but when I went scanning for how to  
implement it, I gleefully found that it was already included!  The  
setting is:


$smwgQEqualitySupport = true; // Should #redirects be evaluated as  
equality between page names?


(It should probably include a note about efficiency for its comment,  
just like the one for $smwgQDefaultNamespaces--taking out redirect  
equality support can make a HUGE difference for more complex  
queries.  HUGE being the difference between a three subquery search  
taking 34 minutes with redirect equivalency and .3 seconds without.)


In any case, my semantic wiki has been happily humming and searching  
along since I set $smwgQEqualitySupport to false.But I also need  
a way to tell where annotations in the wiki are pointing to  
redirected pages, so they can be fixed.


So I've made a special page for redirected annotations that extends  
SMWQueryPage.  This will make it easy for people to find where  
annotations are pointing to redirected pages and fix them to point at  
the right ones.


Messages that would are added for this page:

// Messages for Redirected Annotations Special
'redirectedannotations' => 'Redirected annotations',
	'smw_redirectedannotations_docu' => 'The object of these annotations  
points to a redirected page.',
	'smw_redirectedannotations_template' => 'On page $1, the annotation  
$2::$3 redirects to $4.',


It required one additional function to SMW_SQLStore in the "Special  
page functions" section:


function getRedirectedAnnotationsSpecial($requestoptions = NULL) {
wfProfileIn("SMWSQLStore::getRedirectedAnnotationsSpecial 
(SMW)");
$db =& wfGetDB( DB_SLAVE );
$options = ' ORDER BY subject_title';

if ($requestoptions->limit > 0) {
$options .= ' LIMIT ' . $requestoptions->limit;
}
if ($requestoptions->offset > 0) {
$options .= ' OFFSET ' . $requestoptions->offset;
}

extract( $db->tableNames('smw_relations', 'redirect') );

		$res = $db->query("SELECT subject_title, subject_namespace,  
relation_title, object_title, object_namespace, rd_title,  
rd_namespace FROM $smw_relations "
			. "INNER JOIN $redirect ON $smw_relations.object_id =  
$redirect.rd_from"

. $options, 'SMW::getRedirectedAnnotationsSpecial');
$result = array();

while($row = $db->fetchObject($res)) {
			$subject_page = Title::newFromText($row->subject_title, $row- 
>subject_namespace);
			$relation_page = Title::newFromText($row->relation_title,  
SMW_NS_PROPERTY);
			$object_page = Title::newFromText($row->object_title, $row- 
>object_namespace);

$rd_page = Title::newFromText($row->rd_title, 
$row->rd_namespace);
			$result[] = array($subject_page, $relation_page, $object_page,  
$rd_page);

}

wfProfileOut("SMWSQLStore::getRedirectedAnnotationsSpecial 
(SMW)");
return $result;
}

And it needs to be added in the enableSemantics() function in  
SMW_GlobalFunctions.php:


$wgSpecialPages['RedirectedAnnotations']   = array 
('SMWSpecialPage','RedirectedAnnotations',  
'smwfDoSpecialRedirectedAnnotations', $smwgIP . '/specials/QueryPages/ 
SMW_SpecialRedirectedAnnotations.php');


I've attached the PHP code for the page itself.

Am looking for input on:

* Whether the special page should be initialized and included only if  
$smwgQEqualitySupport = false

* Whether the page name RedirectedAnnotations is an adequate name
* Other wording that should be changed or tweaked.  For example, in  
the smw_redirctedannotations_temp:
* Message text:  "the annotation Similar to::X redirects to Y." or  
"the relation Similar to::X redirects to Y.", since this only happens  
to relation annotations? "The object of these annotations points to a  
redirected page." or "The object of these relations points to a  
redirected page." ?

* Coding style


doQuery( $offset, $limit );
	wfProfileOut('smwfDoSpecialRedirectedAnnotations (SMW)');
	return $result;
}

class SMWRedirectedAnnotationsPage extends SMWQueryPage {

	function getName() {
		/// TODO: should probably use SMW prefix
		return "RedirectedAnnotations";
	}

	function isExpensive() {
		return false; /// disables caching for now
	}

	function isSyndicated() { 
		return false; ///TODO: why not?
	}

	function getPageHeader() {
		return '' . wfMsg('smw_redirectedannotations_docu') . "\n";
	}

	function formatResult( $skin, $result ) {
		global $wgLang, $wgExtraNamespaces;
		
		// Make links for the pages
		$subject_page = $skin->makeLin

Re: [SMW-devel] A new SMW Special page: Redirected annotations

2007-11-19 Thread Audra Johnson
I agree that for some wikis it's not a problem, but the default  
behavior in SMW is to treat them as equivalents, so I assume that  
many wikis would like to treat redirects as equivalents.  However,  
wikis like mine that are catered towards searching can't afford to  
have $smwgQEqualitySupport set to true, because it makes searches  
much more resource intensive--redirect equivalency is the most  
expensive component in a search.  If you want the numbers for some of  
my benchmarks that demonstrate this:

One property search ( [[property::object]] ) --
redirect equiv on: .05 sec, off: .00 sec
Two different properties search ( [[property1::object1]]  
[[property2::object2]] ) --
redirect equiv on: .06 sec, off: .00 sec
Small simple subquery ( [[property::[[Category:Small ) --
redirect equiv on: .09 sec, off: .00 sec
Small category disambiguation ([[property::[[Category:Small1|| 
Small2) --
redirect equiv on: .16 sec, off: .00 sec
Two relatively small simple subqueries ([[property:: 
[[Category:Small1 [[property::[[Category:Medium1 --
redirect equiv on: .50 sec, off: .01 sec
Large three simple subqueries ([[property::[[Category:Hefty1 [[property::[[Category:Hefty2 [[note:: 
[[Category:Hefty3) --
redirect equiv on: 34 in 48 sec, off: .3 sec

For the last one, it's a difference between a search that looks like  
this:

SELECT DISTINCT `page`.page_title as title, `page`.page_namespace as  
namespace, `page`.page_id as id
   FROM `cats17`, `prop13`, `cats11`, `prop7`, `cats5`, `prop1`, `page`
   INNER JOIN `smw_relations` AS rel0 ON rel0.subject_id=`page`.page_id
   LEFT JOIN `redirect` AS rd3 ON rd3.rd_from=rel0.object_id
   LEFT JOIN `page` AS rp4 ON (rd3.rd_title=rp4.page_title AND  
rd3.rd_namespace=rp4.page_namespace)
   INNER JOIN `categorylinks` AS cl2 ON ((cl2.cl_from=rel0.object_id)  
OR (rp4.page_id=cl2.cl_from))
   INNER JOIN `smw_relations` AS rel6 ON rel6.subject_id=`page`.page_id
   LEFT JOIN `redirect` AS rd9 ON rd9.rd_from=rel6.object_id
   LEFT JOIN `page` AS rp10 ON (rd9.rd_title=rp10.page_title AND  
rd9.rd_namespace=rp10.page_namespace)
   INNER JOIN `categorylinks` AS cl8 ON ((cl8.cl_from=rel6.object_id)  
OR (rp10.page_id=cl8.cl_from))
   INNER JOIN `smw_relations` AS rel12 ON  
rel12.subject_id=`page`.page_id
   LEFT JOIN `redirect` AS rd15 ON rd15.rd_from=rel12.object_id
   LEFT JOIN `page` AS rp16 ON (rd15.rd_title=rp16.page_title AND  
rd15.rd_namespace=rp16.page_namespace)
   INNER JOIN `categorylinks` AS cl14 ON  
((cl14.cl_from=rel12.object_id) OR (rp16.page_id=cl14.cl_from))
   WHERE (prop1.title=rel0.relation_title AND  
(cats5.title=cl2.cl_to)) AND (prop7.title=rel6.relation_title
 AND (cats11.title=cl8.cl_to)) AND  
(prop13.title=rel12.relation_title AND (cats17.title=cl14.cl_to))
   ORDER BY `page`.page_title ASC LIMIT 21;

And a search that looks like this:

SELECT DISTINCT `page`.page_title as title, `page`.page_namespace as  
namespace, `page`.page_id as id
FROM `cats17`, `prop13`, `cats11`, `prop7`, `cats5`, `prop1`, `page`
INNER JOIN `smw_relations` AS rel0 ON rel0.subject_id=`page`.page_id
INNER JOIN `categorylinks` AS cl2 ON ((cl2.cl_from=rel0.object_id))
INNER JOIN `smw_relations` AS rel6 ON rel6.subject_id=`page`.page_id
INNER JOIN `categorylinks` AS cl8 ON ((cl8.cl_from=rel6.object_id))
INNER JOIN `smw_relations` AS rel12 ON rel12.subject_id=`page`.page_id
INNER JOIN `categorylinks` AS cl14 ON ((cl14.cl_from=rel12.object_id))
WHERE (prop1.title=rel0.relation_title AND (cats5.title=cl2.cl_to))  
AND (prop7.title=rel6.relation_title
   AND (cats11.title=cl8.cl_to)) AND  
(prop13.title=rel12.relation_title AND (cats17.title=cl14.cl_to))
ORDER BY `page`.page_title ASC LIMIT 21;

So, some wikis might like to annotate redirect pages and have them  
not be equivalent to what they redirect to, like what you describe.   
But others might need to turn redirect equivalencies off for  
performance reasons and then a redirected annotations page to see  
which annotations need changing so they can be included in searches  
again would come in handy.

--Audra

On Nov 19, 2007, at 7:23 AM, Mov GP 0 wrote:

> Hi,
> I don't think that its a problem, because you can annotate
> Redirection-Pages too. This makes sense, because in the Wikipedia, not
> all information is stored atomically. Instead, Redirects are often
> redirecting to subsections of bigger articles. It is also possible to
> categorize or translate Redirection-Pages, so why not semantically
> annotate them? I think this makes sense.
>
> If there would not be this possible, the Articles in the Wikipedia
> would need to get rewritten so the information becomes more atomic.
>
> ys, MovGP0
>
>
> On Nov 19, 2007 1:05 AM, Audra Johnson <[EMAIL PROTECTED]> wrote:
>> A few weeks ago, I discovered that searches with a few subqueries
>> were dreadfully slow and found the cause to be making annotated
>>

Re: [SMW-devel] A new SMW Special page: Redirected annotations

2007-11-27 Thread Audra Johnson
I gave some benchmark examples here:

http://sourceforge.net/mailarchive/message.php? 
msg_name=A9698228-735A-4BEB-9CC7-CC76015BE4BE%40audrajohnson.com

I ran them on my MacBook Pro with a 2Ghz Intel Core Duo and 2 GB of  
RAM.  I also included the final SQL select the example search runs,  
comparing what the SQL is for redirects as equivalents and not  
including redirects.  (It's possible that the 34 minutes would have  
been less if I hadn't gotten bored waiting and started browsing the  
internet waiting for it to finish.)

The size of my wiki isn't very large--3642 entries in the page table,  
174 in redirect, and smw_relations has 10777 rows--so I don't think  
it's an unwieldy data set.  The size of the categories in the search  
were:  6 entries and no subcategories, 141 entries with 6  
subcategories, 99 entries with 1 subcategory.

--Audra

On Nov 26, 2007, at 10:31 PM, Sergey Chernyshev wrote:

> Audra,
>
> Can you describe the size of your wiki and amount of redirects? 34  
> minutes seems to be a huge number.
> Can you also give an example of SMW query that ran that long?
>
>   Sergey
>
>
>
> On Nov 18, 2007 7:05 PM, Audra Johnson <[EMAIL PROTECTED]> wrote:
> A few weeks ago, I discovered that searches with a few subqueries
> were dreadfully slow and found the cause to be making annotated
> redirects equivalent to their targets.  I suggested making redirect
> equivalency a SMW option--but when I went scanning for how to
> implement it, I gleefully found that it was already included!  The
> setting is:
>
> $smwgQEqualitySupport = true; // Should #redirects be evaluated as
> equality between page names?
>
> (It should probably include a note about efficiency for its comment,
> just like the one for $smwgQDefaultNamespaces--taking out redirect
> equality support can make a HUGE difference for more complex
> queries.  HUGE being the difference between a three subquery search
> taking 34 minutes with redirect equivalency and .3 seconds without.)
>
> In any case, my semantic wiki has been happily humming and searching
> along since I set $smwgQEqualitySupport to false.But I also need
> a way to tell where annotations in the wiki are pointing to
> redirected pages, so they can be fixed.
>
> So I've made a special page for redirected annotations that extends
> SMWQueryPage.  This will make it easy for people to find where
> annotations are pointing to redirected pages and fix them to point at
> the right ones.
>
> Messages that would are added for this page:
>
>// Messages for Redirected Annotations Special
>'redirectedannotations' => 'Redirected annotations',
>'smw_redirectedannotations_docu' => 'The object of these  
> annotations
> points to a redirected page.',
>'smw_redirectedannotations_template' => 'On page $1, the  
> annotation
> $2::$3 redirects to $4.',
>
> It required one additional function to SMW_SQLStore in the "Special
> page functions" section:
>
> function getRedirectedAnnotationsSpecial($requestoptions = NULL) {
>wfProfileIn 
> ("SMWSQLStore::getRedirectedAnnotationsSpecial (SMW)");
>$db =& wfGetDB( DB_SLAVE );
>$options = ' ORDER BY subject_title';
>
>if ($requestoptions->limit > 0) {
>$options .= ' LIMIT ' . $requestoptions->limit;
>}
>if ($requestoptions->offset > 0) {
>$options .= ' OFFSET ' . $requestoptions- 
> >offset;
>}
>
>extract( $db->tableNames('smw_relations',  
> 'redirect') );
>
>$res = $db->query("SELECT subject_title,  
> subject_namespace,
> relation_title, object_title, object_namespace, rd_title,
> rd_namespace FROM $smw_relations "
>. "INNER JOIN $redirect ON  
> $smw_relations.object_id =
> $redirect.rd_from"
>. $options,  
> 'SMW::getRedirectedAnnotationsSpecial');
>$result = array();
>
>while($row = $db->fetchObject($res)) {
>$subject_page = Title::newFromText($row- 
> >subject_title, $row-
>  >subject_namespace);
>$relation_page = Title::newFromText($row- 
> >relation_title,
> SMW_NS_PROPERTY);
>$object_page = Title::newFromText($row- 
> >object_title, $row-
>  >object_namespace);
>$rd_page = Title::newFromText($row- 
> >rd_titl

Re: [SMW-devel] {{#ask}}

2007-11-29 Thread Audra Johnson
I don't think doing the implementation would really be that hard.   
There would need to be a refreshMetaSemantics.php maintenance  
script,  and some hooks going into creating and saving pages and  
maybe some other tasks like viewing.  It should probably only keep  
metadata on pages set to have evaluated annotations as defined in  
$smwgNamespacesWithSemanticLinks. If implemented in the core, there  
should be a global settings option so keeping this information can be  
turned on or off.  But I could just as easily see this as a SMW  
extension, and it might be better that way.

Either the special properties would need to be somewhat specific to  
keep from clashing (page last modified on, page created on, page  
edited by, number of page views, number of page edits), or users  
should be allowed to specify what the property namespaces should be.  
Users should also be allowed to customize what metadata gets saved,  
because if they don't care about storing how many times a page is  
viewed, they won't want the extra database hit to update that  
semantic bit of data every time a page is viewed.

I think have a pretty clear idea of how it could be done,  
unfortunately I probably wouldn't be able to do an implementation  
until late December.  So if nobody else does one by then, feel free  
hit me up and remind me, because it's actually something I kind of  
want for myself.  Lately I've been thinking of other possible  
automatically added semantic data, too, like a user ratings system  
extension that works with SMW.

--Audra

On Nov 29, 2007, at 8:08 AM, Denny Vrandečić wrote:

> Any idea how to add page and wiki-meta data to SMW? The problem is, by
> simply adding further special properties (last modified date, creation
> date, etc.) it seems to clutter the property namespace... Well, doing
> the implementation is not trivial either, but heck  :)
>
> Cheers,
> denny
>
>
> Sergey Chernyshev wrote:
>> I use DPL for techpresentations.org   
>> but
>> only because it has access to page meta-data (in my case page  
>> creation
>> dates). I wasn't impressed with DPL's approach and prefer SMW  
>> approach
>> which is about semantic data storage.
>>
>>Sergey
>>
>>
>> On Nov 28, 2007 10:52 AM, Jim Wilson <[EMAIL PROTECTED]
>> > wrote:
>>
>> Of course, I am THRILLED that this is coming down the pipe.
>>
>> Sergey, If you're planning to do something crazy, I suggest  
>> checking
>> out DPL and RegExParserFunctions.
>> Combining {{#ask}} with {{#dpl}} and {{#regex}} can produce  
>> some very
>> neat combinations.  Also, I'm interested to see what you come  
>> up with
>> in the way of {{#ask}} queries.
>>
>> -- Jim
>>
>> On Nov 27, 2007 5:04 PM, Sergey Chernyshev
>> <[EMAIL PROTECTED]
>> > wrote:
>>> Perfect - it works great for what I was planning to use it for!
>> Now almost
>>> no barriers are there ;)
>>>
>>> Sergey
>>>
>>>
>>>
>>>
>>> On Nov 27, 2007 4:43 PM, Markus Krötzsch <
>> [EMAIL PROTECTED]  >  
>> wrote:

 On Dienstag, 27. November 2007, Sergey Chernyshev wrote:
> WOW! Markus, this is great present for being back from
>> vacation! ;) I'll
> test it on my instances as soon as I'll get some time with
>> computer
> tomorrow.

 Great, hope you like it. Surprisingly, most of the work had to
>> go into
 modifying Special:Ask to allow linking to queries using
>> internal links,
>>> and
 into supporting the new separation of printout requests and
>> queries (which
 also makes way for some more "presents"). I had to adopt the
>> Special:ask
 interface a little to account for this. I will drop another
>> short note
>>> about
 recent changes and then be offline for a few days. I guess RC3
>> would be in
 order after this.

 Markus





>
>  Sergey
>
> On Nov 23, 2007 4:01 PM, Markus Krötzsch
>> <[EMAIL PROTECTED] >
>>> wrote:
>> And another note: {{#ask}} is in SVN (in a first version).
>>
>> Working example query:
>>
>> {{#ask: [[Category:Country]] [[borders::Nigeria]] |
>>  ?population|
>>  ?area#km² = ''Size''|
>>  format=list|
>>  limit = 3|
>>  link=all|
>>  intro=Test_|
>> }}
>>
>> Moreover, it is now of course possible to use templates and
>> their
>>> params
>> rather freely in {{#ask}}. Actually even some very
>> unreasonable things
>> work,
>> but many cool things should also be possible. One real
>> issue might be
>> that uninitiated users might nest {{#ask}} in order to emulate
>> -subqueries, even though the latter are much more
>> efficient and
>> complete. Of course nesting sometimes is desirable:
>>
>> {{#ask:
>> [[Category:Country]]
>> [[populatio