Ivan and Kingsley, Thanks for your responses! Things are much clearer now, If I want to search URI's in the property field I should use regex.
Marv. --- On Mon, 12/1/08, Ivan Mikhailov <imikhai...@openlinksw.com> wrote: > From: Ivan Mikhailov <imikhai...@openlinksw.com> > Subject: Re: [Dbpedia-discussion] Text Searching in Virtuoso / 2 questions > To: aunm...@yahoo.com > Cc: dbpedia-discuss...@lists.sourceforge.net, > virtuoso-users@lists.sourceforge.net > Date: Monday, December 1, 2008, 5:22 AM > Hello Marvin, > > > Why does bif:contains return faster than REGEX search, > > bif:contains return faster because it uses special > full-text index to > get IDs of objects that contain words mentioned in the > query, it do not > scan the whole table like Regex-based query. The advantage > of REGEX is > flexibility: one may search for specific fragments of words > or for > special data like protein coding sequences. Moreover, > bif:contains may > be used only for variables that are directly bound in > object position of > triple, not for values of expressions of any other sorts. > > > and why are they returning a different number of > counted rows? > > Because bif:contains looks for phrases or independent > words, and it may > normalize words that use non-canonical Unicode chars, and > it can search > in XML/HTML documents. In addition, even if one and the > same query > string is valid for both REGEX and bif:contains then the > meaning may > differ. For REGEX, pattern "Paris Hilton" is > precisely two words > delimited by single whitespace byte. For bif:contains, > "Paris Hilton" > means that the document should contain word > "Paris" and word "Hilton", > in any places and in any order. See > http://docs.openlinksw.com/virtuoso/queryingftcols.html > for details of bif:contains query string syntax. > > > The search string is not the real string but it does > not change the > question. Which should we use? > > I'd strongly recommend to report real details as soon > as the question is > about real problem on a real system -- this may result in > really useful > answers. > > > *Question 2* > > Why can't I search a property or subject? > > > > SQL> sparql select count(*) where {?s ?p ?o. ?p > bif:contains "searchstring"}; > > > > *** Error 37000: [Virtuoso Driver][Virtuoso > Server]SQ074: Line 1: SP031: SPARQL compiler: The group does > not contain triple pattern with '$p' object before > bif:contains() predicate > > at line 1 of Top-Level: > > sparql select count(*) where {?s ?p ?o. ?p > bif:contains "searchstring"} > > bif:contains uses the free-text index on table of distinct > objects. > Subjects and predicates are not objects, moreover, they are > not texts at > all. The query has failed because there's no triple > pattern with ?p in > object (i.e. third) position in a triple. > > Best Regards, > > Ivan Mikhailov, > OpenLink Software > http://virtuoso.openlinksw.com