On Tuesday 03 February 2009 17:54:29 Ivan Mikhailov wrote:
> One can declare a new "language", as a named set of callbacks that split
> the text into words, normalize that words and decide which words should
> be indexed and which are not. 

OK, that sounds good!

> We did not re-invent bicycles in this area 
> and what was sufficient for other developers in other projects is
> probably useful for you as well.

Certainly!

> The related code is in libsrc/langfunc 
> directory of VOS source tree. Unfortunately, that's C code, it makes
> things a bit more costly than Virtuoso/PL coding.

Yeah, but we have a guy on the project who prefers C++, so he could probably 
do that. Is there any further documentation available on how to do it?

> Even worse, there's no 
> way to change language used by default RDF storage, so now the trick is
> useful only for plain old application-specific tables.

Oh, OK, that sounds like a show-stopper for us, since we are using only the 
quad store. Is there a way to work around this?

Well, the idea we had was that we store everything that needs indexing in a 
separate field, e.g. in addition to dct:title we have a dumb:title, but this 
would lead to data duplication and is certainly less elegant than modifying 
the indexer to do the job.

Kind regards 

Kjetil Kjernsmo
-- 
Senior Knowledge Engineer
Mobile: +47 986 48 234
Email: kjetil.kjern...@computas.com   
Web: http://www.computas.com/

|  SHARE YOUR KNOWLEDGE  |

Computas AS  PO Box 482, N-1327 Lysaker | Phone:+47 6783 1000 | Fax:+47 6783 
1001


Reply via email to