Hi, concerning your questions about DASL:
1.) What content can be searched in a text manner in the first place the current DASL draft says: The DAV:contains operator is an optional operator that provides content-based search capability. This operator implicitly searches against the text content of a resource, not against content of properties. The DAV:contains operator is intentionally not overly constrained, in order to allow the server to do the best job it can in performing the search. This means, that any kind of somehow text document might be a candidate, plain text, word documents, scanned documents in jpg (using a kind of OCR software), spoken words in a wave file (would be a nice thing to implement :-), ... The current generic implementation within slide is a real "basic" implementation, that loads documents as String, and makes a text SEARCH within this string. 2.) If it is plain text only, what about character encoding No special handling for character encoding in the generic <CONTAINS> implementation. Currently the default encoding is used to generate a String out of the contentBytes to compare against the literal in <contains>. 3.) Provided plain text is not the only type of searchable text. What might be the mechanism to choose which types are supported for full text search and how this search is implemented. An own BasicSearch can be implemented. It should be possible to use the current implementation and overwrite the <contains> operator. Please find an example to implement an own BasicSearch at \jakarta-slide\src\stores\org\apache\slide\search\basic\sample in the current HEAD revision of slide. 4.) Which part of slide should be responsible for full text search? Is the kernel the right position? The kernel is definitly not the right position for a performant content search. The right place for a SEARCH facility is the store implemtation. Of course an SQL system would make SQL calls on indexed columns, for a powerfull text content search indexes somehow should be built. For our Tamino WebDAV Server we have a Tamino based SEARCH engine, which of course is more performant than generic SEARCH. The only place for a generic SEARCH, serving ALL possible stores, is the kernel. This is what currently is implemented. Best regards, Martin -----Original Message----- From: Oliver Zeigermann [mailto:[EMAIL PROTECTED] Sent: Donnerstag, 4. September 2003 10:56 To: Slide Users Mailing List Subject: Re: FW: When is the next production release of Slide? Hi all, judging from (little) experience and from threads in the mailing list it appears to me one of the main problems is that no store in Slide has been able to keep track with the implementations of binding and improved acl and to a certain degree also DASL. My impression is that with the introduction of DASL there comes a new quality of service into Slide. Before DASL Slide made no assumptions what is stored, what type of content is managed. Next to meta information and path DASL allows to refer to the content of a resource for full text search as a new quality. This rises the questions of 1.) What content can be searched in a text manner in the first place 2.) If it is plain text only, what about character encoding 3.) Provided plain text is not the only type of searchable text. What might be the mechanism to choose which types are supported for full text search and how this search is implemented. 4.) Which part of slide should be responsible for full text search? Is the kernel the right position? All this leads me to the impression that searching should be implemented in the stores, not in the kernel. To have a non-toy search in terms of speed and accurate semantics, you will need to use the services of an underlying store implementation (i.e. Oracle, Tamino or even plain file with Lucene). Well, this are just my very limited thoughts. What do you people think? Oliver Pill, Juergen wrote: > Hello, > > The current version of Slide does contain a series of bug fixes and a > completed implementation of Delta-V (except the advanced features) and DASL. > Currently we are working on two standards: > > 1) Bind > 2) ACL > 3) Delta-V (advanced features) is on the radar too. > > The quality of the CVS head is pretty good, IMO. If you have a policy not to > use the current CVS head, or to compile the source, there is currently only > the frozen version available. > > The lack of documentation is currently caused by a lack of people willing to > donate documentation. But Christopher Taylor is contributing very good > documentation already. This will be available very soon. > > Would there be someone who is willing to take the position as a release > manager, when Slide 2 will be frozen? If everyone is happy with the current > quality we could try to create a new version, once action (1) and (2) is > completed. > > Best regards, > > Juergen --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
