Hi,

concerning your questions about DASL:

1.) What content can be searched in a text manner in the first place

the current DASL draft says:
The DAV:contains operator is an optional operator that provides
content-based search capability. This operator implicitly searches against
the text content of a resource, not against content of properties. The
DAV:contains operator is intentionally not overly constrained, in order to
allow the server to do the best job it can in performing the search. 

This means, that any kind of somehow text document might be a candidate,
plain text, word documents, scanned documents in jpg (using a kind of OCR
software), spoken words in a wave file (would be a nice thing to implement
:-), ...

The current generic implementation within slide is a real "basic"
implementation, that loads documents as String, and makes a text SEARCH
within this string.

2.) If it is plain text only, what about character encoding

No special handling for character encoding in the generic <CONTAINS>
implementation. Currently the default encoding is used to generate a String
out of the contentBytes to compare against the literal in <contains>. 

3.) Provided plain text is not the only type of searchable text. What 
might be the mechanism to choose which types are supported for full text 
  search and how this search is implemented.

An own BasicSearch can be implemented. It should be possible to use the
current implementation and overwrite the <contains> operator. Please find an
example to implement an own BasicSearch at
\jakarta-slide\src\stores\org\apache\slide\search\basic\sample in the
current HEAD revision of slide.

4.) Which part of slide should be responsible for full text search? Is 
the kernel the right position?

The kernel is definitly not the right position for a performant content
search. The right place for a SEARCH facility is the store implemtation. Of
course an SQL system would make SQL calls on indexed columns, for a
powerfull text content search indexes somehow should be built. For our
Tamino WebDAV Server we have a Tamino based SEARCH engine, which of course
is more performant than generic SEARCH.

The only place for a generic SEARCH, serving ALL possible stores, is the
kernel. This is what currently is implemented.  


Best regards,
Martin


-----Original Message-----
From: Oliver Zeigermann [mailto:[EMAIL PROTECTED]
Sent: Donnerstag, 4. September 2003 10:56
To: Slide Users Mailing List
Subject: Re: FW: When is the next production release of Slide?


Hi all,

judging from (little) experience and from threads in the mailing list it 
appears to me one of the main problems is that no store in Slide has 
been able to keep track with the implementations of binding and improved 
acl and to a certain degree also DASL.

My impression is that with the introduction of DASL there comes a new 
quality of service into Slide. Before DASL Slide made no assumptions 
what is stored, what type of content is managed. Next to meta 
information and path DASL allows to refer to the content of a resource 
for full text search as a new quality.

This rises the questions of
1.) What content can be searched in a text manner in the first place
2.) If it is plain text only, what about character encoding
3.) Provided plain text is not the only type of searchable text. What 
might be the mechanism to choose which types are supported for full text 
  search and how this search is implemented.
4.) Which part of slide should be responsible for full text search? Is 
the kernel the right position?

All this leads me to the impression that searching should be implemented 
in the stores, not in the kernel. To have a non-toy search in terms of 
speed and accurate semantics, you will need to use the services of an 
underlying store implementation (i.e. Oracle, Tamino or even plain file 
with Lucene).

Well, this are just my very limited thoughts. What do you people think?

Oliver


Pill, Juergen wrote:
> Hello,
> 
> The current version of Slide does contain a series of bug fixes and a
> completed implementation of Delta-V (except the advanced features) and
DASL.
> Currently we are working on two standards:
> 
> 1) Bind
> 2) ACL
> 3) Delta-V (advanced features) is on the radar too.
> 
> The quality of the CVS head is pretty good, IMO. If you have a policy not
to
> use the current CVS head, or to compile the source, there is currently
only
> the frozen version available.
> 
> The lack of documentation is currently caused by a lack of people willing
to
> donate documentation. But Christopher Taylor is contributing very good
> documentation already. This will be available very soon.
> 
> Would there be someone who is willing to take the position as a release
> manager, when Slide 2 will be frozen? If everyone is happy with the
current
> quality we could try to create a new version, once action (1) and (2) is
> completed.
> 
> Best regards,
> 
> Juergen



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to