Forgot some things:

Am Mittwoch, 25. Februar 2004 13:37 schrieb Daniel Florey:
> Hi,
> I just checked in some classes for the extractor thing.
> I've implemented a very simple demo extractor that extracts data from xml
> documents by doing some configurable xpath queries. If you want to test
> this you have to enable the extractor trigger.
> This is done in the Domain.xml file in the event section:
>
>    <listener classname="org.apache.slide.extractor.ExtractorTrigger">
>        <configuration>
>           <extractor
> classname="org.apache.slide.extractor.SimpleXmlExtractor"
> uri="/files/articles/test.xml">

You can match by exact uris as well as uri substrings. (e.g. 
uri="/files/articles/") and you can do content-type base matching. If you 
want to extract something from all word docs it would look somehow like this:

<listener classname="org.apache.slide.extractor.ExtractorTrigger">
    <configuration>
        <extractor classname="org.apache.slide.extractor.MSWordExtractor" 
content-type="application/ms-word">

Or whatever the content type may be. Uri matching and content-type matching 
can be combined.

You can configure your extractor by implementing the Configuration interface. 
Have a look at the demo extractor. This is simple.
Regards,
Daniel

>              <configuration>
>                 <instruction property="title" xpath="/article/title/text()"
> /> <instruction property="summary"
> xpath="/article/summary/text()" />
>              </configuration>
>           </extractor>
>        </configuration>
>     </listener>
>
> In this example only the document with uri = /files/articles/test.xml will
> be processed. If the content would be:
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <article>
>       <title>Title of article</title>
>       <summary>The summary of this article</summary>
> </article>
>
> there are some new properties (title, summary) available containing the
> text. If some error occurs, the file cannot be uploaded. This is done by
> throwing an extractor exception.
> Any comments are welcome.
> Regards,
> Daniel
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to