[EMAIL PROTECTED] schrieb:
On 3/24/08, Andreas Hartmann <[EMAIL PROTECTED]> wrote:
 atm the index fields are configured for each publication:
   <index id="default-live" analyzer="stopword_en"
 directory="lenya/pubs/default/work/lucene/index/live/index">
     <structure>
       <field id="url" type="keyword" />
       <field id="title" type="text" storetext="true"/>
       <field id="description" type="text" storetext="true"/>
       <field id="subject" type="keyword" storetext="true" />
       <field id="body" type="text" storetext="true"/>
     </structure>
   </index>
 IMO this is an inappropriate place for this configuration. Furthermore,
 it has to match the index XSLTs of all resource types.

 Wouldn't it be better to
 - index all meta data fields
 - configure the indexable fields for each resource type (have to
   conform to the corresponding index XSLTs)

 The index structure would be automatically derived from this
 configuration (basically the union of all fields). Changing the meta
 data or resource type configuration would certainly require to re-index
 the whole content of the web application, but IMO this is not a big issue.

 WDYT?
 -- Andreas

I agree one configuration for all publications is a worthy goal.  My
version was an add-on to a Lenya 1.2.2 Publication and so was not
concerned with integration into core Lenya.  The current
implementation may be derivative.

Extracting all data from any document is great for the search terms.
All text should be included, or do we have field-level security?

No, we only have document-level security.

Should all properties be included?  Should the properties be
associated with the field (element) name?

ATM this is up to the resource type (done using a {resourceType}2index.xsl stylesheet), and IMO we can leave it like this, e.g. map

  <person>
    <name>Henry Hamster</name>
  </person>

to field

  <lucene:document>
    <lucene:field name="personName">Henry Hamster</lucene:field>
  </lucene:document>

It would be nice to have namespaced field names, though, to avoid clashes (see my other mail).

-- Andreas


--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to