Re: Excerpts Question

Marcel Reutegger Wed, 04 Jun 2008 01:35:17 -0700

Hi Marc,

please check the following points:

- configuration changes to repository.xml only affect newly created workspaces,make sure you changed any existing workspace.xml files

- changes to parameters 'suppportHightlighting' and 'textFilterClasses' requirethat you re-index the workspace, otherwise only newly added resources areindexed according to the new value.


regards
 marcel

Marc Schriftman wrote:

Hey y'all

A quick excerpt question, if you don't mind. I've configured my repository
for excerpts:

<param name="supportHighlighting" value="true"/>
<param name="excerptProviderClass"
value="org.apache.jackrabbit.core.query.lucene.DefaultHTMLExcerpt"/>
<param name="textFilterClasses" value="
            org.apache.jackrabbit.extractor.HTMLTextExtractor,
            org.apache.jackrabbit.extractor.MsExcelTextExtractor,
            org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
            org.apache.jackrabbit.extractor.MsWordTextExtractor,
            org.apache.jackrabbit.extractor.PdfTextExtractor,
            org.apache.jackrabbit.extractor.PlainTextExtractor
"/>

and my code looks like this:

Query query = queryManager.createQuery("//element(*,
nt:resource)[jcr:contains(., '" + partial +
"')]/(@jcr:uuid|rep:excerpt(.))", Query.XPATH);
RowIterator iter = query.execute().getRows();
while (iter.hasNext()) {
final Row row = iter.nextRow();
final String uuid = row.getValue("jcr:uuid").getString();
final String excerpt = row.getValue("rep:excerpt(.)").getString();
getWriter().println(excerpt);

and this is what I'm getting:

<excerpt><fragment>238b244d-8ed2-4e6b-b319-1c26256eb580 ...
63f7bdc2-0667-4366-bed8-5c0928fba5d2 ...
application/vnd.ms-powerpoint</fragment></excerpt>
<excerpt><fragment>0affc599-1dfc-4813-8c57-93a8d6349226 ...
f00a9ba8-7e69-4337-be02-49fcffc6fb72 ...
application/pdf</fragment></excerpt>


Anyone know what I'm doing wrong? It feels like it might be configuration
related, since that's not even the correct format for the
DefaultHTMLExcerpt, but what's with the guid weirdness?

Thanks in advance,

Marc Schriftman

Re: Excerpts Question

Reply via email to