I think the main point to understand is that the xsl:for-each selects one 
managed foxml:datastream element at a time, it has an ID attribute with a 
value, e.g. "PDF_DOC". Then the IFname attribute of the IndexField element gets 
its value from the concat() function, e.g. "dsm.PDF_DOC". So the prerequisite 
is that your foxml has such a datastream, and it is the mimetype of this that 
is used by exts:getDatastreamText() to extract the text from the document.

Best regards,
Gert

From: Matteo Boschini [mailto:[email protected]]
Sent: 19. oktober 2010 16:22
To: Fedora Users
Subject: Re: [fcrepo-user] dummy question on gsearch

Ok, solved but I do not understand why:
dsm is defined as TOKENIZED in basicFoxmlToLucene, but if I modify 
index.properties with this line:

fgsindex.untokenizedFields              = dsm

"magicaly" dsm get indexed.
Why ? What am I missing ?
On Tue, Oct 19, 2010 at 3:35 PM, Matteo Boschini 
<[email protected]<mailto:[email protected]>> wrote:
Sorry for this very dummy/stupid question...
I've succeeded in setting up gsearch with full-text datastream Lucene indexing 
at least 4 times, but now I an no longer do it...

I have a BasicIndex config/setup, and am trying to get some PDF datastreams 
full-text indexed.

basicFoxmlToLucene has lines saying:

<xsl:for-each select="foxml:datastre...@control_group='M']">
                                <IndexField index="TOKENIZED" store="YES" 
termVector="NO">
                                        <xsl:attribute name="IFname">
                                                <xsl:value-of 
select="concat('dsm.', @ID)"/>
                                        </xsl:attribute>
                                        <xsl:value-of 
select="exts:getDatastreamText($PID, $REPOSITORYNAME, @ID, $FEDORASOAP, 
$FEDORAUSER, $FEDORAPASS, $TRUSTSTOREPATH, $TRUSTSTOREPASS)"/>
                                </IndexField>
                        </xsl:for-each>

thus I assume that Managed datastreams, of a MIME-type defined in 
fedoragsearch.properties (actualy, application/pdf, it's there by default), 
should be indexed, but they're not (I checked also with luke).
And in fact, in browseIndex, the FieldName  dms.ID is not listed.

I'm surely missing something stupid, may be someone out there can help me...



------------------------------------------------------------------------------
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to