All right, here comes the addBinaryValue method body: ...
//standard way of indexing String jcrData =
mappings.getPrefix(Name.NS_JCR_URI) + ":data"; if
(jcrData.equals(fieldName)) { InternalValue type =
getValue(NameConstants.JCR_MIMETYPE); if (type != null) {
Metadata metadata = new Metadata();
metadata.set(Metadata.CONTENT_TYPE, type.getString());
// jcr:encoding is not mandatory
InternalValue encoding = getValue(NameConstants.JCR_ENCODING);
if (encoding != null) {
metadata.set(Metadata.CONTENT_ENCODING,
encoding.getString()); }
doc.add(createFulltextField(internalValue, metadata));
} } else { //everything else gets indexed as well
MimeTypes gk = new MimeTypes(); MimeType mimeType =
gk.getMimeType(internalValue.getStream());
Metadata metadata = new Metadata();
metadata.set(Metadata.CONTENT_TYPE, mimeType.getName());
doc.add(createFulltextField(internalValue, metadata)); } ...
my custom parser leverages XMLContentHandler like this (and I can see it's
being started everytime the binary value with my custom mime type is added):
...XHTMLContentHandler xhtml = new XHTMLContentHandler(handler,
metadata);xhtml.startDocument();... for(String value: keywords) {
xhtml.characters(value); xhtml.characters(" ");
//xhtml.element("p", value); }xhtml.endDocument();...
> Date: Mon, 30 Aug 2010 09:12:16 +0200
> Subject: Re: Searching for binary values
> From: [email protected]
> To: [email protected]
>
> 2010/8/27 Slavek Tecl <[email protected]>:
> > In my case the addBinaryValue has been overriden in my custom class so I'm
> > adding this field to the document as well.
>
> Is it possible that you made some error in this? I can't judge it without code
>
> Regards Ard
>
> >
> >> Date: Fri, 27 Aug 2010 17:16:56 +0200
> >> Subject: Re: Searching for binary values
> >> From: [email protected]
> >> To: [email protected]
> >>
> >> 2010/8/27 Slavek Tecl <[email protected]>:
> >> >
> >> > I'm looking for a clarification how the query is processed in my
> >> > customized jackrabbit instance. In my case the NodeIndexer is subclassed
> >> > so it can add the binary value to the indexed Document even if it does
> >> > not have nt:resource type. Then Tika has been customized with my
> >> > mimetype so the parser is able to recognize the binary stream through
> >> > it's magic and of course the tika's Parser object was implemented to
> >> > support the custom binary stream to extract words from it.If I run a
> >> > query on nt:resource nodes it correctly returns files including the
> >> > searched word as expected but when I invoke a similar query on a binary
> >> > property (and the content of this binary property is exactly the type of
> >> > the stream Tika can parse) it does not return anything - is there a way
> >> > out?
> >>
> >>
> >> Binary properties are only indexed on nodescope level, not on property
> >> level.
> >>
> >> See protected void addBinaryValue(Document doc,
> >> String fieldName,
> >> InternalValue internalValue) {
> >>
> >> and then specifically doc.add(createFulltextField(internalValue,
> >> metadata));
> >>
> >> in jr NodeIndexer
> >>
> >> Regards Ard
> >