once again in HTML...
here comes the addBinaryValue method body:...
//standard way of indexing
String jcrData = mappings.getPrefix(Name.NS_JCR_URI) + ":data";
if (jcrData.equals(fieldName)) {
InternalValue type = getValue(NameConstants.JCR_MIMETYPE);
if (type != null) {
Metadata metadata = new Metadata();
metadata.set(Metadata.CONTENT_TYPE, type.getString());
// jcr:encoding is not mandatory
InternalValue encoding = getValue(NameConstants.JCR_ENCODING);
if (encoding != null) {
metadata.set(Metadata.CONTENT_ENCODING, encoding.getString());
}
doc.add(createFulltextField(internalValue, metadata));
}
} else {
//everything else gets indexed as well
MimeTypes gk = new MimeTypes();
MimeType mimeType = gk.getMimeType(internalValue.getStream());
Metadata metadata = new Metadata();
metadata.set(Metadata.CONTENT_TYPE, mimeType.getName());
doc.add(createFulltextField(internalValue, metadata));
}
and here we have my custom parser (and I can see it's being started everytime
the binary value with my custom mime type is added):
XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, metadata);
xhtml.startDocument();
...fetch keywords...
for(String value: keywords) {
xhtml.characters(value);
xhtml.characters(" ");
}
xhtml.endDocument();
...
----------------------------------------
> Date: Mon, 30 Aug 2010 09:52:28 +0200
> Subject: Re: Searching for binary values
> From: [email protected]
> To: [email protected]
>
> 2010/8/30 Slavek Tecl :
> >
> > Bloody hotmail, screwed my awesome formatting ;)Hope it's ok now.
>
> Hmmmm...not really
>
> >
> > here comes the addBinaryValue method body:...//standard way of
> > indexingString jcrData = mappings.getPrefix(Name.NS_JCR_URI) + ":data";if
> > (jcrData.equals(fieldName)) { InternalValue type =
> > getValue(NameConstants.JCR_MIMETYPE); if (type != null) { Metadata metadata
> > = new Metadata(); metadata.set(Metadata.CONTENT_TYPE, type.getString()); //
> > jcr:encoding is not mandatory InternalValue encoding =
> > getValue(NameConstants.JCR_ENCODING); if (encoding != null) {
> > metadata.set(Metadata.CONTENT_ENCODING, encoding.getString()); }
> > doc.add(createFulltextField(internalValue, metadata)); }} else {
> > //everything else gets indexed as well MimeTypes gk = new MimeTypes();
> > MimeType mimeType = gk.getMimeType(internalValue.getStream()); Metadata
> > metadata = new Metadata(); metadata.set(Metadata.CONTENT_TYPE,
> > mimeType.getName()); doc.add(createFulltextField(internalValue,
> > metadata));}...
> >
> > and here we have my custom parser (and I can see it's being started
> > everytime the binary value with my custom mime type is
> > added):XHTMLContentHandler xhtml = new XHTMLContentHandler(handler,
> > metadata);xhtml.startDocument();...fetch keywords...for(String value:
> > keywords) { xhtml.characters(value); xhtml.characters("
> > ");}xhtml.endDocument();...
> >> Date: Mon, 30 Aug 2010 09:31:47 +0200
> >> Subject: Re: Searching for binary values
> >> From: [email protected]
> >> To: [email protected]
> >>
> >> Slavek,
> >>
> >> I am no computer :-) Is there a way you format this is little to human
> >> understandable kind of thing?
> >>
> >>
> >> 2010/8/30 Slavek Tecl :
> >>>
> >>> All right, here comes the addBinaryValue method body: ... //standard way
> >>> of indexing String jcrData = mappings.getPrefix(Name.NS_JCR_URI) +
> >>> ":data"; if (jcrData.equals(fieldName)) { InternalValue type =
> >>> getValue(NameConstants.JCR_MIMETYPE); if (type != null) { Metadata
> >>> metadata = new Metadata(); metadata.set(Metadata.CONTENT_TYPE,
> >>> type.getString());
> >>> // jcr:encoding is not mandatory InternalValue encoding =
> >>> getValue(NameConstants.JCR_ENCODING); if (encoding != null) {
> >>> metadata.set(Metadata.CONTENT_ENCODING, encoding.getString()); }
> >>> doc.add(createFulltextField(internalValue, metadata)); } } else {
> >>> //everything else gets indexed as well MimeTypes gk = new MimeTypes();
> >>> MimeType mimeType = gk.getMimeType(internalValue.getStream());
> >>> Metadata metadata = new Metadata(); metadata.set(Metadata.CONTENT_TYPE,
> >>> mimeType.getName()); doc.add(createFulltextField(internalValue,
> >>> metadata)); } ...
> >>> my custom parser leverages XMLContentHandler like this (and I can see
> >>> it's being started everytime the binary value with my custom mime type is
> >>> added):
> >>> ...XHTMLContentHandler xhtml = new XHTMLContentHandler(handler,
> >>> metadata);xhtml.startDocument();... for(String value: keywords) {
> >>> xhtml.characters(value); xhtml.characters(" "); //xhtml.element("p",
> >>> value); }xhtml.endDocument();...
> >>>> Date: Mon, 30 Aug 2010 09:12:16 +0200
> >>>> Subject: Re: Searching for binary values
> >>>> From: [email protected]
> >>>> To: [email protected]
> >>>>
> >>>> 2010/8/27 Slavek Tecl :
> >>>>> In my case the addBinaryValue has been overriden in my custom class so
> >>>>> I'm adding this field to the document as well.
> >>>>
> >>>> Is it possible that you made some error in this? I can't judge it
> >>>> without code
> >>>>
> >>>> Regards Ard
> >>>>
> >>>>>
> >>>>>> Date: Fri, 27 Aug 2010 17:16:56 +0200
> >>>>>> Subject: Re: Searching for binary values
> >>>>>> From: [email protected]
> >>>>>> To: [email protected]
> >>>>>>
> >>>>>> 2010/8/27 Slavek Tecl :
> >>>>>>>
> >>>>>>> I'm looking for a clarification how the query is processed in my
> >>>>>>> customized jackrabbit instance. In my case the NodeIndexer is
> >>>>>>> subclassed so it can add the binary value to the indexed Document
> >>>>>>> even if it does not have nt:resource type. Then Tika has been
> >>>>>>> customized with my mimetype so the parser is able to recognize the
> >>>>>>> binary stream through it's magic and of course the tika's Parser
> >>>>>>> object was implemented to support the custom binary stream to extract
> >>>>>>> words from it.If I run a query on nt:resource nodes it correctly
> >>>>>>> returns files including the searched word as expected but when I
> >>>>>>> invoke a similar query on a binary property (and the content of this
> >>>>>>> binary property is exactly the type of the stream Tika can parse) it
> >>>>>>> does not return anything - is there a way out?
> >>>>>>
> >>>>>>
> >>>>>> Binary properties are only indexed on nodescope level, not on property
> >>>>>> level.
> >>>>>>
> >>>>>> See protected void addBinaryValue(Document doc,
> >>>>>> String fieldName,
> >>>>>> InternalValue internalValue) {
> >>>>>>
> >>>>>> and then specifically doc.add(createFulltextField(internalValue,
> >>>>>> metadata));
> >>>>>>
> >>>>>> in jr NodeIndexer
> >>>>>>
> >>>>>> Regards Ard
> >>>>>
> >>>
> >