once again in HTML...

 here comes the addBinaryValue method body:...

 
 //standard way of indexing
String jcrData = mappings.getPrefix(Name.NS_JCR_URI) + ":data";
if (jcrData.equals(fieldName)) {
InternalValue type = getValue(NameConstants.JCR_MIMETYPE);
if (type != null) {
    Metadata metadata = new Metadata();
    metadata.set(Metadata.CONTENT_TYPE, type.getString());
    // jcr:encoding is not mandatory
    InternalValue encoding = getValue(NameConstants.JCR_ENCODING);
    if (encoding != null) {
       metadata.set(Metadata.CONTENT_ENCODING, encoding.getString());
   }
   doc.add(createFulltextField(internalValue, metadata));
}
} else {
//everything else gets indexed as well
MimeTypes gk = new MimeTypes();
MimeType mimeType = gk.getMimeType(internalValue.getStream());
      
Metadata metadata = new Metadata();
metadata.set(Metadata.CONTENT_TYPE, mimeType.getName());
doc.add(createFulltextField(internalValue, metadata));
}



and here we have my custom parser (and I can see it's being started everytime 
the binary value with my custom mime type is added):

XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, metadata);
xhtml.startDocument();
...fetch keywords...
for(String value: keywords) { 
xhtml.characters(value); 
xhtml.characters(" ");
}
xhtml.endDocument();
...







----------------------------------------
> Date: Mon, 30 Aug 2010 09:52:28 +0200
> Subject: Re: Searching for binary values
> From: [email protected]
> To: [email protected]
>
> 2010/8/30 Slavek Tecl :
> >
> > Bloody hotmail, screwed my awesome formatting ;)Hope it's ok now.
>
> Hmmmm...not really
>
> >
> > here comes the addBinaryValue method body:...//standard way of 
> > indexingString jcrData = mappings.getPrefix(Name.NS_JCR_URI) + ":data";if 
> > (jcrData.equals(fieldName)) { InternalValue type = 
> > getValue(NameConstants.JCR_MIMETYPE); if (type != null) { Metadata metadata 
> > = new Metadata(); metadata.set(Metadata.CONTENT_TYPE, type.getString()); // 
> > jcr:encoding is not mandatory InternalValue encoding = 
> > getValue(NameConstants.JCR_ENCODING); if (encoding != null) { 
> > metadata.set(Metadata.CONTENT_ENCODING, encoding.getString()); } 
> > doc.add(createFulltextField(internalValue, metadata)); }} else { 
> > //everything else gets indexed as well MimeTypes gk = new MimeTypes(); 
> > MimeType mimeType = gk.getMimeType(internalValue.getStream()); Metadata 
> > metadata = new Metadata(); metadata.set(Metadata.CONTENT_TYPE, 
> > mimeType.getName()); doc.add(createFulltextField(internalValue, 
> > metadata));}...
> >
> > and here we have my custom parser (and I can see it's being started 
> > everytime the binary value with my custom mime type is 
> > added):XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, 
> > metadata);xhtml.startDocument();...fetch keywords...for(String value: 
> > keywords) { xhtml.characters(value); xhtml.characters(" 
> > ");}xhtml.endDocument();...
> >> Date: Mon, 30 Aug 2010 09:31:47 +0200
> >> Subject: Re: Searching for binary values
> >> From: [email protected]
> >> To: [email protected]
> >>
> >> Slavek,
> >>
> >> I am no computer :-) Is there a way you format this is little to human
> >> understandable kind of thing?
> >>
> >>
> >> 2010/8/30 Slavek Tecl :
> >>>
> >>> All right, here comes the addBinaryValue method body: ... //standard way 
> >>> of indexing String jcrData = mappings.getPrefix(Name.NS_JCR_URI) + 
> >>> ":data"; if (jcrData.equals(fieldName)) { InternalValue type = 
> >>> getValue(NameConstants.JCR_MIMETYPE); if (type != null) { Metadata 
> >>> metadata = new Metadata(); metadata.set(Metadata.CONTENT_TYPE, 
> >>> type.getString());
> >>> // jcr:encoding is not mandatory InternalValue encoding = 
> >>> getValue(NameConstants.JCR_ENCODING); if (encoding != null) { 
> >>> metadata.set(Metadata.CONTENT_ENCODING, encoding.getString()); }
> >>> doc.add(createFulltextField(internalValue, metadata)); } } else { 
> >>> //everything else gets indexed as well MimeTypes gk = new MimeTypes(); 
> >>> MimeType mimeType = gk.getMimeType(internalValue.getStream());
> >>> Metadata metadata = new Metadata(); metadata.set(Metadata.CONTENT_TYPE, 
> >>> mimeType.getName()); doc.add(createFulltextField(internalValue, 
> >>> metadata)); } ...
> >>> my custom parser leverages XMLContentHandler like this (and I can see 
> >>> it's being started everytime the binary value with my custom mime type is 
> >>> added):
> >>> ...XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, 
> >>> metadata);xhtml.startDocument();... for(String value: keywords) { 
> >>> xhtml.characters(value); xhtml.characters(" "); //xhtml.element("p", 
> >>> value); }xhtml.endDocument();...
> >>>> Date: Mon, 30 Aug 2010 09:12:16 +0200
> >>>> Subject: Re: Searching for binary values
> >>>> From: [email protected]
> >>>> To: [email protected]
> >>>>
> >>>> 2010/8/27 Slavek Tecl :
> >>>>> In my case the addBinaryValue has been overriden in my custom class so 
> >>>>> I'm adding this field to the document as well.
> >>>>
> >>>> Is it possible that you made some error in this? I can't judge it 
> >>>> without code
> >>>>
> >>>> Regards Ard
> >>>>
> >>>>>
> >>>>>> Date: Fri, 27 Aug 2010 17:16:56 +0200
> >>>>>> Subject: Re: Searching for binary values
> >>>>>> From: [email protected]
> >>>>>> To: [email protected]
> >>>>>>
> >>>>>> 2010/8/27 Slavek Tecl :
> >>>>>>>
> >>>>>>> I'm looking for a clarification how the query is processed in my 
> >>>>>>> customized jackrabbit instance. In my case the NodeIndexer is 
> >>>>>>> subclassed so it can add the binary value to the indexed Document 
> >>>>>>> even if it does not have nt:resource type. Then Tika has been 
> >>>>>>> customized with my mimetype so the parser is able to recognize the 
> >>>>>>> binary stream through it's magic and of course the tika's Parser 
> >>>>>>> object was implemented to support the custom binary stream to extract 
> >>>>>>> words from it.If I run a query on nt:resource nodes it correctly 
> >>>>>>> returns files including the searched word as expected but when I 
> >>>>>>> invoke a similar query on a binary property (and the content of this 
> >>>>>>> binary property is exactly the type of the stream Tika can parse) it 
> >>>>>>> does not return anything - is there a way out?
> >>>>>>
> >>>>>>
> >>>>>> Binary properties are only indexed on nodescope level, not on property 
> >>>>>> level.
> >>>>>>
> >>>>>> See protected void addBinaryValue(Document doc,
> >>>>>> String fieldName,
> >>>>>> InternalValue internalValue) {
> >>>>>>
> >>>>>> and then specifically doc.add(createFulltextField(internalValue, 
> >>>>>> metadata));
> >>>>>>
> >>>>>> in jr NodeIndexer
> >>>>>>
> >>>>>> Regards Ard
> >>>>>
> >>>
> >

                                          

Reply via email to