Hi
I have found that it is not issue with POI. I extracted text using PoI
but differenlty and the term is extracted properly. When I store the
text and retrieve it the term exists. However running the text through
highlighter doesn't work
I will post test case with plain text file on JIRA. Currently on a
cramped train!
Cheers
On 11 Mar 2009, at 18:11, markharw00d <markharw...@yahoo.co.uk> wrote:
If you can supply a Junit test that recreates the problem I think we
can start to make progress on this.
Amin Mohammed-Coleman wrote:
Hi
Apologies for re sending this mail. Just wondering if anyone has
experienced the below. I'm not sure if this could happen due nature
of document. It does seem strange one term search returns summary
while another does not even though same document is being returned.
I'm asking this so I can code around this if is normal.
Apologies again for re sending this mail
Cheers
Amin
Sent from my iPhone
On 9 Mar 2009, at 07:50, Amin Mohammed-Coleman <ami...@gmail.com>
wrote:
Hi
I am seeing some strange behaviour with the highlighter and I'm
wondering if anyone else is experiencing this. In certain
instances I don't get a summary being generated. I perform the
search and the search returns the correct document. I can see
that the lucene document contains the text in the field. However
after doing:
SimpleHTMLFormatter simpleHTMLFormatter = new
SimpleHTMLFormatter("<span class=\"highlight\"><b>", "</b></span>");
//required for highlighting
Query query2 = multiSearcher.rewrite(query);
Highlighter highlighter = new
Highlighter(simpleHTMLFormatter, new QueryScorer(query2));
...
String text= doc.get(FieldNameEnum.BODY.getDescription());
TokenStream tokenStream =
analyzer.tokenStream(FieldNameEnum.BODY.getDescription(), new
StringReader(text));
String result =
highlighter.getBestFragments(tokenStream, text, 3, "...");
the string result is empty. This is very strange, if i try a
different term that exists in the document then I get a summary.
For example I have a word document that contains the term
"document" and "aspectj". If I search for "document" I get the
correct document but no highlighted summary. However if I search
using "aspectj" I get the same doucment with highlighted summary.
Just to mentioned I do rewrite the original query before
performing the highlighting.
I'm not sure what i'm missing here. Any help would be appreciated.
Cheers
Amin
On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman <ami...@gmail.com
> wrote:
Hi
Got it working! Thanks again for your help!
Amin
On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman <ami...@gmail.com
> wrote:
Thanks! The final piece that I needed to do for the project!
Cheers
Amin
On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <u...@thetaphi.de>
wrote:
> cool. i will use compression and store in index. is there
anything
> special
> i need to for decompressing the text? i presume i can just do
> doc.get("content")?
> thanks for your advice all!
No just use Field.Store.COMPRESS when adding to index and
Document.get()
when fetching. The decompression is automatically done.
You may think, why not enable compression for all fields? The case
is, that
this is an overhead for very small and short fields. So you should
only use
it for large contents (it's the same like compressing very small
files as
ZIP/GZIP: These files mostly get larger than without compression).
Uwe
---
------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---
---------------------------------------------------------------------
No virus found in this incoming message.
Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database: 270.11.10/1995
- Release Date: 03/11/09 08:28:00
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org