Re: Highlighting Performance On Large Documents

2010-05-10 Thread Lance Norskog
To search in a field, it has to be indexed. You can store a field without indexing if you want to highlight it. If you index it with the term* options, it should highlight faster. Since these do not speed up higlighting, your analysis stack is probably very simple. The term* options are variations

Re: Highlighting Performance On Large Documents

2010-05-08 Thread Serdar Sahin
Hi, Thanks a lot for the replies, I could have chance today to test them. First of all termVectors/termPositions/termOffsets did not help, it has very little effect, but I tried a workaroud, however it is not as efficient as I thought. From these fields; field name=title type=text

Re: Highlighting Performance On Large Documents

2010-05-08 Thread Lance Norskog
If you want to highlight field X, doing the termOffsets/termPositions/termVectors will make highlighting that field faster. You should make a separate field and apply these options to that field. Now: doing a copyfield adds a value to a multiValued field. For a text field, you get a multi-valued

Re: Highlighting Performance On Large Documents

2010-05-08 Thread Serdar Sahin
Hi, Thanks. However as I said before, termOffsets/termPositions/termVectors had very little effect on the performance and I don't know why. I have done exactly what you are saying but highlighting 10 documents that have 200-400 A4 pages still takes around 2 seconds, depending on the query. I will

Re: Highlighting Performance On Large Documents

2010-05-08 Thread Serdar Sahin
Hi, Sorry for the second e-mail, but for the duplication problem, I have done something wrong, ok now it works, and the query time reduced to 0.1 seconds which is perfect. However, still if I use field name=short_text type=text indexed=false stored=true multiValued=true termVectors=true

Re: Highlighting Performance On Large Documents

2010-05-07 Thread Lance Norskog
Do you have these options turned on when you index the text field: termVectors/termPositions/termOffsets ? Highlighting needs the information created by these anlysis options. If they are not turned on, Solr has load the document text and run the analyzer again with these options on, uses that

Highlighting Performance On Large Documents

2010-05-05 Thread Serdar Sahin
Hi, Currently, there are similar topics active in the mailing list, but it I did not want to steal the topic. I have currently indexed 100.000 documents, they are microsoft office/pdf etc documents I convert them to TXT files before indexing. Files are between 1-500 pages. When I search

Re: Highlighting Performance On Large Documents

2010-05-05 Thread Koji Sekiguchi
(10/05/05 22:08), Serdar Sahin wrote: Hi, Currently, there are similar topics active in the mailing list, but it I did not want to steal the topic. I have currently indexed 100.000 documents, they are microsoft office/pdf etc documents I convert them to TXT files before indexing. Files are