Re: Highlighter doesn't highlight wildcard queries after updating to 2.9.1/3.0.0

2010-01-10 Thread Mohsen Saboorian
OK, to answer my own question: I found from the following issue that if I do a query.rewrite(), highlighter doesn't work. https://issues.apache.org/jira/browse/LUCENE-1425 I did rewrite() in order to find all matched terms for example in a prefix query, but as this doesn't work anymore like Lucen

Re: Highlight the whole sentence instead of the partial matching terms

2010-01-10 Thread Li Leon
Just figured out, missed "lucene-memory-2.4.1.jar" external jar inclusion. With that jar included, "\"Giving and\"~11" only got "Giving" & "and" highlighted but not the whole sentence, any ideas? Thanks, 2010/1/11 Li Leon > *Hi all,* > ** > *I wanted to use to following code to highlight the w

Highlight the whole sentence instead of the partial matching terms

2010-01-10 Thread Li Leon
*Hi all,* ** *I wanted to use to following code to highlight the whole sentence with "\"something\"~n" to be parsed.* ** *The QueryParser part worked well, but when integrated with Highlighter, it ended up with exception. * ** *Does anyone have any clue as I'm investigating this.* ** ** *Thanks,* *

Re: Highlighter doesn't highlight wildcard queries after updating to 2.9.1/3.0.0

2010-01-10 Thread Mohsen Saboorian
The problem comes from this method: org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(Query, Map) The query passed to this method is of type org.apache.lucene.search.ConstantScoreQuery, but it matches non of 'instanceof' checkings in this method, so no WeightedSpanTerm is extra

Re: a complete solution for building a website search with lucene

2010-01-10 Thread Simon Willnauer
You should really look at Nutch. from the website http://lucene.apache.org/nutch: Nutch is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc. so

Re: a complete solution for building a website search with lucene

2010-01-10 Thread jyzhou817
Hi, Have you implemented such web search in your web application development?  As detailed as possible. example: 1) index: ? 2) search: Lucene Please do advise. Thanks. --- On Sat, 9/1/10, Simon Willnauer wrote: From: Simon Willnauer Subject: Re: a complete solution for building a websit

Re: a complete solution for building a website search with lucene

2010-01-10 Thread jyzhou817
Thanks. --- On Sat, 9/1/10, Simon Willnauer wrote: From: Simon Willnauer Subject: Re: a complete solution for building a website search with lucene To: java-user@lucene.apache.org Date: Saturday, 9 January, 2010, 6:16 PM I don't know that much about nutch but hadoop shouldn't really run under

London Search Social - this Tuesday, 12th January

2010-01-10 Thread Richard Marr
Hi all, Apologies for the cross-post. If you're near London on Tuesday the 12th Jan (i.e. this Tuesday) please come along and geek with us over a beer or two. All experience levels welcome, don't be scared. Details on the Meetup page below... (please sign up on there if you're interested in subseq

Re: Is there a way to limit the size of an index?

2010-01-10 Thread Dvora
I'm storing and reading the documents using Compass, not Lucene directly. I didn't touch those parameters, so I guess the default values are being used (I do see cfs files in the index). How the ramBufferSizeMB parameter affect the files size? What value should I use in order to have 6MB files?

Re: Is there a way to limit the size of an index?

2010-01-10 Thread Michael McCandless
What did you set your IndexWriter ramBufferSizeMB to? That controls how large the initial segments are. Are you using compound file format? (that's the default). Mike On Sun, Jan 10, 2010 at 7:03 AM, Dvora wrote: > > Oh, as an excercise I tried to create 6MB files. Using the rule mentioned >

Re: Is there a way to limit the size of an index?

2010-01-10 Thread Dvora
Oh, as an excercise I tried to create 6MB files. Using the rule mentioned before, I set the maxMergeMB to 0.6 (and then 0.62, 06.64... 1.8) and used the default mergeFactor - I thought that should do for 6MB files... Michael McCandless-2 wrote: > > What settings did you use (mergeFactor, maxMe

Re: Is there a way to limit the size of an index?

2010-01-10 Thread Michael McCandless
What settings did you use (mergeFactor, maxMergeMB)? "nothing in the middle" is normal. Segments sizes are often rather quantized, for a newly created index. Ie a bunch of segs will be 1 MB, a bunch ~8, a bunch ~64 (assuming the IW flushes at ~ 1MB and mergeFactor is 8). Mike On Sat, Jan 9, 20