RE: lucene 4.3 seems to be much slower in indexing than lucene 3.6?

2013-07-27 Thread Zhang, Lisheng
Hi, That's a very good point, I will test with a more realistic text (like a novel). Thanks very much for helps, Lisheng -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, July 27, 2013 3:42 AM To: Lucene Users Subject: Re: lucene 4.3 seems to

Query serialization/deserialization

2013-07-27 Thread Denis Bazhenov
I'm looking for a tool to serialize and deserialize Lucene queries. We have tried using Query.toString(), but some queries return string that couldn't be parsed by a QueryParser afterwards. The alternative possibility is to use standard Java serialization mechanism. The reason I'm trying to avoi

Re: need searcher example to read indexes generated by solr

2013-07-27 Thread Erick Erickson
Have you looked at either the Blacklight or Velocity Response Writer? This latter is shipped standard with Solr, access it by the /browse handler. It's pretty easily customizable Blacklight is here: http://projectblacklight.org/ Best Erick On Thu, Jul 25, 2013 at 1:14 PM, mlotfi wrote:

RE: Directly use inverted index

2013-07-27 Thread Uwe Schindler
For a complete in-memory index, use Linux "tmpfs" as in-memory filesystem, and open the directory of this tmpfs with MMapDirectory. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [ma

Re: Directly use inverted index

2013-07-27 Thread Michael McCandless
You can create your own Codec and implement your own formats, e.g. PostingsFormat controls how postings are encoded on disk. It's best to let the OS load the index into RAM ... as long as you have enough free RAM, the OS will use it to cache all hot pages from the index. We do have a RAMDirectory

Re: lucene 4.3 seems to be much slower in indexing than lucene 3.6?

2013-07-27 Thread Michael McCandless
It's also possible 4.x is slower than 3.x for purely random terms: the terms dictionary is completely different. But purely random terms is a poor test since it doesn't match the reality of a typical search index. Mike McCandless http://blog.mikemccandless.com On Fri, Jul 26, 2013 at 2:55 PM,

Re: ERROR: could not read any segments file in directory

2013-07-27 Thread Michael McCandless
OK, you have only one segments file, and it sounds like it was corrupted by the crash of your SAN. I don't think there's much you can do but re-index. Maybe move your index to local storage: it sounds like this SAN is not to be trusted. Separately, your segment numbers are truly immense. Do you

Re: Search a Part of the Sentence/Complete sentence in lucene 4.3

2013-07-27 Thread Michael McCandless
On Sat, Jul 27, 2013 at 3:20 AM, Ankit Murarka wrote: > Ok.I went through the Javadoc of PhraseQuery and tried using position > argument to phrasequery. > > Problem encountered: > > My text contains : Still it is not happening and generally i will be able to > complete it at the earliest. > > The

Re: AnalyzingInfixSuggester

2013-07-27 Thread Michael McCandless
Right, you need those classes from src/test to compile the test case. Just run "ant test -Dtestcase=AnalyzingInfixSuggesterTest" from the lucene/suggest directory. Also, you cannot pass a "real index" to the suggester: it builds the index itself, when you call the .build method. This index is pr

Directly use inverted index

2013-07-27 Thread Airway Wong
Hi, I would like to use Lucene's inverted index directly as building block for experimental purpose. 1. How can I customize the inverted list for different format? Is there any example? 2. Is there an easy way to force load the complete index into memory? Thanks in advance.

Re: Search a Part of the Sentence/Complete sentence in lucene 4.3

2013-07-27 Thread Ankit Murarka
Ok.I went through the Javadoc of PhraseQuery and tried using position argument to phrasequery. Problem encountered: My text contains : Still it is not happening and generally i will be able to complete it at the earliest. The user enters search string : 1. still happening and 2. still it is