RE: Posting unicode data to lucene not working during searching/retreival!

2009-05-20 Thread Uwe Schindler
Indexed data is coming out in the same way as put in. Lucene works with Java Strings, so encoding is irrelevant. When you index your values, you must be sure, to construct your index string/char arrays correctly using the UTF-8 encoding (e.g. by using a standard Java Reader, new String byte[], char

Posting unicode data to lucene not working during searching/retreival!

2009-05-20 Thread KK
How to post utf-8 unicoded data to lucene index. Do we have to specify something special, any sort of flag saying that we're posting unicoded data? I tried to post some utf-8 encoded data, during retrieval I'm not able to see those data , there are just "?" marks in all those places. Earlier I was

corpus vacabulary

2009-05-20 Thread Ridzwan Aminuddin
hi can someone point me in the direction of how i can get a string array of the corpus/index vocabulary from the index using an indexreader? Currently this is what i am doing: IndexReader reader = IndexReader.open(indexdirectorypath); termenumvar = reader.terms(); then i iterate through this ter

Re: About sort questions

2009-05-20 Thread hacklisp
Hi,balasubramanian Thanks for your reply. Both first:25 and second:90 perhaps include 'java' or not. I have set doc#90's boost is 3.15 and doc#25's boost is 1.0. I think that is key. I try to set query term boost to proper value, but it is not fix. to one is okay, but another not. balasubram

Re: About sort questions

2009-05-20 Thread balasubramanian sudaakeran
My guess that this can happen when your document matches more than one condition. For example first:25 could match lang:java as well?? - Original Message From: hacklisp To: java-user@lucene.apache.org Sent: Thursday, May 21, 2009 10:03:52 AM Subject: About sort questions I search

About sort questions

2009-05-20 Thread hacklisp
I search 'lisp' with lucene application using the following query string: uid:5^3 OR uid:10^2 OR lang:lisp I hope result as following: first:5 (which id is 5) second:10 (which id is 10) others:other results sort according to relevance. it is always ok, but sometimes no

Re: CustomScoreQuery numerical precision problem

2009-05-20 Thread Simon Willnauer
Hi there On Tue, May 19, 2009 at 8:32 AM, ac wrote: > hello, > I am using CustomScoreQuery for result ranking. > A field of my documents is parsable as an integer value, the magnide > of which exceeds the precision of the float type. > A sample value of this field is 24118569 > > However, due to

Re: read between the lines of an index

2009-05-20 Thread Erick Erickson
The Lucene In Action book (at least the first edition and, I presume, the second) has exactly this, called SynonymAnalyzer. The basic idea is that at index time you index your multiple terms with no increment between, so all your synonyms get indexed in the same position. I highly recommend the bo

read between the lines of an index

2009-05-20 Thread Timon Roth
dear list i want to add a entry to an index with a custom synomlist to an index. for example with the following text: [i worrie about nothing beacuse this worls is crazy] and i want to add the two custom synonyms [anything]=>[nothing] and [lazy]=>[crazy] so that a search for lazy, crazy not

confusion with questionmark

2009-05-20 Thread Timon Roth
dear list im searching through some lucene(2.9) index built with the GermanAnalyzer (from the package analyzers 2.9). when i search for the word deutschland (query parsed with german alnalyzer transforms to deutschla) i get a few hits. whei im searching for deu?schland i became no results, bec

Re: RangeQuery & TooManyClausesException : Lucene 2.4

2009-05-20 Thread Michael McCandless
Woops -- disregard my comments. I was looking at the unreleased (2.9-dev) version of RangeQuery. In 2.4, RangeQuery will throw TooManyClauses, if the number of terms in the range exceeds BooleanQuery's maxClauseCount. ConstantScoreRangeQuery will not throw that exception. Mike On Wed, May 20, 2

RE: RangeQuery & TooManyClausesException : Lucene 2.4

2009-05-20 Thread Zhang, Lisheng
Hi, I did not see method setConstantScoreRewrite method in RangeQuery class? Best regards, Lisheng -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, May 20, 2009 11:10 AM To: java-user@lucene.apache.org Subject: Re: RangeQuery & TooManyClause

Re: RangeQuery & TooManyClausesException : Lucene 2.4

2009-05-20 Thread Michael McCandless
Hmm... that's actually not true: RangeQuery will still throw that exception, unless you call setConstantScoreRewrite to true (at which point it does the same thing as ConstantScoreRangeQuery, ie that exception will not be thrown). The javadoc for RangeQuery is very misleading. (This happened when

RangeQuery & TooManyClausesException : Lucene 2.4

2009-05-20 Thread Joel Halbert
Hi, Looking at the docs for the 2.4 codebase, for RangeQuery http://lucene.apache.org/java/2_4_0/api/index.html?org/apache/lucene/search/RangeQuery.html there is a comment that a TooManyClauses exception is no longer thrown. Does this mean that it is now safe to use RangeQuery without worrying a

Performing Asynchronous Search

2009-05-20 Thread Amin Mohammed-Coleman
Hi All This may not be a question for this mailing list but i wasn't sure where to start. Please accept my apologies if anyone thinks that this is not the appropriate place for this question. I am currently working on building a proof of concept search solution for my company using Lucene and Hi

Re: How to create a new index

2009-05-20 Thread KK
Thank you ag...@john. This is even better. I don't have to bother about the 3rd argument, right? I'll use the same one everytime for both registering a new core as well as adding docs to an existing one. Thanks, KK. On Wed, May 20, 2009 at 6:54 PM, John Byrne wrote: > Hi KK, > > You're welcome!

Re: How to create a new index

2009-05-20 Thread John Byrne
Hi KK, You're welcome! BTW, I had a quick look at the Javadoc for IndexWriter and noticed this constructor: public IndexWriter(Directory d, Analyzer a) "Constructs an IndexWriter for the index in d, first creating it if it does not already exist." I think that might solve your problem and

Re: How to create a new index

2009-05-20 Thread Erick Erickson
Unless something about your problem space *requires* that you reopen theindex, you're better off just opining it once, writing all your documents to it, then closing it. Although what you're doing will work, it's not very efficient. And the same thing is *especially* true of the searcher. There's

Re: How to create a new index

2009-05-20 Thread KK
Thanks a lot @John. That solved the problem and the other advice is really helpful. I'd have bumped over that otherwise. This clarifies my doubt, that everytime I've to create a new index just call the indexwriter with "true" thereby creating the directory, then start adding docs with "false" as th

Re: How to create a new index

2009-05-20 Thread John Byrne
I think the problem is that you are creating an new index every time you add a document: IndexWriter writer = new IndexWriter(trueIndexPath, new StandardAnalyzer(), true); The last argument, the boolean 'true' tells IndexWriter to overwrite any existing index in that directory. If you set that

Re: Searching index problems with tomcat

2009-05-20 Thread Matthew Hall
Right, so again, you are opening your index by reference there. You application has to assume that the index that its looking for exists in the same directory as the application itself lives. Since you are deploying this application as a deployable war file that's not going to work really wel

Re: Searching index problems with tomcat

2009-05-20 Thread Ian Lea
Marco You haven't answered Matt's question about where you are running it from. Tomcat's default directory may well not be the same as yours. I strongly suggest that you use a full path name and/or provide some evidence that your readers and writers are using the same directory and thus lucene i

Re: How to create a new index

2009-05-20 Thread KK
Thank you very much. I'm using the one mentioned by @Anshum ..but the problem is that after indexing some no of docs what I see is only the last one indexed which clearly indicates that the index is getting overwritten. I'm posing my simple indexer and searcher herewith. Actually I'm trying to craw

Re: How to create a new index

2009-05-20 Thread Anshum
Hi KK, Easier still, you could just open the indexwriter with the last (3rd) arguement as true, this way the indexwriter would create a new index as soon as you start indexing. Also, if you just leave the indexWriter without the 3rd arguement, it'd conditionally create a new directory i.e. only if

Re: How to create a new index

2009-05-20 Thread John Byrne
You can do this with pure Java. Create a file object with the path you want, check if it exists, and it not, create it: File newIndexDir = new File("/foo/bar") if(!newFileDir.exists()) { newDirFile.mkdirs(); } The 'mkdirs()' method creates any necessary parent directories. If you want t

How to create a new index

2009-05-20 Thread KK
How to create a new index? everytime I need to do so , I've to create a new directory and put the path to that, right? how to automate the creation of new directory? I'm a new user of lucene. Please help me out. Thanks, KK.

Re: Searching index problems with tomcat

2009-05-20 Thread Marco Lazzara
I've posted the indexing part,but I don't use this in my app.After I create the index,I put that in a folder like /home/marco/RDFIndexLucece and when I run the query I'm only searching (and not indexing). String[] fieldsearch = new String[] {"name", "synonyms", "propIn"}; //RDFinder rdfind = n

Re: Getting a score of a specific document

2009-05-20 Thread liat oren
Ok, I understand. I will use the HitColector. Thanks a lot for all the explanations! Best, Liat 2009/5/18 Erick Erickson > As best I understand it, you DO NOT WANT A FILTER. Filters do notcontribute > to scoring, therefore do not rank your documents. If you use > a filter, the most irrelevant do

Re: Using Luke on a Lucene Index in a Database

2009-05-20 Thread ChristophD
Ok, so let me clear it up. Lucene offers different types of Directories (org.apache.lucene.store.Directory) into which it stores the index data. Most people probably use the FSDirectory implementation which writes the index data as files into the filesystem. However, we use the DbDirectory implem