Hi,
I was looking into Lucene in-memory Indexes using RAMDirectory.
It has also provided with something MMapDirectory
I want the indexes to persist , so want go for FSDirectory. But to enhance
the searching capability , need to put the indexes onto
RAM. Now , problem is how can i synchronise
Have you run it through a memory profiler yet? Seems the obvious next step.
If that doesn't help, cut it down to the simplest possible
self-contained program that demonstrates the problem and post it here.
--
Ian.
On Thu, Mar 4, 2010 at 6:04 AM, ajay_gupta ajay...@gmail.com wrote:
Erick,
I agree, memory profiler or heap dump or small test case is the next
step... the code looks fine.
This is always a single thread adding docs?
Are you really certain that the iterator only iterates over 2500 docs?
What analyzer are you using?
Mike
On Thu, Mar 4, 2010 at 4:50 AM, Ian Lea
Hi there, Could someone help me with the usage of DuplicateFilters. Here is
my problem
I have created a search index on book Id , title ,and author from a database
of books which fall under various categories. Some books fall under more
than one category. Now, when i issue a search, I get back
Hi,
I would like to submit SpanQueries in Luke. AFAIK this isn't doable out
of the box.
What would be the way to go? Replace the built-in QueryParser by e.g.
the xml-query-parser from the contrib section?
Thanks,
Rene
-
You'd probably get much more pertinent answers asking
on the CLucene, see:
http://sourceforge.net/apps/mediawiki/clucene/index.php?title=Support
http://sourceforge.net/apps/mediawiki/clucene/index.php?title=SupportErick
On Thu, Mar 4, 2010 at 3:42 AM, suman.hol...@zapak.co.in wrote:
Hi,
I
I'm using NOT_ANALYZED because I have a list of text items to index
where some of the items are single words and some of the items are two
words or more with punctuation. My problem is that sometimes one of the
words in a item with two or more words matches one of the single text
items. That
I'm still struggling with your overall goal here, but...
It sounds like what you're looking for is an exact match
in some cases but not others? In which case you could
think about indexing the info: field in a second field and
adding a clause against *that* field for your phrase case.
Yep. PerFieldAnalyzerWrapper seems to have solved my problem.
Thanks,
Paul
-Original Message-
From: java-user-return-45289-paul.b.murdoch=saic@lucene.apache.org
[mailto:java-user-return-45289-paul.b.murdoch=saic@lucene.apache.org
] On Behalf Of Erick Erickson
Sent: Thursday,
If the field you want to use for deduping is ISBN, create a
DuplicateFilter using whatever your ISBN field name is as the field
name and pass that to one of the search methods that takes a filter.
If your index is large I'd be worried about performance and would look
at deduping at indexing time
On 2010-03-04 14:13, Rene Hackl-Sommer wrote:
Hi,
I would like to submit SpanQueries in Luke. AFAIK this isn't doable out
of the box.
What would be the way to go? Replace the built-in QueryParser by e.g.
the xml-query-parser from the contrib section?
The upcoming Luke 1.0.1 will support this
Hi Andrzej,
Thanks! I'll keep my eyes open for that.
FWIW, implementing this by replacing the QueryParser with the CoreParser
worked fine.
Thanks again,
Rene
Am 04.03.2010 16:22, schrieb Andrzej Bialecki:
On 2010-03-04 14:13, Rene Hackl-Sommer wrote:
Hi,
I would like to submit
I tried MultiTermQuery in combination with setRewriteMethod:
MultiTermQuery mtq = new WildcardQuery(new Term(FIELD, queryString));
mtq.setRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
Did you also use Lucene 3.0.0?
--
View this message in context:
I used Lucene.Net 2.9.2. Didn't it work?
DIGY
-Original Message-
From: halbtuerderschwarze [mailto:halbtuerderschwa...@web.de]
Sent: Thursday, March 04, 2010 6:15 PM
To: java-user@lucene.apache.org
Subject: RE: FastVectorHighlighter truncated queries
I tried MultiTermQuery in
Dear All
Hope someone can help. I'm trying to run the demo's that came with Lucene
(3.0.0). I extracted the tar.gz to a directory /home/paul/bin/lucene-3.0.0
and changed into the directory. The contents of the directory are as
follows:
total 2288
-rw-r--r-- 1 paul paul3759 2009-11-16
:I was wondering why TF method gets a float parameter. Isn't frequency
: always considered to be integer?
:
:public abstract float tf(float freq)
Take a look at how PhraseQuery and SPanNearQuery use tf(float).
For simple terms (and TermQuery) tf is always an integer, but when dealing
Dear All
Further to my previous email I notice I made a mistake with the second
example. When I entered the second command it actually read:
java -cp org.apache.lucene.demo.IndexFiles docs
This is what gave the strange error about the docs Class was. If I issue
the correct command:
java
On 2010-03-04 17:56, Otis Gospodnetic wrote:
Andrzej,
Does that mean the regular Lucene QP will get Span query syntax support (vs.
having it in that separate Surround QP)? Or maybe that already happened and I
missed it? :)
I wish that were the case ;)
No, this simply means that you will
Thanks for the reply.
Actually what I'm looking for is to have a kind of fuzzy memberships for the
terms of a document. That is, for each term of a document, I will have a
membership value for that term and each term will be in each document, at
most once.
For that, I will need float TF and IDF
Doesn't your classpath need the full path to the jar, not just
the containing directory?
On Thu, Mar 4, 2010 at 1:22 PM, Paul Rogers paul.roge...@gmail.com wrote:
Dear All
Further to my previous email I notice I made a mistake with the second
example. When I entered the second command it
Erick
What a star!! Hadn't thought of that. Assumed (always a mistake) that the
classpath only pointed to the directory. Using the following command:
java -cp
/home/paul/bin/lucene-3.0.0/lucene-core-3.0.0.jar:/home/paul/bin/lucene-3.0.0/lucene-demos-3.0.0.jar
org.apache.lucene.demo.IndexFiles
Not with Lucene 3.0.1.
Tomorrow I will try it with 2.9.2.
Arne
--
View this message in context:
http://old.nabble.com/FastVectorHighlighter-truncated-queries-tp27709797p27786722.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
I don't think that it is related with lucene version.
Please inspect the C# code below. fragments1 has no highlight info, on the
other hand fragments2 has one.
RAMDirectory dir = new RAMDirectory();
IndexWriter wr = new IndexWriter(dir, new
Another newcomer to Lucene here. I've got the Lucene web demo up and running
on my test server. The indexing and search functions are working perfect.
The problem I'm running regards the format of urls to found objects.
for instance lucene will return a hit like this:
Hi Mike and others,
I have a test case for you (attached) that exhibits a file descriptor leak in
ParallelReader.reopen(). I listed the OS, JDK, and snapshot of Lucene that I'm
using in the source code.
A loop adds just over 4000 documents to an index, reopening the index after
each, before
On 03/04/2010 06:52 PM, Justin wrote:
Hi Mike and others,
I have a test case for you (attached) that exhibits a file descriptor leak in
ParallelReader.reopen(). I listed the OS, JDK, and snapshot of Lucene that I'm
using in the source code.
A loop adds just over 4000 documents to an index,
Has this changed since 2.4.1? Our application didn't explicitly close with
2.4.1 and that combination never had this problem.
- Original Message
From: Mark Miller markrmil...@gmail.com
To: java-user@lucene.apache.org
Sent: Thu, March 4, 2010 6:00:02 PM
Subject: Re: File descriptor
That was always the same with reopen(). Its documented in the javadocs, with a
short example:
http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/index/IndexReader.html#reopen()
also in 2.4.1:
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/index/IndexReader.html#reopen()
See my other mail for you file descriptor leak.
A short note about your search code:
You should not directly instantiate a TopScoreDocCollector but instead use the
Searcher method that returns TopDocs. This has the benefit, that the searcher
automatically chooses the right parameter for
We must have been getting lucky. Thanks Mark and Uwe!
- Original Message
From: Uwe Schindler u...@thetaphi.de
To: java-user@lucene.apache.org
Sent: Thu, March 4, 2010 6:20:56 PM
Subject: RE: File descriptor leak in ParallelReader.reopen()
That was always the same with reopen(). Its
Sorry,
small change:
You should not directly instantiate a TopScoreDocCollector but instead
use the Searcher method that returns TopDocs. This has the benefit,
that the searcher automatically chooses the right parameter for scoring
docs out/in order. In your example, search would be a little
Makes sense. Thanks for the tip!
I haven't seen a response to my 2-pass scoring question, so maybe I've asked at
least one difficult one. :-)
- Original Message
From: Uwe Schindler u...@thetaphi.de
To: java-user@lucene.apache.org
Sent: Thu, March 4, 2010 6:32:06 PM
Subject: RE:
i think you should check the index first.using the lukeall to see if there
is the duplicate books.
On Thu, 04 Mar 2010 20:43:26 +0800, ani...@ekkitab ani...@ekkitab.com
wrote:
Hi there, Could someone help me with the usage of DuplicateFilters. Here
is
my problem
I have created a
Hi Ian,
Thanks for your reply. We had actually done what you had suggested first,
and it wasn't working, so I was hoping for some sample code. But then we
found out that the field name on which we wanted the duplicate filter to be
applied was not actually indexed while adding it into the
Hi Zhangchi
Thanks for your reply.
We have about 3 million records (different isbns) in the database and
documents little more than that, and we wouldn't want to do the deduping at
indexing time, because one book ( one isbn ) can be available under 2 or
more categories( like fiction, comics
35 matches
Mail list logo