On Wed, Feb 24, 2010 at 08:42:02AM -0500, Grant Ingersoll said:
> What would it be?
Adding, deleting and updating of individual fields in a document.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For addit
On Thu, Nov 12, 2009 at 09:20:30PM +0100, Uwe Schindler said:
> Which version of Lucene are you using and which Version constant do you pass
> to Analyzer and Query Parser? In 2.9.0 there was a bug/incorrect setting
> between the query parser and the Version.LUCENE_CURRENT / Version.LUCENE_29
> set
I have a document with the title "Here, there be dragons" and a body.
When I search for
Here, there be dragons
(no quotes)
with a title boost of 2.0 and a body boost of 0.8
I get the document as the first hit which is what I'd expect.
However, if change the query to
"Here, there be dragons"
I know this is one of those "How long is a piece of string?" questions
but I'm curious as to the order of magnitude of indexing performance.
http://lucene.apache.org/java/docs/benchmarks.html
seems to indicate about 100-120 docs/s is pretty good for average sized
documents (say, an email or som
On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:
>
> When you previously saw corruption was it due to an OS or machine
> crash (or power cord got pulled)? If so, you were likely hitting
> LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
> at some point) but
I currently have a set up that indexes into RAM and then periodically
merges that into a disk based index.
Searches are done from the disk based index and deletes are handled by
keeping a list of deleted documents, filtering out search results and
applying the deletes to the index at merge tim
I'm looking at doing a system which is looks something like this - I
have an IndexSearcher open with a on-disk index but all writes go to a
RAM based IndexWriter. Periodically I do
1. Close IndexSearcher
2. Open new IndexWriter in same location
3. Use addIndexes with old
On Wed, Jul 25, 2007 at 05:49:41AM -0400, Michael McCandless said:
> Ahhh, OK. But do you have a segments_N file?
Yup.
> Yes, this is perfect. This is the "simple" option I described. The
> more complex option is to use a custom deletion policy which enables
> you to safely do backups (even i
On Wed, Jul 25, 2007 at 05:19:31AM -0400, Michael McCandless said:
> It's somewhat spooky that you have a write.lock present because that
> means you backed up while a writer was actively writing to the index
> which is a bit dangerous because if the timing is unlucky (backup does
> an "ls" but bef
On Wed, Jul 25, 2007 at 10:08:56AM +0100, me said:
> The data appears to be there - please tell me that I'm doing something
> stupid and I can recover from this.
It appears by deleting the write.lock files everything has recovered.
Is this best practice? Have I just done something so terribly wr
We were affected by the great SF outage yesterday and apparently the
indexing machine crashed without being shutdown properly.
I've taken a backup of the indexes which has the usual smattering of
write.lock segments.gen, .cfs, .fdt, .fnm and .fdx etc files and looks
to be about the right size.
I recently had a thought to do with DbDirectory - specifically would it
be possible to use something like Oracle's inbuilt replication to have
mutiple Reader machine being able to read the index with automatic
partitioning, redundancy and failover?
Also, what is performance like for DbDirectory
On Thu, May 24, 2007 at 09:28:30AM -0400, Erick Erickson said:
> If that's unacceptable, you can *still* open up a new reader in the
> background and warm it up before using it. "immediately" then
> becomes 5-10 seconds or so.
This is currently what I'm doing using a list of previous performed
qu
I've built a Lucene system that gets rapidly updated - documents are
supposed to be searchable immeidately after they've been indexed.
As such I have a Writer that puts new index, update and delete tasks
into a queue and then has a thread which consumes them and applies them
to the index using
I'm looking for some advice on dealing with malformed queries.
If a user searches for "yow!" then I get an exception from the query
parser. I can get round this by using QueryParser.escape(query) first
but then that prevents them from searching using other bits of the the
query syntax such as "
On Tue, Apr 03, 2007 at 08:31:20AM -0400, Michael McCandless said:
> Optimize actually does its own flush before optimizing, so you don't
> need to call it yourself and in fact calling it after optimize will
> just be a harmless no-op.
Ah, that's good to know.
> You should be worried about this
I have an Indexer which inserts tasks onto a queue and then has a thread
which consumes the tasks (Index, Update or Delete) and executes them. If
the Indexer is shut down it stops the thread, waits until it's finished
its current task and then consumes any other tasks on the queue. Then it
runs
I've been reading through the spelling correction API and I'm confused.
It looks like you tell it the directory to hold the spelling correction
DB and then give it an IndexReader and a field to retrieve spelling
suggestions from.
But then I'd have to redo that operation everytime a new document
On Fri, Dec 15, 2006 at 04:01:33PM +0530, Kapil Chhabra said:
> I have implemented such a feature. Just to add on to what Bhavin said,
> your results would be more relevant if you index only 2 & 3 token
> phrases and display a 3 token suggestion if the current search keyword
> consists of 2 toke
Yahoo! has a search suggestion feature so that if you search for say
'shoes' then it also reccomends
payless shoes, jordan shoes, aldo shoes, nike shoes, bakers shoes
and a bunch of others.
Has anyone built something like that in Lucene?
Simon
---
On Wed, Oct 04, 2006 at 01:55:06PM +, eks dev said:
> have you considered hadoop "light" mesagging RPC, should have
> significantly smaller latencies than RMI
Yes, it's one of the things I'm looking at.
-
To unsubscribe, e-
On Wed, Oct 04, 2006 at 08:14:38AM -0400, Haines, Ronald C. (LNG-DAY) said:
> I too am interested in learning more about a large scale distributed
> Lucene model.
I'm also building a large scale (billions of documents) Lucene index.
Prelimary experimentation with a RemoteSearch/ParallelMultiSear
22 matches
Mail list logo