I didn't want to let this drop this on the floor, but I haven't had the
time to craft a response to it either. So, just for the record I agree
that transactions would be nice. I think that it is important that the
solution address change visibility and concurrent transactions within
multiple
Why does it seem to you that C# is faster than Java?
In any case, generally the bottleneck isn't the VM. It's the I/O to
the disks...
Scott
The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on t
You can use:
BooleanQuery.setMaxClauseCount(int maxClauseCount);
to increase the limit.
On Sep 30, 2004, at 8:24 PM, Chris Fraschetti wrote:
I recently read in regards to my problem that date_field:[0820483200
TO 110448]
is evluated into a series of boolean queries ... which has a cap of
1024 .
Well, I do like the *, but apparently there are some people that are
using this with the null...
Scott
On Jun 10, 2004, at 7:15 PM, Erik Hatcher wrote:
On Jun 10, 2004, at 4:54 PM, Scott ganyo wrote:
It looks to me like Revision 1.18 broke it.
It seems this could be it:
revision 1.18
date: 2002
It looks to me like Revision 1.18 broke it.
On Jun 10, 2004, at 3:26 PM, Erik Hatcher wrote:
On Jun 10, 2004, at 4:07 PM, Terry Steichen wrote:
Well, I'm using 1.4 RC3 and the "null" range upper limit works just
fine for
searches in two of my fields; one is in the form of a cannonical date
(eg,
2
At one point it definitely supported null for either term. I think
that has been removed/forgotten in the later revisions of the
QueryParser...
Scott
On Jun 10, 2004, at 1:24 PM, Erik Hatcher wrote:
On Jun 10, 2004, at 2:13 PM, Terry Steichen wrote:
Actually, QueryParser does support open-ended
I don't buy it. HashSet is but one implementation of a Set. By
choosing the HashSet implementation you are not only tying the class to
a hash-based implementation, you are trying the interface to *that
specific* hash-based implementation or it's subclasses. In the end,
either you buy the con
I have. While document.add() itself doesn't increase over time, the
merge does. Ways of partially overcoming this include increasing the
mergeFactor (but this will increase the number of file handles used),
or building blocks of the index in memory and then merging them to
disk. This has bee
I am willing as well.
Scott
On Jan 29, 2004, at 12:04 PM, Boris Goldowsky wrote:
Strangely, the web site does not seem to list any vendors who provide
incident support for Lucene. That can't be right, can it?
Can anyone point me to organizations that would be willing to provide
support for Luce
No, you don't need required or prohibited, but you can't have both.
Here is a rundown:
* A required clause will allow a document to be selected if and only if
it contains that clause and will exclude any documents that don't.
* A prohibited clause will exclude any documents that contain that
I don't think adding extensive locking is necessary. What you are
probably experiencing is that you've closed the index before you're done
using it. If you aren't careful to close the index only after all
searches on it have been completed, you'll get an error like this.
Scott
[EMAIL PROTECT
Offhand, I would say that using 2 directories and merging them is
exactly what you waht. It really shouldn't be all that complicated and
Lucene should handle the synchronization for you...
Scott
Dror Matalon wrote:
Hi folks,
We're in the process of adding search to our online RSS aggregator.
Hi Eugene,
Yes. Doug (Cutting) added this to eliminate OutOfMemory errors that
apparently some people were having. Unfortunately, it causes
backward-compatibility issues if you were used to using version 1.2.
So, you'll need to add a call like this:
BooleanQuery.setMaxClauseCount(Integer.MA
Yes. You can (and should for best performance) reuse an IndexSearcher
as long as you don't need access to changes made to the index. An open
IndexSearcher won't pick up changes to the index, so if you need to see
the changes, you will need to open a new searcher at that point.
Scott
Aviran M
Be careful with option 1. NFS and the Lucene file-based locking
mechanism don't get along extremely well. (See the archives for details...)
Scott
Lienhard, Andrew wrote:
I can think of three options:
1) Single index dir on a shared drive (NFS, etc.) which is mounted on each
app server.
2)
Do these implementations maintain file compatibility with the Java version?
Scott
Erik Hatcher wrote:
I'd love to see there be quality implementations of the Lucene API in
other languages, that are up to date with the latest Java codebase.
I'm embarking on a Ruby port, which I'm hosting at rub
Nifty cool! I'm gonna like this, I can tell already!
I'm having a really hard time actually using Luke, though, as all the
window panes and table columns are apparently of fixed size. Do you
think you could through in the ability to resize the various window
panes and table columns? This wou
Wonderful! I can't wait to try this. I'll try to provide some
comparisons as I get to it, but I'd love to hear from anyone else that
tries this...
Thanks,
Scott
Francesco Bellomi wrote:
Hi,
I developed a Directory implementation that accesses an index stored on the
filesystem using memory-ma
FYI: The best thing I've found for both increasing speed and reducing
file handles is to use an IndexWriter on a RamDirectory for indexing and
then use FileWriter.addIndexes() to write the result to disk. This is
subject to the amount of memory you have available, of course...
Scott
Armbrust,
+1. Support for transactions in Lucene are high on my list of desirable
features as well. I would love to have time to look into adding this,
but lately... well, you know how that goes.
Scott
Eric Jain wrote:
If you want to update a set of documents, you can remove their previous
version firs
We generally optimize only after a full index (re-)build or during
periods where the index is not being unused.
Scott
Leo Galambos wrote:
Unoptimized index is not a problem for document additions, they take
constant time, regardless of the size of the index and regardless of
whether the index is
Not each time you search, but if you've modified the index since you
opened the searcher, you need to create a new searcher to get the changes.
Scott
Rob Outar wrote:
There is a reloading issue but I do not think lastModified is it:
static long lastModified(Directory directory)
Retur
It just marks the record as deleted. The record isn't actually removed
until the index is optimized.
Scott
Rob Outar wrote:
Hello all,
I used the delete(Term) method, then I looked at the index files,
only one
file changed "_1tx.del" I found references to the file still in some
of the
ind
I'm rather partial to Jini for distributed systems, but I agree that
JXTA would definitely be the way to go on this type of peer-to-peer
scenario.
Scott
[EMAIL PROTECTED] wrote:
I'll be doing something very similar some time in the next 12 months for
the project I'm working on. I'll be more th
n imagine how this improves the avg efficiency in my case if i
have 1 terms in "references". although i may be doing something
that was either not intended or ill-designed.
thanks, any thoughts?
alex
On Mon, 2002-11-11 at 10:50, Scott Ganyo wrote:
>Hi Alex,
>
>I just looked
Hi Alex,
I just looked at this and had the following thought:
The RangeQuery must continue to iterate after the first match is found
in order to match everything within the specified range. In other
words, if you have a range of "a" to "d", you can't stop with "a", you
need to continue to "d"
Actually, 10k isn't very large. We have indexes with more than 1M
records. It hasn't been a problem.
Scott
Tim Jones wrote:
Hi,
I am currently starting work on a project that requires indexing and
searching on potentially thousands, maybe tens of thousands, of text
documents.
I'm hoping tha
This sounds like an excellent start and would certainly be useful in a
number of scenarios, but it is not quite as generally useful as it could be
given its asynchronous nature. Generally expected database behavior is that
when a change is committed (and not before) it is immediately viewable in
Cool. But instead of adding a new class, why not change Hits to inherit
from Filter and add the bits() method to it? Then one could "pipe" the
output of one Query into another search without modifying the Queries...
Scott
> -Original Message-
> From: Doug Cutting [mailto:[EMAIL PROTECT
thing? It would seem that if there was an efficient implementation
of a forked file, perhaps that could be used instead of the set of files
that Lucene currently uses to represent a segment.
Scott
> -Original Message-
> From: Scott Ganyo [mailto:[EMAIL PROTECTED]]
> Sent: Tuesda
Are you closing the searcher after each when done?
No: Waiting for the garbage collector is not a good idea.
Yes: It could be a timeout on the OS holding the files handles.
Either way, the only real option is to avoid thrashing the searchers...
Scott
> -Original Message-
> From: Hang
Yup. Cache and reuse your Searcher as much as possible.
Scott
> -Original Message-
> From: Hang Li [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, July 23, 2002 9:59 AM
> To: Lucene Users List
> Subject: Too many open files?
>
>
> >
>
> I have seen a lot postings about this topic. Any fi
done with them
rather than allowing finalization to take care of it.
Scott
> -Original Message-
> From: Doug Cutting [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, July 16, 2002 11:56 AM
> To: Lucene Users List
> Subject: Re: CachedSearcher
>
>
> Scott Ganyo w
I'd like to see the finalize() methods removed from Lucene entirely. In a
system with heavy load and lots of gc, using finalize() causes problems. To
wit:
1) I was at a talk at JavaOne last year where the gc performance experts
from Sun (the engineers actually writing the HotSpot gc) were givin
Deadlocks could be created if the order in which locks are obtained is not
consistent. Note, though, that the locks are obtained in the same order
each time throughout. (BTW: The inner lock is merely needed because the
wait/notify calls need to own the monitor.)
Naturally, you are free to make
You are correct. Actually, there have been a few bug fixes since that was
posted. Here's a diff to an updated version:
@@ -19,11 +19,21 @@
*/
public class IndexAccessControl
{
- public static final Analyzer LUCENE_ANALYZER = new LuceneAnalyzer();
+ private static Analyzer s_defa
x27;m goig to need some
> insider
> help to get through this one.
>
> N.
>
> -Original Message-
> From: Scott Ganyo [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, June 26, 2002 7:15 PM
> To: 'Lucene Users List'
> Subject: RE: Stress Testing Lucene
>
>
1) Are you sure that the index is corrupted? Maybe the file handles just
haven't been released yet. Did you try to reboot and try again?
2) To avoid the too-many files problem: a) increase the system file handle
limits, b) make sure that you reuse IndexReaders as much as you can rather
across r
Use the java -Xmx option to increase your heap size.
Scott
> -Original Message-
> From: Nader S. Henein [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, June 13, 2002 12:20 PM
> To: [EMAIL PROTECTED]
> Subject: Boolean Query + Memory Monster
>
>
>
> I have 1 Geg of memory on the machine w
Actually, [] denotes an inclusive range of Terms. Anyway, why not change
the syntax if this is bad...?
Scott
> -Original Message-
> From: Brian Goetz [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, February 20, 2002 10:08 AM
> To: Lucene Users List
> Subject: Re: Queryparser croaking on "
+1
> -Original Message-
> From: Matt Tucker [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, January 22, 2002 11:06 AM
> To: 'Lucene Users List'
> Subject: RE: JDK 1.1 vs 1.2+
>
>
> Hey all,
>
> I'd just like to chime in support for dropping JDK 1.1,
> especially if it
> would aid i18n in
We use Lucene extensively as a core part of our ASP product here at
eTapestry. In fact, we've built our database query engine on top of
it. We have been extremely pleased with the results.
Scott
Jeff Kunkle asks:
> Does anyone know of any companies or agencies using Lucene for their
> products
I think something like this would be a HUGE boon for us. We do a lot of
complex queries on a lot of different indexes and end up suffering from
severe garbage collection issues on our system. I'd be willing to help out
in any way to make this issue go away as soon as possible.
Scott
> -Ori
an I get Doug's example of indexing in memory and then
> writing it out
> to disk? I just recently subscribed to this list and I can't
> find it in the
> archive.
>
> Thanks.
> Paul
>
> -Original Message-
> From: Scott Ganyo [mailto:[EMAIL
ft side
of a BooleanQuery subtract. Sure, it works, but it ain't pretty...
Scott
> -Original Message-
> From: Doug Cutting [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, November 01, 2001 10:49 AM
> To: 'Lucene Users List'
> Subject: RE: Problems with prohibi
Yes. You have too many open files. There are a few things you can try. 1)
Increase the number of file handles your system has available. Yes, there
is a setting for this in Windows. 2) Make sure that you have the
IndexWriter.maxMergeDocs set to Integer.MAX_VALUE (the default). 3) Try
smalle
e anything about the range query in the syntax BNF.
>
> In regards to the exception, I would expect that searching on
> the query "[]"
> or "name:[]" would either find all documents or no documents,
> not throw an
> exception?
>
>
> -Original Mes
How difficult would it be to get BooleanQuery to do a standalone NOT, do you
suppose? That would be very useful in my case.
Scott
> -Original Message-
> From: Doug Cutting [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, October 31, 2001 2:36 PM
> To: 'Lucene Users List'
> Subject: RE: Pro
[ and ] are used for RangeQuery. They indicate an inclusive range. For
example: "name:[adam-scott]"
> -Original Message-
> From: Paul Friedman [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, October 31, 2001 2:03 PM
> To: '[EMAIL PROTECTED]'
> Subject: Brackets in query syntax?
>
>
> Ar
Message-
> From: Doug Cutting [mailto:[EMAIL PROTECTED]]
> Sent: Friday, October 19, 2001 9:33 PM
> To: Scott Ganyo; '[EMAIL PROTECTED]'
> Subject: RE: new Lucene release: 1.2 RC2
>
>
> > From: Scott Ganyo [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, Octobe
Oops... and the WildcardQuery issues that Robert Lebowitz just reported.
> -Original Message-
> From: Scott Ganyo [mailto:[EMAIL PROTECTED]]
> Sent: Friday, October 19, 2001 5:28 PM
> To: 'Doug Cutting'; '[EMAIL PROTECTED]'
> Subject: RE: new Lucene rel
Well, we know of at least two issues:
1) RAMDirectory not merging properly (reported by me)
2) Indexes left in an inconsistent state on crash (i don't remember who
reported this)
Are these to be left as known issues for 1.2?
Thanks,
Scott
> -Original Message-
> From: Doug Cutting [mail
Not sure about the rest, but if you've stored your dates in mmdd format,
you can use a RangeQuery like so:
dateField:[20011001-null]
This would return all dates on or after October 1, 2001.
Scott
> -Original Message-
> From: W. Eliot Kimber [mailto:[EMAIL PROTECTED]]
> Sent: Tuesda
> > P.S. At one point I tried doing an in-memory index using the
> > RAMDirectory
> > and then merging it with an on-disk index and it didn't work. The
> > RAMDirectory never flushed to disk... leaving me with an
> > empty index. I
> > think this is because of a bug in the mechanism that is
>
Thanks for the detailed information, Doug! That helps a lot.
Based on what you've said and on taking a closer look at the code, it looks
like by setting mergeFactor and maxMergeDocs to Integer.MAX_VALUE, an entire
index will be built in a single segment completely in memory (using the
RAMDirecto
We're having a heck of a time with too many file handles around here. When
we create large indexes, we often get thousands of temporary files in a
given index! Even worse, we just plain run out of file handles--even on
boxes where we've upped the limits as much as we think we can! We've played
56 matches
Mail list logo