Hi All,
Given that Lucene scoring can favour shorter fields in documents, in the
past we've had to pad out 'unreasonably' short fields to a set minimum
(with basically nonsense words), I'm wondering how others might have
dealt with this issue.
Another option is to have a custom Similarity class
: (with basically nonsense words), I'm wondering how others might have
: dealt with this issue.
:
: Another option is to have a custom Similarity class with an altered
: lengthNorm method?
that is what i would recommend ... it's exactly what SweetSpotSimilarity
does (you define a platuea of
Hi,
I have a huge number of documents which contain mainly numbers and dates
(german format dd.MM.), like this:
Tgr. gilt ab 01.01.99 01.01.99 01.01.99 01.01.99 01.01.99 01.01.99
01.01.99 01.01.99 01.01.99 01.01.99 01.01.99 01.01.99 46X0 01
0480101080512070010
I'm wondering if this is a problem that lucene users have already
tackled. I have four copies of the application using a lucene index.
They are located on two physical servers with two copies on each server
accessing two copies of the lucene index. I use Windows FRS (File
Replication
: I think this would be too messy - currently we can be sure of the simple rule
: that documents added to the index get incrementally higher docids, i.e. the
: higher the docid the more recent is the document. I think it would be much
: simpler to write a FilterIndexReader that simply reverses
My index is only 4mb. Is there a SQL backend for Lucene?
Russ
Michael McCandless wrote:
If you're able to tell Windows FRS which specific files to copy, then
SnapshotDeletionPolicy (in 2.3) should work for this.
It basically protects a consistent snapshot of your index, ensuring
those
*How* do you want to search them? If it's simply exact matches, then
WhitespaceAnalyzer should work fine.
But if you want to, for example, look at date ranges or number
ranges, you'll have to be more clever.
What do you want to accomplish?
Best
Erick
On Feb 7, 2008 3:25 PM, [EMAIL PROTECTED]
If you're able to tell Windows FRS which specific files to copy, then
SnapshotDeletionPolicy (in 2.3) should work for this.
It basically protects a consistent snapshot of your index, ensuring
those files will not be deleted, while not blocking further updates
to the index.
Mike
Ruslan
No, FRS copies the whole directory. It's fairly fast, but if there is a
modification on both servers at the same time, there will be issues.
Russ
Michael McCandless wrote:
If you're able to tell Windows FRS which specific files to copy, then
SnapshotDeletionPolicy (in 2.3) should work for
With an index that small, I wonder why you bother with so many copies?
What kind of load are you hitting it with and how complex are the queries?
Because unless you have *very* high query rate, I'd look at why my queries
were
taking so long before complexifying things this way.
Best
Erick
On
Hi,
I want to create a function, which takes in a query string (in lucene
syntax), and a string as content and returns back if the query matches
the content or not. This would mean,
query = +(apache) +(lucene OR httpd)
will match
content = HTTPD by Apache foundation is one of the most popular
11 matches
Mail list logo