Not exactly sure what you're trying to do. You can easily generate a number
when you index each Document and insert it in a uid field (which is, BTW, what
I do), and if you base it on a timestamp plus some characteristic of the
document (which is also what I do), it should always be unique. As
Compared to what?
- Original Message -
From: Miguel Angel
To: [EMAIL PROTECTED]
Sent: Sunday, November 21, 2004 12:00 PM
Subject: disadvantages
What are disadvantages the Lucene??
--
Miguel Angel Angeles R.
Asesoria en Conectividad y Servidores
Telf. 97451277
-
Kevin,
Sorry for the delay in replying. I think your idea for an external field
storage mechanism is excellent. I'd love to see it, and if I can, will be
willing to help make that happen.
Regards,
Terry
- Original Message -
From: Kevin A. Burton
To: Lucene Users List
Sent:
I think what Erik's asking is whether you can live with expressing your indexed date
in the form of MMDD, without the hour and minute extension. That will sharply
educe the number of range query expansion terms. If you're using the timestamp as a
unique identifier, you might consider creat
I think what Sreedhar is asking for is the capability to form a "join" across multiple
indices - and if so, I could sure use that capability myself. However, I think
Lucene's logic focuses only on a single query, so I doubt if that's easily done.
- Original Message -
From: Otis Gos
ting is the padre of both Nutch and Lucene.
Otis
--- Terry Steichen <[EMAIL PROTECTED]> wrote:
> Otis,
>
> What's the relationship between Nutch and Lucene?
>
> Terry
> - Original Message -
> From: Otis Gospodnetic
> To: Lucen
Christoph,
Just curious - how are you currently using Term Vectors? They seem to be a neat
feature with lots of future promise, but I'm not sure how to best use them now.
Regards,
Terry
- Original Message -
From: Christoph Goller
To: Lucene Developers List
Sent: Thursday, Se
Otis,
What's the relationship between Nutch and Lucene?
Terry
- Original Message -
From: Otis Gospodnetic
To: Lucene Users List
Sent: Wednesday, September 15, 2004 7:29 AM
Subject: Re: Concurent operations with Lucene
Hello
Only 1 process can modify (add/delete) an ind
Jeez, Erik! Where's your sense of public spirit ;-)
Terry
PS: Glad to hear you're (finally!) nearing publication.
- Original Message -
From: Erik Hatcher
To: Lucene Users List
Sent: Tuesday, September 07, 2004 6:43 AM
Subject: Re: Lucene Book
On Sep 7, 2004, at 3:00 A
I suspect it has to do with the security restrictions of the applet, 'cause it doesn't
appear to be finding your Lucene jar file. Also, regarding the lock files, I believe
you can disable the locking stuff just for purposes like yours (read-only index).
Regards,
Terry
- Original Message
Aug 4, 2004, at 7:19 AM, Terry Steichen wrote:
> I can't get negative boosts to work with QueryParser. Is it possible
> to do so?
Closer inspection on the parsing:
TOKEN : {
)+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT
}
where
<#_NUM_CHAR: [&q
< 1.0, no?
Otis
--- Terry Steichen <[EMAIL PROTECTED]> wrote:
> I can't get negative boosts to work with QueryParser. Is it possible
> to do so?
>
> TIA,
>
> Terry
>
>
>
I can't get negative boosts to work with QueryParser. Is it possible to do so?
TIA,
Terry
Luke runs just fine with 1.3.1. If you're using Windows, try highlighting
it with Windows Explorer, right-clicking on it, choosing the "Open with.."
menu option and selecting "javaw".
Regards,
Terry
- Original Message -
From: "Andrzej Bialecki" <[EMAIL PROTECTED]>
To: "Lucene Users Lis
+1
- Original Message -
From: "Eric Jain" <[EMAIL PROTECTED]>
To: "lucene-user" <[EMAIL PROTECTED]>
Sent: Friday, June 11, 2004 4:24 AM
Subject: NullAnalyzer
> There doesn't seem to be an Analyzer that doesn't do anything included
> with Lucene, is there? This would seem useful to prev
Speaking for myself, only a small number of my code modules currently treat
"null" as the open-ended range query term parameter. If the syntax change
from 'null' --> '*' was deemed otherwise desirable and the syntax transition
made very clearly, I could personally adjust to it without too much
dif
- Original Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Thursday, June 10, 2004 2:24 PM
Subject: Re: Open-ended range queries
> On Jun 10, 2004, at 2:13 PM, Terry Steichen wrote:
> > Ac
Actually, QueryParser does support open-ended ranges like : [term TO null].
Doesn't work for the lower end of the range (though that's usually less of a
problem).
Regards,
Terry
- Original Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Se
Erik,
When is "Lucene in Action" scheduled to be out?
Regards,
Terry
- Original Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Thursday, June 10, 2004 5:04 AM
Subject: Re: extensible query parser - Re: Proximity Searches behavior
This poses a couple of additional questions:
1) If you set the default slop factor in QueryProcessor to something greater
than 1, can you also use wildcards? (I ask that question because, to my
understanding, you can't combine the explicit proximity query syntax with
wildcards. That is, somethin
Nothing is wrong. When the maximum relevance score is greater than one, all
hit scores are normalized (making the highest score 1.0). When the maximum
score is less than 1, normalization does not occur. The more complex the
query, the more likely that the raw (non-normalized) score will be less
Prasad,
I think you'll have to provide more code so we can see what's actually going
on. BTW, I don't see you calling the UseCompoundFile method (unless you do
it inside indexFile/Directory) - I wonder if that could be an issue?
Regards,
Terry
PS: I run on XP/Pro just fine, so there's nothing
Erik,
Could you expand on this just a wee bit, perhaps with an example of how to
compute this vector angle?
TIA,
Terry
- Original Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, June 01, 2004 9:39 AM
Subject: Re: similarity
Just thought I'd pass on some info I just discovered. I've been successfully using
the CVS head version of Lucene as of about 2 months ago. I then got the formal
release (1.4-rc3) and tried it with my application, but it failed. I tried it with
some commandline test routines and they worked f
+1
- Original Message -
From: "Kevin Burton" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, May 18, 2004 2:43 PM
Subject: Internal full content store within Lucene
> Per the discussion the other day about storing content external to
> Lucene I think we h
> 1.3" work around, or to convert the anonymous inner classes
> to named inner classes?
>
> this is the only 1.4 dependency that I know of.
>
>
> > -Original Message-
> > From: Terry Steichen [mailto:[EMAIL PROTECTED]
> > Sent: Wednesday, May 12,
t;
Sent: Wednesday, May 12, 2004 8:04 AM
Subject: Re: new Lucene release: 1.4 RC3
> I don't recall any JDK 1.4 methods/classes being used, and I just saw
> Doug replacing one AssertException (1.4) with RuntimeException.
>
> Are there some 1.4 dependencies I'm not aware of
I presume this still requires Java 1.4 to build, but will run with Java 1.3?
Regards,
Terry
- Original Message -
From: "Doug Cutting" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, May 11, 2004 4:51 PM
Subject: new Lucene release: 1.4 RC3
> Version 1.4
Erik,
Maybe you could donate some of those demo modules (and the accompanying
article/text) to Lucene, so they'd be incorporated officially in the
website?
Regards,
Terry
- Original Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Fri
I think that if you include the indexing timestamp in the Document you
create when indexing, you could sort on this and only pick the first 100.
Regards,
Terry
- Original Message -
From: "Alan Smith" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, April 27, 2004 8:02 AM
Subjec
Andrzej,
Sorry for misspelling your name. My Polish sucks.
Terry
- Original Message -
From: "Terry Steichen" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Thursday, April 22, 2004 7:56 PM
Subject: Re: Stemmer Benefits/Costs
&
t;
Sent: Thursday, April 22, 2004 5:37 PM
Subject: Re: Stemmer Benefits/Costs
> Terry Steichen wrote:
>
> > I've been experimenting with the Porter and Snowball stemmers. It
> > seems to me that one of the most valuable benefits these provide is
> > the capability t
I've been experimenting with the Porter and Snowball stemmers. It seems to me that
one of the most valuable benefits these provide is the capability to generalize phrase
terms. As a very simple example, without the stemmer, I might need to include three
phrase terms in my query: "north korea",
t;\n with message: " + e.getMessage());
}
}
}
----- Original Message -
From: "Terry Steichen" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, March 31, 2004 11:47 AM
Subject: Re: Wierd Search Behavior
> No, they're typos in the e-mail
MAIL PROTECTED]>
Sent: Wednesday, March 31, 2004 9:55 AM
Subject: Re: Wierd Search Behavior
> On Mar 31, 2004, at 9:49 AM, Terry Steichen wrote:\
> > I'm experiencing some very puzzling search behavior. I am using the
> > CVS head I pulled about a week ago. I use the Sta
I'm experiencing some very puzzling search behavior. I am using the CVS head I pulled
about a week ago. I use the StandardAnalyzer and QueryParser. I have a collection of
XML documents indexed. One field is "subhead", and here's what I find with different
queries:
subhead:(missile defense)
Joachim,
I believe you'll have to replace the default Similarity class with one of
your own. Not sure exactly what the settings should be - maybe some other
list members can give you specifics. Otherwise, you'll probably have to
experiment with it.
Regards,
Terry
- Original Message -
inal Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, March 22, 2004 7:06 AM
Subject: Re: Final Hits
> How exactly would you take advantage of a subclassable Hits class?
>
>
> On Mar 21, 2
AIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, March 22, 2004 2:46 AM
Subject: Re: SpanXXQuery Usage
> Only in unit tests, so far.
>
> Otis
>
> --- Terry Steichen <[EMAIL PROTECTED]> wrote:
> > Is there any documentation (othe
Does anyone know why the Hits class is final (thus preventing it from being
subclassed)?
Regards,
Terry
Is there any documentation (other than that in the source) on how to use the new
SpanxxQuery features? Specifically: SpanNearQuery, SpanNotQuery, SpanFirstQuery and
SpanOrQuery?
Regards,
Terry
I tend to agree (but with the same uncertainty as to why I feel that way).
Regards,
Terry
- Original Message -
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, March 08, 2004 2:34 PM
Subject: Re: Sys properties Was: java.io.tmpdir as
Doug,
What you say makes a good deal of sense to me. Could you give us a relative
sense of the "slowness" of different operators?
Regards
Terry
- Original Message -
From: "Doug Cutting" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, February 17, 2004 1:1
nesday, January 21, 2004 2:04 PM
Subject: Re: Query Term Questions
> On Jan 21, 2004, at 1:07 PM, Terry Steichen wrote:
> > Unfortunately, using positive boost factors less than 1 causes the
> > parser to
> > barf the same as do negative boost factors.
>
> Are you sur
Morus,
Unfortunately, using positive boost factors less than 1 causes the parser to
barf the same as do negative boost factors.
Regards,
Terry
- Original Message -
From: "Morus Walter" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, January 21, 2004 10:5
OTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, January 21, 2004 9:31 AM
Subject: Re: Query Term Questions
> On Jan 20, 2004, at 10:22 AM, Terry Steichen wrote:
> > 1) Is there a way to set the query boost factor depending not on the
> >
By the silence, I gather that the answers to my questions are "no", "no" and
"no".
Regards,
Terry
- Original Message -
From: "Terry Steichen" <[EMAIL PROTECTED]>
To: "Lucene Users Group" <[EMAIL PROTECTED]>
Sent: Tuesday, J
1) Is there a way to set the query boost factor depending not on the presence of a
term, but on the presence of two specific terms? For example, I may want to boost the
relevance of a document that contains both "iraq" and "clerics", but not boost the
relevance of documents that contain only on
Maybe you're using wildcards (which cause the query to get expanded). Just
go in and set the varb to something very large (provided that doing so
doesn't give you an OutOfMemory error - which is why that limit was set).
HTH,
Terry
- Original Message -
From: "Karl Koch" <[EMAIL PROTECTED
I just aborted a re-indexing operation (because it was taking too much time - will run
it overnight instead). But I was surprised by what I found in the index directory,
which contained a total of 1,402 index files! It started out with 36 files with the
name of "_I9a.*", followed by groups of
ster than xerces, you might want to look at these. You might
> want to look at http://dom4j.org/.
>
> Dror
>
>
> >
> > Regards
> >
> > Scott
> >
> > -Original Message-
> > From: Terry Steichen [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday
Scott,
Here are some figures to use for comparision. Using the latest Lucene
release, I index about 200 similar-sized XML files at a time, on a Windows
XP machine (2Ghz). First I create a new index, which adds the documents at
a rate of about 8 per second (I don't recall what the cpu % is during
Paul,
I just started using 1.3 final (labeled 1.4 RC1) and ran into a similar
problem (though I'm not using the compound file option). My code ran just
fine all the way through 1.3RC3, but with the latest release, the
reader.delete() threw a "Lock obtain timed out" IOException. What I finally
di
What kind of response is this? (e.g. "apparently so.") Is this a problem
or not?
Regards,
Terry Steichen
- Original Message -
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, October
Erik's analysis is comprehensive and useful. I think this example reflects
a common (and understandable) oversight - that wildcards do *not* work with
a phrase. Got caught on that many times myself. Also there may be
confusion about the format -> field:(term1 term2), in that the examples
provide
You can also use a RangeQuery. If you index the field of numeric data, say
'score', as a string, then you can do things like: score:[75 TO 80]. Only
extra work is that you need to pad the actual score with enough 0's (such
that 9 becomes 09, etc.) to cover the expected range.
Regards,
Terry
--
highest one is set to 1, and the
others are proportionately lower?
Regards,
Terry
- Original Message -----
From: "Terry Steichen" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Thursday, September 18, 2003 10:10 AM
Subject: Re: Lucene Sc
About a month ago, timeouts were added to Lock (and they seem to make a lot of good
sense). However, because of this enhancement, in using the latest CVS my application
broke - I keep getting the message "Lock obtain timed out". I looked through the
source in an attempt to figure out a quick w
nesday, September 17, 2003 11:15 PM
Subject: Re: Lucene Scoring Behavior
> Hmm. This makes no sense to me. Can you supply a reproducible
> standalone test case?
>
> Doug
>
> Terry Steichen wrote:
> > Doug,
> >
> > (1) No, I did *not* boost the pub_date field, eith
[EMAIL PROTECTED]>
Sent: Wednesday, September 17, 2003 5:51 PM
Subject: Re: Lucene Scoring Behavior
> Terry Steichen wrote:
> > 0.03125 = fieldNorm(field=pub_date, doc=90992)
> > 1.0 = fieldNorm(field=pub_date, doc=90970)
>
> It looks like the fieldNorm's are what dif
tion {
> if (term.field() == "date") {
>return 1.0f;
> } else {
>return super.idf(term, searcher);
> }
>}
> }
>
> Or you could just give date clauses of your query a very small boost
> (e.g., .0001) so that other clauses domina
I've run across some puzzling behavior regarding scoring. I have a set of documents
which contain, among others, a date field (whose contents is a string in the MMDD
format). When I query on the date 20030917 (that is, today), I get 157 hits, all of
which have a score of .23000652. If I u
I've often found the use of query-based boosting to be very beneficial. This is
particularly so when it's easy to identify the term that I want to stand out as a
primary selector.
However, I've come across quite a few other cases where it would be easier (and more
logical) to apply a negativ
sers List" <[EMAIL PROTECTED]>
Sent: Saturday, August 30, 2003 12:09 AM
Subject: Re: Keyword search with space and wildcard
> On Friday 29 August 2003 10:02, Terry Steichen wrote:
> > I agree. One problem, however, that new (and not-so-new) Lucene users
face
> > is a
Tatu,
I agree. One problem, however, that new (and not-so-new) Lucene users face
is a learning curve when they want to get past the simplest and most obvious
uses of Lucene. For example, I don't think any of the docs mention the fact
that you can't combine a phrase and a wildcard query. Other t
load area hold it) ?
> >
> > Jan
> >
> > - Original Message -
> > From: "Lukas Zapletal" <[EMAIL PROTECTED]>
> > To: "Lucene Users List" <[EMAIL PROTECTED]>
> > Sent: Friday, August 29, 2003 12:14 PM
> > Subje
If I understand your issue correctly, I think what you're experiencing is
the fact that you can have a phrase query "hello world", or a wildcard query
+hell* +wor*, but you can't mix the two together. As far as I've found,
that's a basic limitation you just have to live with. (Of course, if
someo
I just switched to RC2 and found that a number of queries now don't work. (When I
switch back to RC! they work fine.) Can't seem to figure out a pattern regarding
those that don't work versus those (the vast majority) that still work fine. I looked
in the RC2 source and noticed that the dates
usual question of what is
> >actually interesting: high frequency, low frequency or the mid range).
> >
> >Indexing would probably be quite expensive since Lucene doesn't seem to
> >support changes in the index, and the index for the terms would change
> >all the time
hange
> all the time. We haven't implemented it yet, but it shouldn't be hard to
> code. I just wouldn't expect good performance when indexing large
> collections.
>
> Peter
>
>
> Terry Steichen wrote:
>
> >Is it possible without extensive additional co
Is it possible without extensive additional coding to use Lucene to conduct a search
based on a document rather than a query? (One use of this would be to refine a search
by selecting one of the hits returned from the initial query and subsequently
retrieving other documents "like" the selected
--- Original Message -
From: "Venkatraman, Shiv" <[EMAIL PROTECTED]>
To: "'Terry Steichen '" <[EMAIL PROTECTED]>; "'Lucene Users List '"
<[EMAIL PROTECTED]>
Sent: Saturday, May 31, 2003 11:31 AM
Subject: RE: searching data indexed
Shiv,
Searching in Lucene is field-based. Thus you must specify the field to be
searched - the only 'exception' is that one field is defined as default. If
you want to search across multiple fields, I believe you must create a
concatenation of the individual fields into a single one during the i
Amit,
I don't exactly know what your problem is, but I'm using a configuration not
too different from yours with no problems - so at least you know it's
possible.
I have an index of about 125MB which I use on various machines, including an
old Windows98/SE 400MHz notebook. I used the default Mer
Probably tokenized 1234 as a string and treated '-' as a separator. See
previous discussion on "query".
Regards,
Terry
- Original Message -
From: "Lixin Meng" <[EMAIL PROTECTED]>
To: "'Lucene Users List'" <[EMAIL PROTECTED]>
Sent: Tuesday, March 25, 2003 9:16 PM
Subject: Tokenize negati
Arsineh,
There was some discussion on this list about this topic earlier. As I
recall, the escaping a '-' doesn't work (for reasons I don't recall -
something about interaction of analyzer and tokenizer, I think). To handle
this for my own purposes, I believe I modified the QueryParser.jj source
+1
- Original Message -
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Monday, March 10, 2003 10:38 AM
Subject: Indexing and searching database data
> Hello,
>
> Would anyone be interested in ability to use Lucene search on the data
fr
Samuel,
Not exactly sure of your question. But, if the path is known at the time of
indexing, you just insert it in the Document that is created as part of the
indexing. If you don't know the path till later, you might insert a partial
path at index time and add the exact location when you use i
bug, cuold you please provide a complete,
> self-contained test case? You could, for example, model this after the
> TestSimilarity class in the test code hierarchy.
>
> The lengthNorm(String,int) method is called when you index the document.
>
> Doug
>
> Terry Steichen
r that the new lengthNorm() method is being
called.
It's probably some silly goof, but I can't figure out where it is.
If you (or anyone else, of course) have any ideas/suggestions, I'd
appreciate them.
Regards,
Terry
- Original Message -----
From: "Terry Steichen" &l
Samir,
The size of the index depends on (a) the size of the documents, (b) the
number of fields per document, (c) the fields that are kept in the index.
The time taken to index depends on the same plus the characteristics of the
processor and storage i/o. With so many variables, I don't think the
are trying this anyway, and looking for ways to improve
> indexing times... Could you perhaps try to replace use of
> java.io.RandomAccessFile in FSDirectory implementation, with the
> attached implementation? It supposedly increases I/O throughput by
> orders of magnitude, by using part
Mike,
By way of comparison, I've got a collection of about 50,000 XML files, each
of which averages about 8K. It takes about 1.25 hours to index (on a 1.8Ghz
machine). I use basically the standard configuration (mergeFactor, etc.)
and I've got about 30 fields per document. I add about 200 new o
ing it as an encoded
space). That would explain the behavior. I will confirm this later on
today.
Regards,
Terry
- Original Message -
From: "Terry Steichen" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Saturday, February 15, 20
ED]>
Sent: Saturday, February 15, 2003 7:41 PM
Subject: Re: Syntax Problem
> Terry Steichen wrote:
> > I have an index which, when searched with this query ("cloning clone
> > animal") produces 1103 hits. A different, more narrow query
> > ("(cloning clone) AND a
I have an index which, when searched with this query ("cloning clone animal") produces
1103 hits. A different, more narrow query ("(cloning clone) AND animal") produces
only 19 hits.
What's puzzling to me is that if I try a different (but supposedly identical) form of
the more narrow query ("+
, 2003 1:57 PM
Subject: Re: Computing Relevancy Differently
> Terry Steichen wrote:
> > Can you give me an idea of what to replace the lengthNorm() method with
to,
> > for example, remove any special weight given to shorter matching
documents?
>
> The goal of the default imp
iginal Message -
From: "Doug Cutting" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Friday, February 07, 2003 2:37 PM
Subject: Re: Computing Relevancy Differently
> Terry Steichen wrote:
> > I read all the relevant references I co
Is there an existing API that allows you to conduct a search such that only hits with
a score greater than X are returned?
Regards,
Terry
in advance
> Nellai...
> - Original Message -
> From: "Terry Steichen" <[EMAIL PROTECTED]>
> To: "Lucene Users List" <[EMAIL PROTECTED]>
> Sent: Monday, February 03, 2003 7:50 PM
> Subject: Re: regarding Query parser for relation
Nellai,
Sounds like you want to use a range query.
Regards,
Terry
- Original Message -
From: "Nellai" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, February 03, 2003 5:10 AM
Subject: regarding Query parser for relational operators
Hi,
Is there any way
I believe that the tokenizer treats a dash as a token separator. Hence, the
only way, as I recall, to eliminate this behavior is to modify
QueryParser.jj so it doesn't do this. However, doing this can cause some
other problems, like hyphenated words at a line break and the like.
(Of course, if y
Lukas,
I believe that "this" is a stop word, so it is stripped out.
Regards,
Terry
- Original Message -
From: "Lukas Zapletal" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Sunday, February 02, 2003 11:47 AM
Subject: Wildchars in phrase
> Hello all!
>
> Why a
Leo,
>From my experience, as I update the index (without optimizing), the number
of physical index files grows. I typically use the number of files as an
indicator as to when optimization is required. While I don't think Lucene
itself has any API to check this, a shell script or the application
I admit to a bit of frustration.
With the past several messages, I simply asked (or, more accurately, tried
to ask) how to alter the way that Lucene ranks relevancy, and I asked
whether the selective boost mechanism might do the trick. I admitted that I
don't know (nor care to know) the theory be
rom: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Saturday, January 25, 2003 2:09 AM
Subject: Re: Computing Relevancy Differently
> Check the lucene-user archives, search for subject "custom scoring api
> questio
How would one go about altering the formula for relevancy? (That is, which modules
and which code?) I'm certain that the current algorithm is well founded in logic and
probably works well in many environments.
However, I find that, as I index news stories, the current algorithm frequently
d
asociated with the Term? |
> Yes, I believe so.
>
> --- Terry Steichen <[EMAIL PROTECTED]> wrote:
> > Otis,
> >
> > Didn't somebody (Doug?) also mention that a keyword in a shorter
> > document is
> > deemed more significant than in a longer one (be
Otis,
Didn't somebody (Doug?) also mention that a keyword in a shorter document is
deemed more significant than in a longer one (because, I guess, it
represents a larger percentage of the document)?
Regards,
Terry
- Original Message -
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Luc
Erik,
That's good. Now I don't have to keep proving what is, is. Glad it finally
made sense.
Regards,
Terry
- Original Message -
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, January 22, 2003 11:43 PM
Subject: Re: Range queries
1 - 100 of 167 matches
Mail list logo