Dear Luceners,
I wonder if there is any pre-defined option to read stop-word from a file?
Any comment is hightly appreciated.
Thanks in advance & Best regards,
Mungkol
__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam pr
Is it possible to make a phrase query fuzzy?
It could be a quick and not so dirty replacement for hidden markov
models and thus produce great results for spell checking and other
natrual language classifications.
-
To unsub
Hej Paul,
I have implemented the DistanceComparatorSource
> example from Lucene In Action (my Bible) and it works
> great. We are now in the situation where we have
> nearly a million documents in our index and the
> performance of this implementation has degraded.
>
I have had the same problem w
Hi Paul,
I don't have any first-hand experience with this, but your suggestion about
pluggable analyzers sounds both reasonable and interesting to me. One thing
you did not mention as a mechanism for figuring out which analyzer to use is
language identification (like the one you can find among
Hi everyone,
We are currently using Lucene to index correspondence between various
people, who may or may not use the same language in their discussions to
each other. Think an email system where participants might use the
language that seems most appropriate to the thought at the time, just a
You are correct. Reuse IndexSearcher.
Otis
- Original Message
From: Amol Bhutada <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, March 15, 2006 7:14:49 PM
Subject: closing searcher
Hi,
I am using lucene in a j2ee based webapplication. I have created a one instance
Hi,
I am using lucene in a j2ee based webapplication. I have created a one instance
of reader and searcher object and trying to use that for all searches from
different users without recreating/refreshing reader & searcher objects.
Is this fine?
I am asking this because I am reading
http://www.
Hello,
I recently came across this email in the Lucene user list and am
interested in this article. I tried to access it from the link you
provided, but couldn't find any link to access it. Do you still have an
electronic copy?
Thanks,
Kevin Runde
-Original Message-
From: Malcolm [mailt
The Javadoc should have all the info.
If not - Lucene in Action - http://www.lucenebook.com/search?query=multisearcher
If not - Lucene in Action's free code that includes code with MultiSearcher, as
you can see from snippets at the above URL.
Otis
- Original Message
From: Brian <[EMAIL
: I have implemented the DistanceComparatorSource
: example from Lucene In Action (my Bible) and it works
: great. We are now in the situation where we have
: nearly a million documents in our index and the
: performance of this implementation has degraded.
: Can someone please spare a couple of
Hi Doug,
Yes, it should probably be called "edit-distance-like" or something.
It should definitely say so in the JavaDoc because I've seen this
propagate to people's articles (it was Eric Hatcher's I think, but I'm
not sure).
But what then would the criteria for matching at all be? Right
This doesn't sound like a Lucene problem, at least the way you've described
it. For example, Lucene can't search on any field that isn't indexed (and
most of yours aren't indexed).
Given that, it seems like your option (c) is the way to go. Seems like a
simple RDBMS schema with 3 tables woul
Hi,
I have implemented the DistanceComparatorSource
example from Lucene In Action (my Bible) and it works
great. We are now in the situation where we have
nearly a million documents in our index and the
performance of this implementation has degraded.
I have downloaded and am trying to understand
Dawid Weiss wrote:
I get the concept implemented in PhraseQuery but isn't calling it an
edit distance a little bit far fetched?
Yes, it should probably be called "edit-distance-like" or something.
Only the marginal elements
(minimum and maximum distance from their respective query positions)
: What about such solution:
: Split path like string into smaller tokens and index them as seperate words
eg:
: #Top/World/Poland/# #Top/World/# #Top/#
i would be careful about your use of the word "token" in that sentence,
but yes indexing each of the directory like paths as keywords and doing
Hello Everyone,
I currently have an IndexSearch working Great!
What I want to do now, is move to a multi Index
search. What's the best way to go about it? Is it a
simple process? Any thought's would be appreciated.
Thanks, B
__
Do You Yahoo!?
Ti
> No queries on other fields (news metadata etc) will be performed.
Do you mean that a full text search on the news text isn't required?
I might be wrong, but it seems to me it doesn't sound as a typical
Lucene usage..
I'd go for the (c) option.. (but not just one table :-)
Bye,
Fabio
P.S.:
how
Jason: you really don't need to send the same message 4 times in one
night. You've got to give people time to sleep, and eat, and take care of
other things that don't involve a computer :)
: Can we add a module to lucene so that we are able to use our own similarity
: measure to calculate the si
Hi all,
I'm required to develop an application for searching over news items.
There will be thousands of news items, each one will be assigned
directly to a list of millions of customerIDs. The query will be done
by passing a customerID and will return all news items associated to
it. Furthermore,
> What I did is this:
>
> TermsFilter filter = new TermsFilter();
> filter.addTerm(new Term("date", "20060304 TO
> 20060304"));
The Term object's constructor in your example does not
parse the "20060304 TO 20060304" string. A term is
supposed to represent a single term exactly as it
appears in y
Hi there,
I get the concept implemented in PhraseQuery but isn't calling it an
edit distance a little bit far fetched? Only the marginal elements
(minimum and maximum distance from their respective query positions) are
taken into account. Consider this example:
phrase: a b c d
term p
> Try making bother terms mandatory with "+"
>
> "+date:[20040101 TO 20040101] +Paris"
That was it .. however it does not exactly suite my needs. I want to
create a few combo boxes to let the user create a datefilter (From and
To) on the search queries using a webform.
Now if he chooses 20040101
On 3/15/06, Samuru Jackson <[EMAIL PROTECTED]> wrote:
> search = "date:[20040101 TO 20040101] Paris"
> Somehow this range search does not work. I still get the same results
> as without the date:[..]
Try making bother terms mandatory with "+"
"+date:[20040101 TO 20040101] +Paris"
http://lucene
Hi!
I have some trouble to use the inclusive range search explained in
Lucene in Action in ch. 2.5.5
What I do is to add several fields to the index this way:
document.add(Field.Keyword("id", key));
document.add(Field.Keyword("type", type));
document.add(Field.Text("text",text));
document.add(Fi
See http://issues.apache.org/jira/browse/LUCENE-481
It was for the trunk at the time, but it's not difficult to apply it to
the 1.4.3 sources manually...
-Original Message-
From: WATHELET Thomas [mailto:[EMAIL PROTECTED]
Sent: woensdag 15 maart 2006 14:53
To: java-user@lucene.apache.org
Yes I use the Lucene 143
Could you send me the link for this patch?
Thanks in advance
-Original Message-
From: Vanlerberghe, Luc [mailto:[EMAIL PROTECTED]
Sent: mercredi 15 mars 2006 13:38
To: java-user@lucene.apache.org
Subject: RE: segments.new
Are you using Lucene 1.4.3 ?
There's a b
Hi!
Another option for a term query would be an analyzer, which creates keywords
from paths, building them from every neighbouring pair in the path. So you
could query for paths anywhere in the hierarchy, and you don't have to start
from the top level hierarchy like in the approach mentioned be
I ve just get a "docs out of order".
I have a database that is indexed everytime an update occurs. The
index was ok for the last 3 weeks, and now, after the system throwed
an exception because of a write lock that was not released (and I
deleted it) I am recebing this:
Can anyone help
Full stack
Sorry, that should have read:
Query query1 = null;
if(cat!=""){
Term term = new Term("parentPath",cat);
query1 = new TermQuery(term);
Hits hits = is.search(query1);
}
("parentPath" substituted for "category").
kieran wrote:
Alternatively, you could examine each path, and index each of its
Alternatively, you could examine each path, and index each of its
"parent" paths (perhaps in a field named "parentPath").
i.e.
Top/World/Poland/Abc
would result in the following three values being indexed:
Top
Top/World
Top/World/Poland
You can then use a TermQuery instead of a PrefixQuery.
F
Ok thanks
-Original Message-
From: Vanlerberghe, Luc [mailto:[EMAIL PROTECTED]
Sent: mercredi 15 mars 2006 13:38
To: java-user@lucene.apache.org
Subject: RE: segments.new
Are you using Lucene 1.4.3 ?
There's a bug report in JIRA (LUCENE-481) with a patch that solves this.
On Windows, f
Reply to myself hate this :(
What about such solution:
Split path like string into smaller tokens and index them as seperate words eg:
#Top/World/Poland/# #Top/World/# #Top/#
so if I ask about word #Top/# I will get all the results for this
category, without making so many boolean queries.
Is the
Are you using Lucene 1.4.3 ?
There's a bug report in JIRA (LUCENE-481) with a patch that solves this.
On Windows, files cannot be deleted while they are open and before the
patch, calling getCurrent or isCurrent in one process could block
another one from updating the segments file.
The patch in
On 3/14/06, Mordo, Aviran (EXP N-NANNATEK) <[EMAIL PROTECTED]> wrote:
> You need to index the field as a keyword, or use an analyzer that will
> not strip the / from the string
>
> Aviran
> http://www.aviransplace.com
Field is indexed as Keyword, I was using StandardAnalyzer(), but
currently I try
Yes - this 1.4 bug is what induced us to upgrade to 1.9! So, finding the same
problem in a different guise in 1.9 is quite an unfortunate coincidence!
-Original Message-
From: Daniel Naber [mailto:[EMAIL PROTECTED]
Sent: 14 March 2006 19:38
To: java-user@lucene.apache.org
Subject: Re: S
Hi Thomas
I have been getting similar errors and am trying to investigate the cause.
My current thinking is that it is caused by my virus checker opening
the files. The error only occurs on Windows. When I run the same
test on Linux I do not get the error.
Not much help I know... but at least yo
Hi,
Can we add a module to lucene so that we are able to use our own similarity
measure to calculate the similarity between documents and queries? As lucene
has defined its own measure, we can do few with it.
Considering the documents and queries represented as the vectors, we only
need one clas
High,
I have a trouble this the indexation process, sometimes I retrieve an
error like the file segments.new can't be rename or delete something
like that.
What's happened?
"Raghavendra Prabhu" <[EMAIL PROTECTED]> wrote on 15/03/2006 08:37:25 AM:
> Hi
>
> The problem which i am facing is that the query is Case Sensitive
>
> If i type in BIG letters i am not able to see answers and if i type in
> small letters i am able to see results
>
> Is there anything by which i
39 matches
Mail list logo