--- Tea Yu <[EMAIL PROTECTED]> wrote:
> David,
>
> For me I don't want a search for "in print" gives
> results from "in printer"?
> I'll consider that over-stemmed elsecase.
Here the "in" won't be considered as it is a stopword
in most of the analyzers. I know it is in
StandardAnalyzer. So searc
David,
For me I don't want a search for "in print" gives results from "in printer"?
I'll consider that over-stemmed elsecase.
I'm also not that satisfactory when "effective" is stemmed to "effect" by
snowball recently
Cheers
Tea
> Hi David
>
> I like KStem more than Porter / Snowball - but sti
Doug Cutting wrote:
David Spencer wrote:
[1] The user enters a query like:
recursize descent parser
[2] The search code parses this and sees that the 1st word is not a
term in the index, but the next 2 are. So it ignores the last 2 terms
("recursive" and "descent") and suggests alternatives t
Your analysis sounds correct.
At base, a weight is a normalized tf*idf. So a document weight is:
docTf * idf * docNorm
and a query weight is:
queryTf * idf * queryNorm
where queryTf is always one.
So the product of these is (docTf * idf * docNorm) * (idf * queryNorm),
which indeed contains id
Hi guys!
I've posted previously that Hits.doc(x) was taking a long time. Turns out it
has to do with a date range in our query. We usually do date ranges like this:
Date:[(lucene date field) - (lucene date field)]
Sometimes the begin date is "0" which is what we get from
DateField.dateT
David Spencer wrote:
[1] The user enters a query like:
recursize descent parser
[2] The search code parses this and sees that the 1st word is not a term
in the index, but the next 2 are. So it ignores the last 2 terms
("recursive" and "descent") and suggests alternatives to
"recursize"...thu
Andrzej Bialecki wrote:
I was wondering about the way you build the n-gram queries. You
basically don't care about their position in the input term. Originally
I thought about using PhraseQuery with a slop - however, after checking
the source of PhraseQuery I realized that this probably wouldn't
Andrzej Bialecki wrote:
David Spencer wrote:
...or prepare in advance a fast lookup index - split all existing
terms to bi- or trigrams, create a separate lookup index, and then
simply for each term ask a phrase query (phrase = all n-grams from
an input term), with a slop > 0, to get similar existi
David Spencer wrote:
...or prepare in advance a fast lookup index - split all existing
terms to bi- or trigrams, create a separate lookup index, and then
simply for each term ask a phrase query (phrase = all n-grams from
an input term), with a slop > 0, to get similar existing terms.
This should be
Tate Avery wrote:
I get a NullPointerException shown (via Apache) when I try to access http://www.searchmorph.com/kat/spell.jsp
How embarassing!
Sorry!
Fixed!
T
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 3:23 PM
To: Lucene Users List
I get a NullPointerException shown (via Apache) when I try to access
http://www.searchmorph.com/kat/spell.jsp
T
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 3:23 PM
To: Lucene Users List
Subject: NGramSpeller contribution -- Re: com
Andrzej Bialecki wrote:
David Spencer wrote:
I can/should send the code out. The logic is that for any terms in a
query that have zero matches, go thru all the terms(!) and calculate
the Levenshtein string distance, and return the best matches. A more
intelligent way of doing this is to instead
Hi David
I like KStem more than Porter / Snowball - but still has limitations
although performs better as it has a dictionary to augment the rules.
Note that KStem will also treat "print" and "printer" as two distinct terms,
probably treating it as verb and noun respectively.
Cheers
Pete Lewis
Hi George
There are lots of problems with Port stemmers, not great for English but get
worse for other languages.
If you look at:
http://snowball.tartarus.org/demo.php
You'll see the Snowball demo - this is basically another instance of Porter.
If you enter "print" and "printer" and submit the
Honey George wrote:
Hi,
This might be more of a questing related to the
PorterStemmer algorithm rather than with lucene, but
if anyone has the knowledge please share.
You might want to also try the Snowball stemmer:
http://jakarta.apache.org/lucene/docs/lucene-sandbox/snowball/
And KStem:
http://c
You could recieve the group name as an input from the
user and construct a BooleanQuery internally which
will qyery only the group field based on the user
input. So the user need not append the group name with
the search string.
Thanks,
George
--- mahaveer jain <[EMAIL PROTECTED]> wrote:
> If
Hi,
This might be more of a questing related to the
PorterStemmer algorithm rather than with lucene, but
if anyone has the knowledge please share.
I am using the PorterStemFilter that some with lucene
and it turns out that searching for the word 'printer'
does not return a document containing the
If i have rightly understood, you mean to say that the query for search has to be
"Group1" AND "Hello" (if hello is what I want to search ?)
Cocula Remi <[EMAIL PROTECTED]> wrote:
A keyword is not tokenized, that's why you wont be able to search over a part of it.
You'd rather use a Text fie
A keyword is not tokenized, that's why you wont be able to search over a part of it.
You'd rather use a Text fied.
About creating a special field :
IndexWriter Ir =
File f =
Document doc = new Document();
if (f.toString.startsWith("C:\tomcat\webapps\Root\Group1")
{
Hi,
I've used the following Ant targets for build scripts that required
platform dependent work. In the example here, the property
"catalina.home" is set according to what platform we're running on. You
can adapt as needed.
Well in my case the path is KeyWord. I had tried that earlier and it does not seems to
work in a single index file.
Can you explain a bit more about adding group1 and group2 ?
Cocula Remi <[EMAIL PROTECTED]> wrote:
Well you could add a field to each of your Documents whose value would be eith
Well you could add a field to each of your Documents whose value would be either
"group1" or "group2".
Or you could use the path to your files ...
-Message d'origine-
De : mahaveer jain [mailto:[EMAIL PROTECTED]
Envoyé : mardi 14 septembre 2004 17:49
À : [EMAIL PROTECTED]
Objet : RE: He
I am clear with looping recursively to index all the file under Root folder.
But the problem is if I want to search only in group1 or group2.Is that possible to
search only in one of the group folder ?
Cocula Remi <[EMAIL PROTECTED]> wrote:
You just have to loop recurssively over the C:\tomcat\w
You just have to loop recurssively over the C:\tomcat\webapps\Root tree to create your
index.
Yes you can index databases; you will just have to write a mechanism that is able to
create org.apache.lucene.document.Document from database.
For instance :
- connect JDBC
- run a query for obtaining a
Hi
I have implemented Text based search using lucene. I was wonderful playing around with
it.
Now I want to enchance the application.
I have a Root folder, under that I have many other folder, that are group specific,
say (group1, group2, .. so on). The Root folder is in C:\tomcat\webapps\Roo
Karthik,
You are still being a bit cryptic and making it hard for me to
comprehend what the problem is, but here are some general pieces of
advice with Ant related to what I think you are doing:
* There is no need to use conditional logic to have a different set of
properties for different oper
Natarajan.T wrote:
Ok you are correct ...
Suppose if I type "what java" then how can I handle...
You don't have to handle it, lucene does it. If you don't like how
lucene handles it then you may extend
the functionality.
If you use the same analyzer for indexing and searching then you will
fi
Hi
Erik
1) Using Ant and Build.xml I want to run the
org.apache.lucene.demo.IndexFiles to create an Indexfolder
2) Problem is The same Build.xml is to be used Across the O/s for
creating Index
3) The path of Lucene1-4-final.jar are in respective directories for the
O/s...
Ok you are correct ...
Suppose if I type "what java" then how can I handle...
Regards,
Natarajan.
-Original Message-
From: sergiu gordea [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 7:38 PM
To: Lucene Users List
Subject: Re: Search PharseQuery
Natarajan.T wrote:
>Hi,
>
Perhaps we should @deprecate the contributions page like we did with
the Powered By page, and migrate it to the wiki?
Erik
On Sep 13, 2004, at 6:50 PM, Daniel Naber wrote:
On Friday 10 September 2004 15:48, Chas Emerick wrote:
PDFTextStream should be added to the 'Document Converters' sec
I'm not following what you want very clearly, but there is an
task in Lucene's Sandbox.
Please post what you are trying, and I'd be happy to help once I see
the details.
Erik
On Sep 12, 2004, at 4:44 PM, Karthik N S wrote:
Hi
Guys
Apologies..
The Task for me is to build the Ind
Natarajan.T wrote:
Hi,
Thanks for your response.
For example search keyword is like below...
Language "what is java"
Token 1: language
Token 2: what is java(like google)
Regards,
Natarajan.
Lucene works exaclty as you describe above with a simple correction ...
The analyzer has a list of s
Interesting!
http://kasparov.skife.org/blog/2004/09/13#lucene-graphs
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi,
Thanks for your response.
For example search keyword is like below...
Language "what is java"
Token 1: language
Token 2: what is java(like google)
Regards,
Natarajan.
-Original Message-
From: Aad Nales [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 5:19 PM
Hi
I am new to Lucene.
Could anyone tell me how to set the RELEVANCE in which the search results
are displayed.
Any online Examples available on this topic
I welcome ur suggestions
Thanx & Regards
E.Faisal
Important Email Information :- The information in this email is
confidential and may
--- "Natarajan.T" <[EMAIL PROTECTED]>
wrote:
> I am trying to extend the current behavior.
You might have already seen a mail from Cocula Remi on
this. Please provide more details of the problem for
specific comments - basically the problem you are
facing and/or what behavior you are trying to ex
I am trying to extend the current behavior.
Regards,
Natarajan.
-Original Message-
From: Honey George [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 5:09 PM
To: Lucene Users List
Subject: Re: Search PharseQuery
--- "Natarajan.T" <[EMAIL PROTECTED]>
wrote:
> Hi All,
>
>
Hi,
Not sure if this is what you need but I created a lastname filter which
in Dutch means potential double last names like:"van der Vaart". In
order to process these I created a finite state machine that queried
these last names. Since I only needed the filter on 'index' time and I
never use it f
--- "Natarajan.T" <[EMAIL PROTECTED]>
wrote:
> Hi All,
>
>
>
> How do I implement PharseQuery API?
What exactly you mean by implement? Are you trying to
extend the current behavior or only trying find out
the usage?
Thanks,
George
Hi Serigu,
String queryString = "\"waht is java\"";
Query q = QueryParser.parse(queryString, "field", new
StandardAnalyzer());
System.out.println(q.toString());
This is enough for starting consult Lucene API for more information
Are you tested the above query? This search keyword is not
String queryString = "\"waht is java\"";
Query q = QueryParser.parse(queryString, "field", new StandardAnalyzer());
System.out.println(q.toString());
This is enough for starting consult Lucene API for more information
Sergiu
Natarajan.T wrote:
Hi,
Thanks for your mail, that link says only th
Hi,
Thanks for your mail, that link says only theoretically but I need some
sample
Regards,
Natarajan.
-Original Message-
From: Cocula Remi [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 2:58 PM
To: Lucene Users List
Subject: RE: Search PharseQuery
Use QueryParser.
Use QueryParser.
please take a look at
http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html
It's pretty clear.
-Message d'origine-
De : Natarajan.T [mailto:[EMAIL PROTECTED]
Envoyé : mardi 14 septembre 2004 11:26
À : 'Lucene Users List'
Objet : Search PharseQuery
Hi All,
Hi All,
How do I implement PharseQuery API? Pls send me some sample code.( How
can I handle "java is platform" as single word?
)
Regards,
Natarajan.
On Tuesday 14 September 2004 08:32, JiÅÃ Kuhn wrote:
> The error is thrown in exactly the same point as before. This morning I
> downloaded Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM
> is 1.4.2_05-b04, both Linux and Windows.
Now I can reproduce the problem. I first tried running
45 matches
Mail list logo