On 06-May-2011, at 2:04 AM, Paul Libbrecht wrote:
Le 6 mai 2011 à 00:20, Otis Gospodnetic a écrit :
thus far, only search-testing has provided some analytics measures for us
(precision and recall ones). We, of course, construct the test-suites from
the
logs.
Interesting. It
Where do you get your Lucene/Solr downloads from?
[X] ASF Mirrors (linked in our release announcements or via the Lucene website)
[] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
[] I/we build them from source via an SVN/Git checkout.
[] Other (someone in your company mirrors
Frank --
Lucene can definitely do this stuff. This review of the Query Syntax might
offer you some insight:
http://lucene.apache.org/java/2_4_0/queryparsersyntax.html
Specifically, you can look up Fuzzy Searches and Synonyms. There are a
couple of key ways to handle synonyms, so you might
:33 PM, N Hira nh...@cognocys.com wrote:
Frank --
Lucene can definitely do this stuff. This review of the Query Syntax might
offer you some insight:
http://lucene.apache.org/java/2_4_0/queryparsersyntax.html
Specifically, you can look up Fuzzy Searches and Synonyms. There are a
couple
I don't know of a single tutorial that puts it all together, but the rich
documents feature implemented in Solr-284 would be where I would start:
https://issues.apache.org/jira/browse/SOLR-284
Look here if you're using Solr 1.4 -- it should address your needs:
I use svn diff --change revisionNumber to get the list of files associated
with a given commit.
You might also want to look at http://freshmeat.net/projects/svnweb/
HTH
-h
From: Erick Erickson erickerick...@gmail.com
To: java-user java-user@lucene.apache.org
I think it speaks to the maturity of the project ... Lucene has
solved some of the easier problems in the problem space and the ones
that remain are ... difficult.
I recently introduced Lucene/Nutch to a group of ~10 relatively
capable Java developers. While they find it easy to use,
To you as well, Weiwei Wang.
You can theoretically release your project under a license that is very similar
to the Apache license at any time, presuming you are licensing rights related
to your project. To create a project that is maintained by the Apache Software
Foundation, you should
Which analyzer do you use in luke?
The general practice is to use the same analyzer for indexing and searching.
Good luck.
-h
- Original Message
From: Rathinapriya Nagalingam rnaga...@in.ibm.com
To: java-user@lucene.apache.org
Sent: Friday, October 2, 2009 10:51:42 AM
Subject:
Good summary, Shai.
I've missed some of this thread as well, but does anyone know what happened to
the suggestion about query manipulation?
e.g., query (about us) = query(about us, aboutus)
query(credit card) = query(credit card, creditcard)
Regards,
-h
- Original Message
Lazzara
marco.lazz...@gmail.com
wrote:
I attache the file testIndex.zip.Run the query with :
PHILIPCIMIANO, or
RESEARCHER.
I use StandardAnalyzer.Is it a problem?
Marco Lazzara
2009/5/27 N. Hira nh...@cognocys.com
Not sure if this applies here, but that tends to happen when
/
testIndex,fieldsearch);
try {
this.paths = this.rdfind.Search(text, path);
} catch (ParseException e1) {
e1.printStackTrace();
} catch (IOException e1) {
e1.printStackTrace();
}
Marco Lazzara
2009/5/27 N. Hira nh
);*/
}
isearcher.close();
return resultingpaths;
}
2009/5/27 N. Hira nh...@cognocys.com
Thanks.
Could you also post the code for RDFinder.Search() and the output
from
query.toString() when text is PHILIPCIMIANO?
-h
On 27-May-2009, at 12:40 PM, Marco Lazzara wrote
Cool!
1. So you are creating a parser with { name, synonyms, propIn }, correct?
2. Sorry -- I meant the output of query.toString(); I'm expecting to see
something like this when the sentence parameter is set to philipcimiano:
name:philipcimiano synonyms:philipcimiano propIn:philipcimiano
Marco,
Does the part of the web app that is responsible for searching have permissions
to read /home/marco/testIndex?
Could you add some code to your searching app to print out the directory
listing to confirm?
Also, I may have missed this posting, but could you provide the answer from
Step
marco marco 58 2009-05-24 12:00 segments_c
-rw-r--r-- 1 marco marco 20 2009-05-24 12:00 segments.gen
2009/5/26 N Hira nh...@cognocys.com
Marco,
Does the part of the web app that is responsible for searching have
permissions to read /home/marco/testIndex?
Could you add some code
I think I understand what you're describing as a link map to be a
tag cloud where each tag is a frequent or strong term.
We did something like this as an experiment (without Lucene):
http://www.cognocys.com/prospector/news.html
If you're talking about something similar, then I think you can
You can search the lucene and solr mailing lists for denormalize
but the general response is to try one of:
1. de-normalize the data while indexing
- advantage: one query
- disadvantage: data repetition
2. use 2 indices
- advantage: no need for repetition; this is
I'm not an expert, so please take this with a grain of salt, but if
you return the Hits object, you are inadvertently holding on to
that IndexSearcher, right?
According to the FAQ (http://wiki.apache.org/lucene-java/
ImproveSearchingSpeed), iterating over all Hits will result in
I don't know how much of this is a Lucene problem, but -- as I'm sure
you will inevitably hear from others on the list -- it depends on
what your definition of similar is.
By similar, do you mean:
1. Identical, except for variations in case (upper/lower)
2. Allow 1., but also allow
. For Life.
N. Hira wrote:
I don't know how much of this is a Lucene problem, but -- as I'm
sure you will inevitably hear from others on the list -- it
depends on what your definition of similar is.
By similar, do you mean:
1. Identical, except for variations in case (upper/lower)
2. Allow 1
This isn't ideal, but if you have a defined list of such terms, you
may find it easier to filter these terms out into a separate field
for indexing.
-h
--
Hira, N.R.
Solutions Architect
Cognocys, Inc.
(773) 251-7453
On
How about an attribute (fullyIndexed=true/false) to keep track of whether the
indexing was successful?
We used a similar attribute for a similar problem, but stored it in the
accompanying database instead.
-h
- Original Message
From: Michael McCandless [EMAIL PROTECTED]
To:
in Document addition
In your scenario, it might work, but I wonder how you generate hits,
excluding the fullyindexed=false.
-Original Message-
From: N Hira [mailto:[EMAIL PROTECTED]
Sent: 19 May 2008 18:31
To: java-user@lucene.apache.org
Subject: Re: Transaction semantics in Document addition
Please review:
http://wiki.apache.org/lucene-java/LuceneFAQ
I suspect your question is answered as:
How do I make sure that a match in a document title has greater
weight than than a match in a document body?
-h
--
Hira,
Thank you for this. Luke has been *extremely* helpful.
-h
--
Hira, N.R.
Solutions Architect
Cognocys, Inc.
On 04-Feb-2008, at 10:17 PM, Andrzej Bialecki wrote:
Hi all,
I just released Luke 0.8, the Lucene Index Toolbox. As
Hi Liaqat,
Are you sure that the Urdu characters are being correctly interpreted
by the JVM even during the file I/O operation?
I would expect Unicode characters to be encoded as multi-byte
sequences and so, the string-matching operations would fail (if the
literals are different from
Can you explain the problem you're trying to address from the user's
perspective?
From the description you've provided, you may want to look up
Faceted Searching. Another option may be to use a HitCollector,
but it would help us if you could describe the problem at a higher
level.
Donna,
If I understand the problem correctly, it is: given a [job
description], find [candidates] that we would not otherwise find. That
seems to be a user-weighted similarity problem more than a simple
search problem.
IOW:
1. Given a [job description], create a set of queries that look for
Could you show us the relevant source from doBodySearch()?
-h
On Tue, 2007-07-24 at 19:58 -0400, Askar Zaidi wrote:
I ran some tests and it seems that the slowness is from Lucene calls when I
do doBodySearch, if I remove that call, Lucene gives me results in 5
seconds. otherwise it takes
;
}
I really need to optimize doBodySearch(...) as this takes the most
time.
thanks guys,
Askar
On 7/24/07, N. Hira [EMAIL PROTECTED] wrote:
Could you show us the relevant source from doBodySearch()?
-h
On Tue, 2007-07-24 at 19:58
Max,
We use a (customized version of) Lucene as part of our Cognocys IAM
product, which is also available with Oracle RDBMS.
I can tell you that the software is used at Medtronic, a global medical
technology company.
-h
---
We had a similar problem. We discovered that it was basically that eden/from
was out of memory and made two changes and that seems to have helped:
1. Reduce [Max]PermSize to 128M
2. Use the concurrent garbage collector
Good luck.
-h
--- Ross Rankin [EMAIL PROTECTED] wrote:
We keep getting
Alberto,
It might be helpful if you would provide the full stack-trace.
We use Lucene with our web application like many other projects. I can assure
you that there is no basic incompatibility, but you may indeed be experiencing
something specific to your environment.
-h
Alberto
34 matches
Mail list logo