Re: Searching index problems with tomcat

2009-05-26 Thread N Hira
Marco, Does the part of the web app that is responsible for searching have permissions to read "/home/marco/testIndex"? Could you add some code to your searching app to print out the directory listing to confirm? Also, I may have missed this posting, but could you provide the answer from Ste

Re: Searching index problems with tomcat

2009-05-26 Thread N Hira
arco marco 58 2009-05-24 12:00 segments_c -rw-r--r-- 1 marco marco 20 2009-05-24 12:00 segments.gen 2009/5/26 N Hira > > Marco, > > Does the part of the web app that is responsible for searching have > permissions to read "/home/marco/testIndex"? > > Could y

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
ds (for example) only 3 files of 5. Any ideas??? Marco Lazzara 2009/5/27 N Hira Sorry for the confusion -- I checked the archive and I could not find a message where you have been able to open the index using Luke. Have you been able to do that? I see that you have reported the crea

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
a wrote: I attache the file testIndex.zip.Run the query with : PHILIPCIMIANO, or RESEARCHER. I use StandardAnalyzer.Is it a problem? Marco Lazzara 2009/5/27 N. Hira Not sure if this applies here, but that tends to happen when the analyzer you use for indexing is different from the o

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
with: no segments* file found in org.apache.lucene.store It means that Lucene recognizes the index (when it isn't empty) but on the webapp It obtains no result Marco Lazzara 2009/5/27 N. Hira Okay -- if the problem is not the number of results, then let's clarify the pr

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
1.printStackTrace(); } Marco Lazzara 2009/5/27 N. Hira Okay -- that helps. So we know that searching the same files with Luke works, but with the web app does not. Can you please re-post the fragment of code that opens your index and uses the query? If you haven't alr

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
archer.doc(sc.doc).get(Field); System.out.println(res); results.add(res);*/ } isearcher.close(); return resultingpaths; } 2009/5/27 N. Hira Thanks. Could you also post the code for RDFinder.Search() and the output from query.toString() w

Re: Searching index problems with tomcat

2009-05-27 Thread N Hira
Cool! 1. So you are creating a parser with { name, synonyms, propIn }, correct? 2. Sorry -- I meant the output of "query.toString()"; I'm expecting to see something like this when the sentence parameter is set to philipcimiano: name:philipcimiano synonyms:philipcimiano propIn:philipcimian

Re: Searching doubt

2009-08-04 Thread N Hira
Good summary, Shai. I've missed some of this thread as well, but does anyone know what happened to the suggestion about query manipulation? e.g., query (about us) => query("about us", "aboutus") query(credit card) => query("credit card", "creditcard") Regards, -h - Original Message

Re: lucene 2.4.1 : document in index but not returned in search

2009-10-02 Thread N Hira
Which analyzer do you use in luke? The general practice is to use the same analyzer for indexing and searching. Good luck. -h - Original Message From: Rathinapriya Nagalingam To: java-user@lucene.apache.org Sent: Friday, October 2, 2009 10:51:42 AM Subject: lucene 2.4.1 : document i

Re: Time of processing hits.doc()

2007-11-18 Thread N. Hira
Can you explain the problem you're trying to address from the user's perspective? From the description you've provided, you may want to look up "Faceted Searching". Another option may be to use a HitCollector, but it would help us if you could describe the problem at a higher level. Re

Re: StopWords problem

2007-12-27 Thread N. Hira
Hi Liaqat, Are you sure that the Urdu characters are being correctly interpreted by the JVM even during the file I/O operation? I would expect Unicode characters to be encoded as multi-byte sequences and so, the string-matching operations would fail (if the literals are different from the

Re: [ANN] Luke 0.8 released

2008-02-04 Thread N. Hira
Thank you for this. Luke has been *extremely* helpful. -h -- Hira, N.R. Solutions Architect Cognocys, Inc. On 04-Feb-2008, at 10:17 PM, Andrzej Bialecki wrote: Hi all, I just released Luke 0.8, the Lucene Index Toolbox. As

Re: Search Order

2008-05-05 Thread N. Hira
Please review: http://wiki.apache.org/lucene-java/LuceneFAQ I suspect your question is answered as: How do I make sure that a match in a document title has greater weight than than a match in a document body? -h -- Hira, N.

Re: Transaction semantics in Document addition

2008-05-19 Thread N Hira
How about an attribute (fullyIndexed=true/false) to keep track of whether the indexing was successful? We used a similar attribute for a similar problem, but stored it in the accompanying database instead. -h - Original Message From: Michael McCandless <[EMAIL PROTECTED]> To: java-

Re: Transaction semantics in Document addition

2008-05-19 Thread N Hira
ics in Document addition In your scenario, it might work, but I wonder how you generate hits, excluding the fullyindexed=false. -Original Message- From: N Hira [mailto:[EMAIL PROTECTED] Sent: 19 May 2008 18:31 To: java-user@lucene.apache.org Subject: Re: Transaction semantics in Do

Re: searching for C++

2008-06-24 Thread N. Hira
This isn't ideal, but if you have a defined list of such terms, you may find it easier to filter these terms out into a separate field for indexing. -h -- Hira, N.R. Solutions Architect Cognocys, Inc. (773) 251-7453 On 24-Ju

Re: Similarity percentage between two Strings

2008-09-03 Thread N. Hira
I don't know how much of this is a Lucene problem, but -- as I'm sure you will inevitably hear from others on the list -- it depends on what your definition of "similar" is. By similar, do you mean: 1. Identical, except for variations in case (upper/lower) 2. Allow 1., but also allow prefix

Re: Similarity percentage between two Strings

2008-09-03 Thread N. Hira
pen Source. For Life. N. Hira wrote: I don't know how much of this is a Lucene problem, but -- as I'm sure you will inevitably hear from others on the list -- it depends on what your definition of "similar" is. By similar, do you mean: 1. Identical, except for variations

Re: Lucene Memory Leak

2008-09-05 Thread N. Hira
I'm not an expert, so please take this with a grain of salt, but if you return the Hits object, you are inadvertently "holding on" to that IndexSearcher, right? According to the FAQ (http://wiki.apache.org/lucene-java/ ImproveSearchingSpeed), iterating over all Hits will result in addition

Re: Search all Related Documents

2008-09-18 Thread N. Hira
You can search the lucene and solr mailing lists for "denormalize" but the general response is to try one of: 1. de-normalize the data while indexing - advantage: one query - disadvantage: data repetition 2. use 2 indices - advantage: no need for repetition; this is necessa

Re: Link map over results? or term freq

2008-10-16 Thread N. Hira
I think I understand what you're describing as a "link map" to be a "tag cloud" where each tag is a "frequent" or "strong" term. We did something like this as an experiment (without Lucene): http://www.cognocys.com/prospector/news.html If you're talking about something similar, then I think yo

Re: How to get a apache public license

2009-12-23 Thread N Hira
To you as well, Weiwei Wang. You can theoretically release your project under a license that is very similar to the Apache license at any time, presuming you are licensing rights related to your project. To create a project that is maintained by the Apache Software Foundation, you should proba

Re: If you could have one feature in Lucene...

2010-02-25 Thread N. Hira
I think it speaks to the maturity of the project ... Lucene has solved some of the easier problems in the problem space and the ones that remain are ... difficult. I recently introduced Lucene/Nutch to a group of ~10 relatively capable Java developers. While they find it easy to use, they

Re: Searching Subversion comments:

2010-03-08 Thread N Hira
I use "svn diff --change " to get the list of files associated with a given commit. You might also want to look at http://freshmeat.net/projects/svnweb/ HTH -h From: Erick Erickson To: java-user Sent: Mon, March 8, 2010 2:48:41 PM Subject: Searching Subversi

Re: Lucene Newbie Questions

2010-05-31 Thread N Hira
Frank -- Lucene can definitely do this stuff. This review of the Query Syntax might offer you some insight: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html Specifically, you can look up "Fuzzy Searches" and "Synonyms". There are a couple of key ways to handle synonyms, so you might

Re: Lucene Newbie Questions

2010-05-31 Thread N Hira
;m already inside a java based web application it would seem like both SOLR and Lucene would be plausible. I'm curious what other factors I should know about in determing if SOLR or Lucene is right for me. Can SOLR be used within a web application (as a library) or is it only a standalone app. Fra

Re: Solr tutorial

2010-05-31 Thread N Hira
I don't know of a single tutorial that puts it all together, but the "rich documents" feature implemented in Solr-284 would be where I would start: https://issues.apache.org/jira/browse/SOLR-284 Look here if you're using Solr 1.4 -- it should address your needs: http://wiki.apache.org/solr/Extra

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread N. Hira
Where do you get your Lucene/Solr downloads from? [X] ASF Mirrors (linked in our release announcements or via the Lucene website) [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors

Re: Lucene Index of Oracle RDBMS

2007-01-08 Thread N. Hira
Max, We use a (customized version of) Lucene as part of our Cognocys IAM product, which is also available with Oracle RDBMS. I can tell you that the software is used at Medtronic, a global medical technology company. -h ---

Re: Fine Tuning Lucene implementation

2007-07-24 Thread N. Hira
Could you show us the relevant source from doBodySearch()? -h On Tue, 2007-07-24 at 19:58 -0400, Askar Zaidi wrote: > I ran some tests and it seems that the slowness is from Lucene calls when I > do "doBodySearch", if I remove that call, Lucene gives me results in 5 > seconds. otherwise it takes

Re: Fine Tuning Lucene implementation

2007-07-24 Thread N. Hira
String str = doc.get("item"); > int tmp = Integer.parseInt(str); > if(tmp==id) > score = hits.score(i); > } > > return score; > } > > I really need to optimize doBodySearch(...) as this t

Re: How to implement cut of score ?

2007-08-13 Thread N. Hira
Donna, If I understand the problem correctly, it is: given a [job description], find [candidates] that we would not otherwise find. That seems to be a "user-weighted similarity" problem more than a simple search problem. IOW: 1. Given a [job description], create a set of queries that look for

Re: Thoughts on Search Analytics?

2011-05-06 Thread N. Hira
On 06-May-2011, at 2:04 AM, Paul Libbrecht wrote: > > Le 6 mai 2011 à 00:20, Otis Gospodnetic a écrit : >>> thus far, only search-testing has provided some analytics measures for us >>> (precision and recall ones). We, of course, construct the test-suites from >>> the >>> logs. >> >> Inter

Re: Problems with Lucene

2006-06-01 Thread N Hira
Alberto, It might be helpful if you would provide the full stack-trace. We use Lucene with our web application like many other projects. I can assure you that there is no basic incompatibility, but you may indeed be experiencing something specific to your environment. -h Alberto Marqu�ff

Re: JVM Crash

2006-06-13 Thread N Hira
We had a similar problem. We discovered that it was basically that eden/from was out of memory and made two changes and that seems to have helped: 1. Reduce [Max]PermSize to 128M 2. Use the concurrent garbage collector Good luck. -h --- Ross Rankin <[EMAIL PROTECTED]> wrote: > We keep gettin