date:20070508

Scoring results?!

2007-05-08 Thread supereric

How I can get the tag word score in lucene. suppose that you have searched a tag word and 3 hit documents are now found. 1 -How someone could find number of occurrences in any document so it could sort the results. Also I wan to have some other policies for ranking the results. What should I do t

Lock obtain timed out while searching

2007-05-08 Thread Laxmilal Menaria

Hello everyone, I am getting the following exception while searching: Lock obtain timed out: java.io.IOException: Lock obtain timed out: [EMAIL PROTECTED]:\WINDOWS\TEMP\lucene-22e0ad3c019e26a6e2991b0e6ed97e1c-commit.lock I have implemented MultiSearcher only, No other methods are updating/addi

Re: Periodic Indexing DESIGN QUESTION

2007-05-08 Thread Chris Hostetter

: For example, when you are indexing every hour and large document set : is present, it takes >1 hr to index the documents. Now you are : already behind indexing for the next hour. How do you design : something that is robust? fundementally, this question is really about issues in a producer/con

Re: Periodic Indexing DESIGN QUESTION

2007-05-08 Thread Erick Erickson

Don't do it that way ? Is this an actual or theoretical scenario? And do you reasonably expect it to become actual? Otherwise, why bother? And you've got other problems here. If you're indexing that much data, you'll soon outgrow your disk. Unless you're replacing most of the documents. But assu

Periodic Indexing DESIGN QUESTION

2007-05-08 Thread Ram Peters

I am indexing documents periodically every hour. I have a scenario. For example, when you are indexing every hour and large document set is present, it takes >1 hr to index the documents. Now you are already behind indexing for the next hour. How do you design something that is robust? thanks.

Re: search problem/odd results

2007-05-08 Thread Daniel Naber

On Tuesday 08 May 2007 23:42, John Powers wrote: > I've had problems with luke in the past not being able to read the > files. Just make sure you specify the directory, not the files when opening an index with Luke. Also use the latest version (0.7). Regards Daniel -- http://www.danielnaber

RE: search problem/odd results

2007-05-08 Thread John Powers

Here are the queries In my code I have: System.out.println("luceneQuery: " + luceneQuery); query = MultiFieldQueryParser.parse(luceneQuery.toString(), IndexerExternal.CARTABLE, IndexerExternal.getAnalyzer()); System.out.printl

Re: Keyphrase Extraction

2007-05-08 Thread Bob Carpenter

Mark Miller wrote: The only commercial options that I have seen do not have a web presence (that I know of or can find) and I don't recall the company names (only peripherally involved). Are we talking about Yahoo's buzz index and Amazon's SIPs or CAPs? I actually think the most interesting a

Re: search problem/odd results

2007-05-08 Thread Erick Erickson

First question: What analyzers are you using at index and search time? Second question: Have you tried using query.toString() to see how the query parses? If so, you should post the results. Third question: Have you used Luke to examine your index, to see what's actually in there (which may surp

search problem/odd results

2007-05-08 Thread John Powers

I don't understand why I'm getting the results I'm getting. If I search for "pandock*" I get 6 results Np-pandock Np-pandock-L Np-pandock-1 Np-pandock-2 Np-pandock Np-pandock-L1 If I search for np-pandock I get Np-pandock Np-pandock-L If I search for pandock I get Np-pandock

Re: Automatic analyzer resolving based on Locale

2007-05-08 Thread Chris Hostetter

: There is nothing canned that I know of. I'm also not sure how this : would be used. If you're using a single index, how are you going : to index, then search using these analyzers? Or is there some : other magic going on? i suspect the use case is "shipped" software product, where you want to h

Re: Is it necessary to optimize?

2007-05-08 Thread Otis Gospodnetic

Hi, - Original Message From: Stadler Hans-Christian <[EMAIL PROTECTED]> If mergeFactor is set to 2 and no optimize() is ever done on the index, what is the impact on 1) the number opened files during indexing OG: it will grow a little, but frequently go down as Lucene merges segments

Re: Automatic analyzer resolving based on Locale

2007-05-08 Thread Erick Erickson

There is nothing canned that I know of. I'm also not sure how this would be used. If you're using a single index, how are you going to index, then search using these analyzers? Or is there some other magic going on? Consider your document with a field "text". If you index into this field with dif

Automatic analyzer resolving based on Locale

2007-05-08 Thread Geoffrey De Smet

I have a use case, in which I need to select the Analyzer based on a Locale. For example: "nl" => DutchAnalyzer "nl_BE" => DutchAnalyzer "fr" => FrenchAnalyzer "foobar" => StandardAnalyzer (fallback) I was wondering if lucene has any sort of "AutomaticAnalyzerResolver" class that could do this f

Re: Keyphrase Extraction

2007-05-08 Thread José Ramón Pérez Agüera

here you have a very good tool for Keyphrase Extraction. It is GNU and easy to integrate in Lucene. http://www.paynter.info/academia/Kea.php best jose On 5/8/07, Bill Janssen <[EMAIL PROTECTED]> wrote: Dawid Weiss wrote: > You could also try splitting the document into paragraphs and use Carr

Doubt in FuzzyQuery

2007-05-08 Thread sccarrera

Last week I send a message with a doubt in a FuzzyQuery. I am working with Lucene 2.1.0. I would like to recover files including the set of strings "société américaine" and "sociétés américaines" from a fuzzy query relating the string "société américain" I create a method "getDocuments" and I

Re: Is it necessary to optimize?

2007-05-08 Thread Grant Ingersoll

The contrib/benchmark addition can help you characterize many of these scenarios, especially if you write a DocMaker and QueryMaker for your collection. On May 8, 2007, at 5:30 AM, Stadler Hans-Christian wrote: If mergeFactor is set to 2 and no optimize() is ever done on the index, what i

RE: Keyphrase Extraction

2007-05-08 Thread Vishal Shah

Hi Arsen, I've seen another commercial one from a company called Connexor (www.connexor.com) . It has a decent part-of-speech tagger that could be used in keyphrase extraction with some heuristics on top of it. -vishal. -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] S

Re: Questions regarding Lucene query syntax

2007-05-08 Thread Daniel Einspanjer

On 5/7/07, Doron Cohen <[EMAIL PROTECTED]> wrote: With a query parser set to allowLeadingWildcard, this should do: ( +item -price:* ) ( +item +price:[0100 TO 0150] ) or, to avoid too-many-cluases risk: ( +item -price:[MIN TO MAX]) ( +item +price:[0100 TO 0150] ) where MIN and MAX cover (at least)

Re: Keyphrase Extraction

2007-05-08 Thread Bill Janssen

Dawid Weiss wrote: > You could also try splitting the document into paragraphs and use Carrot2's > Lingo algorithm (www.carrot2.org) on a paragraph-level to extract clusters. > Labelling routine in Lingo should extract 'key' phrases; this analysis is > heavily frequency-based, but... you know, y

Re: Is it necessary to optimize?

2007-05-08 Thread Aleksander M. Stensby

I would say, that over time, the number of files will grow. and continue growing if you never perform an optimize(). After some very adviceful mails from Erick i settled on a mergeFactor of 30, and since I do the indexing in large batches, I perform an optimize() only in the end of the indexi

Is it necessary to optimize?

2007-05-08 Thread Stadler Hans-Christian

If mergeFactor is set to 2 and no optimize() is ever done on the index, what is the impact on 1) the number opened files during indexing 2) the number of opened files during searching 2) the search speed 3) the indexing speed ?? HC ---

Re: Multi language indexing

2007-05-08 Thread bhecht

Hi Doron, Thank you very much for your time and for the detailed explanations. This is exactly what I meant and I am happy to see I understood correctly. I am now using the Snowball which seems to work very good. Thanks again and good day, Barak Hecht. -- View this message in context: htt

Scoring results?!

Lock obtain timed out while searching

Re: Periodic Indexing DESIGN QUESTION

Re: Periodic Indexing DESIGN QUESTION

Periodic Indexing DESIGN QUESTION

Re: search problem/odd results

RE: search problem/odd results

Re: Keyphrase Extraction

Re: search problem/odd results

search problem/odd results

Re: Automatic analyzer resolving based on Locale

Re: Is it necessary to optimize?

Re: Automatic analyzer resolving based on Locale

Automatic analyzer resolving based on Locale

Re: Keyphrase Extraction

Doubt in FuzzyQuery

Re: Is it necessary to optimize?

RE: Keyphrase Extraction

Re: Questions regarding Lucene query syntax

Re: Keyphrase Extraction

Re: Is it necessary to optimize?

Is it necessary to optimize?

Re: Multi language indexing

23 matches

Site Navigation

Mail list logo

Footer information