date:20060406

RE: Optimize completely in memory with a FSDirectory?

2006-04-06 Thread Max Pfingsthorn

Hi, Thanks for your suggestion. I thought about the same, but somehow it didn't seem like such a good idea... Now that I think about it, it would take the same IO load (in terms of flushing many megabytes to disk) as optimizing in memory with the FSDirectory. Another weird thing we observed

nested phrase queries

2006-04-06 Thread Michael Dodson

Can phrase queries be nested the same way boolean queries can be nested? I want a user query to be translated into a boolean query (say, x AND (y OR z)), and I want those terms to be within a certain distance of each other (approximately within the same sentence, so the slop would be

Re: nested phrase queries

2006-04-06 Thread Erik Hatcher

On Apr 6, 2006, at 8:47 AM, Michael Dodson wrote: Can phrase queries be nested the same way boolean queries can be nested? Yes... using SpanNearQuery instead of PhraseQuery. I want a user query to be translated into a boolean query (say, x AND (y OR z)), and I want those terms to be

Re: nested phrase queries

2006-04-06 Thread mark harwood

The XMLQueryParser in the contrib section also handles Spans (as well as a few other Lucene queries/filters not represented by the standard QueryParser). Here's an example of a complex query from the JUnit test ?xml version=1.0 encoding=UTF-8? SpanOr fieldName=contents SpanNear slop=8

easy way to perform range searches on numeric values

2006-04-06 Thread Bill Snyder

Hello, How can I configure Lucene to handle numeric range searches? (This question has been asked 100 times, I'm sure.) I've tried the suggestions on the SearchNumericalFields wiki page. This seems to work for simple queries. Searching for line:[1 to 10] gives me lines 1 thru 10 of the

Multiple Indexes Search

2006-04-06 Thread Yang Sun

Hi, Just wondering if there is anyway to search two indexes with relations like in the relational database. For example, in index1 there are fields pid and content. in index2 there are fields cid, record, and pid. I want to search keyword1 in content and keyword2 in record and they should

Re: StopAnalyzer and apostrophes

2006-04-06 Thread Marvin Humphrey

I wrote: It looks like StopAnalyzer tokenizes by letter, and doesn't handle apostrophes. So, the input I don't know produces these tokens: don t know Is that right? It's not right. StopAnalyzer does tokenize letter by letter, but 't' is a stopword, so the tokens are:

DateField vs DateTools

2006-04-06 Thread John Smith

Hi We are in the process of upgrading Lucene from 1.2 to 1.9. There used to be 2 methods in DateField.java in 1.2 public static String MIN_DATE_STRING() public static String MAX_DATE_STRING() This basically gave the minimum and the maximum dates we could index

Question related to using FieldCacheImpl

2006-04-06 Thread John Smith

Hi I need to access min and max values of a particular field in the index, as soon as a searcher is initialized. I don't need it later. Looking at old newsgroup mails, I found a few recommendations. One was to keep the min and max fields external to the index. But this will not work

RE: Data structure of a Lucene Index

2006-04-06 Thread Dmitry Goldenberg

Ideally, I'd love to see an article explaining both in detail: the index structure as well as the merge algorithm... From: Prasenjit Mukherjee [mailto:[EMAIL PROTECTED] Sent: Tue 3/28/2006 11:57 PM To: java-user@lucene.apache.org Subject: Data structure of a

Re: DateField vs DateTools

2006-04-06 Thread Daniel Naber

On Donnerstag 06 April 2006 19:50, John Smith wrote: I have not drilled down into the implementation details too much, but what was the reason for getting rid of these methods in Lucene 1.9? There is no limit on the given dates in DateTools (within the limits of what Java's Calendar/Date

RE: Distributed Lucene.. - clustering as a requirement

2006-04-06 Thread Dmitry Goldenberg

I firmly believe that clustering support should be a part of Lucene. We've tried implementing it ourselves and so far have been unsuccessful. We tried storing Lucene indices in a database that is the back-end repository for our app in a clustered environment and could not overcome the

Re: nested phrase queries

2006-04-06 Thread Erik Hatcher

Seeing this worries me we'll see users creating XML strings, then parsing them to get the desired query. I've seen this lots with QueryParser, but it would be even more gross to see folks do this with the XML syntax. So, here's my community service message for the day if you're

Re: Question related to using FieldCacheImpl

2006-04-06 Thread John Smith

Thank you JS --- Yonik Seeley [EMAIL PROTECTED] wrote: On 4/6/06, John Smith [EMAIL PROTECTED] wrote: // inherit javadocs public String[] getStrings (IndexReader reader, String field) The string array I get back, is it guaranteed that the first non-null value I encounter in

Re: Distributed Lucene.. - clustering as a requirement

2006-04-06 Thread Chris Lamprecht

What about using lucene just for searching (i.e., no stored fields except maybe one ID primary key field), and using an RDBMS for storing the actual documents? This way you're using lucene for what lucene is best at, and using the database for what it's good at. At least up to a point -- RDBMSs

RE: Distributed Lucene.. - clustering as a requirement

2006-04-06 Thread Dmitry Goldenberg

I think it's a good idea. For an enterprise-level application, Lucene appears too file-system and too byte-sequence-centric a technology. Just my opinion. The Directory API is just too low-level. I'd be OK with an RDBMS-based Directory implementation I could take and use. But generally, I

Question about Lucene's search algorithm

2006-04-06 Thread inge santoso

Hi all, Im still new to Lucene. I'm in the last year of my bachelor degree in Computer Science. My final thesis is about indexing and searching in Lucene 1.4.3. I've read about Space Optimizations for Total Ranking paper. My main question is : 1.What search

doc.get(contents)

2006-04-06 Thread miki sun

Dear all I got a java.lang.NullPointerException at java.io.StringReader.init(StringReader.java:33) error when processing the following code: for (int i = 0; i theHits.length(); i++) { Document doc = theHits.doc(i); String contents = doc.get(contents) ; TokenStream tokenStream =

Getting count of documents matching a query?

2006-04-06 Thread Tom Hill

Hi - Is there a fast way (not easy, but speedy) of getting the count of documents that match a query? I need the count, and don't need the docs at this point. If I had a simple query, (e.g. book) I can use docFreq(), and it's lightning fast. If I just run it as a query it's much slower. I'm

Re: Getting count of documents matching a query?

2006-04-06 Thread Chris Hostetter

: I need the count, and don't need the docs at this point. If I had a : simple query, (e.g. book) I can use docFreq(), and it's lightning : fast. If I just run it as a query it's much slower. I'm just : wondering if I did a custom scorer / similarity / hitcollector, how : much faster than a query

Re: highlighting - fuzzy search

2006-04-06 Thread Daniel Noll

Fisheye wrote: HashSet terms = new HashSet(); query.rewrite(reader).extractTerms(terms); Ok, but this delivers every term, not just a list of words the Levenshtein algorithm produced with similarity. I asked a similar thing in the past about term highlighting in general,

Re: StopAnalyzer and apostrophes

2006-04-06 Thread Daniel Noll

Marvin Humphrey wrote: I wrote: It looks like StopAnalyzer tokenizes by letter, and doesn't handle apostrophes. So, the input I don't know produces these tokens: don t know Is that right? It's not right. StopAnalyzer does tokenize letter by letter, but 't' is a stopword, so

Re: StopAnalyzer and apostrophes

2006-04-06 Thread Marvin Humphrey

On Apr 6, 2006, at 4:23 PM, Daniel Noll wrote: Marvin Humphrey wrote: I wrote: It looks like StopAnalyzer tokenizes by letter, and doesn't handle apostrophes. So, the input I don't know produces these tokens: don t know Is that right? It's not right. StopAnalyzer does

Calling addDocument twice for the same document

2006-04-06 Thread Daniel Noll

Hi all. I have a situation where a Document is constructed with a bunch of strings and a couple of readers. An error may occur while reading from the readers, and in these situations, we want to remove the reader and then try to index the same document again. I've made a test case which

RE: Optimize completely in memory with a FSDirectory?

nested phrase queries

Re: nested phrase queries

Re: nested phrase queries

easy way to perform range searches on numeric values

Multiple Indexes Search

Re: StopAnalyzer and apostrophes

DateField vs DateTools

Question related to using FieldCacheImpl

RE: Data structure of a Lucene Index

Re: DateField vs DateTools

RE: Distributed Lucene.. - clustering as a requirement

Re: nested phrase queries

Re: Question related to using FieldCacheImpl

Re: Distributed Lucene.. - clustering as a requirement

RE: Distributed Lucene.. - clustering as a requirement

Question about Lucene's search algorithm

doc.get(contents)

Getting count of documents matching a query?

Re: Getting count of documents matching a query?

Re: highlighting - fuzzy search

Re: StopAnalyzer and apostrophes

Re: StopAnalyzer and apostrophes

Calling addDocument twice for the same document

24 matches

Site Navigation

Mail list logo

Footer information