date:20061003

Re: QueryParser syntax French Operator

2006-10-03 Thread Ronnie Kolehmainen

Wouldn't the easiest fix be to just alter the users query string before passing it to queryparser (moving the semantics of your search app outside of lucene)? Something like: str.replaceAll(" ET ", " AND ").replaceAll(" OU ", " OR ").replaceAll(" SAUF ", " NOT ") Citerar Mark Miller <[EMAIL PROT

Re: Number Proximity Query

2006-10-03 Thread Chris Hostetter

: >From my searches, there seems to be a FunctionQuery in Solr that can do this : type of query. But I am using pure Lucene, and trying to port Solr code over : (to create my own version of FunctionQuery) looks too complicated because of : code dependency on other Solr code such as ValueSource, et

Re: Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-03 Thread Scott

Hi, > Well, the first question is always "are you opening/closing your > IndexSearchers for each request on your remote machines?". This is always a > no-no. This is also a question for your single-searcher version. Yes I know, each search slave (RMI server) have single instance of IndexSearc

Number Proximity Query

2006-10-03 Thread KEGan

Hi, Is there a way to query all numbers that is close to a particular number (query), and score by how close they are to that number (query) ? To illustrate further, assume document with single field "num", and the value for this field can only be integer number. Now, let says, there are 3 docum

Re: QueryParser syntax French Operator

2006-10-03 Thread Mark Miller

Oh wouldn't we all. I want this too. Unfortunately, it's an elusive beast at best. As I am sure you know, JavaCC generates code based on the grammar and so it is very hard to alter the grammar after JavaCC'ing it. If you relax the 'add' part then you might be able to do something with QueryPars

Re: QueryParser syntax French Operator

2006-10-03 Thread Erik Hatcher

Currently AND/OR/NOT are hardcoded into the .jj file. A patch to make this configurable would be welcome! Erik On Oct 3, 2006, at 11:15 AM, Patrick Turcotte wrote: Hi, Is there a way to add / replace the text for the boolean operators used by the query parser? We would like to

Re: [Lucene 2.0]How to recover index?

2006-10-03 Thread Chris Lu

I don't think it can be recovered. It's better to validate the index file beforehand, or make sure one thread is updating the index files and close the index properly. Chris Lu -- http://www.dbsight.net Instant Lucene Search on Any Database/Application On 10/3

Re: Lucene scoring question (how to boost leading terms match)

2006-10-03 Thread Chris Hostetter

: does not pour affinity information into the score - i.e. both doc1 and doc2 : in your example would get the same score, and the SpanFirstQurey would only : allow you to limit the set of returned documents - Hoss, do you agree with : this? Oh ... hmmm ... i think you're right. SpanScorer scores

Re: Lucene scoring question (how to boost leading terms match)

2006-10-03 Thread Doron Cohen

If I understand the question, you do not want to boost in advance a certain doc, but rather score higher those documents containing the search term closer to the start of the document. There is more to define here - for instance, if doc1 has 5 words but doc2 has 1,000,000 words, would you still pr

java.io.IOException: term out of order --> HELP

2006-10-03 Thread Michael J. Prichard

We get this when trying to optimize index: Exception in thread "main" java.io.IOException: term out of order at org.apache.lucene.index.TermInfosWriter.add(TermInfosWriter.java:95) at org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:305) at org.apache.lucene.index.SegmentM

Re: Lucene scoring question (how to boost leading terms match)

2006-10-03 Thread Chris Hostetter

take a look at the SpanFirstQuery ... do do "A*" type searches inside of a SpanQuery you'll eithe need to use SpanRegexQuery, or roll your own SpamPrefixQuery out of a SpanOrQuery containing SpamTermQueries. : I have a question about the lucene scoring. In my following example, : how can I ensu

Re: MultiFieldQueryParser vs concatenated field

2006-10-03 Thread Chris Hostetter

: Well, as always, it depends ... My first thought is that I'd index things : in separate fields as it gives you more options. For instance, let's say : that you have name and phone fields and decide that the name field is more : important than the phone number. Your options for boosting "anything

Re: Search in HTML code

2006-10-03 Thread Erick Erickson

Sure, anything's possible. Whether Lucene is your best bet may be another question . But in this example, you're not using Lucene to do anything except store the strings. By storing all the data as UN_TOKENIZED, all you're doing is a regex match on the entire HTML text of each document. You might

Lucene scoring question (how to boost leading terms match)

2006-10-03 Thread qaz zaq

Hi, I have a question about the lucene scoring. In my following example, how can I ensure the doc1 has the higher score than doc2, if I search for "A*". In another words, I want to boost the docs which match their leading terms. doc1: Aterm Bterm Cterm doc2: Bterm Aterm Cterm

native Java DB (eg, Derby) to store the index: performance comparision?..

2006-10-03 Thread Vladimir Olenin

Hi, I've been wondering if anyone has tried to compare the performance of any 'native' Java DB as index storage mechanism vs Lucene custom implementation? I'm assuming that DB products should provide some functionality for 'free' right out of the box (correct, if I'm wrong): - easily managable

Lucene query syntax description in German

2006-10-03 Thread Aleksei Valikov

Hi folks, Does anybody have the description of Lucene query syntax in German? Thanks! Bye. /lexi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

QueryParser syntax French Operator

2006-10-03 Thread Patrick Turcotte

Hi, Is there a way to add / replace the text for the boolean operators used by the query parser? We would like to replace (or even better, add), "AND", "OR" and "NOT" by "ET", "OU" and "SAUF". Is there a way to configure the QueryParser to do it? We know we could always modify QueryParser.jj to

Re: Search in HTML code

2006-10-03 Thread John Bugger

My crawler indexing crawled pages with these code: Document doc = new Document(); doc.add(new Field("body", page.getHtmlData(), Store.YES, Index.UN_TOKENIZED )); doc.add(new Field("url", page.getUrl(), Store.YES, Index.UN_TOKENIZED)); doc.add(new Field("title", page.getTitle(), Store.YES, Index.TO

Re: AW: get terms by positions

2006-10-03 Thread Grant Ingersoll

I should note, though, that we do this using the Lucene index, using the TermDocs, etc. On Oct 3, 2006, at 8:42 AM, Grant Ingersoll wrote: We often calculate co-occurrence information as an offline task and store it and then it is just a simple lookup at run time. You just have to put to

Re: AW: get terms by positions

2006-10-03 Thread Nicolas Lalevée

Le Mardi 03 Octobre 2006 12:14, Renzo Scheffer a écrit : > I try to get back a list of all left or right neighbours of a searchterm. > Then I will count them to get back the Information, how often a specific > word is used as neighbour of the searchterm. I know that the results are > variable accor

Re: Indexing In Lucene

2006-10-03 Thread Nicolas Lalevée

Le Mardi 03 Octobre 2006 14:27, Nicolas Lalevée a écrit : > Le Mardi 03 Octobre 2006 12:06, W.H. van Atteveldt a écrit : > > I don't know what you're doing but the to: header is empty in your email > > which is really annoying (since I rely on the to: to sort my mail) > > Strange. Looking to the so

Re: AW: get terms by positions

2006-10-03 Thread Grant Ingersoll

We often calculate co-occurrence information as an offline task and store it and then it is just a simple lookup at run time. You just have to put together the appropriate loops based on the window size that you want for any given term. Probably not efficient if you index is changing a l

Re: Indexing In Lucene

2006-10-03 Thread Nicolas Lalevée

Le Mardi 03 Octobre 2006 12:06, W.H. van Atteveldt a écrit : > I don't know what you're doing but the to: header is empty in your email > which is really annoying (since I rely on the to: to sort my mail) Strange. Looking to the source of Ajani's mail, there is : To: And my filter worked : I put

Re: I need your opinion about working with big index and frecuently updates

2006-10-03 Thread Enrique Lamas

Thank you very much Erik, I'll think about it and will do some tests. Bye - Original Message - From: "Erick Erickson" <[EMAIL PROTECTED]> To: Sent: Tuesday, October 03, 2006 1:42 PM Subject: Re: I need your opinion about working with big index and frecuently updates Think about In

Re: MultiFieldQueryParser vs concatenated field

2006-10-03 Thread Erick Erickson

Well, as always, it depends ... My first thought is that I'd index things in separate fields as it gives you more options. For instance, let's say that you have name and phone fields and decide that the name field is more important than the phone number. Your options for boosting "anything in the

Re: I need your opinion about working with big index and frecuently updates

2006-10-03 Thread Erick Erickson

Think about IndexModifier to change your index, although the documentation does state that it's better to batch your deletions together and batch your additions together if possible. 100Mb is not, in my experience, a very big index, so I really don't anticipate many problems. Do note that you can

Re: Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-03 Thread Erick Erickson

Well, the first question is always "are you opening/closing your IndexSearchers for each request on your remote machines?". This is always a no-no. This is also a question for your single-searcher version. What is your performance if you only go to one server? I'd start by finding out what happen

MultiFieldQueryParser vs concatenated field

2006-10-03 Thread Volodymyr Bychkoviak

In my application I need to implement search across several fields. What is better approach in term of relevance scoring: Index in separate fields and search using MultiFieldQueryParser or index everything as concatenated field an search using this field? Thanks in advance. -- regards, Volody

Indexing In Lucene

2006-10-03 Thread Ajani, Akil \(Cognizant\)

Hi, Can anyone tell me how indexing takes place in lucene(Depth).i will be thankful to you if anyone help me.. Thanks & Regards, Akil Ajani Cognizant Technology Solutions India Pvt. Ltd. Plot # 26, Rajiv Gandhi Infotech Park, MIDC Hinjewadi, Pune 411057 Tel: (91) (20) 40201100 e

Re: [Lucene 2.0]How to recover index?

2006-10-03 Thread zhu jiang

Anyone can help me??? 2006/10/3, zhu jiang <[EMAIL PROTECTED]>: Hi all, In some situation, index files may throw read past EOF exception so that the index cannot be used any more. I wonder how to recover the index files in such situation? -- Thanks, Jiang -- Thanks, Jiang

Re: DateTools again

2006-10-03 Thread Volodymyr Bychkoviak

thanks for detailed explanation. John Haxby wrote: Volodymyr Bychkoviak wrote: User has an input (javaScript calendar) on page where he can choose some date to include in search. Search resolution is day resolution. If user will enter same date in different time of date he will get different

Re: DateTools again

2006-10-03 Thread John Haxby

Volodymyr Bychkoviak wrote: User has an input (javaScript calendar) on page where he can choose some date to include in search. Search resolution is day resolution. If user will enter same date in different time of date he will get different results (because calendar will also set current hour

AW: get terms by positions

2006-10-03 Thread Renzo Scheffer

I try to get back a list of all left or right neighbours of a searchterm. Then I will count them to get back the Information, how often a specific word is used as neighbour of the searchterm. I know that the results are variable according to the used Analyzer/Filter. It's just an experiment and fir

RE: Indexing In Lucene

2006-10-03 Thread W.H. van Atteveldt

I don't know what you're doing but the to: header is empty in your email which is really annoying (since I rely on the to: to sort my mail) > -Original Message- > From: Ajani, Akil (Cognizant) [mailto:[EMAIL PROTECTED] > Sent: dinsdag 3 oktober 2006 10:47 > Subject: Indexing In Lucene > >

Indexing In Lucene

2006-10-03 Thread Ajani, Akil \(Cognizant\)

Hi, Can you tell me how indexing takes place in lucene(Depth).if document has 1n indices then which algorithm it uses,which information retrival model it uses... Thanks & Regards, Akil Ajani Cognizant Technology Solutions India Pvt. Ltd. Plot # 26, Rajiv Gandhi Infotech Park, MID

Indexing In Lucene

2006-10-03 Thread Ajani, Akil \(Cognizant\)

Hi, Can you tell me how indexing takes place in lucene(Depth).if document has 1n indices then which algorithm it uses,which information retrival model it uses... Thanks & Regards, Akil Ajani Cognizant Technology Solutions India Pvt. Ltd. Plot # 26, Rajiv Gandhi Infotech Park, MID

Re: DateTools again

2006-10-03 Thread Volodymyr Bychkoviak

Ok, I'll try to explain a bit. User has an input (javaScript calendar) on page where he can choose some date to include in search. Search resolution is day resolution. If user will enter same date in different time of date he will get different results (because calendar will also set current

I need your opinion about working with big index and frecuently updates

2006-10-03 Thread Enrique Lamas

Hi, I'm working with a 100Mb length index. By application requirements, the information indexed is frecuently updated, with plenty of modifications, deletions and additions. I think Lucene is a very powerful searching tool once the index is already created, but I'm not sure if update index frec

38 matches

Mail list logo