date:20070724

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Dmitry

Askar, why do you need to add +id:? thanks, dt, www.ejinz.com search engine news forms - Original Message - From: "Askar Zaidi" <[EMAIL PROTECTED]> To: ; <[EMAIL PROTECTED]> Sent: Wednesday, July 25, 2007 12:39 AM Subject: Re: Fine Tuning Lucene implementation Hey Hira , Thanks so mu

RE: What replaced org.apache.lucene.document.Field.Text?

2007-07-24 Thread Liu_Andy2

Please reference How do I get code written for Lucene 1.4.x to work with Lucene 2.x? http://wiki.apache.org/lucene-java/LuceneFAQ#head-86d479476c63a2579e867b 75d4faa9664ef6cf4d Andy -Original Message- From: Lindsey Hess [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 25, 2007 12:31 PM To

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

Hey Hira , Thanks so much for the reply. Much appreciate it. Quote: Would it be possible to just include a query clause? - i.e., instead of just contents:, also add +id: How can I do that ? I see my query as : +contents:harvard +contents:business +contents:review where the search phrase w

What replaced org.apache.lucene.document.Field.Text?

2007-07-24 Thread Lindsey Hess

I'm trying to get some relatively old Lucene code to compile (please see below), and it appears that Field.Text has been deprecated. Can someone please suggest what I should use in its place? Thank you. Lindsey public static void main(String args[]) throws Exception {

Re: Fine Tuning Lucene implementation

2007-07-24 Thread N. Hira

I'm no expert on this (so please accept the comments in that context) but 2 things seem weird to me: 1. Iterating over each hit is an expensive proposition. I've often seen people recommending a HitCollector. 2. It seems that doBodySearch() is essentially saying, do this search and return the

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Grant Ingersoll

Inline below On Jul 24, 2007, at 8:14 PM, Askar Zaidi wrote: Sure. public float doBodySearch(Searcher searcher,String query, int id){ try{ score = search(searcher, query,id); } catch(IOException io){}

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Mark Miller

Are you sure you are using the same Searcher for every search? Don't open a new one unless you have modified the index. You are iterating over every hit with the Hits class. You don't ever want to do this. Use a HitCollector if you want to iterate over more than a hundred or so hits. You will f

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

Sure. public float doBodySearch(Searcher searcher,String query, int id){ try{ score = search(searcher, query,id); } catch(IOException io){} catch(ParseException pe){}

Re: Fine Tuning Lucene implementation

2007-07-24 Thread N. Hira

Could you show us the relevant source from doBodySearch()? -h On Tue, 2007-07-24 at 19:58 -0400, Askar Zaidi wrote: > I ran some tests and it seems that the slowness is from Lucene calls when I > do "doBodySearch", if I remove that call, Lucene gives me results in 5 > seconds. otherwise it takes

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

Shall I setMergeFactor = 2 ? Slow indexing is not a bother. On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote: > > I ran some tests and it seems that the slowness is from Lucene calls when > I do "doBodySearch", if I remove that call, Lucene gives me results in 5 > seconds. otherwise it takes ab

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

I ran some tests and it seems that the slowness is from Lucene calls when I do "doBodySearch", if I remove that call, Lucene gives me results in 5 seconds. otherwise it takes about 50 seconds. But I need to do Body search and that field contains lots of text. The field is . How can I optimize that

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Grant Ingersoll

Sorry, I mistyped. I don't mean the get methods, I mean the doTagSearch, doTitleSearch, etc. As for the stop watch, not really sure what to make of that... Try System.currentTimeMillis()... You can get just the fields you want when loading a Document by using the FieldSelector API on

Query parsing?

2007-07-24 Thread Lindsey Hess

Hi, I'm building an application that needs to translate one query format into another. For example, my application generates the following query from a UI: ((title="Gone With The Wind") (title="Brave New World")) and internally I need to convert it into this format so that I can make a web s

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

Can someone please tell me how to cache results in Lucene ? I know the classes, but I don't know how to go about it. thanks, Askar On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote: > > Thanks for the reply. > > I am timing the entire search process with a stop watch, a bit ghetto > style. My get

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

Thanks for the reply. I am timing the entire search process with a stop watch, a bit ghetto style. My getXXX methods are: Document doc = hits.doc(i); String str = doc.get("item"); So you can see that I am retrieving the entire document in a search query. Ideally , I'd like to just retrieve the F

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Grant Ingersoll

Where are you getting your numbers from? That is, where are your timers? Are you timing the rs.next() loop, or the individual calls to Lucene? What do the getX methods look like? How big are your queries? How big is your index? Essentially, we need more info to really help you. Fr

Re: Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

I have 512MB RAM allocated to JVM Heap. If I double my system RAM from 768MB to say 2GB or so, and give JVM 1.5GB Heap space, will I get quicker results ? Can I expect results which take 1 minute to be returned in 30 seconds with more RAM ? Should I also get a more powerful CPU ? A real server cla

Lucene and Eastern languages (Japanese, Korean and Chinese)

2007-07-24 Thread Shaw, James

Hi, guys, I found Analyzers for Japanese, Korean and Chinese, but not stemmers; the Snowball stemmers only include European languages. Does stemming not make sense for ideograph-based languages (i.e., no stemming is needed for Japanese, Korean and Chinese)? Also for spell checking, does the defau

FieldCache for Search

2007-07-24 Thread Askar Zaidi

Hey Guys, >From what I understand, FieldCache is used to store only the field required for search. I am using a Document object and then using doc.get("item"). One of my fields is HUGE, so using Document will slow things down. How can I use FieldCache ? an example ? thanks, AZ

Fine Tuning Lucene implementation

2007-07-24 Thread Askar Zaidi

Hey Guys, I just finished up using Lucene in my application. I have data in a database , so while indexing I extract this data from the database and pump it into the index. Specifically , I have the following data in the index: where itemID is just a number (primary key in the DB) tags : te

Re: ArrayIndexOutOfBoundsException on TermScorer

2007-07-24 Thread Rafael Rossini

Got it, I don´t have a clue if this corruption was caused by hardware failure, but that is possible because we suffer with a lot of power failures from time to time. But the thing is that I´ve been using lucene for a long time and I never got this kind of exception. The thing is that I´d l

Re: ArrayIndexOutOfBoundsException on TermScorer

2007-07-24 Thread Yonik Seeley

On 7/24/07, Rafael Rossini <[EMAIL PROTECTED]> wrote: I did a litle debug and found that in the TermScorer, the byte[] norms has size = 1.119.933, wich is the number of docs on my index, and there is a docID = 1226511, that is if the "doc" variable in the method is the docID. I tried to access t

Re: ArrayIndexOutOfBoundsException on TermScorer

2007-07-24 Thread Rafael Rossini

I did a litle debug and found that in the TermScorer, the byte[] norms has size = 1.119.933, wich is the number of docs on my index, and there is a docID = 1226511, that is if the "doc" variable in the method is the docID. I tried to access this document with reader.document() and got a * java.io

Re: Lucene 2.2 + Not Merging Segments

2007-07-24 Thread Harini Raghavan

I figured out the problem. The issue had nothing to do with Lucene 2.2. I had accidentally reset the default mergeFactor to 1000. This was the reason it was not merging the segments. With the default mergeFactor, the indexing is working perfectly fine. Thanks, Harini On 7/24/07, Michael McCandle

Re: Search for null

2007-07-24 Thread Jay Yu

daniel rosher wrote: Perhaps you can use a filter in the following way. -Create a filter (via QueryFilter) that would contain all document that do not have null values for the field Interesting: what does the QueryFilter look like? Isn't it just as hard as finding out what docs have the null

Re: ArrayIndexOutOfBoundsException on TermScorer

2007-07-24 Thread Rafael Rossini

I don´t know the exact date of the build, but it is certainly before July 4, and before the LUCENE-843 patch was committed. My index has 1.119.934 docs on it and is about 8.2G. I really don´t know how to reproduce this, the only query that I get this error, so far, is "brasil"... and I don´t know

Re: Search for null

2007-07-24 Thread Erick Erickson

Nobody can answer that question, you have to test in your particular situation. Filters are very efficient to use once created, can be created once and used often, etc. Adding a special value to stand for an empty field is conceptually simple, and queries are straight forward. Unless you can dem

Re: ArrayIndexOutOfBoundsException on TermScorer

2007-07-24 Thread Michael McCandless

That looks spooky. It looks like either the norms array is not large enough or that docID is too large. Do you know how many docs you have in your index? Is this easy to reproduce, maybe on a smaller index? There was a very large change recently (LUCENE-843) to speed up indexing and it's possi

Re: Multiple Languages with Lucene (Arabic & English)

2007-07-24 Thread Erick Erickson

You'll also find lots of discussion about indexing multiple languages if you search the mail archive for things like multiple language. I think one thing you're missing is that Lucene indexes data however you tell it to. You have both total control over and total responsibility for how things are

ArrayIndexOutOfBoundsException on TermScorer

2007-07-24 Thread Rafael Rossini

Hello all, I´m using solr in an app, but I´m getting an error that it might be a lucene problem. When I perform a simple query like q = brasil I´m getting this exception: java.lang.ArrayIndexOutOfBoundsException: 1226511 at org.apache.lucene.search.TermScorer.score(TermScorer.java:74) at org

Re: Search for null

2007-07-24 Thread Yonik Seeley

On 7/24/07, daniel rosher <[EMAIL PROTECTED]> wrote: Perhaps you can use a filter in the following way. -Create a filter (via QueryFilter) that would contain all document that do not have null values for the field -flip the bits of the filter so that it now contains documents that have null valu

Re: Search for null

2007-07-24 Thread testn

Would it be more efficient to create an additional inverted field where I assign a value to that field only when the field I would like to search is NULL? daniel rosher wrote: > > Perhaps you can use a filter in the following way. > > -Create a filter (via QueryFilter) that would contain all d

Re: Multiple Languages with Lucene (Arabic & English)

2007-07-24 Thread Grant Ingersoll

On Jul 24, 2007, at 3:21 AM, Elie Choueiri wrote: Hi I'm new to searching and am trying to use Lucene to search English & Arabic documents. I've got a bunch of questions (hopefully you'll find some interesting!) and am hoping someone's gone through some of them and has some answers fo

Re: Search for null

2007-07-24 Thread daniel rosher

Perhaps you can use a filter in the following way. -Create a filter (via QueryFilter) that would contain all document that do not have null values for the field -flip the bits of the filter so that it now contains documents that have null values for a field -Use the filter in conjunction with subs

Multiple Languages with Lucene (Arabic & English)

2007-07-24 Thread Elie Choueiri

Hi I'm new to searching and am trying to use Lucene to search English & Arabic documents. I've got a bunch of questions (hopefully you'll find some interesting!) and am hoping someone's gone through some of them and has some answers for me! First, do I have to worry about the Arabic Analyz

Re: Fine Tuning Lucene implementation

RE: What replaced org.apache.lucene.document.Field.Text?

Re: Fine Tuning Lucene implementation

What replaced org.apache.lucene.document.Field.Text?

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Query parsing?

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Re: Fine Tuning Lucene implementation

Lucene and Eastern languages (Japanese, Korean and Chinese)

FieldCache for Search

Fine Tuning Lucene implementation

Re: ArrayIndexOutOfBoundsException on TermScorer

Re: ArrayIndexOutOfBoundsException on TermScorer

Re: ArrayIndexOutOfBoundsException on TermScorer

Re: Lucene 2.2 + Not Merging Segments

Re: Search for null

Re: ArrayIndexOutOfBoundsException on TermScorer

Re: Search for null

Re: ArrayIndexOutOfBoundsException on TermScorer

Re: Multiple Languages with Lucene (Arabic & English)

ArrayIndexOutOfBoundsException on TermScorer

Re: Search for null

Re: Search for null

Re: Multiple Languages with Lucene (Arabic & English)

Re: Search for null

Multiple Languages with Lucene (Arabic & English)

35 matches

Site Navigation

Mail list logo

Footer information