date:20080506

Re: lucene farsi problem

2008-05-06 Thread Vizzini

Sorry for cross posting, but why the word 'Farsi' instead of 'Persian'? No one says Lucnce français or Español, or Deutsch - so why Farsi? Please read the following article, I found it quite enlightening. http://www.cais-soas.com/CAIS/Languages/persian_not_farsi.htm PV -- View this message i

Re: Are those runtime errors about the jdk, or lucene's jar, or my code?

2008-05-06 Thread crspan

Thanks so much, Mike. Those runtime errors were caused by one corrupted index, somehow corrupted during scp. It has Nothing to do with lucene 2.3.2. For those who come by this thread: Please "CheckIndex" That would saved me many hours of fruitless debugging. Cheers, Charlie Michael M

RE: How to make a query that associates 2 index files

2008-05-06 Thread Michael Siu

Yes, there is many-to-one mapping to the content index. And the size of content data is varying say from 1K to multiple Gs. That why it is not wise to repeat the same content in a index document. Thanks for telling that the doc IDs are not constant. Yes, the keys to content are generated on the f

Re: Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1

2008-05-06 Thread Marcelo Ochoa

Hi Mike: Well the problem is consitently, but to test the code and the project its necesary an Oracle 11g database :( I don't know why the computation of bufferUpto variable is wrong in the last step, during all other calls pool.buffers.length is consitently to 366 so I asume that its OK. And t

Re: Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1

2008-05-06 Thread Michael McCandless

Hi Marcelo, Hmmm something is not right. Somehow the byte slices, which DocumentsWriter uses to hold the postings in RAM, became corrupt. Is this easily reproduced? Mike Marcelo Ochoa wrote: Hi Lucene experts: I am working upgrading Lucene-Oracle integration project to latest Lucene 2.

Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1

2008-05-06 Thread Marcelo Ochoa

Hi Lucene experts: I am working upgrading Lucene-Oracle integration project to latest Lucene 2.3.1 code. After correcting a minor issue on OJVMDirectory file implementation I have the integration running with latest 2.3.1 code. But it only works with small indexes, I think index which are lower

Re: How to make a query that associates 2 index files

2008-05-06 Thread Erick Erickson

Sure, just include different fields in different docs in your index. Then, when you search since each term is on a field, docs without that field are excluded from the search. But this is really not very different in terms of a solution than your earlier one. You still have the issue of searching

RE: How to make a query that associates 2 index files

2008-05-06 Thread Michael Siu

My problem is: the [content] value can be huge. Duplicating it in more than one index document waste disk space (and search time?). In additions, when new documents are added to the second index, it will be faster to just index the linked [content] once (in first index file) and any subsequent refe

Re: Are those runtime errors about the jdk, or lucene's jar, or my code?

2008-05-06 Thread Michael McCandless

Hi, Could you run org.apache.lucene.index.CheckIndex on your index and post the result? Are these exceptions easily reproduced starting from scratch (new index)? More responses/questions below: crspan wrote: -- OS: Linux lg99 2.6.5-7.276-smp #1 SMP Fri Sep 28 20:33:22 AKDT 2007 x86

Are those runtime errors about the jdk, or lucene's jar, or my code?

2008-05-06 Thread crspan

-- OS: Linux lg99 2.6.5-7.276-smp #1 SMP Fri Sep 28 20:33:22 AKDT 2007 x86_64 x86_64 x86_64 GNU/Linux -- Lucene: 2.3.2 (tried 2.2.0 as well, since the index was built around 2.2.0, jdk1.6.0_01 ) -- JDK: Sun jdk1.6.0_06 ( from jdk-6u6-linux-x64.bin ) & Sun jdk1.5.0_15 ( from jdk-1_5_0_

Re: Postcode/zipcode search

2008-05-06 Thread mark harwood

Can you not convert all postcodes to coordinates and do actual distance-based matching? You will have to pay Royal Mail or 3rd party suppliers to get hold of the PAF data required for this geocoding (despite having funded this already as a UK tax payer- g) Cheers Mark - Original Messa

Re: Postcode/zipcode search

2008-05-06 Thread AJ Weber

Maybe I'm oversimplifying it, and maybe this isn't what you desire, but... What about breaking the postcode into two (or three) different fields? Seems easy to parse on the ingestion-side, as you just break the string at the "middle" space. Then store "postal_area", "postal_street", and option

RE: Postcode/zipcode search

2008-05-06 Thread Will Johnson

You could split up the field into 2 separate fields: Postcode:NW10 7NY -> post1:NW10 post2:7NY Then rewrite user's queries using the same logic: ie if the enter 1 term 'NW10' it gets rewritten to post1:NW10, if they enter 2 terms post1:NW10 AND post2:7NY. It also lets you do fuzzy search ie pos

Re: Multiple Field search

2008-05-06 Thread Erick Erickson

Well, it's the one I'd use. Whether it's the best or not is...er...not so certain . Erick On Tue, May 6, 2008 at 12:37 PM, Kelvin Foo Chuan Lyi <[EMAIL PROTECTED]> wrote: > Thanks... that's what I thought of ... but was wondering if that was the > best method to do so... i guess it is then... :)

Re: Postcode/zipcode search

2008-05-06 Thread Erick Erickson

Have you looked at PrefixQuery? If that doesn't work for you, could you give a few more examples of expected inputs and outputs? Best Erick On Tue, May 6, 2008 at 12:28 PM, Chris Mannion <[EMAIL PROTECTED]> wrote: > Hi all > > I've got a bit of a niggling problem with how one of my searches is >

Re: How to make a query that associates 2 index files

2008-05-06 Thread Chris Lu

No easy way unless you merge your 2 indexes into: Index: [who][accessed] [key] [content] David1/1/2007 Abc"blah blah 123 ..." Someone 1/2/2005 Abc"blah blah 123 ..." Guess12/1/2000Xyz

Re: Postcode/zipcode search

2008-05-06 Thread Grant Ingersoll

You might have a look at using a phrase query when you have more than one term in the query in addition to your term query, but giving the phrase query more weight (i.e. give an exact match more weight) and keep your original tokenization process. Something like: "NW10 7NY"^5 OR NW10 OR 7NY

Re: Multiple Field search

2008-05-06 Thread Kelvin Foo Chuan Lyi

Thanks... that's what I thought of ... but was wondering if that was the best method to do so... i guess it is then... :) On Wed, May 7, 2008 at 12:32 AM, Erick Erickson <[EMAIL PROTECTED]> wrote: > One of my favorite quotes from Roger Zelazny... "postulating > infinity, the rest is easy". > >

Re: How to make a query that associates 2 index files

2008-05-06 Thread Erick Erickson

You don't. You really have to roll your own solution here, there's no "inter-index" awareness that I know of in Lucene. Typically, people either do a half-half solution (that is, put the text search in Lucene and leave the DB parts in the DB) or de-normalize the data in a Lucene index so you don't

Re: Multiple Field search

2008-05-06 Thread Erick Erickson

One of my favorite quotes from Roger Zelazny... "postulating infinity, the rest is easy". In this case, "infinity" is how you break up your query. The easy part is making your search return what you want. Assuming you know that you want "greatest" and "hits" to go against the title field and "bea

Postcode/zipcode search

2008-05-06 Thread Chris Mannion

Hi all I've got a bit of a niggling problem with how one of my searches is working as opposed to how my users would like it too work. We're indexing on UK postcodes, which are in the format of a 3 or 4 character area code followed by a 3 or 4 character street specific code, e.g. "NW10 7NY" or "M1

How to make a query that associates 2 index files

2008-05-06 Thread Michael Siu

Hi, I am a newbie to Lucene. I have a question for making a query that associate 2 index files: - One index has the content index for a list of documents and a key to the document. That means the Lucene document of this index contains 2 fields: the 'content' and the 'key'. - another index

Multiple Field search

2008-05-06 Thread Kelvin Foo Chuan Lyi

I'm new to lucene and have a question on how to create a query for the following example... Say I have two fields, Title and Description, with the following data Item 1 Title: The greatest hits Description : Collection of the best music from The Beatles. Item 2 Title: U2 collections Description :

Re: Filtering a SpanQuery

2008-05-06 Thread Paul Elschot

Eran, Op Tuesday 06 May 2008 10:15:10 schreef Eran Sevi: > Hi, > > I am looking for a way to filter a SpanQuery according to some other > query (on another field from the one used for the SpanQuery). I need > to get access to the spans themselves of course. > I don't care about the scoring of the

RE: lucene farsi problem

2008-05-06 Thread esra

Hi Steven , Hi Steven, i tried the class and it works fine with the locale parameter "ar". Actually we are using "fa" for farsi and "ar" for arabic. I have added a little control for the locale parameter in my code and now i can see the correct results. Thank you very much for ypur help. Esra.

Re: index corruption with latest lucene

2008-05-06 Thread Gopikrishnan Subramani

Thanks Mike. Sorry, I should have mentioned that I'm using 1.6.0_04. I happened to look at the thread a while ago and used -Xbatch but that didn't help which made me think may be it's a different issue. I'll try with -Xint before downgrading to 1.6.0_03 to be doubly sure. -Gopi On 5/6/08, Michae

Re: index corruption with latest lucene

2008-05-06 Thread Michael McCandless

Could you provide more detail on how you hit these two exceptions? Are they reproducible from scratch (creating a new index)? Are you using multiple threads against IndexWriter? Is autoCommit true or false? Any prior exceptions hit? Do your documents have varying number/configuration

Re: index corruption with latest lucene

2008-05-06 Thread Michael McCandless

Are you using JRE 1.6.0_04 or 1.6.0_05? This sounds exactly the same as this: http://www.gossamer-threads.com/lists/lucene/java-user/59650 If it is the same issue, which seems to be a bug in the hotspot compiler, downgrading to JRE 1.6.0_03, or running Java with -Xbatch (forces up-fron

Re: index corruption with latest lucene

2008-05-06 Thread Gopikrishnan Subramani

[ Sorry if I'm hijacking this thread, if you feel this error is unrelated to this thread, I'll move this to a separate thread. ] Even after upgrading to 2.3.1 I'm running into index corruption problems. I'm posting below the exception that is generated while searching. The stack trace looks like,

Filtering a SpanQuery

2008-05-06 Thread Eran Sevi

Hi, I am looking for a way to filter a SpanQuery according to some other query (on another field from the one used for the SpanQuery). I need to get access to the spans themselves of course. I don't care about the scoring of the filter results and just need the positions of hits found in the docu

Re: lucene farsi problem

Re: Are those runtime errors about the jdk, or lucene's jar, or my code?

RE: How to make a query that associates 2 index files

Re: Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1

Re: Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1

Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1

Re: How to make a query that associates 2 index files

RE: How to make a query that associates 2 index files

Re: Are those runtime errors about the jdk, or lucene's jar, or my code?

Are those runtime errors about the jdk, or lucene's jar, or my code?

Re: Postcode/zipcode search

Re: Postcode/zipcode search

RE: Postcode/zipcode search

Re: Multiple Field search

Re: Postcode/zipcode search

Re: How to make a query that associates 2 index files

Re: Postcode/zipcode search

Re: Multiple Field search

Re: How to make a query that associates 2 index files

Re: Multiple Field search

Postcode/zipcode search

How to make a query that associates 2 index files

Multiple Field search

Re: Filtering a SpanQuery

RE: lucene farsi problem

Re: index corruption with latest lucene

Re: index corruption with latest lucene

Re: index corruption with latest lucene

Re: index corruption with latest lucene

Filtering a SpanQuery

30 matches

Site Navigation

Mail list logo

Footer information