Hi

Apologies.....

The problem is not with Optics or 'O' ,

Since the 3rd and 6th Document is Indexed as

Document 3 contains   =   ELECTRONICS  DIGITAL CAMERA 0PTICS
Document 6 contains   =   ELECTRONICS  DIGITAL CAMERA OPTICS CABEL


A search Criteria for 'digital camera Optics'  should return only 3rd
document Only and

NOT 6th Document.

For a Query in either 'DIGITAL CAMERA 0PTICS'  or 'OPTICS CAMERAS DIIGTAL'
or

'CAMERA OPTICS' to return similar hits, we have to Inject the 'IGNOREWORDS'
to the query as  ' -cabel '

for Natural Language processing.

Now the main problem is to find the permutation /combination of all such
words to be fed into the field 'IGNOREWORDS'
for the query to be processed.


Any Ideas Please

Karthik




-----Original Message-----
From: Paul Elschot [mailto:[EMAIL PROTECTED]
Sent: Monday, June 06, 2005 10:33 PM
To: java-user@lucene.apache.org
Subject: Re: RE REQUEST: SPECIFIC HIT


On Monday 06 June 2005 08:40, Karthik N S wrote:
> Hi
>
> Guys.
>
> Apologies.....
>
> with refrence to my last main dted  Mon, 14 Mar 2005
>
>
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200503.mbox/%3COBE
> [EMAIL PROTECTED]
>
> I would like to again request some Help in the search Concepts.
>
> I have Indexed documents sucessfully and they would be
>
> Document 1 contains   =   ELECTRONICS  DIGITAL CAMERA
> Document 2 contains   =   ELECTRONICS  DIGITAL CAMERA BATTERY ACCESSORIES
> Document 3 contains   =   ELECTRONICS  DIGITAL CAMERA 0PTICS
> Document 4 contains   =   ELECTRONICS  DIGITAL CAMERA ACCESSORIES
> Document 5 contains   =   ELECTRONICS  DIGITAL CAMERA CABEL ACCESSORIES
> Document 6 contains   =   ELECTRONICS  DIGITAL CAMERA OPTICS CABEL
> ACCESSORIES
> Document 7 contains   =   ELECTRONICS  DIGITAL CAMERA APPERAL ACCESSORIES
>
> On Search  "Digital Camera Optics" , the hit has to return me 3rd Document
> ONLY
> instead of other Documents  [ The word DIGITAL CAMERA is common word in
all
> cases and could be in any order of sequence].
>
> To Solve this Problem I creating a new Field called 'IGNORE WORD' and this
> field would be as shown below
>
> Document 1 contains   =   ELECTRONICS  DIGITAL CAMERA
> 'IGNORE WORD = BATTERY,ACCESSORIES,0PTICS,CABEL,APPERAL
>
> Document 2 contains   =   ELECTRONICS  DIGITAL CAMERA BATTERY ACCESSORIES
> 'IGNORE WORD = ACCESSORIES,0PTICS,CABEL,APPERAL
>
> Document 3 contains   =   ELECTRONICS  DIGITAL CAMERA 0PTICS
> 'IGNORE WORD = BATTERY,ACCESSORIES,CABEL,APPERAL
>
> Document 4 contains   =   ELECTRONICS  DIGITAL CAMERA ACCESSORIES
> 'IGNORE WORD = BATTERY,0PTICS,CABEL,APPERAL
>
> Document 5 contains   =   ELECTRONICS  DIGITAL CAMERA CABEL ACCESSORIE
> 'IGNORE WORD = BATTERY,0PTICS,APPERAL
>
> Document 6 contains   =   ELECTRONICS  DIGITAL CAMERA OPTICS CABEL
> ACCESSORIES
> 'IGNORE WORD = BATTERY,APPERAL
>
> Document 7 contains   =   ELECTRONICS  DIGITAL CAMERA APPERAL ACCESSORIES
> 'IGNORE WORD = BATTERY,0PTICS,CABEL
>
>
> For Every search I feed the 'IGNORE WORD' to the query such as
>
>
> Search  = DIGITAL CAMERA 0PTICS
> Query   = +KEYSRC:Digital +KEYSRC:Camera +KEYSRC:Cabel -KEYSRC:(BATTERY
> ACCESSORIES CABEL APPERAL)
>
> The resultant hit would be the 3rd doc instead of 3rd and 5th..
>
>
> The Problem here is of 2 conditions
>
> 1) Search could be  DIGITAL CAMERA 0PTICS  or OPTICS CAMERAS DIIGTAL  or
> CAMERA OPTICS should retrieve same hit results.

Here (and also above) 'optics' is written in to forms:
starting with a zero as 0PTICS, and
starting with a capital O as OPTICS.

Is that perhaps part of the problem?

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to