On Aug 8, 2005, at 7:44 AM, Karthik N S wrote:
   I  would like to reformat the Question  slightly ,
   Words without double Quotes may also be present in the String.
Also I have to apply the STOP - Analyzer to filter out common English
words appearing within.

Do u mind giving me a bit of src hint for the same...
[ I am googled out of ideas ]

Well, the Analyzers chapter in Lucene in Action will give you all the basics you need to write custom pieces of an analyzer :)

A custom Tokenizer is where you'll want to start with, perhaps, though there are other possibilities such as tokenizing quotes separately and then having a filter buffer when they are encountered and combine multiple tokens into one when within quotes. The only hints I have would be examples from within Lucene's codebase itself. You'll need to maintain some state (whether within quotes or not), but otherwise it should be relatively straightforward.

To provide many more hints would require implementing it myself, I'm afraid.

    Erik




with regards
karthik


-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Monday, August 08, 2005 4:23 PM
To: java-user@lucene.apache.org
Subject: Re: Reply Split Search Word


To have an analyzer split that string into 1-5 as you have listed
will require you write a custom Analyzer to tokenize with double
quotes in mind like that.

     Erik

On Aug 8, 2005, at 12:06 AM, Karthik N S wrote:


Hi

Luceners

Apologies.....

As I have already replied,Using Analysis I have tried on all Analyzers
(including Standard Analyzer)
But not able to achive the required COMPLETS WORD Split.

My I/p String would be a lengthy one as below

String sKey = "\"" + "Dough Cutting" + "\"" +  "  " +  "Otis
Gospodnetic"  +
"   " +
              "\"" + "Erik Hatcher" + "\""  + "  " +    "Authors of
" + "\"" +
"Lucene In Action" + "\"";

The required split of complete words should return

   1) "Dough Cutting"
   2) Otis Gospodnetic
   3) "Erik Hatcher"
   4) Authors of
   5) "Lucene In Action"

Plz Note :- Words with "\"" are complete split words....

I am shure some Analyzer code inside Lucene is handling this task.


som how can one achive this task..

with regards
Karthik

-----Original Message-----
From: Mordo, Aviran (EXP N-NANNATEK) [mailto:[EMAIL PROTECTED]
Sent: Friday, August 05, 2005 7:58 PM
To: java-user@lucene.apache.org
Subject: RE: Split Search Word


The StandardAnalyzer should work just fine with it, It will break the
search string to 5 search terms.

HTH

Aviran
http://www.aviransplace.com

  _____

From: Karthik N S [mailto:[EMAIL PROTECTED]
Sent: Friday, August 05, 2005 1:57 AM
To: LUCENE
Subject: Split Search Word



Hi Luceners

Apologies.....

I  have along Search String as given below...



SearchWord =  "\"" + "Dough Cutting" + "\"" +  "  " +  "Otis
Gospodnetic"  +  "   " + "\"" + "Erik Hatcher" + "\""  + "  " +
                           "Authors of " + "\"" + "Lucene In Action"
+"\"";

And prior to searching the Index ,I need the Words to be Split.

SearchWord   =

   1)   "\"" + "Dough Cutting" + "\""
   2)   "Otis Gospodnetic"
   3)  "\"" + "Erik Hatcher" + "\""
   4)  "Authors of "
   5) "\"" +"Lucene In Action" +"\""

I am shure some Analyzer within Lucene is performin the task.
So some body please Tell me Howto

[ I already used Analysis/Paralysis code to check ,but no help ]




WITH WARM REGARDS
HAVE A NICE DAY
[ N.S.KARTHIK]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to