Re: Editing StopWordList

2010-12-21 Thread manjula wijewickrema
Hi Gupta,

Thanx a lot for your reply. But I could not understand whether I could
modify (adding more words) to the default stop word list or should I have to
make a new list as an array as follows.
public string[] NEW_STOP_WORDS = { a, and, are, as, at, be,
but, by, for, if, in, into, is, no, not, of, on, or,

s, such, t, that, the, their, then, there, these, they,
this, to, was, will, with,
inc,incorporated,co.,ltd,ltd., we, you, your, us,
etc...};

then call it as follows,

SnowballAnalyzer analyzer = *new* SnowballAnalyzer(English,
StopAnalyzer.NEW_STOP_WORDS );
Am I correct?
Or if not could you explain me how can I do this?

Thanx in advance.
Manjula.

On Tue, Dec 21, 2010 at 10:36 AM, Anshum ansh...@gmail.com wrote:

 Hi Manjula,
 You could initialize the Analyzer using a modified stop word set. Use
 the *StopAnalyzer.ENGLISH_STOP_WORDS_SET
 *to get the default stopset and then add your own words to it. You could
 then initialize the analyzer using this new stop set instead of the default
 stop set.
 Hope that helps.

 --
 Anshum Gupta
 http://ai-cafe.blogspot.com


 On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
 manjul...@gmail.comwrote:

  Hi,
 
  1) In my application, I need to add more words to the stop word list.
  Therefore, is it possible to add more words into the default lucene stop
  word list?
 
  2) If is it possible, then how can I do this?
 
  Appreciate any comment from you.
 
  Thanks,
  Manjula.
 



Re: Editing StopWordList

2010-12-21 Thread Anshum
The default stopword list is final and so you'd have to copy this to some
other object and add terms to it instead of being able to directly modify
it. So all said and done, what I meant when I said you could use it was that
you could read that object and construct your own set (almost as case 2
below).


--
Anshum Gupta
http://ai-cafe.blogspot.com


On Tue, Dec 21, 2010 at 3:54 PM, manjula wijewickrema
manjul...@gmail.comwrote:

 Hi Gupta,

 Thanx a lot for your reply. But I could not understand whether I could
 modify (adding more words) to the default stop word list or should I have
 to
 make a new list as an array as follows.
 * public string[] NEW_STOP_WORDS = { a, and, are, as, at, be,
 but, by, for, if, in, into, is, no, not, of, on,
 or,

 s, such, t, that, the, their, then, there, these, they,
 this, to, was, will, with,
 inc,incorporated,co.,ltd,ltd., we, you, your, us,
 etc...};
 *
 then call it as follows,

 SnowballAnalyzer analyzer = *new* SnowballAnalyzer(English,
 StopAnalyzer.NEW_STOP_WORDS );
 Am I correct?
 Or if not could you explain me how can I do this?

 Thanx in advance.
 Manjula.

 On Tue, Dec 21, 2010 at 10:36 AM, Anshum ansh...@gmail.com wrote:

  Hi Manjula,
  You could initialize the Analyzer using a modified stop word set. Use
  the *StopAnalyzer.ENGLISH_STOP_WORDS_SET
  *to get the default stopset and then add your own words to it. You could
  then initialize the analyzer using this new stop set instead of the
 default
  stop set.
  Hope that helps.
 
  --
  Anshum Gupta
  http://ai-cafe.blogspot.com
 
 
  On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
  manjul...@gmail.comwrote:
 
   Hi,
  
   1) In my application, I need to add more words to the stop word list.
   Therefore, is it possible to add more words into the default lucene
 stop
   word list?
  
   2) If is it possible, then how can I do this?
  
   Appreciate any comment from you.
  
   Thanks,
   Manjula.
  
 



Re: Editing StopWordList

2010-12-20 Thread Anshum
Hi Manjula,
You could initialize the Analyzer using a modified stop word set. Use
the *StopAnalyzer.ENGLISH_STOP_WORDS_SET
*to get the default stopset and then add your own words to it. You could
then initialize the analyzer using this new stop set instead of the default
stop set.
Hope that helps.

--
Anshum Gupta
http://ai-cafe.blogspot.com


On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
manjul...@gmail.comwrote:

 Hi,

 1) In my application, I need to add more words to the stop word list.
 Therefore, is it possible to add more words into the default lucene stop
 word list?

 2) If is it possible, then how can I do this?

 Appreciate any comment from you.

 Thanks,
 Manjula.