Re: [Zope-dev] Stop words/vocabulary

2001-02-10 Thread Dieter Maurer

Hi Arno,

Arno Gross writes:
 > I have now a german stop word list and would like to
 > apply it for my current ZCatalog 'NewsCatalog'. But how? 
 > Or should I copy my list to the source (no good idea)?
I have told you, you can have stop words.

I did not tell you that you should not have them:

  In my view stop words are a bad thing, invented
  when computers were slow and storage expensive.

  The only thing they do now is make life more difficult:

You search for a word that happens to be a stop
word and you get no hits, usually without a useful
problem indication.

Phrase searches become a nightmare with stopwords
(at least if one tries to stick to the correct
semantics).

If you change the stopword list, your index becomes inconsistent
and needs reindexing.

How should stopwords be handled with advanced
search facilities such as phonetic searches,
search patterns, mis-spelling tolerant searches.
Everytime, you want to have a clear semantic
specification for your searches, stop words
come into your way.


  Thus, rethink about whether you really want to have stop words.


But I told you, you can have them.
And I will help you to get them, if you think, this is necessary.

The "Vocabulary" (Products.ZCatalog.Vocabulary.Vocabulary)
has a method "manage_stop_syn".
Currently is defined empty and not exposed as a view.
You could fill it with life and insert it in
"Products.ZCatalog.Vocabulary.Vocabulary.manage_options".

The "SearchIndex.Lexicon.Lexicon" has a method
"set_stop_syn" to set the stopword dict.
What you can do:

  Put your stopwords or synonyms into a file in Python dictionary
  syntax.
  Make the file selectable in "manage_stop_syn",
  read and "eval" it (this makes a Python dict), call
  "set_stop_syn".

As Chris pointed out, the GlobbingLexicon does not yet
support stopwords. The reason probably has been that
the author did not know, what stopwords and synonyms
should mean for a search process with wildcard characters.
If you know about it, then "GlobbingLexicon" is
easily extended along the lines of "Lexicon".


Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Stop words/vocabulary

2001-02-09 Thread Christopher Petrilli

Arno Gross [[EMAIL PROTECTED]] wrote:
> Hello Dieter,
> I have now a german stop word list and would like to
> apply it for my current ZCatalog 'NewsCatalog'. But how? 
> Or should I copy my list to the source (no good idea)?

Currently, if you use a <>GlobbingLexicon, then you don't get
stop words.  If you use a regular <>Lexicon you do.  These are
stored in the file---yes, a very bad idea.  Ideally they should be
managed through the web.  The easy opion is to change the code a
little to make it read them from disk.

Chris
-- 
| Christopher Petrilli
| [EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Stop words/vocabulary

2001-02-08 Thread Arno Gross

Hello Dieter,
I have now a german stop word list and would like to
apply it for my current ZCatalog 'NewsCatalog'. But how? 
Or should I copy my list to the source (no good idea)?
Thanks.

On Thu, 08 Feb 2001, Dieter Maurer wrote:
> Arno Gross writes:
>  > Can I apply stop words in a ZCatalog?
> You can:
> 
>   ZCatalog's "Lexicon"'s (--> SearchSupport.Lexicon.Lexicon)
>   have a method "set_stop_syn" to provide a mapping
>   for synonyms and stop word.
> 
>   I think, the source documentation is wrong:
> 
> It says, the mapping would map words to a list of synonyms.
> When I looked at the code in "SearchSupport/Splitter.c",
> I had the impression that a map from a word
> to a single replacement is expected.
> A "None" replacement signifies a stop word,
> other replacements can be used for stemming
> or synomyms.
> 
>  > Are there stop words for german available?
>  > If not I would try to compose a stop word list for german
>  > and publish it.
> Good!
> 
> 
> 
> Dieter
> 
> ___
> Zope-Dev maillist  -  [EMAIL PROTECTED]
> http://lists.zope.org/mailman/listinfo/zope-dev
> **  No cross posts or HTML encoding!  **
> (Related lists - 
>  http://lists.zope.org/mailman/listinfo/zope-announce
>  http://lists.zope.org/mailman/listinfo/zope )

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Stop words/vocabulary

2001-02-08 Thread Dieter Maurer

Arno Gross writes:
 > Can I apply stop words in a ZCatalog?
You can:

  ZCatalog's "Lexicon"'s (--> SearchSupport.Lexicon.Lexicon)
  have a method "set_stop_syn" to provide a mapping
  for synonyms and stop word.

  I think, the source documentation is wrong:

It says, the mapping would map words to a list of synonyms.
When I looked at the code in "SearchSupport/Splitter.c",
I had the impression that a map from a word
to a single replacement is expected.
A "None" replacement signifies a stop word,
other replacements can be used for stemming
or synomyms.

 > Are there stop words for german available?
 > If not I would try to compose a stop word list for german
 > and publish it.
Good!



Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



[Zope-dev] Stop words/vocabulary

2001-02-08 Thread Arno Gross

Can I apply stop words in a ZCatalog?
Are there stop words for german available?
If not I would try to compose a stop word list for german
and publish it.

Thanks 
   Arno Gross, [EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )