Hey all,

I've setup a Context/Intermedia/Text/whateverTheHell index on 8.1.7.4 on
HP/UX to index about 250000 description fields in order for our users to
search on them.  This was two years ago, and now someone has discovered at
least one issue.

One description contains something like:

        BLEAH,120,1/4W

Using the default lexer, this stupidly parses into tokens of "BLEAH",
"120,1" and "4W" instead of "BLEAH", "120", and "1/4W" (or even "1" and
"4W").  I think this is because of the default NUMGROUP for US languages,
which is a comma (",").  So when a user looks for "120 AND 1/4W", this
description is missed because "120" isn't a valid token with the default
lexer.

There can be numerous other issues with NUMGROUP when lexing a
free-formatted description, so I really don't want a NUMGROUP.  I tried
setting it to null using:

        ctx_ddl.set_attribute('MYLEXER','NUMGROUP','');

..but this bombs with:

        ORA-20000: interMedia Text error:
        DRG-10705: invalid value NULL for attribute NUMGROUP

Other than trying to find some char that will work with 250K rows, is there
a way to turn this off?  The thing that gets me is that "120,1" isn't even a
proper number, but ConTerMedText thinks it is and tokenizes it.

TIA,
Rich

Rich Jesse                           System/Database Administrator
[EMAIL PROTECTED]                  Quad/Tech Inc, Sussex, WI USA
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Jesse, Rich
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to