Mark,
This is a new in-house developed  the school directory system. The
source data come from all over the different systems. The data quite
dirty in some case. I am not sure how the business rule was defined yet.
The developer knew her query was nasty, so she turned on me for help.
This query is part of the whole process. As you said, it compares all
the possible same person feed from all the systems (med, sis,law...)and
insert it to another table. I think you and Stephane's suggestions are
very valuable and I will definitely ask her to review her design of this
processing. (sigh, we are not that experienced and good on this) After I
work more with developer and get more knowledge of this application. I'd
like to ask some detail of your experience in such searching logic.

many thanks,

Joan 


Mark Richard wrote:
> 
> I tend to agree with Stephane...
> 
> It looks like that query was possibly generated by a tool.  If not you
> should go talk to the person who wrote it.  Find out what the business
> rules are and write it from scratch.  In reality the query keeps hitting
> the same two tables, presumably looking for rows that have just been loaded
> which match existing rows in another table based on the names matching in
> some fashion (ie: perhaps incorrect firstname/middlename usage, etc).
> 
> Some of Stephane's suggestions will also help.  We store customer names as
> case sensitive but we obviously need to do a case-insensitive search from
> our application to find customers.  The easiest way out - store the name
> twice (we have firstname and firstnameasupper, lastname and
> lastnameasupper, etc).  Obviously the indexes sit on the "asupper" columns.
> Sure there is a slight amount of duplicated data, but it is done for a
> reason and let's us search customers very fast whist still preserving the
> nuance's that customers like to see when you send a bill to them.
> 
> Perhaps you need to store some columns which have had "replace" and "upper"
> applied.  Finally since the query seems to accept pretty much any
> combination of lastname, firstname, middlename you could concatenate the
> three columns into a "namesearch" column are just perform three instr
> searches on that column, as opposed to trying every possibility - that
> should save Oracle some heartache.
> 
> Regards,
>      Mark.
> 
> 
>                     Stephane
>                     Faroult              To:     Multiple recipients of list 
>ORACLE-L <[EMAIL PROTECTED]>
>                     <sfaroult@orio       cc:
>                     le.com>              Subject:     Re: slow query help
>                     Sent by:
>                     [EMAIL PROTECTED]
>                     om
> 
> 
>                     18/12/2002
>                     07:58
>                     Please respond
>                     to ORACLE-L
> 
> 
> 
> Joan Hsieh wrote:
> >
> > Hi,
> >
> > This is the query bothered us very much recently. It run at least 15
> > min. and sometimes crashed the temp tablespace. Do you have any idea how
> > to make it run better. Our developer tried used two cursors to compare
> > the result, but the result is not optimized. We tried used last name and
> > first name function based index on sm_new_load table.  If someone
> > interested in this query, I can sent you the execution plan.
> > Thanks in advanced,
> >
> > Joan
> >
> > SELECT T11.TRUNK TRUNK_FOUND,T21.ID TARGET_ID,T21.SSN
> > TARGET_SSN,T21.FULLNAME
> >   TARGET_FULLNAME,T21.LASTNAME TARGET_LASTNAME,T21.MIDDLENAME
> >   TARGET_MIDDLENAME,T21.FIRSTNAME TARGET_FIRSTNAME,T21.DOB TARGET_DOB,
> >   T21.GENDER TARGET_GENDER
> > FROM
> >  SM_NEW_LOAD T21,PR_IDENTITY T11  WHERE ( T21.ROWID,T11.ROWID  ) IN
> > (SELECT
> >   T22.ROWID,T12.ROWID   FROM
> > PR_IDENTITY T12,SM_NEW_LOAD T22  WHERE T22.SSN
> >   IN ( T12.SSN,T12.HRID,SISID,MEDID,AFFID  ) OR T22.ID IN (
> > T12.SSN,T12.HRID,
> >   SISID,MEDID,AFFID  ) OR EXISTS  (SELECT 1   FROM
> >  PR_ALT_IDS  WHERE TRUNK = T12.TRUNK  AND ALT_ID IN ( T22.SSN,T22.ID
> > ))  UNION ALL
> > SELECT T22.ROWID,
> >   T12.ROWID   FROM
> > SM_NEW_LOAD T22,PR_IDENTITY T12  WHERE
> >   REPLACE(UPPER(T12.LASTNAME),'-',' ') =
> > REPLACE(UPPER(T22.LASTNAME),'-',' ')
> >    AND UPPER(T12.FIRSTNAME) = UPPER(T22.FIRSTNAME)  UNION ALL
> > SELECT
> >   T23.ROWID,T13.ROWID   FROM
> >  SM_NEW_LOAD T23,PR_IDENTITY T13  WHERE
> >   REPLACE(UPPER(T13.LASTNAME),'-',' ') =
> > REPLACE(UPPER(T23.LASTNAME),'-',' ')
> >    AND ((INSTR(UPPER(T13.FIRSTNAME),UPPER(T23.MIDDLENAME),1) = 1  AND
> >   INSTR(UPPER(T13.MIDDLENAME),UPPER(T23.FIRSTNAME),1) = 1 ) OR
> >   (INSTR(UPPER(T23.FIRSTNAME),UPPER(T13.MIDDLENAME),1) = 1  AND
> >   INSTR(UPPER(T23.MIDDLENAME),UPPER(T13.FIRSTNAME),1) = 1 )) UNION ALL
> > SELECT
> >   T24.ROWID,T14.ROWID   FROM
> > SM_NEW_LOAD T24,PR_IDENTITY T14  WHERE
> >   UPPER(T14.LASTNAME) = UPPER(T24.FIRSTNAME)  AND UPPER(T14.FIRSTNAME) =
> >   UPPER(T24.LASTNAME)  UNION ALL SELECT T25.ROWID,T15.ROWID   FROM
> >   SM_NEW_LOAD T25,PR_IDENTITY T15  WHERE (INSTR(' ' ||
> >   REPLACE(UPPER(T15.LASTNAME),'-',' ')  || ' ' ,' ' ||
> > UPPER(T25.LASTNAME)
> >   || ' ' ) > 0  OR INSTR(' ' || REPLACE(UPPER(T25.LASTNAME),'-',' ')  ||
> > ' ' ,
> >   ' ' || UPPER(T15.LASTNAME)  || ' ' ) > 0 ) AND UPPER(T15.FIRSTNAME) =
> >   UPPER(T25.FIRSTNAME)  AND (T15.MIDDLENAME IS NULL  OR T25.MIDDLENAME
> > IS
> >   NULL  OR UPPER(SUBSTR(T15.MIDDLENAME,1,1)) =
> > UPPER(SUBSTR(T25.MIDDLENAME,1,
> >   1)) ) UNION ALL
> > SELECT T27.ROWID,T17.ROWID   FROM
> >  PR_IDENTITY T17,
> >   SM_NEW_LOAD T27  WHERE UPPER(T27.LASTNAME) = UPPER(T17.LASTNAME)  AND
> >   (INSTR(UPPER(T27.FIRSTNAME),UPPER(T17.FIRSTNAME),1) > 0  OR
> >   INSTR(UPPER(T17.FIRSTNAME),UPPER(T27.FIRSTNAME),1) > 0 ) AND
> >   (INSTR(UPPER(T27.MIDDLENAME),UPPER(T17.MIDDLENAME),1) > 0  OR
> >   INSTR(UPPER(T17.MIDDLENAME),UPPER(T27.MIDDLENAME),1) > 0 ))
> > --
> 
> Joan,
> 
>    I think that you have a design problem here. First it is obvious that
> if all your data was entered in uppercase, it would help. If you are on
> 8.1.7 or above, function-based index should help. Otherwise the code is
> obviously awkward. What use is (a.rowid, b.rowid) in (select c.rowid,
> d.rowid ...) ? Quite obviously you are scanning the same data twice. All
> your 'replace', 'instr', 'upper' etc. are performance killers. The
> easiest thing to do is possibly to denormalise PR_IDENTITY, add a column
> (trigger-entered) which contains data you can search without applying
> functions of death to it - and index it.
> --
> HTH,
> 
> Stephane Faroult
> Oriole Software
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.com
> --
> Author: Stephane Faroult
>   INET: [EMAIL PROTECTED]
> 
> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
> San Diego, California        -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from).  You may
> also send the HELP command for other information (like subscribing).
> 
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>    Privileged/Confidential information may be contained in this message.
>           If you are not the addressee indicated in this message
>        (or responsible for delivery of the message to such person),
>             you may not copy or deliver this message to anyone.
> In such case, you should destroy this message and kindly notify the sender
>            by reply e-mail or by telephone on (61 3) 9612-6999.
>    Please advise immediately if you or your employer does not consent to
>                 Internet e-mail for messages of this kind.
>         Opinions, conclusions and other information in this message
>               that do not relate to the official business of
>                          Transurban City Link Ltd
>          shall be understood as neither given nor endorsed by it.
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.com
> --
> Author: Mark Richard
>   INET: [EMAIL PROTECTED]
> 
> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
> San Diego, California        -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from).  You may
> also send the HELP command for other information (like subscribing).
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Joan Hsieh
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to