James,

Thanks again.  Great food for thought.

Is there a performance hit with either of the XML Parsers you suggested?
If so, what has been your experience?
Do these XML Parsers use something called "XPath" to get the data from the
XML Document?

Background:
I have created a pre-joined resultset of all the records(<CITATION>) and
stored them in an Oracle9i Table (a.k.a Materialize View).  This basically
takes all the records and joins them together into a denormalized
spreadsheet to elliminate the expensive joins at runtime.  One of the
columns in my Table contains an XML Document compiled from all the tables
and columns  used for TEXT indexing and searching.

Like I said, the documents are stored in an Oracle9i database table column
as an XML document of type XMLType (clob or Character Large Object).  There
is some functionality in the database to parse the XML Document but as of
now, I do not know how elaborate or expansive these functions are.  Ideally,
I am trying to make the Oracle9i Database do the work for me before
returning the result set to the ColdFusion Application Server for
processing.  So far I've been able to get a list of <agent> tags for all my
<agents> in a <Citation>.

I wonder if the XML functionality in the Oracle9i database exist and is as
good as that of the two parsers you suggested?

I also have the option to store the values in seperate columns.  The problem
I am running into is that mutliple <agents> can exist for each <CITATION>
and I only want one record in the table(a.k.a Materialized View) for each
<CITATION>.

XML Document
--------------------------------------
<?xml version="1.0"?>
<citation>
  <accno>60793</accno>
  <title>Parc Andre-Citroen</title>
  <settitle sid="929">Paris: Parks and Gardens, 1615-1992</settitle>
  <callnumber>lGxFR B343a A53ed05</callnumber>
  <agents>
    <agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure,
b.1929</agent>
    <agent aid="2481" primary="NO" acutter="B847">Bruggen, Coosje van,
b.1942</agent>
    <agent primary="YES" aid="9387" acutter="B343">Berger, Patrick, b.
1947</agent>
    <agent primary="NO" aid="9388" acutter="V668">Viguier, Jean-Paul</agent>
    <agent acutter="J474" primary="NO" aid="9389">Jodry,
Jean-Francois</agent>
    <agent primary="NO" aid="9390" acutter="C585">Clement, Gilles</agent>
    <agent primary="NO" aid="9391">Provost, Alain</agent>
  </agents>
</citation>
-------------------------------------
Thanks
Troy

------------------------------------------
Troy Simpson
Applications Analyst/Programmer - MCSE, OCP DBA
North Carolina State University Libraries
Campus Box 7111 | Raleigh | North Carolina
ph.919.515.3855 | fax.919.513.3330
[EMAIL PROTECTED]

-----Original Message-----
From: James Ang [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 23, 2002 1:14 PM
To: CF-Talk
Subject: Re: REReplace and RegExp


Troy,

What you need is a 2-part parser. There isn't an easy way unless you
decide to use MS XML Parser or the Apache.org Java parser to parse the
XML.

If you decided not to use the Apache or the MS XML parsers, here's how
your tag parser would do:

Step 1: Retrieve a start tag one at a time:
<agent([[:space:]]*|[[:space:]]+[^>]*)>

Step 2: Retrieve the individual attributes of the tag retrieved in step
1

Step 3: Perform transformation of the attributes in step 2.

Step 4: Perform transformation of the tag in Step 1

Step 5: Place the transformed string back in to xml input stream OR
place the transformed stream into your output stream.

Step 6: If not end of file/stream, go to Step 1.

Attached is a sample code that might help. It is meant for CFAS 5. I
wrote it thinking that it would solve your problem until I re-read your
posting. :P The attached file should provide some insight, I hope. :)
For the code to work in CFAS 4.5.x, you will need to convert the UDF to
Custom Tags.

Good luck. :)

Back to *real* work. (This list is too much fun.)

James Ang
Senior Programmer
MedSeek, Inc.
[EMAIL PROTECTED]


----- Original Message -----
From: "Troy Simpson" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Tuesday, April 23, 2002 8:04 AM
Subject: RE: REReplace and RegExp


James,

Thanks for the response.  It has given me other ideas about how to
approach this.

It appears that the solution you provided only replaces that Tags, which
is part of the desired solution.  I also need to obtain the value of the
attributes and put then in differenct attributes for the <a> tag.  The
real kicker is that the attributes in the <agent> tag can be in any
order.  For
example:

>From this:
<agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes
Thure,b.1929</agent> <agent aid="2481" acutter="B847"
primary="NO">Bruggen, Coosje van, b.1942</agent>

To this:
<a href="results.cfm?c=aid&q=157">Oldenburg, Claes Thure, b.1929</a> <a
href="results.cfm?c=aid&q=2481">Bruggen, Coosje van, b.1942</a>

So far, I've come up with this, which is not complete:
  REReplaceNoCase(
    agentList,

'<agent[[:space:]]+primary="(YES|NO)"[[:space:]]+aid="([0-9]*)"[[:space:
]]+a
cutter="([a-z0-9]*)">(.[^<]*?)</agent>',
'<a href="results.cfm?c=aid&q=\2">\4</a>',
"ALL")>

Background:
I am using ColdFusion 4.51 on a Windows2000/IIS5 server.  The <agent>
tags come from an XML document that looks like this:

<?xml version="1.0"?>
<citation>
  <accno>60793</accno>
  <title>Parc Andre-Citroen</title>
  <settitle sid="929">Paris: Parks and Gardens, 1615-1992</settitle>
  <callnumber>lGxFR B343a A53ed05</callnumber>
  <agents>
    <agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure,
b.1929</agent>
    <agent aid="2481" primary="NO" acutter="B847">Bruggen, Coosje van,
b.1942</agent>
    <agent primary="YES" aid="9387" acutter="B343">Berger, Patrick, b.
1947</agent>
    <agent primary="NO" aid="9388" acutter="V668">Viguier,
Jean-Paul</agent>
    <agent acutter="J474" primary="NO" aid="9389">Jodry,
Jean-Francois</agent>
    <agent primary="NO" aid="9390" acutter="C585">Clement,
Gilles</agent>
    <agent primary="NO" aid="9391">Provost, Alain</agent>
  </agents>
</citation>


Thanks,
Troy

------------------------------------------
Troy Simpson
Applications Analyst/Programmer - MCSE, OCP DBA
North Carolina State University Libraries
Campus Box 7111 | Raleigh | North Carolina
ph.919.515.3855 | fax.919.513.3330
[EMAIL PROTECTED]

-----Original Message-----
From: James Ang [mailto:[EMAIL PROTECTED]]
Sent: Monday, April 22, 2002 5:45 PM
To: CF-Talk
Subject: RE: REReplace and RegExp


Try this:

REReplaceNoCase(agents, "(</?)agent([[:space:]]*>|[[:space:]]+[^>]*>)",
"\1a\2", "ALL")

I have tested this code on CFAS 5 on WinXP.

James Ang
Senior Programmer
MedSeek, Inc.


-----Original Message-----
From: Troy Simpson [mailto:[EMAIL PROTECTED]]
Sent: Monday, April 22, 2002 2:15 PM
To: CF-Talk
Subject: REReplace and RegExp


Dear CF-Talkers:

I have a string in the following format( I've added carriage returns for
readability):

<cfset agents =
'<agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure,
b.1929</agent>' & '<agent primary="NO" aid="2481"
acutter="B847">Bruggen, Coosje van, b.1942</agent>' & '<agent
primary="YES" aid="9387" acutter="B343">Berger, Patrick, b.
1947</agent>'
>

I want to process the string to look like this (replace the AGENT tags
with ANCHOR tags):

<cfset agents =
'<a href="results.cfm?c=aid&q=157">Oldenburg, Claes Thure, b.1929</a>' &
'<a href="results.cfm?c=aid&q=2481">Bruggen, Coosje van, b.1942</a>' &
'<a href="reulsts.cfm?c=aid&q=9387">Berger, Patrick, b. 1947</a>'
>

I have somewhat accomplished this like so but still need some work and
have become a little lost.

REReplaceNoCase(
    agent,

'<agent[[:space:]]+primary="(YES|NO)"[[:space:]]+aid="([0-9]*)"[[:space:
]]+a
cutter="([a-z0-9]*)">(.*)?</agent>',
' <a href="results.cfm?c=aid&q=\2">\4</a> ',
"ALL")>

**Another problem is that the AGENT Attributes can be in any order which
really throughs wrench into things.

Anyone have any advise on how to approach this?
I would really appreciated it.

Thanks,
Troy


------------------------------------------
Troy Simpson
Applications Analyst/Programmer - MCSE, OCP-DBA
North Carolina State University Libraries
Campus Box 7111 | Raleigh | North Carolina
ph.919.515.3855 | fax.919.513.3330
[EMAIL PROTECTED]





______________________________________________________________________
Get the mailserver that powers this list at http://www.coolfusion.com
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/cf-talk@houseoffusion.com/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

Reply via email to