James, Thanks again. Great food for thought.
Is there a performance hit with either of the XML Parsers you suggested? If so, what has been your experience? Do these XML Parsers use something called "XPath" to get the data from the XML Document? Background: I have created a pre-joined resultset of all the records(<CITATION>) and stored them in an Oracle9i Table (a.k.a Materialize View). This basically takes all the records and joins them together into a denormalized spreadsheet to elliminate the expensive joins at runtime. One of the columns in my Table contains an XML Document compiled from all the tables and columns used for TEXT indexing and searching. Like I said, the documents are stored in an Oracle9i database table column as an XML document of type XMLType (clob or Character Large Object). There is some functionality in the database to parse the XML Document but as of now, I do not know how elaborate or expansive these functions are. Ideally, I am trying to make the Oracle9i Database do the work for me before returning the result set to the ColdFusion Application Server for processing. So far I've been able to get a list of <agent> tags for all my <agents> in a <Citation>. I wonder if the XML functionality in the Oracle9i database exist and is as good as that of the two parsers you suggested? I also have the option to store the values in seperate columns. The problem I am running into is that mutliple <agents> can exist for each <CITATION> and I only want one record in the table(a.k.a Materialized View) for each <CITATION>. XML Document -------------------------------------- <?xml version="1.0"?> <citation> <accno>60793</accno> <title>Parc Andre-Citroen</title> <settitle sid="929">Paris: Parks and Gardens, 1615-1992</settitle> <callnumber>lGxFR B343a A53ed05</callnumber> <agents> <agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure, b.1929</agent> <agent aid="2481" primary="NO" acutter="B847">Bruggen, Coosje van, b.1942</agent> <agent primary="YES" aid="9387" acutter="B343">Berger, Patrick, b. 1947</agent> <agent primary="NO" aid="9388" acutter="V668">Viguier, Jean-Paul</agent> <agent acutter="J474" primary="NO" aid="9389">Jodry, Jean-Francois</agent> <agent primary="NO" aid="9390" acutter="C585">Clement, Gilles</agent> <agent primary="NO" aid="9391">Provost, Alain</agent> </agents> </citation> ------------------------------------- Thanks Troy ------------------------------------------ Troy Simpson Applications Analyst/Programmer - MCSE, OCP DBA North Carolina State University Libraries Campus Box 7111 | Raleigh | North Carolina ph.919.515.3855 | fax.919.513.3330 [EMAIL PROTECTED] -----Original Message----- From: James Ang [mailto:[EMAIL PROTECTED]] Sent: Tuesday, April 23, 2002 1:14 PM To: CF-Talk Subject: Re: REReplace and RegExp Troy, What you need is a 2-part parser. There isn't an easy way unless you decide to use MS XML Parser or the Apache.org Java parser to parse the XML. If you decided not to use the Apache or the MS XML parsers, here's how your tag parser would do: Step 1: Retrieve a start tag one at a time: <agent([[:space:]]*|[[:space:]]+[^>]*)> Step 2: Retrieve the individual attributes of the tag retrieved in step 1 Step 3: Perform transformation of the attributes in step 2. Step 4: Perform transformation of the tag in Step 1 Step 5: Place the transformed string back in to xml input stream OR place the transformed stream into your output stream. Step 6: If not end of file/stream, go to Step 1. Attached is a sample code that might help. It is meant for CFAS 5. I wrote it thinking that it would solve your problem until I re-read your posting. :P The attached file should provide some insight, I hope. :) For the code to work in CFAS 4.5.x, you will need to convert the UDF to Custom Tags. Good luck. :) Back to *real* work. (This list is too much fun.) James Ang Senior Programmer MedSeek, Inc. [EMAIL PROTECTED] ----- Original Message ----- From: "Troy Simpson" <[EMAIL PROTECTED]> To: "CF-Talk" <[EMAIL PROTECTED]> Sent: Tuesday, April 23, 2002 8:04 AM Subject: RE: REReplace and RegExp James, Thanks for the response. It has given me other ideas about how to approach this. It appears that the solution you provided only replaces that Tags, which is part of the desired solution. I also need to obtain the value of the attributes and put then in differenct attributes for the <a> tag. The real kicker is that the attributes in the <agent> tag can be in any order. For example: >From this: <agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure,b.1929</agent> <agent aid="2481" acutter="B847" primary="NO">Bruggen, Coosje van, b.1942</agent> To this: <a href="results.cfm?c=aid&q=157">Oldenburg, Claes Thure, b.1929</a> <a href="results.cfm?c=aid&q=2481">Bruggen, Coosje van, b.1942</a> So far, I've come up with this, which is not complete: REReplaceNoCase( agentList, '<agent[[:space:]]+primary="(YES|NO)"[[:space:]]+aid="([0-9]*)"[[:space: ]]+a cutter="([a-z0-9]*)">(.[^<]*?)</agent>', '<a href="results.cfm?c=aid&q=\2">\4</a>', "ALL")> Background: I am using ColdFusion 4.51 on a Windows2000/IIS5 server. The <agent> tags come from an XML document that looks like this: <?xml version="1.0"?> <citation> <accno>60793</accno> <title>Parc Andre-Citroen</title> <settitle sid="929">Paris: Parks and Gardens, 1615-1992</settitle> <callnumber>lGxFR B343a A53ed05</callnumber> <agents> <agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure, b.1929</agent> <agent aid="2481" primary="NO" acutter="B847">Bruggen, Coosje van, b.1942</agent> <agent primary="YES" aid="9387" acutter="B343">Berger, Patrick, b. 1947</agent> <agent primary="NO" aid="9388" acutter="V668">Viguier, Jean-Paul</agent> <agent acutter="J474" primary="NO" aid="9389">Jodry, Jean-Francois</agent> <agent primary="NO" aid="9390" acutter="C585">Clement, Gilles</agent> <agent primary="NO" aid="9391">Provost, Alain</agent> </agents> </citation> Thanks, Troy ------------------------------------------ Troy Simpson Applications Analyst/Programmer - MCSE, OCP DBA North Carolina State University Libraries Campus Box 7111 | Raleigh | North Carolina ph.919.515.3855 | fax.919.513.3330 [EMAIL PROTECTED] -----Original Message----- From: James Ang [mailto:[EMAIL PROTECTED]] Sent: Monday, April 22, 2002 5:45 PM To: CF-Talk Subject: RE: REReplace and RegExp Try this: REReplaceNoCase(agents, "(</?)agent([[:space:]]*>|[[:space:]]+[^>]*>)", "\1a\2", "ALL") I have tested this code on CFAS 5 on WinXP. James Ang Senior Programmer MedSeek, Inc. -----Original Message----- From: Troy Simpson [mailto:[EMAIL PROTECTED]] Sent: Monday, April 22, 2002 2:15 PM To: CF-Talk Subject: REReplace and RegExp Dear CF-Talkers: I have a string in the following format( I've added carriage returns for readability): <cfset agents = '<agent primary="NO" aid="157" acutter="O469">Oldenburg, Claes Thure, b.1929</agent>' & '<agent primary="NO" aid="2481" acutter="B847">Bruggen, Coosje van, b.1942</agent>' & '<agent primary="YES" aid="9387" acutter="B343">Berger, Patrick, b. 1947</agent>' > I want to process the string to look like this (replace the AGENT tags with ANCHOR tags): <cfset agents = '<a href="results.cfm?c=aid&q=157">Oldenburg, Claes Thure, b.1929</a>' & '<a href="results.cfm?c=aid&q=2481">Bruggen, Coosje van, b.1942</a>' & '<a href="reulsts.cfm?c=aid&q=9387">Berger, Patrick, b. 1947</a>' > I have somewhat accomplished this like so but still need some work and have become a little lost. REReplaceNoCase( agent, '<agent[[:space:]]+primary="(YES|NO)"[[:space:]]+aid="([0-9]*)"[[:space: ]]+a cutter="([a-z0-9]*)">(.*)?</agent>', ' <a href="results.cfm?c=aid&q=\2">\4</a> ', "ALL")> **Another problem is that the AGENT Attributes can be in any order which really throughs wrench into things. Anyone have any advise on how to approach this? I would really appreciated it. Thanks, Troy ------------------------------------------ Troy Simpson Applications Analyst/Programmer - MCSE, OCP-DBA North Carolina State University Libraries Campus Box 7111 | Raleigh | North Carolina ph.919.515.3855 | fax.919.513.3330 [EMAIL PROTECTED] ______________________________________________________________________ Get the mailserver that powers this list at http://www.coolfusion.com FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Archives: http://www.mail-archive.com/cf-talk@houseoffusion.com/ Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists