Ok,

This one is killing me.. And my CFMX server...

Currently I'm using the following code to parse some text:

<cfset teststr = definition>
<cfset st = ReFindNoCase("{[^}]*}", teststr,
1, true)>
<cfloop condition="st.pos[1] GT 0">
<cfset replaceme = mid(teststr,
st.pos[1], st.len[1])>
<cfset theword = mid(replaceme, 2,
len(replaceme)-2)>
<cfset teststr =
ReplaceNoCase(teststr, replaceme, "<a
href=""> heword#</a>", "all")>
<cfset st = ReFindNoCase("{[^}]*}",
teststr, 1, true)>
</cfloop>

The text usually resembles something like the first example below and by the
time the parser has run, everything inside the parentheses are turned into
hyperlinks..

Fungi \Fun"gi\, n. pl. (Bot.)
   A group of thallophytic plants of low organization, destitute
   of chlorophyll, in which reproduction is mainly accomplished
   by means of asexual spores, which are produced in a great
   variety of ways, though sexual reproduction is known to occur
   in certain {Phycomycetes}, or so-called algal fungi.

   Note: The Fungi appear to have originated by degeneration
         from various alg[ae], losing their chlorophyll on
         assuming a parasitic or saprophytic life. By some they
         are divided into the subclasses {Phycomycetes}, the
         lower or algal fungi; the {Mesomycetes}, or
         intermediate fungi; and the {Mycomycetes}, or the
         higher fungi; by others into the {Phycomycetes}; the
         {Ascomycetes}, or sac-spore fungi; and the
         {Basidiomycetes}, or basidial-spore fungi.

My problem is that I have just discovered is that of nested parentheses...
Using the code above the process goes into an infinite loop and eats all
available RAM in about 30 seconds....

The following text kills my algorithm!

Cryptogamia \Cryp`to*ga"mi*a\ (kr?p`t?-g?"m?-?), n.; pl.
   {Cryptogami[ae]} (-?). [NL., fr. Gr. krypto`s hidden, secret
   + ga`mos marriage.] (Bot.)
   The series or division of flowerless plants, or those never
   having true stamens and pistils, but propagated by spores of
   various kinds.

   Note: The subdivisions have been variously arranged. The
         following arrangement recognizes four classes: -- I.
         {{Pteridophyta}, or {Vascular Acrogens}.} These include
         Ferns, {Equiseta} or Scouring rushes, {Lycopodiace[ae]}
         or Club mosses, {Selaginelle[ae]}, and several other
         smaller orders. Here belonged also the extinct coal
         plants called {Lepidodendron}, {Sigillaria}, and
         {Calamites}. II. {{Bryophita}, or {Cellular Acrogens}}.
         These include {Musci}, or Mosses, {Hepatic[ae]}, or
         Scale mosses and Liverworts, and possibly
         {Charace[ae]}, the Stoneworts. III. {{Alg[ae]}}, which
         are divided into {Floride[ae]}, the Red Seaweeds, and
         the orders {Dictyote[ae]}, {O["o]spore[ae]},
         {Zo["o]spore[ae]}, {Conjugat[ae]}, {Diatomace[ae]}, and
         {Cryptophyce[ae]}. IV. {{Fungi}}. The molds, mildews,
         mushrooms, puffballs, etc., which are variously grouped
         into several subclasses and many orders. The {Lichenes}
         or Lichens are now considered to be of a mixed nature,
         each plant partly a Fungus and partly an Alga.

One solution would be to repeatedly pre-parse the text to remove or escape
the extra parentheses but I'm not inclined to do this as I believe I should
be able to fix my regular _expression_ which is currently "{[^}]*}".

Does anyone have any suggestions???

Thanks in advance

Paul
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings] [Donations and Support]

Reply via email to