Dear P.O., dear Gil:

the version below should tackle almost all cases, except of:

  * if a programlisting element content starts with blanks followed by one or 
more [CR-]LF then no
    stripping takes place: it is assumed that the white space is there 
intentionally
  * if a programlisting element content consists of whitespace and/or {CR-]LF 
only then not
    stripping takes place: it is assumed that the white space is there 
intentionally
  * if there are blanks that follow the last [CR-]LF sequence before the end 
tag of a programlisting
    element: it is assumed that the last [CR-]LF followed by whitespace is 
there intentionally

So these cases may account for differences to Erich's post-process script.

Another change that took place: now only those xml files get rewritten that 
have their content
changed. Therefore, if you run the script 
("stripBlankLinesFromProgramlisting.rex") multiple times,
rewrites only take place the very first time. At the end of the run you will 
get brief statistics
indicating how many files got changed/rewritten.

Here the script:

    ---rgf, 2020-02-02, 2020-02-22, 2020-02-23: strip CR-LF from ooRexx xml 
program listings
    start=.dateTime~new
    call sysfiletree "*.xml", "files.", "FOS"
    end =.dateTime~new
    .count~bVerbose=.false     -- if .true, shows files with programlisting 
attributes and if rewritten
    say "SysFileTree duration:" end-start", about to process" files.0 "files"

    len=files.0~length
    do i=1 to files.0
       say i~right(len)":" files.i
       call stripBlankLines files.i
    end
    end =.dateTime~new
    say "found" .count~counter "<programlisting> elements," -
                    "(".count~attrCount "with attributes),"         -
                    "rewrote" .count~rewriteCounter "of" files.0 "files,"    -
                    "duration:" end-start

    ::routine stripBlankLines
       parse arg fileName

       inStr=.stream~new(fileName)~~open("read")
       chars=inStr~chars
       allChars=inStr~charin(1,chars)
       inStr~close

       startPgmListing ="<programlisting>"
       startNeedle="<programlisting"
       endNeedle  ="</programlisting>"
       cdataStart ="<![CDATA["
       cdataEnd   ="]]>"
       crlf       ="0d0a"x
       mbOut=.mutableBuffer~new(,chars)
       bDirty=.false
       do while allChars<>""
          parse var allChars before (startNeedle) attributes ">" program 
(endNeedle) allChars
          if attributes<>"" then
          do
             attrCount=.count~increaseAttrCount
             if .count~bVerbose then say "..." startNeedle || attributes">" 
"..." "attrCount="attrCount
          end

          if program="" then
          do
             if allChars="" then  -- arrived at end of file
                mbOut~append(before)
             else  -- maybe a placeholder of whitespace, leave as is
             do
                mbOut~append(before, startPgmListing, program, endNeedle)
               .count~increase
             end
          end
          else     -- strip leading and trailing CR-LF characters
          do
             -- check for CDATA-section, remove leading trailing blank lines 
there as well
             if program~pos(cdataStart)>0 then
             do
                parse var program (cdataStart) program (cdataEnd)
                strippedProgram=program~strip("both",crlf)
                if strippedProgram="" then    -- maybe a placeholder of 
whitespace, leave as is
                   strippedProgram=program

                bDirty=(program<>strippedProgram)

                if attributes="" then
                   mbOut~append(before, startPgmListing, cdataStart,            
  strippedProgram, cdataEnd, endNeedle)
                else
                   mbOut~append(before, startNeedle, attributes, ">", 
cdataStart, strippedProgram, cdataEnd, endNeedle)
             end
             else
             do
                strippedProgram=program~strip("both",crlf)
                if strippedProgram="" then    -- maybe a placeholder of 
whitespace, leave as is
                   strippedProgram=program

                bDirty=(program<>strippedProgram)

                if attributes="" then
                   mbOut~append(before, startPgmListing,              
strippedProgram, endNeedle)
                else
                   mbOut~append(before, startNeedle, attributes, ">", 
strippedProgram, endNeedle)
             end

             .count~increase
          end
       end

          -- write new file, if strip changes took place
      if bDirty then
      do
          if .count~bVerbose then say "... rewriting (bDirty=.true) ..."
         .count~increaseRewriteCounter
         .stream~new(fileName)~~open("write 
replace")~~charout(mbOut~string)~close
      end

    ::class count
    ::attribute counter        class
    ::attribute attrCount      class
    ::attribute rewriteCounter class
    ::attribute bVerbose       class    -- if .true will indicate attributes 
and files that get rewritten

    ::method    init           class
      expose counter attrCount rewriteCounter
      counter       =0
      attrCount     =0
      rewriteCounter=0
      bVerbose      =.false

    ::method    increase       class
      expose counter
      counter+=1
      return counter

    ::method    increaseAttrCount class
      expose attrCount
      attrCount+=1
      return attrCount

    ::method    increaseRewriteCounter class
       expose rewriteCounter
       rewriteCounter+=1
       return rewriteCounter

Running it here yields as output (in verbose mode, see top of program):

    F:\work\svn\oorexx\docs\trunk>stripBlankLinesFromProgramlisting.rex
    SysFileTree duration: 00:00:00.000000, about to process 301 files
      1: F:\work\svn\oorexx\docs\trunk\buildmachine\en-US\admin.xml
    ... cut ...

     80: F:\work\svn\oorexx\docs\trunk\oodialog\en-US\utilityclasses.xml
    ... <programlisting id="exampleLoWordClsDlgUtil"> ... attrCount=2
    ... <programlisting id="exampleSLoWordClsDlgUtil"> ... attrCount=3
    ... <programlisting id="exampleNewClsPoint"> ... attrCount=4
    ... cut ...

     92: F:\work\svn\oorexx\docs\trunk\oorexx\en-US\Notices.xml
     93: 
F:\work\svn\oorexx\docs\trunk\oorexx\publish\oorexx\en-US\Conventions.xml
    ... <programlisting language="Java"> ... attrCount=5
    ... cut ...

    76: F:\work\svn\oorexx\docs\trunk\rexxpg\en-US\api.xml
    ... <programlisting language="C++"> ... attrCount=6
        ... cut ...
    ... <programlisting language="C++"> ... attrCount=275
    177: F:\work\svn\oorexx\docs\trunk\rexxpg\en-US\Author_Group.xml
    ... cut ...

    180: F:\work\svn\oorexx\docs\trunk\rexxpg\en-US\classicapi.xml
    ... <programlisting language="C++"> ... attrCount=276
        ... cut ...
    ... <programlisting language="C++"> ... attrCount=342

    181: F:\work\svn\oorexx\docs\trunk\rexxpg\en-US\command.xml
    ... cut ...

    301: F:\work\svn\oorexx\docs\trunk\winextensions\en-US\winregistry.xml
    found 6450 <programlisting> elements, (342 with attributes), rewrote 149 of 
301 files, duration: 00:00:01.305000

Rerunning it immediately thereafter, yields:

    F:\work\svn\oorexx\docs\trunk>stripBlankLinesFromProgramlisting.rex
    SysFileTree duration: 00:00:00.016000, about to process 301 files
      1: F:\work\svn\oorexx\docs\trunk\buildmachine\en-US\admin.xml
    ... cut ...
    301: F:\work\svn\oorexx\docs\trunk\winextensions\en-US\winregistry.xml
    found 6450 <programlisting> elements, (342 with attributes), rewrote 0 of 
301 files, duration: 00:00:00.277000

The svn revert statement to undo the changes would be:

    svn revert -R *

---rony


_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to