Re: Out of Memory Error Transforming large XML file.

Dick Applebaum Tue, 15 Jun 2004 18:06:51 -0700

That's essentially what the subroutines do -- for lines of interest --
many lines are discarded.

Are you suggesting that I list-loop an initial pass against the entire
file creating an array, then parse the array line-by-line?

Mmmmm.

Dick

On Jun 15, 2004, at 5:56 PM, Barney Boisvert wrote:

> I can't help you with the CFFILE issue (though you could probably
> implement
>  it in java without great difficulty), but it seems that converting
> the file
>  to an array would get around your issues with line pointers.  You can
> add or
>  remove lines in the middle, keep track of where you are in the file,
> and do
>  arbitrary reads and skips at any point.  I don't know what the
> overhead of
>  that would be, but I'd imagine it wouldn't be any slower than the
> loop.
>
>  Cheers,
>  barneyb
>
>  > -----Original Message-----
>  > From: Dick Applebaum [mailto:[EMAIL PROTECTED]
>  > Sent: Tuesday, June 15, 2004 5:45 PM
>  > To: CF-Talk
>  > Subject: Re: Out of Memory Error Transforming large XML file.
>  >
>  > I finally chose to parse it myself, recoding existing Perl routines
>  > into CFMX scripts.
>  >
>  > So far it looks promising -- Most Perl structure, syntax & RegExps
>  > convert quite easily to CFScript.
>  >
>  > There are still a few disadvantages to CFMX over Perl
>  >
>  > 1) CF lacks a file readLn (readLine), so you must read in the entire
>  > file into memory.  You then can approximate a Perl chomp
>  > (readLn) by CF
>  > list-looping over the file with a newline delimiter.  Note:  Using
> CF
>  > list-looping presents the next line with each iteration and performs
>  > well (ListGetAt() is too slow to be acceptable).
>  >
>  > 2) Perl maintains a pointer to the current line position in a
>  >  file --
>  > this means you can chomp (readLn) the file in subroutines,
>  > then return
>  > to the caller with the current line positioned where the subroutine
>  > left it.  CF list-looping has no similar construct, nor the
>  > ability to
>  > list-loop from-to -- this means that the logic of the main
>  > routine gets
>  > more complex and there is more main-subroutine overhead.
>  >
>  > For example the xml file is divided into two major sections:  tracks
>  > followed by playlists.
>  >
>  > The Perl program chomps till it finds the first track
>  > ----passes control to a get_tracks subroutine
>  > ------subroutine chomps tracks until first play list, then returns
>  > Main program passes control to get_playlists subroutine
>  > ------subroutine chomps playlists until EOF, then returns
>  > Main program continues
>  >
>  > In the CF program
>  >
>  > Main Program loops over the list line by line.
>  > if found first track and not first playlist
>  > ---call get_tracks sub to process this track (line) only & return
>  > if found  first playlist
>  > ---call get_playlist sub to process this playlist (line) only &
> return
>  > Main program continues
>  >
>  > As you can see, there is additional  testing and call
>  > overhead for each
>  > line in the CF solution -- all because CF can't readLn a file nor
>  > manipulate the list-loop position in a subroutine.
>  >
>  > For the latter I would like to see CFML improved to include:
>  > 1) an implied list  pointer for each list loop:
>  > 1) , ListGetFirst, ListGetLast, ListGetAt functions that
>  > reposition the
>  > list pointer
>  >
>  > Also  cf list-loop  with from  and to parameters
>  >
>  > Also CFFile readln
>  >
>  > Dick
>

[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]

Re: Out of Memory Error Transforming large XML file.

Reply via email to