Re: [OpenBD] Memory Issue while looping over large file

Alex Skinner Thu, 12 Jan 2012 10:43:39 -0800

Seeing some code would be good how are you doing the read

I google and found something like this


<cfscript>
// Define the file to read, use forward slashes only
FileName="C:/Example/ReadMe.txt";
// Initilize Java File IO
FileIOClass=createObject("java","java.io.FileReader");
FileIO=FileIOClass.init(FileName);
LineIOClass=createObject("java","java.io.BufferedReader" );
LineIO=LineIOClass.init(FileIO);
</cfscript>

<CFSET EOF=0>
<CFLOOP condition="NOT EOF">
    <!--- Read in next line --->
    <CFSET CurrLine=LineIO.readLine()>
    <!--- If CurrLine is not defined, we have reached the end of file --->
    <CFIF IsDefined("CurrLine") EQ "NO">
        <CFSET EOF=1>
        <CFBREAK>
    </CFIF>
    <CFOUTPUT>#CurrLine#<br></CFOUTPUT><CFFLUSH>
</CFLOOP>


Is your solution similar ?

A

On 12 January 2012 17:57, Aaron J. White <[email protected]> wrote:

> Hey all,
>
> I am receiving an OutOfMemory error while running a script that is
> trying to loop over a 1.2gb+ xml file (~ 12 million lines). I'm not
> really sure if what I am doing is just horrible and there is a better
> way or if it is a memory issue in openbd.
>
> I have assigned tomcat 2gb max memory. While I'm running the script I
> can see the memory usage slowly creep up in task manager. With 4gb of
> ram on the vps I get to about 7 million lines before tomcat gives up.
> When I had 3gb of ram on the server and 1gb applied to Tomcat I could
> only get to about 4 million lines.
>
> Here's the logic behind what I am doing.
>
> I am interested in one particular node in the large file so I loop
> over the file line by line. As I loop if the line does not contain the
> end of the node I'm looking for then I <cfset locals.exampleNode &=
> locals.line />
> Once I hit a line that contains the end of the node ( </
> example_node> ). I do a few operations to clean up any extra text from
> the front and back of the node string and then convert it to xml with
> xmlparse.
>
> Once I have the node as xml I push it to another function that does
> serveral things.
> ** uses xpath to grab particular information from the node. Seven
> xpath searches are done on each node unless I decide to skip the node
> after the first two xpath searches.
> ** Depending on the content I either add the information to my
> database, update the information, or skip it. I have about 5 tables
> that are getting modified from the script. A few of the unimportant
> queries use background="yes".
> The whole script runs in a cfthread so it doesn't time out.
>
> Can anyone give any insight. Also, I could post some code example, but
> my script is about 600 lines long.
>
> --
> online documentation: http://openbd.org/manual/
>   google+ hints/tips: https://plus.google.com/115990347459711259462
>     http://groups.google.com/group/openbd?hl=en
>
>     Join us @ http://www.OpenCFsummit.org/ Dallas, Feb 2012
>



-- 
Alex Skinner
Managing Director
Pixl8 Interactive

Tel: +448452600726
Email: [email protected]
Web: pixl8.co.uk

-- 
online documentation: http://openbd.org/manual/
   google+ hints/tips: https://plus.google.com/115990347459711259462
     http://groups.google.com/group/openbd?hl=en

     Join us @ http://www.OpenCFsummit.org/ Dallas, Feb 2012

Re: [OpenBD] Memory Issue while looping over large file

Reply via email to