Hey all,

I am receiving an OutOfMemory error while running a script that is
trying to loop over a 1.2gb+ xml file (~ 12 million lines). I'm not
really sure if what I am doing is just horrible and there is a better
way or if it is a memory issue in openbd.

I have assigned tomcat 2gb max memory. While I'm running the script I
can see the memory usage slowly creep up in task manager. With 4gb of
ram on the vps I get to about 7 million lines before tomcat gives up.
When I had 3gb of ram on the server and 1gb applied to Tomcat I could
only get to about 4 million lines.

Here's the logic behind what I am doing.

I am interested in one particular node in the large file so I loop
over the file line by line. As I loop if the line does not contain the
end of the node I'm looking for then I <cfset locals.exampleNode &=
locals.line />
Once I hit a line that contains the end of the node ( </
example_node> ). I do a few operations to clean up any extra text from
the front and back of the node string and then convert it to xml with
xmlparse.

Once I have the node as xml I push it to another function that does
serveral things.
** uses xpath to grab particular information from the node. Seven
xpath searches are done on each node unless I decide to skip the node
after the first two xpath searches.
** Depending on the content I either add the information to my
database, update the information, or skip it. I have about 5 tables
that are getting modified from the script. A few of the unimportant
queries use background="yes".
The whole script runs in a cfthread so it doesn't time out.

Can anyone give any insight. Also, I could post some code example, but
my script is about 600 lines long.

-- 
online documentation: http://openbd.org/manual/
   google+ hints/tips: https://plus.google.com/115990347459711259462
     http://groups.google.com/group/openbd?hl=en

     Join us @ http://www.OpenCFsummit.org/ Dallas, Feb 2012

Reply via email to