Hey all, I am receiving an OutOfMemory error while running a script that is trying to loop over a 1.2gb+ xml file (~ 12 million lines). I'm not really sure if what I am doing is just horrible and there is a better way or if it is a memory issue in openbd.
I have assigned tomcat 2gb max memory. While I'm running the script I can see the memory usage slowly creep up in task manager. With 4gb of ram on the vps I get to about 7 million lines before tomcat gives up. When I had 3gb of ram on the server and 1gb applied to Tomcat I could only get to about 4 million lines. Here's the logic behind what I am doing. I am interested in one particular node in the large file so I loop over the file line by line. As I loop if the line does not contain the end of the node I'm looking for then I <cfset locals.exampleNode &= locals.line /> Once I hit a line that contains the end of the node ( </ example_node> ). I do a few operations to clean up any extra text from the front and back of the node string and then convert it to xml with xmlparse. Once I have the node as xml I push it to another function that does serveral things. ** uses xpath to grab particular information from the node. Seven xpath searches are done on each node unless I decide to skip the node after the first two xpath searches. ** Depending on the content I either add the information to my database, update the information, or skip it. I have about 5 tables that are getting modified from the script. A few of the unimportant queries use background="yes". The whole script runs in a cfthread so it doesn't time out. Can anyone give any insight. Also, I could post some code example, but my script is about 600 lines long. -- online documentation: http://openbd.org/manual/ google+ hints/tips: https://plus.google.com/115990347459711259462 http://groups.google.com/group/openbd?hl=en Join us @ http://www.OpenCFsummit.org/ Dallas, Feb 2012
