Hi,
I don't get the point of this experiment: if v 2.7 can only parse 10 times the 48Mb file, while v 2.8 can do it for 15 times, it means 2.8 uses less memory for the same DOM tree (as I guess you are not releasing the DOM tree between the parse() operations, so keeping them all in memory). As for your added code, you are concatenating the input file in a std::string, so it makes sense that 16 times * 48Mb = 768Mb crashes your application (btw, the fact that you have 2Gb of memory doesn't imply that the program can find a contiguous chunk of memory of 800Mb).

Alberto

At 13.38 17/10/2007 -0500, Li, PingShan \(Kansas City\) wrote:
We use Xerces in our C++ project to load XML file as DOM tree.

We have one question related to the memory usage of Xerces C++ version. I made small modification to
the sample DOMCount project provided by Xerces to demonstrate the question.

Operating system is Windows xp professional. Visual studio 2003 VC7.1 is used for the testing.



The program is tested on a box with 2G RAM.



Test.xml used in here is a 48M xml file.

For xerces 2.7:

If I add the following code to DOMCount.cpp, I can run 10 iterations before I got "out of memory" exception. But if I commented out other code and only run the added code, I can run up to 16
iterations before I got "out of memory" exception. I would expect after
"XMLPlatformUtils::Terminate()" is called, there should be no difference on the number of iterations
for the added code to get the "out of memory" exception.

We used process explorer (HYPERLINK
"http://www.microsoft.com/technet/sysinternals/utilities/processexplorer.mspx";
\nhttp://www.microsoft.com/technet/sysinternals/utilities/processexplorer.mspx) to help us find out the memory usage of the program. The only thing came to our attention is the virtual memory used by Xerces. Physical memory is released after XMLPlatformUtils::Terminate, but virtual memory stays at the
same level.

Then I think I can try the same code with Xerces 2.8. To my surprise, I can run up to 15 iterations before I got the out of memory exception. If I only run the added code, it will throw out of memory
exception on the 16th iteration.

Is there anything that the 2.7 user need to pay attention to? Could anybody please tell me why there is a difference on the number of iterations before I got the "out of memory" exception in 2.7?

Thank you

PingShan Li


    //
// Delete the parser itself. Must be done prior to calling Terminate, below.
    //

    parser->release();

    // And call the termination method
    XMLPlatformUtils::Terminate();



/////////////////////////////////////////////////////////////////////////////
    // Code added for testing
    std::string test;
    int stringSize( 0 );
    for ( int i = 0; i < 100; ++i )
    {
      _sleep( 10 );
      std::cout << i << " " << stringSize << std::endl;

      FILE *hFile = hFile = fopen( "C:\\Test\\test.xml", "rb" );
      if ( hFile )
      {
        // Get the file size so we can allocate our buffer.
        fseek( hFile, 0, SEEK_END );
        const int nLength = ftell( hFile );
        fseek( hFile, 0, SEEK_SET );
        char* pszBuffer = new char[ nLength + 1 ];
        fread( pszBuffer, sizeof( char ), nLength, hFile );
        pszBuffer[ nLength ] = '\0';
        test += std::string( pszBuffer );
        stringSize += nLength;
        delete [] pszBuffer;
        fclose( hFile );
      }
    }

////////////////////////////////////////////////////////////////////////////////





No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.488 / Virus Database: 269.14.13/1075 - Release Date: 10/17/2007 9:38 AM


Reply via email to