https://bugzilla.wikimedia.org/show_bug.cgi?id=21937
Summary: mwdumper uses too much memory Product: mwdumper Version: unspecified Platform: PC OS/Version: Windows XP Status: NEW Severity: enhancement Priority: Normal Component: general AssignedTo: br...@pobox.com ReportedBy: gti...@gmail.com I tried to run the GUI version of the newest revision (r60229) of mwdumper under Java 6 update 17 on an Intel Core i7 with 3,25G RAM and WinXP SP3, and it gave this error: Exception in thread "Thread-8" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) at java.lang.StringCoding.safeTrim(Unknown Source) at java.lang.StringCoding.access$300(Unknown Source) at java.lang.StringCoding$StringEncoder.encode(Unknown Source) at java.lang.StringCoding.encode(Unknown Source) at java.lang.String.getBytes(Unknown Source) at com.mysql.jdbc.StringUtils.getBytes(StringUtils.java:493) at com.mysql.jdbc.StringUtils.getBytes(StringUtils.java:603) at com.mysql.jdbc.ByteArrayBuffer.writeStringNoNull(ByteArrayBuffer.java:544) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1638) at com.mysql.jdbc.Connection.execSQL(Connection.java:2972) at com.mysql.jdbc.Connection.execSQL(Connection.java:2902) at com.mysql.jdbc.Statement.execute(Statement.java:529) at org.mediawiki.importer.SqlServerStream.writeStatement(SqlServerStream.java:25) at org.mediawiki.importer.SqlWriter.flushInsertBuffer(SqlWriter.java:195) at org.mediawiki.importer.SqlWriter.bufferInsertRow(SqlWriter.java:184) at org.mediawiki.importer.SqlWriter15.writeRevision(SqlWriter15.java:68) at org.mediawiki.importer.PageFilter.writeRevision(PageFilter.java:67) at org.mediawiki.dumper.ProgressFilter.writeRevision(ProgressFilter.java:56) at org.mediawiki.importer.XmlDumpReader.closeRevision(XmlDumpReader.java:346) at org.mediawiki.importer.XmlDumpReader.endElement(XmlDumpReader.java:204) at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(Unknown Source) According to the Java docs, default max heap size is 3/4 of the physical memory, that is, around 800M. Since a single revision is at most 2M, there is no reason for mwdumper to require that much space. (It ran on the huwiki full history dump, directly writing to the database.) -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l