Re: [Tutor] Trying to parse a HUGE(1gb) xml file in python

Stefan Behnel Tue, 21 Dec 2010 02:25:46 -0800

Alan Gauld, 21.12.2010 10:58:

"David Hutto" wrote

http://www.google.com/search?client=ubuntu&channel=fs&q=parsing+gigabyte+xml+python&ie=utf-8&oe=utf-8


Eeek! One of the listings says:

22 Jan 2009 ... Stripping Illegal Characters from XML in Python >>

... I'd be asking Python to process 6.4 gigabytes of CSV into
6.5 gigabytes of XML 1. ..... In fact, what happened was that
the parsing didn't work and the whole db was ...

And I thought a 1G file was extreme... Do these people stop to think that
with XML as much as 80% of their "data" is just description (ie the tags).

As I already said, it compresses well. In run-length compressed XML files,the tags can easily take up a negligible amount of space compared to themore widely varying data content (although that also commonly tends tocompress rather well). And depending on how fast your underlying storageis, decompressing and parsing the file may still be faster than parsing ahuge uncompressed file directly. So, again, the shear uncompressed filesize is *not* a very interesting argument.


Stefan

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Trying to parse a HUGE(1gb) xml file in python

Reply via email to