Hi All, My goal is to be able to read the www.gutenberg.org <http://www.gutenberg.org/> rdf catalog, parse it into a python structure, and pull out data for each record.
The catalog is a Dublin core RDF/XML catalog, divided into sections for each book and details for that book. I have done a very large amount of research on this problem. I've tried tools such as pyrple, sax/dom/minidom, and some others both standard and nonstandard to a python installation. None of the tools has been able to read this file successfully, and those that can even see the data can take up to half an hour to load with 2 gb of ram. So you all know what I'm talking about, the file is located at: http://www.gutenberg.org/feeds/catalog.rdf.bz2 Does anyone have suggestions for a parser or converter, so I'd be able to view this file, and extract data? Any help is appreciated. Thanks, Brandon McGinty [EMAIL PROTECTED]
-- http://mail.python.org/mailman/listinfo/python-list