Protocol buffers, thrift?
On 11/3/08 4:07 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote: Zhou, Yunqing wrote: > embedded database cannot handle large-scale data, not very efficient > I have about 1 billion records. > these records should be passed through some modules. > I mean a data exchange format similar to XML but more flexible and > efficient. JSON CSV erlang-style records (name,value,value,value) RDF-triples in non-XML representations For all of these, you need to test with data that includes things like high unicode characters, single and double quotes, to see how well they get handled. you can actually append with XML by not having opening/closing tags, just stream out the entries to the tail of the file <entry>...</entry> To read this in an XML parser, include it inside another XML file: <?xml version="1.0"?> <!DOCTYPE log [ <!ENTITY log SYSTEM "log.xml"> ]> <file> &log; </file> I've done this for very big files, as long as you aren't trying to load it in-memory to a DOM, things should work -- Steve Loughran http://www.1060.org/blogxter/publish/5 Author: Ant in Action http://antbook.org/