Serializing large data sets

Abhay M Fri, 11 Jun 2010 08:26:52 -0700

Hi,

Are there any know concerns with serializing large data sets with Thrift? I
am looking to serialize messages with 10-150K records, sometimes resulting
in ~30M per message. These messages are serialized for storage.


I have been experimenting with Google protobuf and saw this in the
documentation (
http://code.google.com/apis/protocolbuffers/docs/techniques.html) -
"Protocol Buffers are not designed to handle large messages. As a general
rule of thumb, if you are dealing in messages larger than a megabyte each,
it may be time to consider an alternate strategy."
FWIW, I did switch to delimited write/parse API (Java only) as recommended
in the doc and it works well. But, Python protobuf impl lacks this API and
is slow.

Thanks
Abhay

Serializing large data sets

Reply via email to