John R. Frank created CASSANDRA-5575: ----------------------------------------
Summary: permanent client failures: attempting batch_mutate on data that serializes to more than thrift_framed_transport_size_in_mb fails forever Key: CASSANDRA-5575 URL: https://issues.apache.org/jira/browse/CASSANDRA-5575 Project: Cassandra Issue Type: Bug Reporter: John R. Frank Since batch_mutate is a thrift interface, it unifies all of the data in a batch into a single thrift message. This means that clients cannot easily predict whether a batch will exceed thrift_framed_transport_size_in_mb Thrift's client libraries do not yet raise an exception on exceeding the frame size: https://issues.apache.org/jira/browse/THRIFT-1324 So, Cassandra clients are doomed to the infinite loop illustrated here: http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3calpine.deb.2.00.1305101202190.25...@computableinsights.com%3E I still don't understand why Cassandra has both of these parameters -- the second parameter appears to be superfluous: {code:borderStyle=solid} # Frame size for thrift (maximum field length). thrift_framed_transport_size_in_mb: 1500 # The max length of a thrift message, including all fields and # internal thrift overhead. thrift_max_message_length_in_mb: 1600 {code} (Note the monsterous message sizes we are now using to avoid zoombie clients; This is clearly too brittle to go into production. Is Cassandra really only for small batches?) Possible solutions: 1) fix Thrift and catch the error inside all the Cassandra clients and subdivide the batch and raise a further error if an individual message is too large. 2) change batch_mutate to serialize each mutation separately and assemble the messages into a thrift transmission controlled more directly by the client 3) plan the end-of-life of the Thrift interfaces to Cassandra and replace them with something else -- the new "binary streaming" protocol we've been hearing about? Other ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira