The workarounds we can apply at the Cassandra level have too high a cost:benefit ratio. The long term fix is to move to Avro.
2010/3/26 Ted Zlatanov <t...@lifelogs.com>: > On Fri, 26 Mar 2010 07:48:43 -0500 Jonathan Ellis <jbel...@gmail.com> wrote: > > JE> 2010/3/26 Ted Zlatanov <t...@lifelogs.com>: >>> I know this has been discussed in tickets and here previously. I just >>> wanted to comment on it because of the upcoming 0.6 release. >>> >>> In my environment I patch Cassandra to prevent the OOM errors from >>> malformed incoming Thrift data, which as everyone knows let anyone crash >>> the servers hard with a netcat invocation. For those who don't know the >>> story, see https://issues.apache.org/jira/browse/THRIFT-601 >>> >>> I think the OOM guard should be in the Cassandra releases, at least as >>> an option. Just because Thrift doesn't give us airbags doesn't mean we >>> don't need brakes. > > JE> Catching OOME is a bug, not a fix. OOME is the JVM saying "I give up; > JE> you're screwed." The JVM isn't stable anymore. > > I didn't know that, thanks for explaining. I thought the JVM could > recover. > > Can we patch the Thrift-generated Java code, at least, set the read > length, or do something else? I hate to give up on this just because > Thrift is broken (as we've discussed, there's no viable Thrift > replacement yet, and we won't allow users to replace the Thrift API with > their own implementation as I proposed with IPluggableAPI). > > Thanks > Ted > >