@Ed, what you just said reminded me a lot of RAMP transactions. I did a blog post on it here: http://rustyrazorblade.com/2015/11/ramp-made-easy/
I've been considering doing a follow up on how to do a Cassandra data model enabling RAMP transactions, but that takes time, and I have almost zero of that. On Wed, Dec 7, 2016 at 9:16 AM Edward Capriolo <edlinuxg...@gmail.com> wrote: > I have been circling around a thought process over batches. Now that > Cassandra has aggregating functions, it might be possible write a type of > record that has an END_OF_BATCH type marker and the data can be suppressed > from view until it was all there. > > IE you write something like a checksum record that an intelligent client > can use to tell if the rest of the batch is complete. > > On Wed, Dec 7, 2016 at 11:58 AM, Voytek Jarnot <voytek.jar...@gmail.com> > wrote: > > Been about a month since I have up on it, but it was very much related to > the stuff you're dealing with ... Basically Cassandra just stepping on its > own.... errrrr, tripping over its own feet streaming MVs. > > On Dec 7, 2016 10:45 AM, "Benjamin Roth" <benjamin.r...@jaumo.com> wrote: > > I meant the mv thing > > Am 07.12.2016 17:27 schrieb "Voytek Jarnot" <voytek.jar...@gmail.com>: > > Sure, about which part? > > default batch size warning is 5kb > I've increased it to 30kb, and will need to increase to 40kb (8x default > setting) to avoid WARN log messages about batch sizes. I do realize it's > just a WARNing, but may as well avoid those if I can configure it out. > That said, having to increase it so substantially (and we're only dealing > with 5 tables) is making me wonder if I'm not taking the correct approach > in terms of using batches to guarantee atomicity. > > On Wed, Dec 7, 2016 at 10:13 AM, Benjamin Roth <benjamin.r...@jaumo.com> > wrote: > > Could you please be more specific? > > Am 07.12.2016 17:10 schrieb "Voytek Jarnot" <voytek.jar...@gmail.com>: > > Should've mentioned - running 3.9. Also - please do not recommend MVs: I > tried, they're broken, we punted. > > On Wed, Dec 7, 2016 at 10:06 AM, Voytek Jarnot <voytek.jar...@gmail.com> > wrote: > > The low default value for batch_size_warn_threshold_in_kb is making me > wonder if I'm perhaps approaching the problem of atomicity in a non-ideal > fashion. > > With one data set duplicated/denormalized into 5 tables to support > queries, we use batches to ensure inserts make it to all or 0 tables. This > works fine, but I've had to bump the warn threshold and fail threshold > substantially (8x higher for the warn threshold). This - in turn - makes > me wonder, with a default setting so low, if I'm not solving this problem > in the canonical/standard way. > > Mostly just looking for confirmation that we're not unintentionally doing > something weird... > > > > >