Commented and added a munin graph, if it helps. For the record, I’m happy with -par performance for now.
/Janne On 24 Oct 2014, at 18:59, Sean Bridges <sean.brid...@gmail.com> wrote: > Janne, > > I filed CASSANDRA-8177 [1] for this. Maybe comment on the jira that you are > having the same problem. > > Sean > > [1] https://issues.apache.org/jira/browse/CASSANDRA-8177 > > On Thu, Oct 23, 2014 at 2:04 PM, Janne Jalkanen <janne.jalka...@ecyrd.com> > wrote: > > On 23 Oct 2014, at 21:29 , Robert Coli <rc...@eventbrite.com> wrote: > >> On Thu, Oct 23, 2014 at 9:33 AM, Sean Bridges <sean.brid...@gmail.com> wrote: >> The change from parallel to sequential is very dramatic. For a small >> cluster with 3 nodes, using cassandra 2.0.10, a parallel repair takes 2 >> hours, and io throughput peaks at 6 mb/s. Sequential repair takes 40 hours, >> with average io around 27 mb/s. Should I file a jira? >> >> As you are an actual user actually encountering the problem I had only >> conjectured about, you are the person best suited to file such a ticket on >> the reasonableness of the -par default. :D > > Hm? I’ve been banging my head against the exact same problem (cluster size > five nodes, RF=3, ~40GB/node) - paraller repair takes about 6 hrs whereas > serial takes some 48 hours or so. In addition, the compaction impact is > roughly the same - that is, there’s the same number of compactions triggered > per minute, but serial runs eight times more of them. There does not seem to > be a difference between the node response latency during parallel or serial > repair. > > NB: We do increase our compaction throughput during calmer times, and lower > it through busy times, and the serial compaction takes enough time to hit the > busy period - that might also have an impact to the overall performance. > > If I had known that this had so far been a theoretical problem, I would’ve > spoken up earlier. Perhaps serial repair is not the best default. > > /Janne > >