Re: Many pending compactions

Ja Sam Wed, 18 Feb 2015 07:41:38 -0800

Can you explain me what is the correlation between growing SSTables and
repair?
I was sure, until your  mail, that repair is only to make data consistent
between nodes.


Regards


On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar <ronibaltha...@gmail.com>
wrote:

> Which error are you getting when running repairs?
> You need to run repair on your nodes within gc_grace_seconds (eg:
> weekly). They have data that are not read frequently. You can run
> "repair -pr" on all nodes. Since you do not have deletes, you will not
> have trouble with that. If you have deletes, it's better to increase
> gc_grace_seconds before the repair.
>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
> After repair, try to run a "nodetool cleanup".
>
> Check if the number of SSTables goes down after that... Pending
> compactions must decrease as well...
>
> Cheers,
>
> Roni Balthazar
>
>
>
>
> On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam <ptrstp...@gmail.com> wrote:
> > 1) we tried to run repairs but they usually does not succeed. But we had
> > Leveled compaction before. Last week we ALTER tables to STCS, because
> guys
> > from DataStax suggest us that we should not use Leveled and alter tables
> in
> > STCS, because we don't have SSD. After this change we did not run any
> > repair. Anyway I don't think it will change anything in SSTable count -
> if I
> > am wrong please give me an information
> >
> > 2) I did this. My tables are 99% write only. It is audit system
> >
> > 3) Yes I am using default values
> >
> > 4) In both operations I am using LOCAL_QUORUM.
> >
> > I am almost sure that READ timeout happens because of too much SSTables.
> > Anyway firstly I would like to fix to many pending compactions. I still
> > don't know how to speed up them.
> >
> >
> > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar <ronibaltha...@gmail.com
> >
> > wrote:
> >>
> >> Are you running repairs within gc_grace_seconds? (default is 10 days)
> >>
> >>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
> >>
> >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
> >> that you do not read often.
> >>
> >> Are you using default values for the properties
> >> min_compaction_threshold(4) and max_compaction_threshold(32)?
> >>
> >> Which Consistency Level are you using for reading operations? Check if
> >> you are not reading from DC_B due to your Replication Factor and CL.
> >>
> >>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
> >>
> >>
> >> Cheers,
> >>
> >> Roni Balthazar
> >>
> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam <ptrstp...@gmail.com> wrote:
> >> > I don't have problems with DC_B (replica) only in DC_A(my system write
> >> > only
> >> > to it) I have read timeouts.
> >> >
> >> > I checked in OpsCenter SSTable count  and I have:
> >> > 1) in DC_A  same +-10% for last week, a small increase for last 24h
> (it
> >> > is
> >> > more than 15000-20000 SSTables depends on node)
> >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice
> >> > prognostics.
> >> > Now I have less then 1000 SSTables
> >> >
> >> > What did you measure during system optimizations? Or do you have an
> idea
> >> > what more should I check?
> >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
> >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
> >> > spikes
> >> > 3) system RAM usage is almost full
> >> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For total
> >> > DC_A
> >> > it is less than 10MB/s, in DC_B it looks much better (avg is like
> >> > 17MB/s)
> >> >
> >> > something else?
> >> >
> >> >
> >> >
> >> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
> >> > <ronibaltha...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> You can check if the number of SSTables is decreasing. Look for the
> >> >> "SSTable count" information of your tables using "nodetool cfstats".
> >> >> The compaction history can be viewed using "nodetool
> >> >> compactionhistory".
> >> >>
> >> >> About the timeouts, check this out:
> >> >>
> >> >>
> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
> >> >> Also try to run "nodetool tpstats" to see the threads statistics. It
> >> >> can lead you to know if you are having performance problems. If you
> >> >> are having too many pending tasks or dropped messages, maybe will you
> >> >> need to tune your system (eg: driver's timeout, concurrent reads and
> >> >> so on)
> >> >>
> >> >> Regards,
> >> >>
> >> >> Roni Balthazar
> >> >>
> >> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam <ptrstp...@gmail.com> wrote:
> >> >> > Hi,
> >> >> > Thanks for your "tip" it looks that something changed - I still
> don't
> >> >> > know
> >> >> > if it is ok.
> >> >> >
> >> >> > My nodes started to do more compaction, but it looks that some
> >> >> > compactions
> >> >> > are really slow.
> >> >> > In IO we have idle, CPU is quite ok (30%-40%). We set
> >> >> > compactionthrouput
> >> >> > to
> >> >> > 999, but I do not see difference.
> >> >> >
> >> >> > Can we check something more? Or do you have any method to monitor
> >> >> > progress
> >> >> > with small files?
> >> >> >
> >> >> > Regards
> >> >> >
> >> >> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
> >> >> > <ronibaltha...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> HI,
> >> >> >>
> >> >> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0
> >> >> >> was
> >> >> >> the solution...
> >> >> >> The number of SSTables decreased from many thousands to a number
> >> >> >> below
> >> >> >> a hundred and the SSTables are now much bigger with several
> >> >> >> gigabytes
> >> >> >> (most of them).
> >> >> >>
> >> >> >> Cheers,
> >> >> >>
> >> >> >> Roni Balthazar
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstp...@gmail.com>
> >> >> >> wrote:
> >> >> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
> >> >> >> > Compaction
> >> >> >> > are running but VERY slow with "idle" IO.
> >> >> >> >
> >> >> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about
> >> >> >> > ~120000
> >> >> >> > (only
> >> >> >> > xxx-Data.db) in DC_B has only ~4000.
> >> >> >> >
> >> >> >> > I don't know if this change anything but:
> >> >> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a
> really
> >> >> >> > big
> >> >> >> > ones,
> >> >> >> > but most is really small (almost 10000 files are less then
> 100mb).
> >> >> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
> >> >> >> >
> >> >> >> > Do you think that above flag will help us?
> >> >> >> >
> >> >> >> >
> >> >> >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstp...@gmail.com>
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> I set setcompactionthroughput 999 permanently and it doesn't
> >> >> >> >> change
> >> >> >> >> anything. IO is still same. CPU is idle.
> >> >> >> >>
> >> >> >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
> >> >> >> >> <ronibaltha...@gmail.com>
> >> >> >> >> wrote:
> >> >> >> >>>
> >> >> >> >>> Hi,
> >> >> >> >>>
> >> >> >> >>> You can run "nodetool compactionstats" to view statistics on
> >> >> >> >>> compactions.
> >> >> >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the
> number
> >> >> >> >>> of
> >> >> >> >>> SSTables when you use Size-Tiered compaction.
> >> >> >> >>> You can also create a cron job to increase the value of
> >> >> >> >>> setcompactionthroughput during the night or when your IO is
> not
> >> >> >> >>> busy.
> >> >> >> >>>
> >> >> >> >>> From http://wiki.apache.org/cassandra/NodeTool:
> >> >> >> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput
> >> >> >> >>> 999
> >> >> >> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput
> 16
> >> >> >> >>>
> >> >> >> >>> Cheers,
> >> >> >> >>>
> >> >> >> >>> Roni Balthazar
> >> >> >> >>>
> >> >> >> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam <ptrstp...@gmail.com>
> >> >> >> >>> wrote:
> >> >> >> >>> > One think I do not understand. In my case compaction is
> >> >> >> >>> > running
> >> >> >> >>> > permanently.
> >> >> >> >>> > Is there a way to check which compaction is pending? The
> only
> >> >> >> >>> > information is
> >> >> >> >>> > about total count.
> >> >> >> >>> >
> >> >> >> >>> >
> >> >> >> >>> > On Monday, February 16, 2015, Ja Sam <ptrstp...@gmail.com>
> >> >> >> >>> > wrote:
> >> >> >> >>> >>
> >> >> >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night
> >> >> >> >>> >> build
> >> >> >> >>> >> is
> >> >> >> >>> >> available from
> >> >> >> >>> >> http://cassci.datastax.com/job/cassandra-2.1/
> >> >> >> >>> >>
> >> >> >> >>> >> I read about cold_reads_to_omit It looks promising. Should
> I
> >> >> >> >>> >> set
> >> >> >> >>> >> also
> >> >> >> >>> >> compaction throughput?
> >> >> >> >>> >>
> >> >> >> >>> >> p.s. I am really sad that I didn't read this before:
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >>
> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >> On Monday, February 16, 2015, Carlos Rolo <
> r...@pythian.com>
> >> >> >> >>> >> wrote:
> >> >> >> >>> >>>
> >> >> >> >>> >>> Hi 100% in agreement with Roland,
> >> >> >> >>> >>>
> >> >> >> >>> >>> 2.1.x series is a pain! I would never recommend the
> current
> >> >> >> >>> >>> 2.1.x
> >> >> >> >>> >>> series
> >> >> >> >>> >>> for production.
> >> >> >> >>> >>>
> >> >> >> >>> >>> Clocks is a pain, and check your connectivity! Also check
> >> >> >> >>> >>> tpstats
> >> >> >> >>> >>> to
> >> >> >> >>> >>> see
> >> >> >> >>> >>> if your threadpools are being overrun.
> >> >> >> >>> >>>
> >> >> >> >>> >>> Regards,
> >> >> >> >>> >>>
> >> >> >> >>> >>> Carlos Juzarte Rolo
> >> >> >> >>> >>> Cassandra Consultant
> >> >> >> >>> >>>
> >> >> >> >>> >>> Pythian - Love your data
> >> >> >> >>> >>>
> >> >> >> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin:
> >> >> >> >>> >>> linkedin.com/in/carlosjuzarterolo
> >> >> >> >>> >>> Tel: 1649
> >> >> >> >>> >>> www.pythian.com
> >> >> >> >>> >>>
> >> >> >> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
> >> >> >> >>> >>> <r.etzenham...@t-online.de> wrote:
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> Hi,
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0
> >> >> >> >>> >>>> (suggested
> >> >> >> >>> >>>> by
> >> >> >> >>> >>>> Al
> >> >> >> >>> >>>> Tobey from DataStax)
> >> >> >> >>> >>>> 7) minimal reads (usually none, sometimes few)
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> those two points keep me repeating an anwser I got. First
> >> >> >> >>> >>>> where
> >> >> >> >>> >>>> did
> >> >> >> >>> >>>> you
> >> >> >> >>> >>>> get 2.1.3 from? Maybe I missed it, I will have a look.
> But
> >> >> >> >>> >>>> if
> >> >> >> >>> >>>> it
> >> >> >> >>> >>>> is
> >> >> >> >>> >>>> 2.1.2
> >> >> >> >>> >>>> whis is the latest released version, that version has
> many
> >> >> >> >>> >>>> bugs -
> >> >> >> >>> >>>> most of
> >> >> >> >>> >>>> them I got kicked by while testing 2.1.2. I got many
> >> >> >> >>> >>>> problems
> >> >> >> >>> >>>> with
> >> >> >> >>> >>>> compactions not beeing triggred on column families not
> >> >> >> >>> >>>> beeing
> >> >> >> >>> >>>> read,
> >> >> >> >>> >>>> compactions and repairs not beeing completed.  See
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> >> >> >> >>> >>>>
> https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> Apart from that, how are those both datacenters
> connected?
> >> >> >> >>> >>>> Maybe
> >> >> >> >>> >>>> there
> >> >> >> >>> >>>> is a bottleneck.
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> Also do you have ntp up and running on all nodes to keep
> >> >> >> >>> >>>> all
> >> >> >> >>> >>>> clocks
> >> >> >> >>> >>>> in
> >> >> >> >>> >>>> thight sync?
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> Note: I'm no expert (yet) - just sharing my 2 cents.
> >> >> >> >>> >>>>
> >> >> >> >>> >>>> Cheers,
> >> >> >> >>> >>>> Roland
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>> --
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >
> >> >> >> >>
> >> >> >> >>
> >> >> >> >
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>

Re: Many pending compactions

Reply via email to