Re: uneven data movement in one of the disk in Cassandra

James Shaw Fri, 09 Mar 2018 19:49:36 -0800

Ours have similar issue and I am working to solve it this weekend.
Our case is because STCS make one huge table's sstable file bigger and
bigger after compaction  (this is STCS compaction nature, nothing wrong),
even all most all data TTL 30days, but tombstones not evicted since largest
file is waiting for other 3 files for compaction.  The largest file 99.99%
are tombstones.

use command:  nodetool upgradesstables -a keyspace table
it will re-write all existed sstables and evit tombstones.

in you case, first do a few checking:
1. cd  /data/disk03/cassandra/data_prod/data
du -ks * | sort -n
find which tables use most space

2.  check the snapshot for above bigger tables
it's possible too old snapshots caused.

3.  cd table directory
sstablemetadata  sstablefile
to look the tables, whether a lot tombstones droppable

 4.
ls -lhS /data/disk */ cassandra/data_prod/data/"that
keyspace"/"that_table"*/*Data.db
look all sstables files,  you will see what's next compaction.

Per my watch, when small size compaction, seems randomly to which disks,
but when size large, it goes to disks which has more free space.

5.  if the biggest file too big, will wait long time for next compaction.
You may test ( sorry, not in my case, so I am not 100% sure)
1) if new cassandra 3.0,  you may try nodetool compact -s  ( it will split )
2) if old cassandra version,  stop cassandra,  use sstbalesplit

Hope it helps

Thanks,

James

On Fri, Mar 9, 2018 at 7:14 AM, Kyrylo Lebediev <kyrylo_lebed...@epam.com>
wrote:

> Not sure where I heard this, but AFAIK data imbalance when multiple
> data_directories are in use is a known issue for older versions of
> Cassandra. This might be the root-cause of your issue.
>
> Which version of C* are you using?
>
> Unfortunately, don't remember in which version this imbalance issue was
> fixed.
>
>
> -- Kyrill
> ------------------------------
> *From:* Yasir Saleem <yasirsaleem9...@gmail.com>
> *Sent:* Friday, March 9, 2018 1:34:08 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: uneven data movement in one of the disk in Cassandra
>
> Hi Alex,
>
> no active compaction, right now.
>
>
>
>
> On Fri, Mar 9, 2018 at 3:47 PM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
> On Fri, Mar 9, 2018 at 11:40 AM, Yasir Saleem <yasirsaleem9...@gmail.com>
> wrote:
>
> Thanks, Nicolas Guyomar
>
> I am new to cassandra, here is the properties which I can see in yaml
> file:
>
> # of compaction, including validation compaction.
> compaction_throughput_mb_per_sec: 16
> compaction_large_partition_warning_threshold_mb: 100
>
>
> To check currently active compaction please use this command:
>
> nodetool compactionstats -H
>
> on the host which shows the problem.
>
> --
> Alex
>
>
>

Re: uneven data movement in one of the disk in Cassandra

Reply via email to