Cassandra4 ViewMutationStage

2021-03-25 Thread Cameron Zemek
When is ViewMutationStage ever used? I am trying to have a test that will
result in tasks getting executed in this stage to verify metrics for
ViewMutuationStage.

I been going through the code and see in StorageProxy::mutateMV it will
call:

asyncWriteBatchedMutations(wrappers, localDataCenter, Stage.VIEW_MUTATION);

if the MV endpoint is not this endpoint. This ends up leading to call to
StorageProxy::sendToHintedReplicas

But it will only use that stage if performing locally. I haven't come up
with situation where its ever going to perform locally here. So I never get
a single task on ViewMutationStage as it will send messages with verb
MUTATION_REQ that are handled on MutationStage.

Considering local MV mutations are executed on MUTATION stage and anything
non local in sendToHintedReplicas also will end up on MUTATION stage, what
is the purpose of this ViewMutationStage ? Even if it got to execute
something on that stage its very specific conditions. Seems like this stage
should be removed.. or everything in StorageProxy::mutateMV should be using
the stage as the name implies.


Attention to serious bug CASSANDRA-15081

2019-09-11 Thread Cameron Zemek
Have had multiple customer hit this CASSANDRA-15081 issue now, where
upgrading from older versions the sstables contain an unknown column (its
not present in the dropped_columns in the schema)

This bug is serious as reads return incorrect results and if you run scrub
it will drop the row. So hoping to bring it some attention to have the
issue resolved. Note I have included a patch that I think does not cause
any regressions elsewhere.

Regards,
Cameron


Re: Repairing question

2017-06-25 Thread Cameron Zemek
> When you perform a non-incremental repair data is repaired but not marked
as repaired since this require anti-compaction to be run.

Not sure since what version, but in 3.10 at least (I think its since 3.x
started) full repair does do anti-compactions and marks sstables as
repaired.

On 23 June 2017 at 06:30, Paulo Motta  wrote:

> > This attribute seems to be only modified when executing "nodetool repair
> [keyspace] [table]", but not when executing with other options like
> --in-local-dc or --pr.
>
> This is correct behavior because this metric actually represent the
> percentage of SSTables incrementally repaired - and marked as repaired
> - which doesn't happen when you execute a non-incremental repair
> (--full, --in-local-dc, --pr). When you perform a non-incremental
> repair data is repaired but not marked as repaired since this require
> anti-compaction to be run.
>
> Actually this "percent repaired" display name is a bit misleading,
> since it sounds like data needs to be repaired while you could be
> running non-incremental repairs and still have data 100% repaired, so
> we should probably open a ticket to rename that to "Percent
> incrementally repaired" or similar.
>
>
> 2017-06-22 14:38 GMT-05:00 Javier Canillas :
> > Hi,
> >
> > I have been thinking about scheduling a daily routine to force repairs
> on a
> > cluster to maintain its health.
> >
> > I saw that by running a nodetool tablestats [keyspace] there is an
> attribute
> > called "Percent repaired" that show the percentage of data repaired on
> the
> > each table.
> >
> > This attribute seems to be only modified when executing "nodetool repair
> > [keyspace] [table]", but not when executing with other options like
> > --in-local-dc or --pr.
> >
> > My main concern is about building the whole MERKLE tree for a big table.
> I
> > have also check to repair by token ranges, but this also seems not to
> modify
> > this attribute of the table.
> >
> > Is this an expected behavior? Or there is something missing on the code
> that
> > needs to be fixed?
> >
> > My "maintenance" script would be calling nodetool tablestats per each
> > keyspace that has replication_factor > 0 to check for the value of the
> > "Percent repaired" of each table and, in case it is below some
> threshold, I
> > would execute a repair on it.
> >
> > Any ideas?
> >
> > Thanks in advance.
> >
> > Javier.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: New contribution - Burst Hour Compaction Strategy

2017-06-14 Thread Cameron Zemek
The main issue I see with this is "Read all the SSTables and detect which
partition keys are present in more than the compaction minimum threshold
value" . This is quite expensive and will be using quite a lot of I/O to
calculate. What makes writing a compaction strategy so difficult is
calculating what compactions to do without spending lots of I/O. I would
like to see benchmarks of this compaction strategy and where it offers
benefits to the existing compaction strategies. In particular you state one
of the motivations is to "Continuous high I/O of LCS -> addressed by the
scheduling feature of BHCS" however I don't see how you have achieved this
with the read I/O required to calculate which sstables contain a key.

On 15 June 2017 at 07:49, Pedro Gordo  wrote:

> Hi
>
> I've addressed the issues with Git. I believe this is what Stefan asking
> for: https://github.com/sedulam/cassandra/tree/12201
> I've also added more tests for BHCS, including more for wide rows following
> Jeff's suggestion.
>
> Thanks for the directions so far! If there's something else you would like
> to see tested or some metrics, please let me know what would be relevant.
>
> All the best
>
>
> Pedro Gordo
>
> On 13 June 2017 at 15:43, Pedro Gordo  wrote:
>
> > Hi all
> >
> > Although a couple of people engaged with me directly to talk about BHCS,
> I
> > would also like to get the community opinion on this, so I thought I
> could
> > get the discussion started by saying what the advantages would be and in
> > which type of tables BHCS would do a good job. Please keep in mind that
> all
> > my assumptions are without any real world experience on Cassandra, so
> this
> > is where I expect to see some input of the C* veterans to help me steer
> > BHCS implementation in the right direction if needed. This is a long
> email,
> > so there's a TLDR if you don't want to read everything. This is intended
> > for high-level discussion. For code level discussion, please refer to the
> > document in JIRA.
> > I'm aware that some might not like that no compaction occurs outside of
> > the burst hour, but I thought of solutions for that, so please read the
> > planned improvements below.
> >
> > *TL;DR*
> > BHCS tries to address these issues with the current compaction
> strategies:
> > - Necessity of allocating large storage during big compactions in STCS ->
> > Through the sstable_max_size property of BHCS, we can keep SSTables
> below a
> > certain size, so we wouldn't have issues with size during compaction
> > - We might get to a point where to return the results of a query, we need
> > to read from a large number of SSTables -> BHCS addresses this by making
> > sure that the number of SSTables where a key exists will be consistently
> > maintained at a low level after every compaction. The number of SSTables
> > where a key exists is configurable, so in the limit, you could set it to
> 1
> > for optimal read performance.
> > - Continuous high I/O of LCS -> addressed by the scheduling feature of
> > BHCS.
> >
> > *Longer explanation:*
> >
> > *Where would it be advantageous using BHCS?*
> > - Read-bound tables: due to BHCS maintaining the number of key copies at
> a
> > low level, the read speed would be consistently fast. Since there's not a
> > lot of writes in this type of table, even if there are new SSTables
> > produced containing that key, the number SSTables containing that key
> would
> > be set again to 1 after burst hour (BH).
> > - Write-bound tables: in this scenario, there's a lot of SSTables created
> > outside of BH, but few reads, so the issue with existing strategies would
> > be a continuous high I/O dedicated to compaction. With BHCS during these
> > active hours, we would have an increase in disk size, but I assume that
> > this disk increase outside the BH would be tolerable since a lot of space
> > would be released during the burst. Still, if that's a big issue, I plan
> to
> > address this with the improvement (1).
> >
> > *Where is BHCS NOT recommended and what improvements can be done to make
> > it viable?*
> > - Read and write-heavy tables: because outside BH, SSTables would
> increase
> > until the burst kicks in, there can be an increase in the read speed and
> > disk used space. This could also be solved with improvement (1), (3) or
> (5).
> >
> > *Planned Improvements:*
> > (1) - The user could indicate that he wants continuous compaction. This
> > would change the strategy in such a way that outside of the Burst Hour,
> > STCS would be used to maintain an acceptable read speed and disk used
> > space. And then when BH would kick in, it would set key copies and disk
> > size again to optimal levels.
> > (2) - During table creation the user, might not be aware of the
> compaction
> > configurable details, so a user-friendly configuration would be provided.
> > If the user sets the table as a Write-and-Read heavy table, then
> > improvement (1) would be 

Re: Repair Management

2017-05-18 Thread Cameron Zemek
Here is what I have done so far:
https://github.com/apache/cassandra/compare/trunk...instaclustr:repair_management

> I'm not sure what you mean by "coordinator repair commands". Do you mean
full repairs?

By coordinator repair I meant the repair command from the coordinator node.
That is the repair command from StorageService::repairAsync . Hopefully the
branch above shows what I am mean.





On 19 May 2017 at 03:16, Blake Eggleston <beggles...@apple.com> wrote:

> I am looking to improve monitoring and management of repairs (so far I
> have
> patch for adding ActiveRepairs to table/keyspace metrics) and come across
> ActiveRepairServiceMBean but this appears to be limited to incremental
> repairs. Is there a reason for this
>
> The incremental repair stuff was just the first set of jmx controls added
> to ActiveRepairService. ActiveRepairService is involved in all repairs
> though.
>
> I was looking to add something very similar to this nodetool repair_admin
> but it would work on co-ordinator repair commands.
>
>
> I'm not sure what you mean by "coordinator repair commands". Do you mean
> full repairs?
>
> What is the purpose of the current repair_admin? If I wish to add the
> above
> should I rename the MBean to say
> org.apache.cassandra.db:type=IncrementalRepairService and the nodetool
> command to inc_repair_admin ?
>
>
> nodetool help repair_admin says it's purpose is to "list and fail
> incremental repair sessions". However, by failing incremental repair
> sessions, it doesn't mean that it cancels the validation/sync, just that it
> releases the sstables that were involved in the repair back into the
> unrepaired data set. I don't see any reason why you couldn't add this
> functionality to the existing RepairService mbean. That said, before
> getting into mbean names, it's probably best to come up with a plan for
> cancelling validation and sync on each of the replicas involved in a given
> repair. As far as I know (though I may be wrong), that's not currently
> supported.
>
> On May 17, 2017 at 7:36:51 PM, Cameron Zemek (came...@instaclustr.com)
> wrote:
>
> I am looking to improve monitoring and management of repairs (so far I
> have
> patch for adding ActiveRepairs to table/keyspace metrics) and come across
> ActiveRepairServiceMBean but this appears to be limited to incremental
> repairs. Is there a reason for this?
>
> I was looking to add something very similar to this nodetool repair_admin
> but it would work on co-ordinator repair commands.
>
> For example:
> $ nodetool repair_admin --list
> Repair#1 mykeyspace columnFamilies=colfamilya,colfamilyb;
> incremental=True;
> parallelism=parallel progress=5%
>
> $ nodetool repair_admin --terminate 1
> Terminating repair command #1 (19f00c30-1390-11e7-bb50-ffb920a6d70f)
>
> $ nodetool repair_admin --terminate-all # calls
> ssProxy.forceTerminateAllRepairSessions()
> Terminating all repair sessions
> Terminated repair command #2 (64c44230-21aa-11e7-9ede-cd6eb64e3786)
>
> What is the purpose of the current repair_admin? If I wish to add the
> above
> should I rename the MBean to say
> org.apache.cassandra.db:type=IncrementalRepairService and the nodetool
> command to inc_repair_admin ?
>
>


Repair Management

2017-05-17 Thread Cameron Zemek
I am looking to improve monitoring and management of repairs (so far I have
patch for adding ActiveRepairs to table/keyspace metrics) and come across
ActiveRepairServiceMBean but this appears to be limited to incremental
repairs. Is there a reason for this?

I was looking to add something very similar to this nodetool repair_admin
but it would work on co-ordinator repair commands.

For example:
$ nodetool repair_admin --list
Repair#1 mykeyspace columnFamilies=colfamilya,colfamilyb; incremental=True;
parallelism=parallel progress=5%

$ nodetool repair_admin --terminate 1
Terminating repair command #1 (19f00c30-1390-11e7-bb50-ffb920a6d70f)

$ nodetool repair_admin --terminate-all  # calls
ssProxy.forceTerminateAllRepairSessions()
Terminating all repair sessions
Terminated repair command #2 (64c44230-21aa-11e7-9ede-cd6eb64e3786)

What is the purpose of the current repair_admin? If I wish to add the above
should I rename the MBean to say
org.apache.cassandra.db:type=IncrementalRepairService and the nodetool
command to inc_repair_admin ?