Thank you Maxim for summarizing it like this for better visibility of what
you are suggesting to contribute.

I am OK with that in general but:

1) The table should be limited in its size (I think the original patch you
had was already done with space-limitation in mind). Since we want to put
this on a heap, I think we should definitely make it capped on size.
2) If both of them are limited in their size, can it not happen that there
would be an entry e.g. in compaction_operation_linked_stats which would not
have its counterpart in compaction_operation_status (or vice versa)? Could
not it be done in such a way that if an entry in
compaction_operation_status is dropped (as new entries would come in, old
would be discarded), the dropping of that entry would automatically drop
all related rows in compaction_operations_linked_tasks?

For the record, we were also investigating if existing
system.compaction_history could not be somehow used for this but that
appears to be problematic as we would need to probably change the schema /
add new columns etc. and this is not so simple as it is a system table. We
might use the compaction_properties column which is a map and put all the
details there, however that starts to be quite uncomfortable on querying
and even if we somehow did it, it still does not solve all the issues the
two table approach seems to address.

On Fri, Sep 27, 2024 at 4:21 PM Maxim Muzafarov <mmu...@apache.org> wrote:

> Hello everyone,
>
> I still need a few more eyes on [1][2], but this time I'm going to try
> and do some marketing for the feature I'm talking about, so...
>
>
> We are trying to bridge the gap between the API that is called and the
> compaction process that MAY or MAY NOT be called as a result, and make
> users aware of what is happening inside the cluster with their running
> commands. Currently, this can only be viewed by reading logs, which is
> not a convenient way for both operators and audit subsystems of the
> node internals.
>
> What we want to do is store the history of running operations for the
> compaction manager in a small collection in the java heap and fill
> this gap with virtual tables on top of this data collection, namely:
>
> - compaction_operations_status - has (operation_type, operation_id)
> primary key and exposes the status of the cleanup command as a whole.
> It may or may not trigger the compaction process and the compaction
> may or may not appear in the sstable_tasks virtual table (active
> compactions);
> - compaction_operations_linked_tasks - has (operation_type,
> operation_id, compaction_id) as its primary key and shows the
> relationship between the user-triggered operation and the compaction
> process invoked as a result;
>
> The CASSANDRA-19670 [1] issue covers only the cleanup command and
> demonstrates the approach; all other commands, which can be identified
> by the OperationType class, could be implemented in follow-up issues.
>
>
> Examples:
>
> - The definition of these new virtual tables looks like:
> https://gist.github.com/Mmuzaf/2d3006f5b654d54e7cabc343cd73a2a3
>
> - The output when we run the cleanup command, but it doesn't trigger
> the compaction:
>
> https://gist.github.com/user-attachments/assets/9089d5c1-70d4-475f-9cf7-cc16dff48699
>
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19760
> [2] https://github.com/apache/cassandra/pull/3412/files
>
> On Mon, 15 Jul 2024 at 21:06, Maxim Muzafarov <mmu...@apache.org> wrote:
> >
> > Hello everyone,
> >
> > I would like to gently ask for help in reviewing the following issue
> > that we've been facing for a while:
> > https://issues.apache.org/jira/browse/CASSANDRA-19760
> >
> > When a cleanup command is called, the compaction process under the
> > hood is triggered accordingly. However, if there is nothing to compact
> > or the cleanup command returns with a status other than SUCCESSFUL,
> > there is no way to get the execution results of the command that was
> > run. This is especially true when using any kind of
> > automation/scripting on top of JMX or as a nodetool wrapper.
> >
> > I propose to keep these history results in memory for some time and
> > expose them via a virtual table so that a user can query it to check
> > the status.
> >
> > Any suggestions are welcome. I believe other commands like verify,
> > scrub, etc. can be exposed in the same way.
>

Reply via email to