[ 
https://issues.apache.org/jira/browse/CASSANDRA-16818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396622#comment-17396622
 ] 

Andres de la Peña edited comment on CASSANDRA-16818 at 8/10/21, 11:41 AM:
--------------------------------------------------------------------------

I don't think that behaviour is unexpected given the limitations of 
{{RepeatableRunner}} and the reliance on the order in which the test are run.

The tests using {{assertPostTestEnv(true)}} are those that run beyond the 
instantiation of {{SSTableMetadata}}. Those tests may or may not load the 
schema, that's why we don't do any checks about schema loading. Placing the 
tests that never load the schema at the beginning of the test should guarantee 
that those tests can verify that the schema is not loaded without the risk of 
finding it loaded by one of the tests that may or may not load it. However, 
{{RepeatableRunner}} breaks this guarantee because it runs all the iterations 
in the same jvm, so some of the tests that never load the schema are run after 
the tests that may or may not load the schema from the previous iteration. This 
way, most runs pass a few initial iterations because the schema hasn't been 
loaded yet and, once an unlucky iteration has loaded it, they start to 
consistently fail.

To see this apparently random schema loading, we can remove all the tests in 
the class except one of the tests loading the table metadata, for example 
{{testTSFormatArg}}. Then we replace {{assertPostTestEnv(true)}} by 
{{assertPostTestEnv(false)}} in that test, so it verifies that the schema is 
not loaded. If we run the test repeatedly a few hundred times, we can see that 
the test survives a few rounds and then it starts to consistently fail. 
Conversely, if we replace the call to {{assertPostTestEnv}} by a call to 
{{assertSchemaLoaded}}, we will see the opposite behaviour, it fails a few 
times and then it consistently succeeds.
{quote}Change iterations to 10 and test passes
{quote}
It seems that the schema class is only loaded after a higher number of 
iterations, I have no clue about why that happens.
{quote}Replace all {{assertPostTestEnv(true);}} with 
{{assertPostTestEnv(false);}} and test passes
{quote}
That skips all checks about schema loading so it can never fail them, do you 
mean the opposite, replacing {{assertPostTestEnv(false)}} by 
{{assertPostTestEnv(true)}}? That would also fail.


was (Author: adelapena):
I don't think that behaviour is unexpected given the limitations of 
{{RepeatableRunner}} and the reliance on the order in which the test are run.

The tests using {{assertPostTestEnv(true)}} are those that run beyond the 
instantiation of {{SSTableMetadata}}. Those tests may or may not load the 
schema, that's why in we don't do any checks about schema loading. Placing the 
tests that never load the schema at the beginning of the test should guarantee 
that those tests can verify that the schema is not loaded without the risk of 
finding it loaded by one of the tests that may or may not load it. However, 
{{RepeatableRunner}} breaks this guarantee because it runs all the iterations 
in the same jvm, so some of the tests that never load the schema are run after 
the tests that may or may not load the schema from the previous iteration. This 
way, most runs pass a few initial iterations because the schema hasn't been 
loaded yet and, once an unlucky iteration has loaded it, they start to 
consistently fail.

To see this apparently random schema loading, we can remove all the tests in 
the class except one of the tests loading the table metadata, for example 
{{testTSFormatArg}}. Then we replace {{assertPostTestEnv(true)}} by 
{{assertPostTestEnv(false)}} in that test, so it verifies that the schema is 
not loaded. If we run the test repeatedly a few hundred times, we can see that 
the test survives a few rounds and then it starts to systematically fail. 
Conversely, if we replace the call to {{assertPostTestEnv}} by a call to 
{{assertSchemaLoaded}}, we will see the opposite behaviour, it fails a few 
times and then it consistently succeeds.
{quote}Change iterations to 10 and test passes
{quote}
It seems that the schema class is only loaded after a higher number of 
iterations, I have no clue about why that happens.
{quote}Replace all {{assertPostTestEnv(true);}} with 
{{assertPostTestEnv(false);}} and test passes
{quote}
That skips all checks about schema loading so it can never fail them, do you 
mean the opposite, replacing {{assertPostTestEnv(false)}} by 
{{assertPostTestEnv(true)}}? That would also fail.

> Fix usage instructions about sstabledump -k and -x options
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-16818
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16818
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tool/sstable
>            Reporter: Andres de la Peña
>            Assignee: Andres de la Peña
>            Priority: Normal
>             Fix For: 4.0.1, 4.1
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The options {{-k}} and {{-x}} of {{sstabledump}} admit multiple arguments, so 
> users can include or exclude multiple partitions. The intended usage is, for 
> example:
> {code:java}
> $ sstablepartitions <sstable_path> -k 1
> $ sstablepartitions <sstable_path> -k 1 2 3
> {code}
> However, the following command will fail:
> {code:java}
> $ sstablepartitions -k 1 <sstable_path>
> You must supply exactly one sstable
> usage: sstabledump <sstable file path> <options>
> Dump contents of given SSTable to standard output in JSON format.
>  -d         CQL row per line internal representation
>  -e         enumerate partition keys only
>  -k <arg>   Partition key
>  -t         Print raw timestamps instead of iso8601 date strings
>  -x <arg>   Excluded partition key
> {code}
> This command fails because the sstable path is considered a fourth partition 
> key, and the mandatory argument for the sstable path is missed. While I think 
> this behaviour is correct, it can be a bit confusing for users, especially 
> when the information about usage describes both {{-k}} and {{-x}} as 
> single-argument.
> I think that at least we should fix the description of the options to 
> indicate that there could be multiple included/excluded keys, and probably 
> improve the message about the missing sstable path when those options are 
> used.
> Alternatively we could modify the options to have a single argument and allow 
> to repeat them, so we could accept things like:
> {code:java}
> $ sstablepartitions -k 1 <sstable_path>
> $ sstablepartitions -k 1 -k 2 -k 3 <sstable_path>
> $ sstablepartitions <sstable_path> -k 1 -k 2 -k 3
> {code}
> The main downside of the latter approach is that the change in the syntax of 
> the command might cause compatibility issues. Also we would need to upgrade 
> {{commons-cli}} to at least 1.2 due to CLI-137.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to