[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435921#comment-17435921 ] Berenguer Blasi edited comment on CASSANDRA-17070 at 10/29/21, 10:46 AM: - They didn't fail sort to speak... They show the 13200 passed tests with no _test_ failures. It failed the run bc of the wilcard. It is easy to remove the noise by switching to {{test}} instead of {{testsome}}. Removing the cleanup logic breaks the {{CQLTester}} teardown when it tries to drop tables that have attached MVs. So I want to give that a thought... was (Author: bereng): They didn't fail sort to speak... They show the 13200 passed tests with no _test_ failures. It failed the run bc of the wilcard. It is easy to fix it by switching to {{test}} instead of {{testsome}}. Removing the cleanup logic breaks the {{CQLTester}} teardown when it tries to drop tables that have attached MVs. So I want to give that a thought... > ViewComplexTest hardening > - > > Key: CASSANDRA-17070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17070 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.x > > > I have seen a number of times already the {{ViewComplexTest}} family timeout > on test method teardown. This leaves a dirty env behind triggering the > following test methods to fail on it. This ticket aims at hardening them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435885#comment-17435885 ] Andres de la Peña edited comment on CASSANDRA-17070 at 10/29/21, 10:24 AM: --- It seems that the multiplexer runs in the PR have failed due to the wildcard. I think it would work if we use {{ant test}} instead of {{ant testsome}}, with the unqualified class name: {code:java} .circleci/generate.sh -m \ -e REPEATED_UTEST_TARGET=test \ -e REPEATED_UTEST_COUNT=500 \ -e REPEATED_UTEST_CLASS=ViewComplex*Test {code} The wildcard is a great idea, it would have saved me time with the above runs :) I think we can try some more runs without the duplicated cleanup logic and see how it goes, wdyt? was (Author: adelapena): It seems that the multiplexer runs have failed due to the wildcard. I think it would work if we use {{ant test}} instead of {{ant testsome}}, with the unqualified class name: {code} .circleci/generate.sh -m \ -e REPEATED_UTEST_TARGET=test \ -e REPEATED_UTEST_COUNT=500 \ -e REPEATED_UTEST_CLASS=ViewComplex*Test {code} The wildcard is a great idea, it would have saved me time with the above runs :) I think we can try some more runs without the duplicated cleanup logic and see how it goes, wdyt? > ViewComplexTest hardening > - > > Key: CASSANDRA-17070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17070 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.x > > > I have seen a number of times already the {{ViewComplexTest}} family timeout > on test method teardown. This leaves a dirty env behind triggering the > following test methods to fail on it. This ticket aims at hardening them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435840#comment-17435840 ] Berenguer Blasi edited comment on CASSANDRA-17070 at 10/29/21, 8:14 AM: [~adelapena] the reason to run that code twice was that the test keeps a list of the views. So if a cleanup failed the next one would pick up from the failing point But +1 to the abstract class and unique MV name. I ported your changes to my PR but: I deleted the extra cleanup which is not needed anymore, I changed the config to run {{ViewComplex*Test}} instead of having one commit per repeat test and I attached results to the PR with 100 repeats and 0 failures (CASSANDRA-17083). I am only on the fence about the cleanup logic, which we could remove now there are unique MV names, but on the other hand it's nice to have it. I think I am +1 on how things are right now. Wdyt? was (Author: bereng): [~adelapena] the reason to run that code twice was that the test keeps a list of the views. So if a cleanup failed the next one would pick up from the failing point But +1 to the abstract class and unique MV name. I ported your changes to my PR but: I deleted the extra cleanup which is not needed anymore, I changed the config to run {{ViewComplex*Test}} instead of having one commit per repeat test and I attached results to the PR with 100 repeats and 0 failures. I am only on the fence about the cleanup logic, which we could remove now there are unique MV names, but on the other hand it's nice to have it. I think I am +1 on how things are right now. Wdyt? > ViewComplexTest hardening > - > > Key: CASSANDRA-17070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17070 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.x > > > I have seen a number of times already the {{ViewComplexTest}} family timeout > on test method teardown. This leaves a dirty env behind triggering the > following test methods to fail on it. This ticket aims at hardening them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435840#comment-17435840 ] Berenguer Blasi edited comment on CASSANDRA-17070 at 10/29/21, 8:07 AM: [~adelapena] the reason to run that code twice was that the test keeps a list of the views. So if a cleanup failed the next one would pick up from the failing point But +1 to the abstract class and unique MV name. I ported your changes to my PR but: I deleted the extra cleanup which is not needed anymore, I changed the config to run {{ViewComplex*Test}} instead of having one commit per repeat test and I attached results to the PR with 100 repeats and 0 failures. I am only on the fence about the cleanup logic, which we could remove now there are unique MV names, but on the other hand it's nice to have it. I think I am +1 on how things are right now. Wdyt? was (Author: bereng): [~adelapena] the reason to run that code twice was that the test keeps a list of the views. So if a cleanup failed the next one would pick up from the failing point But +1 to the abstract class and unique MV name. I ported your changes to my PR but: I deleted the extra cleanup which is not needed anymore, I changed the config to run {{ViewComplex*Test}} instead of having one commit per repeat test and I attached results to the PR. I am only on the fence about the cleanup logic, which we could remove now there are unique MV names, but on the other hand it's nice to have it. I think I am +1 on how things are right now. Wdyt? > ViewComplexTest hardening > - > > Key: CASSANDRA-17070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17070 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.x > > > I have seen a number of times already the {{ViewComplexTest}} family timeout > on test method teardown. This leaves a dirty env behind triggering the > following test methods to fail on it. This ticket aims at hardening them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435553#comment-17435553 ] Andres de la Peña edited comment on CASSANDRA-17070 at 10/28/21, 4:54 PM: -- I'm not sure I understand why the MV cleanup can fail. Would the second round of cleanup guarantee that the MV cleanups succeed, or would it reduce the chances of having leftovers because it's unlikely to have the same error twice? Maybe it's just giving more time for the first async cleanup to happen? In any case, here are 500 runs of each ViewComplex test: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1090/workflows/4362eefe-a13a-4a8c-a8eb-9e543a74623a/jobs/10187] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1091/workflows/a9185c89-ceff-4ddd-90c0-960281233fe3/jobs/10189] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1092/workflows/1f74edb4-3a41-47fd-b95c-0edc1d514a20/jobs/10191] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1093/workflows/f2e69851-19ea-4ab6-9315-5af9d721ec3d/jobs/10193] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1094/workflows/0d25bf72-7c90-4287-92d6-21a751082e21/jobs/10196] >From these runs only a test runner for {{ViewComplexTest}} is failing due to >timeouts. It's not strictly related to the fix, but it seems that there is some repeated code among the five tests in which {{ViewComplexTest}} was split by CASSANDRA-16670. Probably we should do the same thing that we did with {{ViewTest}} in CASSANDRA-16777 and extract a superclass to avoid code duplication. This would make changes like this one a bit easier to do. I gave it a go in [this commit|https://github.com/adelapena/cassandra/commit/204f3bbe69aae9165c0570224bfd811c8bbd4d07], and it saves us some 176 lines of code, which I think is always nice. Here are some repeated runs for the tests using a common superclass, with no failures so far: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1096/workflows/ded0dfe1-e736-468a-babb-0c007a6bf3c9/jobs/10201] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1097/workflows/544b429d-b712-428c-b97d-edd50076992a/jobs/10203] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1098/workflows/c062d815-2e26-4a75-aa64-656bfc48c8cc/jobs/10205] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1099/workflows/1f834191-b30b-4380-87ce-da257fac731e/jobs/10207] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1095/workflows/3df8138d-1d7b-4283-be3c-a23cb2c0bc40] An alternative/complementary approach to avoid problems with leftovers from previous runs is making sure that all MVs are created with a new, unique name. This way the tests don't collide with any surviving MVs if the cleanup has failed or if it hasn't finished. {{CQLTester}} does this with keyspace, table, function and aggregate names. I gave a go to this generation of unique names [here|https://github.com/adelapena/cassandra/commit/e1928447e4139f1ad4126d9ed34e3de9d3bceb10]. It is just for {{ViewComplexTest}}, but we may consider having a more generic and reusable solution for creating/cleaning MVs in {{CQLTester}}, although I think we don't have to do it here. wdyt? was (Author: adelapena): I'm not sure I understand why the MV cleanup can fail. Would the second round of cleanup guarantee that the MV cleanups succeed, or would it reduce the chances of having leftovers because it's unlikely to have the same error twice? Maybe it's just giving more time for the first async cleanup to happen? In any case, here are 500 runs of each ViewComplex test: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1090/workflows/4362eefe-a13a-4a8c-a8eb-9e543a74623a/jobs/10187] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1091/workflows/a9185c89-ceff-4ddd-90c0-960281233fe3/jobs/10189] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1092/workflows/1f74edb4-3a41-47fd-b95c-0edc1d514a20/jobs/10191] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1093/workflows/f2e69851-19ea-4ab6-9315-5af9d721ec3d/jobs/10193] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1094/workflows/0d25bf72-7c90-4287-92d6-21a751082e21/jobs/10196] >From this only a test runner for {{ViewComplexTest}} is failing due to >timeouts. It's not strictly related to the fix, but it seems there is some repeated code among the five tests in which {{ViewCo
[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435553#comment-17435553 ] Andres de la Peña edited comment on CASSANDRA-17070 at 10/28/21, 4:49 PM: -- I'm not sure I understand why the MV cleanup can fail. Would the second round of cleanup guarantee that the MV cleanups succeed, or would it reduce the chances of having leftovers because it's unlikely to have the same error twice? Maybe it's just giving more time for the first async cleanup to happen? In any case, here are 500 runs of each ViewComplex test: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1090/workflows/4362eefe-a13a-4a8c-a8eb-9e543a74623a/jobs/10187] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1091/workflows/a9185c89-ceff-4ddd-90c0-960281233fe3/jobs/10189] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1092/workflows/1f74edb4-3a41-47fd-b95c-0edc1d514a20/jobs/10191] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1093/workflows/f2e69851-19ea-4ab6-9315-5af9d721ec3d/jobs/10193] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1094/workflows/0d25bf72-7c90-4287-92d6-21a751082e21/jobs/10196] >From this only a test runner for {{ViewComplexTest}} is failing due to >timeouts. It's not strictly related to the fix, but it seems there is some repeated code among the five tests in which {{ViewComplexTest}} was split by CASSANDRA-16670. Probably we should do the same thing that we did with {{ViewTest}} in CASSANDRA-16777 and extract a superclass to avoid code duplication. This would changes like this one a bit easier to do. I gave it a go in [this commit|https://github.com/adelapena/cassandra/commit/204f3bbe69aae9165c0570224bfd811c8bbd4d07], and it saves us some 176 lines of code, which I think is always nice. Here are some repeated runs for the tests using a common superclass, with no failures so far: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1096/workflows/ded0dfe1-e736-468a-babb-0c007a6bf3c9/jobs/10201] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1097/workflows/544b429d-b712-428c-b97d-edd50076992a/jobs/10203] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1098/workflows/c062d815-2e26-4a75-aa64-656bfc48c8cc/jobs/10205] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1099/workflows/1f834191-b30b-4380-87ce-da257fac731e/jobs/10207] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1095/workflows/3df8138d-1d7b-4283-be3c-a23cb2c0bc40] An alternative/complementary approach to avoid problems with leftovers from previous runs is making sure that all MVs are created with a new, unique name. This way the tests don't collide with any surviving MVs if the cleanup has failed or if it hasn't finished. {{CQLTester}} does this with keyspace, table, function and aggregate names. I gave a go to this generation of unique names [here|https://github.com/adelapena/cassandra/commit/e1928447e4139f1ad4126d9ed34e3de9d3bceb10]. It is just for {{ViewComplexTest}}, but we may consider having a more generic and reusable solution for creating/cleaning MVs in {{CQLTester}}, although I think we don't have to do it here. wdyt? was (Author: adelapena): I'm not sure I understand why the MV cleanup can fail. Would the second round of cleanup guarantee that the MV cleanups succeed, or would it reduce the chances of having leftovers because it's unlikely to have the same error twice? Maybe it's just giving more time for the first async cleanup to happen? In any case, here are 500 runs of each ViewComplex test: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1090/workflows/4362eefe-a13a-4a8c-a8eb-9e543a74623a/jobs/10187] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1091/workflows/a9185c89-ceff-4ddd-90c0-960281233fe3/jobs/10189] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1092/workflows/1f74edb4-3a41-47fd-b95c-0edc1d514a20/jobs/10191] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1093/workflows/f2e69851-19ea-4ab6-9315-5af9d721ec3d/jobs/10193] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1094/workflows/0d25bf72-7c90-4287-92d6-21a751082e21/jobs/10196] >From this only a test runner for {{ViewComplexTest}} is failing due to >timeouts. It's not strictly related to the fix, but it seems there is some repeated code among the five tests in which {{ViewComplexTest}} was s
[jira] [Comment Edited] (CASSANDRA-17070) ViewComplexTest hardening
[ https://issues.apache.org/jira/browse/CASSANDRA-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435553#comment-17435553 ] Andres de la Peña edited comment on CASSANDRA-17070 at 10/28/21, 4:43 PM: -- I'm not sure I understand why the MV cleanup can fail. Would the second round of cleanup guarantee that the MV cleanups succeed, or would it reduce the chances of having leftovers because it's unlikely to have the same error twice? Maybe it's just giving more time for the first async cleanup to happen? In any case, here are 500 runs of each ViewComplex test: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1090/workflows/4362eefe-a13a-4a8c-a8eb-9e543a74623a/jobs/10187] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1091/workflows/a9185c89-ceff-4ddd-90c0-960281233fe3/jobs/10189] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1092/workflows/1f74edb4-3a41-47fd-b95c-0edc1d514a20/jobs/10191] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1093/workflows/f2e69851-19ea-4ab6-9315-5af9d721ec3d/jobs/10193] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1094/workflows/0d25bf72-7c90-4287-92d6-21a751082e21/jobs/10196] >From this only a test runner for {{ViewComplexTest}} is failing due to >timeouts. It's not strictly related to the fix, but it seems there is some repeated code among the five tests in which {{ViewComplexTest}} was split by CASSANDRA-16670. Probably we should do the same thing that we did with {{ViewTest}} in CASSANDRA-16777 and extract a superclass to avoid code duplication. This would changes like this one a bit easier to do. I gave it a go in [this commit|https://github.com/adelapena/cassandra/commit/204f3bbe69aae9165c0570224bfd811c8bbd4d07], and it saves us some 176 lines of code, which I think is always nice. Here are some repeated runs for the test using a common superclass, with no failures so far: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1096/workflows/ded0dfe1-e736-468a-babb-0c007a6bf3c9/jobs/10201] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1097/workflows/544b429d-b712-428c-b97d-edd50076992a/jobs/10203] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1098/workflows/c062d815-2e26-4a75-aa64-656bfc48c8cc/jobs/10205] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1099/workflows/1f834191-b30b-4380-87ce-da257fac731e/jobs/10207] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1095/workflows/3df8138d-1d7b-4283-be3c-a23cb2c0bc40] An alternative/complementary approach to avoid problems with leftovers from previous test runs is making sure that all MVs are created with a new, unique name. This way the tests don't collide with any surviving MVs if the cleanup has failed or if it hasn't finished. {{CQLTester}} does this with keyspace, table, function and aggregate names. I gave it a go to this generation of unique names [here|https://github.com/adelapena/cassandra/commit/e1928447e4139f1ad4126d9ed34e3de9d3bceb10]. It is just for {{ViewComplexTest}}, but we may consider having a more generic and reusable solution for creating/cleaning MVs in {{CQLTester}}, although I think we don't have to do it here. wdyt? was (Author: adelapena): I'm not sure I understand why the MV cleanup can fail. Would the second round of cleanup guarantee that the MV cleanups succeed, or would it reduce the chances of having leftovers because it's unlikely to have the same error twice? Maybe it's just giving more time for the first async cleanup to happen? In any case, here are 500 runs of each ViewComplex test: * [ViewComplexDeletionsTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1090/workflows/4362eefe-a13a-4a8c-a8eb-9e543a74623a/jobs/10187] * [ViewComplexLivenessTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1091/workflows/a9185c89-ceff-4ddd-90c0-960281233fe3/jobs/10189] * [ViewComplexTTLTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1092/workflows/1f74edb4-3a41-47fd-b95c-0edc1d514a20/jobs/10191] * [ViewComplexTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1093/workflows/f2e69851-19ea-4ab6-9315-5af9d721ec3d/jobs/10193] * [ViewComplexUpdatesTest|https://app.circleci.com/pipelines/github/adelapena/cassandra/1094/workflows/0d25bf72-7c90-4287-92d6-21a751082e21/jobs/10196] >From this only a test runner for {{ViewComplexTest}} is failing due to >timeouts. It's not strictly related to the fix, but it seems there is some repeated code among the five tests in which {{ViewComplexTest}