Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread David Capwell
Hmmm, did you try without —remote?  We 100% rely on git hooks, and this is what 
we do

 $ grep -r 'git submodule' .build/
.build//sh/bump-accord.sh:  git submodule status modules/accord
.build//sh/change-submodule.sh:  git submodule set-url "${path}" "${url}"
.build//sh/change-submodule.sh:  git submodule set-branch --branch "${branch}" 
"${path}"
.build//sh/change-submodule.sh:  git submodule update --remote
.build//sh/development-switch.sh:  [ "$exists" == false ] && error "git 
submodule $a does not exist"
.build//sh/development-switch.sh:git submodule set-url "${path}" 
"../cassandra-${name}.git"
.build//sh/development-switch.sh:git submodule set-branch --branch 
"${branch}" "${path}"
.build//git/git-hooks/post-checkout/100-update-submodules.sh:  git submodule 
update --init —recursive

> so perhaps this a "first run tax" for submodule + worktree.

We never had such a thing in Accord… but we seem to be running commands 
slightly differently than you are...

> On Jun 1, 2023, at 1:06 PM, Josh McKenzie  wrote:
> 
>> Josh, do you see any reports on what isn’t working?  I think most people 
>> don’t touch 1% of what git can do… so it might be that 10% is broken but 
>> that no one in our domain actually touches that path?
> Was changing .gitmodule in harry to point to a branch and git just straight 
> up went out to lunch when I tried to "git submodule update --init --recursive 
> --remote" or any derivation thereof. Reproducing today in a worktree with 
> GIT_TRACE, and it looks like the submodule command is hanging on:
> 
>> 16:00:48.253406 git.c:460   trace: built-in: git index-pack 
>> --stdin --fix-thin '--keep=fetch-pack 32955 on Joshuas-MacBook-Pro.local' 
>> --check-self-contained-and-connected
>> 
> 
> On a whim I just let it run and it finally got unstuck after probably 5+ 
> minutes; this might just be down to me being impatient and the default 
> logging on git being... completely silent. =/
> 
> Looks like subsequent runs aren't hanging on that and are hopping right 
> through, so perhaps this a "first run tax" for submodule + worktree.
> 
> On Thu, Jun 1, 2023, at 2:05 PM, David Capwell wrote:
>> To be clear, we only use the relative syntax during development and not long 
>> lived feature branches like cep-15-accord; we use https address there.  So 
>> when you create a PR you switch to relative paths (if-and-only-if you change 
>> the submodule), then on merge you switch back to https pointing to apache.  
>> So the main issue has been when 2 authors try to work together (such as 
>> during review of a PR)
>> 
>>> On Jun 1, 2023, at 10:15 AM, David Capwell  wrote:
>>> 
>>> Most edge cases we have seen in Accord are working with feature branches 
>>> from other authors where we use relative paths to make sure the git@ vs 
>>> https:// doesn’t become a problem for CI (submodule points to https:// to 
>>> work in CI, but if you do that during feature development it gets annoying 
>>> to push to GitHub… so we do ../cassandra-accord.git so git respects w/e 
>>> protocol you are using).  In 1-2 peoples environments, when they checked 
>>> out another authors logic the C* remote was correct, but the Accord one was 
>>> still pointing to Apache (which doesn’t have the feature branch)…. This is 
>>> trivial to fix, and might be a bug with our git hooks…. But still calling 
>>> out as it has been an issue.
>>> 
>>> Josh, do you see any reports on what isn’t working?  I think most people 
>>> don’t touch 1% of what git can do… so it might be that 10% is broken but 
>>> that no one in our domain actually touches that path?
>>> 
 On May 31, 2023, at 12:36 PM, Josh McKenzie  wrote:
 
 Bumping into worktree + submodule pain on some harry related work; it 
 looks like "git worktree" and submodules are not currently fully 
 implemented:
 
 https://git-scm.com/docs/git-worktree#_bugs
 BUGS
 Multiple checkout in general is still experimental, and the support for 
 submodules is incomplete. It is NOT recommended to make multiple checkouts 
 of a superproject.
 
 I rely pretty heavily on worktrees and I know a lot of other folks who do 
 too. This is a dealbreaker for me in terms of adding anything else as a 
 submodule and I'd like to know if the accord folks have been running into 
 any worktree related woes w/the accord integration.
 
 
 On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote:
> Regarding approachability, one of the things I thought is worth adding is 
> a DSL. I feel like there's enough functionality in Harry and there's 
> enough information for anyone who needs to write even an involved test 
> out there, but adoption doesn't usually start with complex use-cases, so 
> it could be that making it extremely simple to generate the data and 
> validating that written data is where it's supposed to be, should help 
> adoption a lot. Unfortunately, more complex 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread Josh McKenzie
> Josh, do you see any reports on what isn’t working?  I think most people 
> don’t touch 1% of what git can do… so it might be that 10% is broken but that 
> no one in our domain actually touches that path?
Was changing .gitmodule in harry to point to a branch and git just straight up 
went out to lunch when I tried to "git submodule update --init --recursive 
--remote" or any derivation thereof. Reproducing today in a worktree with 
GIT_TRACE, and it looks like the submodule command is hanging on:

> 16:00:48.253406 git.c:460   trace: built-in: git index-pack 
> --stdin --fix-thin '--keep=fetch-pack 32955 on Joshuas-MacBook-Pro.local' 
> --check-self-contained-and-connected
> 

On a whim I just let it run and it finally got unstuck after probably 5+ 
minutes; this might just be down to me being impatient and the default logging 
on git being... completely silent. =/

Looks like subsequent runs aren't hanging on that and are hopping right 
through, so perhaps this a "first run tax" for submodule + worktree.

On Thu, Jun 1, 2023, at 2:05 PM, David Capwell wrote:
> To be clear, we only use the relative syntax during development and not long 
> lived feature branches like cep-15-accord; we use https address there.  So 
> when you create a PR you switch to relative paths (if-and-only-if you change 
> the submodule), then on merge you switch back to https pointing to apache.  
> So the main issue has been when 2 authors try to work together (such as 
> during review of a PR)
> 
>> On Jun 1, 2023, at 10:15 AM, David Capwell  wrote:
>> 
>> Most edge cases we have seen in Accord are working with feature branches 
>> from other authors where we use relative paths to make sure the git@ vs 
>> https:// doesn’t become a problem for CI (submodule points to https:// to 
>> work in CI, but if you do that during feature development it gets annoying 
>> to push to GitHub… so we do ../cassandra-accord.git so git respects w/e 
>> protocol you are using).  In 1-2 peoples environments, when they checked out 
>> another authors logic the C* remote was correct, but the Accord one was 
>> still pointing to Apache (which doesn’t have the feature branch)…. This is 
>> trivial to fix, and might be a bug with our git hooks…. But still calling 
>> out as it has been an issue.
>> 
>> Josh, do you see any reports on what isn’t working?  I think most people 
>> don’t touch 1% of what git can do… so it might be that 10% is broken but 
>> that no one in our domain actually touches that path?
>> 
>>> On May 31, 2023, at 12:36 PM, Josh McKenzie  wrote:
>>> 
>>> Bumping into worktree + submodule pain on some harry related work; it looks 
>>> like "git worktree" and submodules are not currently fully implemented:
>>> 
>>> https://git-scm.com/docs/git-worktree#_bugs
>>> BUGS
>>> 
>>> Multiple checkout in general is still experimental, and the support for 
>>> submodules is incomplete. It is NOT recommended to make multiple checkouts 
>>> of a superproject.
>>> 
>>> I rely pretty heavily on worktrees and I know a lot of other folks who do 
>>> too. This is a dealbreaker for me in terms of adding anything else as a 
>>> submodule and I'd like to know if the accord folks have been running into 
>>> any worktree related woes w/the accord integration.
>>> 
>>> 
>>> On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote:
 Regarding approachability, one of the things I thought is worth adding is 
 a DSL. I feel like there's enough functionality in Harry and there's 
 enough information for anyone who needs to write even an involved test out 
 there, but adoption doesn't usually start with complex use-cases, so it 
 could be that making it extremely simple to generate the data and 
 validating that written data is where it's supposed to be, should help 
 adoption a lot. Unfortunately, more complex use-cases such as group-by 
 support, or SAI testing will require a bit more knowledge and writing an 
 involved model, so I do not see any shortcuts we can take here.
 
 > I do think that moving Harry in-tree would improve approachability
 
 I think it's similar as it is with in-jvm dtest api. I feel like we wold 
 evolve it more actively if we didn't have to cut a release before every 
 commit. In other words, I think that changing Harry code and extending 
 functionality will be easier, which I think will eventually lead to 
 quicker adoption. But of course the act of moving itself does not increase 
 adoption, it just comes from better ergonomics.
 
 
 On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote:
> I'm seeing a few distinct topics here:
> 
> 1. Harry's adoption and approachability
> 
> I agree that approachability is one of Harry's main improvement areas 
> right now. If our goal is to produce a fuzz testing framework for the 
> Cassandra project, then adoption by contributors and usage for new 
> feature development 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread David Capwell
To be clear, we only use the relative syntax during development and not long 
lived feature branches like cep-15-accord; we use https address there.  So when 
you create a PR you switch to relative paths (if-and-only-if you change the 
submodule), then on merge you switch back to https pointing to apache.  So the 
main issue has been when 2 authors try to work together (such as during review 
of a PR)

> On Jun 1, 2023, at 10:15 AM, David Capwell  wrote:
> 
> Most edge cases we have seen in Accord are working with feature branches from 
> other authors where we use relative paths to make sure the git@ vs https:// 
> doesn’t become a problem for CI (submodule points to https:// to work in CI, 
> but if you do that during feature development it gets annoying to push to 
> GitHub… so we do ../cassandra-accord.git so git respects w/e protocol you are 
> using).  In 1-2 peoples environments, when they checked out another authors 
> logic the C* remote was correct, but the Accord one was still pointing to 
> Apache (which doesn’t have the feature branch)…. This is trivial to fix, and 
> might be a bug with our git hooks…. But still calling out as it has been an 
> issue.
> 
> Josh, do you see any reports on what isn’t working?  I think most people 
> don’t touch 1% of what git can do… so it might be that 10% is broken but that 
> no one in our domain actually touches that path?
> 
>> On May 31, 2023, at 12:36 PM, Josh McKenzie  wrote:
>> 
>> Bumping into worktree + submodule pain on some harry related work; it looks 
>> like "git worktree" and submodules are not currently fully implemented:
>> 
>> https://git-scm.com/docs/git-worktree#_bugs
>> BUGS
>> Multiple checkout in general is still experimental, and the support for 
>> submodules is incomplete. It is NOT recommended to make multiple checkouts 
>> of a superproject.
>> 
>> I rely pretty heavily on worktrees and I know a lot of other folks who do 
>> too. This is a dealbreaker for me in terms of adding anything else as a 
>> submodule and I'd like to know if the accord folks have been running into 
>> any worktree related woes w/the accord integration.
>> 
>> 
>> On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote:
>>> Regarding approachability, one of the things I thought is worth adding is a 
>>> DSL. I feel like there's enough functionality in Harry and there's enough 
>>> information for anyone who needs to write even an involved test out there, 
>>> but adoption doesn't usually start with complex use-cases, so it could be 
>>> that making it extremely simple to generate the data and validating that 
>>> written data is where it's supposed to be, should help adoption a lot. 
>>> Unfortunately, more complex use-cases such as group-by support, or SAI 
>>> testing will require a bit more knowledge and writing an involved model, so 
>>> I do not see any shortcuts we can take here.
>>> 
>>> > I do think that moving Harry in-tree would improve approachability
>>> 
>>> I think it's similar as it is with in-jvm dtest api. I feel like we wold 
>>> evolve it more actively if we didn't have to cut a release before every 
>>> commit. In other words, I think that changing Harry code and extending 
>>> functionality will be easier, which I think will eventually lead to quicker 
>>> adoption. But of course the act of moving itself does not increase 
>>> adoption, it just comes from better ergonomics.
>>> 
>>> 
>>> On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote:
 I'm seeing a few distinct topics here:
 
 1. Harry's adoption and approachability
 
 I agree that approachability is one of Harry's main improvement areas 
 right now. If our goal is to produce a fuzz testing framework for the 
 Cassandra project, then adoption by contributors and usage for new feature 
 development are reasonable indicators for whether we're achieving that 
 goal. If Harry is not getting adopted by contributors outside of Apple, 
 and is not getting used for new feature development, then we should make 
 an effort to understand why. I don't think that a several-hour seminar is 
 the best point of leverage to achieve those goals.
 
 Here's what I think we do need:
 
 - The README should be understandable by anyone interested in writing a 
 fuzz test
 - Example tests should be runnable from a fresh clone of Cassandra, in an 
 IDE or on the command line
 - Examples of how we would test new features (like CEP-7, CEP-29, etc) 
 with the fuzz testing framework
 
 I find the JVM dtest framework accomplishes similar goals, and one reason 
 is because there are plenty of examples, and it's relatively easy to copy 
 and paste one example and have it do what you'd like. I believe the same 
 approach would work for a fuzz testing framework.
 
 Some of these tasks above are already done for Harry, such as better IDE 
 support for samples. This will be available in OSS Harry shortly.

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread David Capwell
Most edge cases we have seen in Accord are working with feature branches from 
other authors where we use relative paths to make sure the git@ vs https:// 
doesn’t become a problem for CI (submodule points to https:// to work in CI, 
but if you do that during feature development it gets annoying to push to 
GitHub… so we do ../cassandra-accord.git so git respects w/e protocol you are 
using).  In 1-2 peoples environments, when they checked out another authors 
logic the C* remote was correct, but the Accord one was still pointing to 
Apache (which doesn’t have the feature branch)…. This is trivial to fix, and 
might be a bug with our git hooks…. But still calling out as it has been an 
issue.

Josh, do you see any reports on what isn’t working?  I think most people don’t 
touch 1% of what git can do… so it might be that 10% is broken but that no one 
in our domain actually touches that path?

> On May 31, 2023, at 12:36 PM, Josh McKenzie  wrote:
> 
> Bumping into worktree + submodule pain on some harry related work; it looks 
> like "git worktree" and submodules are not currently fully implemented:
> 
> https://git-scm.com/docs/git-worktree#_bugs
> BUGS
> Multiple checkout in general is still experimental, and the support for 
> submodules is incomplete. It is NOT recommended to make multiple checkouts of 
> a superproject.
> 
> I rely pretty heavily on worktrees and I know a lot of other folks who do 
> too. This is a dealbreaker for me in terms of adding anything else as a 
> submodule and I'd like to know if the accord folks have been running into any 
> worktree related woes w/the accord integration.
> 
> 
> On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote:
>> Regarding approachability, one of the things I thought is worth adding is a 
>> DSL. I feel like there's enough functionality in Harry and there's enough 
>> information for anyone who needs to write even an involved test out there, 
>> but adoption doesn't usually start with complex use-cases, so it could be 
>> that making it extremely simple to generate the data and validating that 
>> written data is where it's supposed to be, should help adoption a lot. 
>> Unfortunately, more complex use-cases such as group-by support, or SAI 
>> testing will require a bit more knowledge and writing an involved model, so 
>> I do not see any shortcuts we can take here.
>> 
>> > I do think that moving Harry in-tree would improve approachability
>> 
>> I think it's similar as it is with in-jvm dtest api. I feel like we wold 
>> evolve it more actively if we didn't have to cut a release before every 
>> commit. In other words, I think that changing Harry code and extending 
>> functionality will be easier, which I think will eventually lead to quicker 
>> adoption. But of course the act of moving itself does not increase adoption, 
>> it just comes from better ergonomics.
>> 
>> 
>> On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote:
>>> I'm seeing a few distinct topics here:
>>> 
>>> 1. Harry's adoption and approachability
>>> 
>>> I agree that approachability is one of Harry's main improvement areas right 
>>> now. If our goal is to produce a fuzz testing framework for the Cassandra 
>>> project, then adoption by contributors and usage for new feature 
>>> development are reasonable indicators for whether we're achieving that 
>>> goal. If Harry is not getting adopted by contributors outside of Apple, and 
>>> is not getting used for new feature development, then we should make an 
>>> effort to understand why. I don't think that a several-hour seminar is the 
>>> best point of leverage to achieve those goals.
>>> 
>>> Here's what I think we do need:
>>> 
>>> - The README should be understandable by anyone interested in writing a 
>>> fuzz test
>>> - Example tests should be runnable from a fresh clone of Cassandra, in an 
>>> IDE or on the command line
>>> - Examples of how we would test new features (like CEP-7, CEP-29, etc) with 
>>> the fuzz testing framework
>>> 
>>> I find the JVM dtest framework accomplishes similar goals, and one reason 
>>> is because there are plenty of examples, and it's relatively easy to copy 
>>> and paste one example and have it do what you'd like. I believe the same 
>>> approach would work for a fuzz testing framework.
>>> 
>>> Some of these tasks above are already done for Harry, such as better IDE 
>>> support for samples. This will be available in OSS Harry shortly.
>>> 
>>> 2. Moving Harry in-tree vs. in submodule
>>> 
>>> As I understand it, making Harry a submodule of Cassandra would make it 
>>> easier to deal with versioning, since we wouldn't have to do the entire 
>>> release dance we need to do for dtest-api, but I don't see this as a big 
>>> improvement to approachability.
>>> 
>>> I do think that moving Harry in-tree would improve approachability, for the 
>>> same reason as the JVM dtests. It's nice to write a feature or fix, find a 
>>> similar JVM dtest, copy, paste, and edit, and have something 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-31 Thread Josh McKenzie
Bumping into worktree + submodule pain on some harry related work; it looks 
like "git worktree" and submodules are not currently fully implemented:

https://git-scm.com/docs/git-worktree#_bugs
BUGS

Multiple checkout in general is still experimental, and the support for 
submodules is incomplete. It is NOT recommended to make multiple checkouts of a 
superproject.

I rely pretty heavily on worktrees and I know a lot of other folks who do too. 
This is a dealbreaker for me in terms of adding anything else as a submodule 
and I'd like to know if the accord folks have been running into any worktree 
related woes w/the accord integration.


On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote:
> Regarding approachability, one of the things I thought is worth adding is a 
> DSL. I feel like there's enough functionality in Harry and there's enough 
> information for anyone who needs to write even an involved test out there, 
> but adoption doesn't usually start with complex use-cases, so it could be 
> that making it extremely simple to generate the data and validating that 
> written data is where it's supposed to be, should help adoption a lot. 
> Unfortunately, more complex use-cases such as group-by support, or SAI 
> testing will require a bit more knowledge and writing an involved model, so I 
> do not see any shortcuts we can take here.
> 
> > I do think that moving Harry in-tree would improve approachability
> 
> I think it's similar as it is with in-jvm dtest api. I feel like we wold 
> evolve it more actively if we didn't have to cut a release before every 
> commit. In other words, I think that changing Harry code and extending 
> functionality will be easier, which I think will eventually lead to quicker 
> adoption. But of course the act of moving itself does not increase adoption, 
> it just comes from better ergonomics.
> 
> 
> On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote:
>> I'm seeing a few distinct topics here:
>> 
>> 1. Harry's adoption and approachability
>> 
>> I agree that approachability is one of Harry's main improvement areas right 
>> now. If our goal is to produce a fuzz testing framework for the Cassandra 
>> project, then adoption by contributors and usage for new feature development 
>> are reasonable indicators for whether we're achieving that goal. If Harry is 
>> not getting adopted by contributors outside of Apple, and is not getting 
>> used for new feature development, then we should make an effort to 
>> understand why. I don't think that a several-hour seminar is the best point 
>> of leverage to achieve those goals.
>> 
>> Here's what I think we do need:
>> 
>> - The README should be understandable by anyone interested in writing a fuzz 
>> test
>> - Example tests should be runnable from a fresh clone of Cassandra, in an 
>> IDE or on the command line
>> - Examples of how we would test new features (like CEP-7, CEP-29, etc) with 
>> the fuzz testing framework
>> 
>> I find the JVM dtest framework accomplishes similar goals, and one reason is 
>> because there are plenty of examples, and it's relatively easy to copy and 
>> paste one example and have it do what you'd like. I believe the same 
>> approach would work for a fuzz testing framework.
>> 
>> Some of these tasks above are already done for Harry, such as better IDE 
>> support for samples. This will be available in OSS Harry shortly.
>> 
>> 2. Moving Harry in-tree vs. in submodule
>> 
>> As I understand it, making Harry a submodule of Cassandra would make it 
>> easier to deal with versioning, since we wouldn't have to do the entire 
>> release dance we need to do for dtest-api, but I don't see this as a big 
>> improvement to approachability.
>> 
>> I do think that moving Harry in-tree would improve approachability, for the 
>> same reason as the JVM dtests. It's nice to write a feature or fix, find a 
>> similar JVM dtest, copy, paste, and edit, and have something useful.
>> 
>> 3. General subdivision of Cassandra projects
>> 
>> This topic has come up quite a few times recently - around shared utilities 
>> (CEP-10 concurrency primitives, etc), dtest-api, query parser, etc. The 
>> project has tried out a few different approaches on composition of separate 
>> projects. Hopefully in the near future we find the one that works best and 
>> can start this process of splitting out libraries.
>> 
>> --
>> Abe
>> 
>>> On May 25, 2023, at 6:36 AM, Josh McKenzie  wrote:
>>> 
 I would really like us to split out utilities into a common project
>>> +1 to the sentiment.
>>> 
>>> Would also advocate strongly for it being more tightly integrated with the 
>>> base project than what we've been doing with our ecosystem (i.e. completely 
>>> separate projects, not submodules), mostly from a discoverability and 
>>> workflow standpoint.
>>> 
>>> I'm definitely salty about having to have 4 IDE's / projects open just to 
>>> work on the entire stack.
>>> 
>>> On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote:
 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-28 Thread Alex Petrov
Regarding approachability, one of the things I thought is worth adding is a 
DSL. I feel like there's enough functionality in Harry and there's enough 
information for anyone who needs to write even an involved test out there, but 
adoption doesn't usually start with complex use-cases, so it could be that 
making it extremely simple to generate the data and validating that written 
data is where it's supposed to be, should help adoption a lot. Unfortunately, 
more complex use-cases such as group-by support, or SAI testing will require a 
bit more knowledge and writing an involved model, so I do not see any shortcuts 
we can take here.

> I do think that moving Harry in-tree would improve approachability

I think it's similar as it is with in-jvm dtest api. I feel like we wold evolve 
it more actively if we didn't have to cut a release before every commit. In 
other words, I think that changing Harry code and extending functionality will 
be easier, which I think will eventually lead to quicker adoption. But of 
course the act of moving itself does not increase adoption, it just comes from 
better ergonomics.


On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote:
> I'm seeing a few distinct topics here:
> 
> 1. Harry's adoption and approachability
> 
> I agree that approachability is one of Harry's main improvement areas right 
> now. If our goal is to produce a fuzz testing framework for the Cassandra 
> project, then adoption by contributors and usage for new feature development 
> are reasonable indicators for whether we're achieving that goal. If Harry is 
> not getting adopted by contributors outside of Apple, and is not getting used 
> for new feature development, then we should make an effort to understand why. 
> I don't think that a several-hour seminar is the best point of leverage to 
> achieve those goals.
> 
> Here's what I think we do need:
> 
> - The README should be understandable by anyone interested in writing a fuzz 
> test
> - Example tests should be runnable from a fresh clone of Cassandra, in an IDE 
> or on the command line
> - Examples of how we would test new features (like CEP-7, CEP-29, etc) with 
> the fuzz testing framework
> 
> I find the JVM dtest framework accomplishes similar goals, and one reason is 
> because there are plenty of examples, and it's relatively easy to copy and 
> paste one example and have it do what you'd like. I believe the same approach 
> would work for a fuzz testing framework.
> 
> Some of these tasks above are already done for Harry, such as better IDE 
> support for samples. This will be available in OSS Harry shortly.
> 
> 2. Moving Harry in-tree vs. in submodule
> 
> As I understand it, making Harry a submodule of Cassandra would make it 
> easier to deal with versioning, since we wouldn't have to do the entire 
> release dance we need to do for dtest-api, but I don't see this as a big 
> improvement to approachability.
> 
> I do think that moving Harry in-tree would improve approachability, for the 
> same reason as the JVM dtests. It's nice to write a feature or fix, find a 
> similar JVM dtest, copy, paste, and edit, and have something useful.
> 
> 3. General subdivision of Cassandra projects
> 
> This topic has come up quite a few times recently - around shared utilities 
> (CEP-10 concurrency primitives, etc), dtest-api, query parser, etc. The 
> project has tried out a few different approaches on composition of separate 
> projects. Hopefully in the near future we find the one that works best and 
> can start this process of splitting out libraries.
> 
> --
> Abe
> 
>> On May 25, 2023, at 6:36 AM, Josh McKenzie  wrote:
>> 
>>> I would really like us to split out utilities into a common project
>> +1 to the sentiment.
>> 
>> Would also advocate strongly for it being more tightly integrated with the 
>> base project than what we've been doing with our ecosystem (i.e. completely 
>> separate projects, not submodules), mostly from a discoverability and 
>> workflow standpoint.
>> 
>> I'm definitely salty about having to have 4 IDE's / projects open just to 
>> work on the entire stack.
>> 
>> On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote:
>>> This was not a talk, but rather an interactive workshop, unfortunately will 
>>> not work in a recorded way, but I am trying to work out ways to preserve 
>>> this.
>>> 
>>> On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote:
 Since the talk was not accepted for Cassandra Summit, would it be possible 
 to record it as a simple youtube video and publish it so that the detailed 
 information about how to use Harry is not lost?
 
 On Thu, May 25, 2023 at 7:36 AM Alex Petrov  wrote:
> __
> While we are at it, we may also want to pull the in-jvm dtest API as a 
> submodule, and actually move some tests that are common between the 
> branches there.
> 
> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
>> Isn’t the other reason Accord 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Abe Ratnofsky
I'm seeing a few distinct topics here:

1. Harry's adoption and approachability

I agree that approachability is one of Harry's main improvement areas right 
now. If our goal is to produce a fuzz testing framework for the Cassandra 
project, then adoption by contributors and usage for new feature development 
are reasonable indicators for whether we're achieving that goal. If Harry is 
not getting adopted by contributors outside of Apple, and is not getting used 
for new feature development, then we should make an effort to understand why. I 
don't think that a several-hour seminar is the best point of leverage to 
achieve those goals.

Here's what I think we do need:

- The README should be understandable by anyone interested in writing a fuzz 
test
- Example tests should be runnable from a fresh clone of Cassandra, in an IDE 
or on the command line
- Examples of how we would test new features (like CEP-7, CEP-29, etc) with the 
fuzz testing framework

I find the JVM dtest framework accomplishes similar goals, and one reason is 
because there are plenty of examples, and it's relatively easy to copy and 
paste one example and have it do what you'd like. I believe the same approach 
would work for a fuzz testing framework.

Some of these tasks above are already done for Harry, such as better IDE 
support for samples. This will be available in OSS Harry shortly.

2. Moving Harry in-tree vs. in submodule

As I understand it, making Harry a submodule of Cassandra would make it easier 
to deal with versioning, since we wouldn't have to do the entire release dance 
we need to do for dtest-api, but I don't see this as a big improvement to 
approachability.

I do think that moving Harry in-tree would improve approachability, for the 
same reason as the JVM dtests. It's nice to write a feature or fix, find a 
similar JVM dtest, copy, paste, and edit, and have something useful.

3. General subdivision of Cassandra projects

This topic has come up quite a few times recently - around shared utilities 
(CEP-10 concurrency primitives, etc), dtest-api, query parser, etc. The project 
has tried out a few different approaches on composition of separate projects. 
Hopefully in the near future we find the one that works best and can start this 
process of splitting out libraries.

--
Abe

> On May 25, 2023, at 6:36 AM, Josh McKenzie  wrote:
> 
>> I would really like us to split out utilities into a common project
> +1 to the sentiment.
> 
> Would also advocate strongly for it being more tightly integrated with the 
> base project than what we've been doing with our ecosystem (i.e. completely 
> separate projects, not submodules), mostly from a discoverability and 
> workflow standpoint.
> 
> I'm definitely salty about having to have 4 IDE's / projects open just to 
> work on the entire stack.
> 
> On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote:
>> This was not a talk, but rather an interactive workshop, unfortunately will 
>> not work in a recorded way, but I am trying to work out ways to preserve 
>> this.
>> 
>> On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote:
>>> Since the talk was not accepted for Cassandra Summit, would it be possible 
>>> to record it as a simple youtube video and publish it so that the detailed 
>>> information about how to use Harry is not lost?
>>> 
>>> On Thu, May 25, 2023 at 7:36 AM Alex Petrov >> > wrote:
>>> 
>>> While we are at it, we may also want to pull the in-jvm dtest API as a 
>>> submodule, and actually move some tests that are common between the 
>>> branches there.
>>> 
>>> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
 Isn’t the other reason Accord works well as a submodule that it has no 
 dependencies on C* proper? Harry does at the moment, right? (Not that we 
 couldn’t address that…just trying to think this through…)
 
> On May 24, 2023, at 6:54 PM, Benedict  > wrote:
> 
> 
> In this case Harry is a testing module - it’s not something we will 
> develop in tandem with C* releases, and we will want improvements to be 
> applied across all branches.
> 
> So it seems a natural fit for submodules to me.
> 
> 
>> On 24 May 2023, at 21:09, Caleb Rackliffe > > wrote:
>> 
>> > Submodules do have their own overhead and edge cases, so I am mostly 
>> > in favor of using for cases where the code must live outside of tree 
>> > (such as jvm-dtest that lives out of tree as all branches need the 
>> > same interfaces)
>> 
>> Agreed. Basically where I've ended up on this topic.
>> 
>> > We could go over some interesting examples such as testing 2i (SAI)
>> 
>> +100
>> 
>> 
>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov > > wrote:
>> 
>> > I'm about to need to harry test for the paging across tombstone work 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Josh McKenzie
> I would really like us to split out utilities into a common project
+1 to the sentiment.

Would also advocate strongly for it being more tightly integrated with the base 
project than what we've been doing with our ecosystem (i.e. completely separate 
projects, not submodules), mostly from a discoverability and workflow 
standpoint.

I'm definitely salty about having to have 4 IDE's / projects open just to work 
on the entire stack.

On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote:
> This was not a talk, but rather an interactive workshop, unfortunately will 
> not work in a recorded way, but I am trying to work out ways to preserve this.
> 
> On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote:
>> Since the talk was not accepted for Cassandra Summit, would it be possible 
>> to record it as a simple youtube video and publish it so that the detailed 
>> information about how to use Harry is not lost?
>> 
>> On Thu, May 25, 2023 at 7:36 AM Alex Petrov  wrote:
>>> __
>>> While we are at it, we may also want to pull the in-jvm dtest API as a 
>>> submodule, and actually move some tests that are common between the 
>>> branches there.
>>> 
>>> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
 Isn’t the other reason Accord works well as a submodule that it has no 
 dependencies on C* proper? Harry does at the moment, right? (Not that we 
 couldn’t address that…just trying to think this through…)
 
> On May 24, 2023, at 6:54 PM, Benedict  wrote:
> 
> 
> In this case Harry is a testing module - it’s not something we will 
> develop in tandem with C* releases, and we will want improvements to be 
> applied across all branches.
> 
> So it seems a natural fit for submodules to me.
> 
> 
>> On 24 May 2023, at 21:09, Caleb Rackliffe  
>> wrote:
>> 
>> > Submodules do have their own overhead and edge cases, so I am mostly 
>> > in favor of using for cases where the code must live outside of tree 
>> > (such as jvm-dtest that lives out of tree as all branches need the 
>> > same interfaces)
>> 
>> Agreed. Basically where I've ended up on this topic.
>> 
>> > We could go over some interesting examples such as testing 2i (SAI)
>> 
>> +100
>> 
>> 
>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:
>>> __
>>> > I'm about to need to harry test for the paging across tombstone work 
>>> > for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's 
>>> > where my own overlapping fuzzing came in). In the process, I'll see 
>>> > if I can't distill something really simple along the lines of how 
>>> > React approaches it (https://react.dev/learn).
>>> 
>>> We can pick that up as an example, sure. 
>>> 
>>> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
> workshop,
 I'm about to need to harry test for the paging across tombstone work 
 for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's 
 where my own overlapping fuzzing came in). In the process, I'll see if 
 I can't distill something really simple along the lines of how React 
 approaches it (https://react.dev/learn).
 
 Ideally we'd be able to get something together that's a high level "In 
 the next 15 minutes, you will know and understand A-G and have access 
 to N% of the power of harry" kind of offer.
 
 Honestly, there's a *lot* in our ecosystem where we could benefit from 
 taking a page from their book in terms of onboarding and getting 
 started IMO.
 
 On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
> > I wonder if a mini-onboarding session would be good as a community 
> > session - go over Harry, how to run it, how to add a test?  Would 
> > that be the right venue?  I just would like to see how we can not 
> > only plug it in to regular CI but get everyone that wants to add a 
> > test be able to know how to get started with it.
> 
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
> workshop, but unfortunately it got declined. Goes without saying, we 
> can still do it online, time and resources permitting. But again, I 
> do not think it should be barring us from making Harry a part of the 
> codebase, as it already is. In fact, we can be iterating on the 
> development quicker having it in-tree. 
> 
> We could go over some interesting examples such as testing 2i (SAI), 
> modelling Group By tests, or testing repair. If there is enough 
> appetite and collaboration in the community, I will see if we can 
> pull something like that together. Input on _what_ you would like to 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Alex Petrov
This was not a talk, but rather an interactive workshop, unfortunately will not 
work in a recorded way, but I am trying to work out ways to preserve this.

On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote:
> Since the talk was not accepted for Cassandra Summit, would it be possible to 
> record it as a simple youtube video and publish it so that the detailed 
> information about how to use Harry is not lost?
> 
> On Thu, May 25, 2023 at 7:36 AM Alex Petrov  wrote:
>> __
>> While we are at it, we may also want to pull the in-jvm dtest API as a 
>> submodule, and actually move some tests that are common between the branches 
>> there.
>> 
>> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
>>> Isn’t the other reason Accord works well as a submodule that it has no 
>>> dependencies on C* proper? Harry does at the moment, right? (Not that we 
>>> couldn’t address that…just trying to think this through…)
>>> 
 On May 24, 2023, at 6:54 PM, Benedict  wrote:
 
 
 In this case Harry is a testing module - it’s not something we will 
 develop in tandem with C* releases, and we will want improvements to be 
 applied across all branches.
 
 So it seems a natural fit for submodules to me.
 
 
> On 24 May 2023, at 21:09, Caleb Rackliffe  
> wrote:
> 
> > Submodules do have their own overhead and edge cases, so I am mostly in 
> > favor of using for cases where the code must live outside of tree (such 
> > as jvm-dtest that lives out of tree as all branches need the same 
> > interfaces)
> 
> Agreed. Basically where I've ended up on this topic.
> 
> > We could go over some interesting examples such as testing 2i (SAI)
> 
> +100
> 
> 
> On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:
>> __
>> > I'm about to need to harry test for the paging across tombstone work 
>> > for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's 
>> > where my own overlapping fuzzing came in). In the process, I'll see if 
>> > I can't distill something really simple along the lines of how React 
>> > approaches it (https://react.dev/learn).
>> 
>> We can pick that up as an example, sure. 
>> 
>> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
 I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
 workshop,
>>> I'm about to need to harry test for the paging across tombstone work 
>>> for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where 
>>> my own overlapping fuzzing came in). In the process, I'll see if I 
>>> can't distill something really simple along the lines of how React 
>>> approaches it (https://react.dev/learn).
>>> 
>>> Ideally we'd be able to get something together that's a high level "In 
>>> the next 15 minutes, you will know and understand A-G and have access 
>>> to N% of the power of harry" kind of offer.
>>> 
>>> Honestly, there's a *lot* in our ecosystem where we could benefit from 
>>> taking a page from their book in terms of onboarding and getting 
>>> started IMO.
>>> 
>>> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
 > I wonder if a mini-onboarding session would be good as a community 
 > session - go over Harry, how to run it, how to add a test?  Would 
 > that be the right venue?  I just would like to see how we can not 
 > only plug it in to regular CI but get everyone that wants to add a 
 > test be able to know how to get started with it.
 
 I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
 workshop, but unfortunately it got declined. Goes without saying, we 
 can still do it online, time and resources permitting. But again, I do 
 not think it should be barring us from making Harry a part of the 
 codebase, as it already is. In fact, we can be iterating on the 
 development quicker having it in-tree. 
 
 We could go over some interesting examples such as testing 2i (SAI), 
 modelling Group By tests, or testing repair. If there is enough 
 appetite and collaboration in the community, I will see if we can pull 
 something like that together. Input on _what_ you would like to see / 
 hear / tested is also appreciated. Harry was developed out of a strong 
 need for large-scale testing, which also has informed many of its 
 APIs, but we can make it easier to access for interactive testing / 
 unit tests. We have been doing a lot of that with Transactional 
 Metadata, too. 
 
 > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
 > thoughts here?
 
 Yes, sorry for not responding on this thread earlier. I can not 
 understate how excited I am about this, and 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Benedict
I would really like us to split out utilities into a common project, personally. It would be nice to work with a shared palette, including for dtest-api, accord, Harry etc.I think it would help clean up the codebase a bit too, as we have some (minimal) tight coupling with utilities and the C* process.But doubt we have the time for that anytime soon.On 25 May 2023, at 05:04, Caleb Rackliffe  wrote:Isn’t the other reason Accord works well as a submodule that it has no dependencies on C* proper? Harry does at the moment, right? (Not that we couldn’t address that…just trying to think this through…)On May 24, 2023, at 6:54 PM, Benedict  wrote:In this case Harry is a testing module - it’s not something we will develop in tandem with C* releases, and we will want improvements to be applied across all branches.So it seems a natural fit for submodules to me.On 24 May 2023, at 21:09, Caleb Rackliffe  wrote:> Submodules do have their own overhead and edge cases, so I am mostly in favor of using for cases where the code must live outside of tree (such as jvm-dtest that lives out of tree as all branches need the same interfaces)Agreed. Basically where I've ended up on this topic.> We could go over some interesting examples such as testing 2i (SAI)+100On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:> I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).We can pick that up as an example, sure. On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).Ideally we'd be able to get something together that's a high level "In the next 15 minutes, you will know and understand A-G and have access to N% of the power of harry" kind of offer.Honestly, there's a lot in our ecosystem where we could benefit from taking a page from their book in terms of onboarding and getting started IMO.On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:> I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test?  Would that be the right venue?  I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, but unfortunately it got declined. Goes without saying, we can still do it online, time and resources permitting. But again, I do not think it should be barring us from making Harry a part of the codebase, as it already is. In fact, we can be iterating on the development quicker having it in-tree. We could go over some interesting examples such as testing 2i (SAI), modelling Group By tests, or testing repair. If there is enough appetite and collaboration in the community, I will see if we can pull something like that together. Input on _what_ you would like to see / hear / tested is also appreciated. Harry was developed out of a strong need for large-scale testing, which also has informed many of its APIs, but we can make it easier to access for interactive testing / unit tests. We have been doing a lot of that with Transactional Metadata, too. > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?Yes, sorry for not responding on this thread earlier. I can not understate how excited I am about this, and how important I think this is. Time constraints are somehow hard to overcome, but I hope the results brought by TCM will make it all worth it.On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:I think pulling Harry into the tree will make adoption easier for the folks. I have been a bit swamped with Transactional Metadata work, but I wanted to make some of the things we were using for testing TCM available outside of TCM branch. This includes a bunch of helper methods to perform operations on the clusters, data generation, and more useful stuff. Of course, the question always remains about how much time I want to spend porting it all to Gossip, but I think we can find a reasonable compromise. I would not set this improvement as a prerequisite to pulling Harry into the main branch, but rather interpret it as a commitment from myself to take community input and make it more approachable by the day. On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:importantly it’s a million times better than the dtest-api process - 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Claude Warren, Jr via dev
Since the talk was not accepted for Cassandra Summit, would it be possible
to record it as a simple youtube video and publish it so that the detailed
information about how to use Harry is not lost?

On Thu, May 25, 2023 at 7:36 AM Alex Petrov  wrote:

> While we are at it, we may also want to pull the in-jvm dtest API as a
> submodule, and actually move some tests that are common between the
> branches there.
>
> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
>
> Isn’t the other reason Accord works well as a submodule that it has no
> dependencies on C* proper? Harry does at the moment, right? (Not that we
> couldn’t address that…just trying to think this through…)
>
> On May 24, 2023, at 6:54 PM, Benedict  wrote:
>
> 
>
> In this case Harry is a testing module - it’s not something we will
> develop in tandem with C* releases, and we will want improvements to be
> applied across all branches.
>
> So it seems a natural fit for submodules to me.
>
>
> On 24 May 2023, at 21:09, Caleb Rackliffe 
> wrote:
>
> 
> > Submodules do have their own overhead and edge cases, so I am mostly in
> favor of using for cases where the code must live outside of tree (such as
> jvm-dtest that lives out of tree as all branches need the same interfaces)
>
> Agreed. Basically where I've ended up on this topic.
>
> > We could go over some interesting examples such as testing 2i (SAI)
>
> +100
>
>
> On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:
>
>
> > I'm about to need to harry test for the paging across tombstone work for
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my
> own overlapping fuzzing came in). In the process, I'll see if I can't
> distill something really simple along the lines of how React approaches it (
> https://react.dev/learn).
>
> We can pick that up as an example, sure.
>
> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
>
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry
> workshop,
>
> I'm about to need to harry test for the paging across tombstone work for
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my
> own overlapping fuzzing came in). In the process, I'll see if I can't
> distill something really simple along the lines of how React approaches it (
> https://react.dev/learn).
>
> Ideally we'd be able to get something together that's a high level "In the
> next 15 minutes, you will know and understand A-G and have access to N% of
> the power of harry" kind of offer.
>
> Honestly, there's a *lot* in our ecosystem where we could benefit from
> taking a page from their book in terms of onboarding and getting started
> IMO.
>
> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
>
> > I wonder if a mini-onboarding session would be good as a community
> session - go over Harry, how to run it, how to add a test?  Would that be
> the right venue?  I just would like to see how we can not only plug it in
> to regular CI but get everyone that wants to add a test be able to know how
> to get started with it.
>
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry
> workshop, but unfortunately it got declined. Goes without saying, we can
> still do it online, time and resources permitting. But again, I do not
> think it should be barring us from making Harry a part of the codebase, as
> it already is. In fact, we can be iterating on the development quicker
> having it in-tree.
>
> We could go over some interesting examples such as testing 2i (SAI),
> modelling Group By tests, or testing repair. If there is enough appetite
> and collaboration in the community, I will see if we can pull something
> like that together. Input on _what_ you would like to see / hear / tested
> is also appreciated. Harry was developed out of a strong need for
> large-scale testing, which also has informed many of its APIs, but we can
> make it easier to access for interactive testing / unit tests. We have been
> doing a lot of that with Transactional Metadata, too.
>
> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any
> thoughts here?
>
> Yes, sorry for not responding on this thread earlier. I can not understate
> how excited I am about this, and how important I think this is. Time
> constraints are somehow hard to overcome, but I hope the results brought by
> TCM will make it all worth it.
>
> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>
> I think pulling Harry into the tree will make adoption easier for the
> folks. I have been a bit swamped with Transactional Metadata work, but I
> wanted to make some of the things we were using for testing TCM available
> outside of TCM branch. This includes a bunch of helper methods to perform
> operations on the clusters, data generation, and more useful stuff. Of
> course, the question always remains about how much time I want to spend
> porting it all to Gossip, but I think we can find a reasonable compromise.
>
> I would not set this improvement as a 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Alex Petrov
While we are at it, we may also want to pull the in-jvm dtest API as a 
submodule, and actually move some tests that are common between the branches 
there.

On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
> Isn’t the other reason Accord works well as a submodule that it has no 
> dependencies on C* proper? Harry does at the moment, right? (Not that we 
> couldn’t address that…just trying to think this through…)
> 
>> On May 24, 2023, at 6:54 PM, Benedict  wrote:
>> 
>> 
>> In this case Harry is a testing module - it’s not something we will develop 
>> in tandem with C* releases, and we will want improvements to be applied 
>> across all branches.
>> 
>> So it seems a natural fit for submodules to me.
>> 
>> 
>>> On 24 May 2023, at 21:09, Caleb Rackliffe  wrote:
>>> 
>>> > Submodules do have their own overhead and edge cases, so I am mostly in 
>>> > favor of using for cases where the code must live outside of tree (such 
>>> > as jvm-dtest that lives out of tree as all branches need the same 
>>> > interfaces)
>>> 
>>> Agreed. Basically where I've ended up on this topic.
>>> 
>>> > We could go over some interesting examples such as testing 2i (SAI)
>>> 
>>> +100
>>> 
>>> 
>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:
 __
 > I'm about to need to harry test for the paging across tombstone work for 
 > https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my 
 > own overlapping fuzzing came in). In the process, I'll see if I can't 
 > distill something really simple along the lines of how React approaches 
 > it (https://react.dev/learn).
 
 We can pick that up as an example, sure. 
 
 On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
>> workshop,
> I'm about to need to harry test for the paging across tombstone work for 
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my 
> own overlapping fuzzing came in). In the process, I'll see if I can't 
> distill something really simple along the lines of how React approaches 
> it (https://react.dev/learn).
> 
> Ideally we'd be able to get something together that's a high level "In 
> the next 15 minutes, you will know and understand A-G and have access to 
> N% of the power of harry" kind of offer.
> 
> Honestly, there's a *lot* in our ecosystem where we could benefit from 
> taking a page from their book in terms of onboarding and getting started 
> IMO.
> 
> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
>> > I wonder if a mini-onboarding session would be good as a community 
>> > session - go over Harry, how to run it, how to add a test?  Would that 
>> > be the right venue?  I just would like to see how we can not only plug 
>> > it in to regular CI but get everyone that wants to add a test be able 
>> > to know how to get started with it.
>> 
>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
>> workshop, but unfortunately it got declined. Goes without saying, we can 
>> still do it online, time and resources permitting. But again, I do not 
>> think it should be barring us from making Harry a part of the codebase, 
>> as it already is. In fact, we can be iterating on the development 
>> quicker having it in-tree. 
>> 
>> We could go over some interesting examples such as testing 2i (SAI), 
>> modelling Group By tests, or testing repair. If there is enough appetite 
>> and collaboration in the community, I will see if we can pull something 
>> like that together. Input on _what_ you would like to see / hear / 
>> tested is also appreciated. Harry was developed out of a strong need for 
>> large-scale testing, which also has informed many of its APIs, but we 
>> can make it easier to access for interactive testing / unit tests. We 
>> have been doing a lot of that with Transactional Metadata, too. 
>> 
>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>> > thoughts here?
>> 
>> Yes, sorry for not responding on this thread earlier. I can not 
>> understate how excited I am about this, and how important I think this 
>> is. Time constraints are somehow hard to overcome, but I hope the 
>> results brought by TCM will make it all worth it.
>> 
>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>>> I think pulling Harry into the tree will make adoption easier for the 
>>> folks. I have been a bit swamped with Transactional Metadata work, but 
>>> I wanted to make some of the things we were using for testing TCM 
>>> available outside of TCM branch. This includes a bunch of helper 
>>> methods to perform operations on the clusters, data generation, and 
>>> more useful stuff. Of course, the question always remains about how 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Caleb Rackliffe
Isn’t the other reason Accord works well as a submodule that it has no dependencies on C* proper? Harry does at the moment, right? (Not that we couldn’t address that…just trying to think this through…)On May 24, 2023, at 6:54 PM, Benedict  wrote:In this case Harry is a testing module - it’s not something we will develop in tandem with C* releases, and we will want improvements to be applied across all branches.So it seems a natural fit for submodules to me.On 24 May 2023, at 21:09, Caleb Rackliffe  wrote:> Submodules do have their own overhead and edge cases, so I am mostly in favor of using for cases where the code must live outside of tree (such as jvm-dtest that lives out of tree as all branches need the same interfaces)Agreed. Basically where I've ended up on this topic.> We could go over some interesting examples such as testing 2i (SAI)+100On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:> I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).We can pick that up as an example, sure. On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).Ideally we'd be able to get something together that's a high level "In the next 15 minutes, you will know and understand A-G and have access to N% of the power of harry" kind of offer.Honestly, there's a lot in our ecosystem where we could benefit from taking a page from their book in terms of onboarding and getting started IMO.On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:> I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test?  Would that be the right venue?  I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, but unfortunately it got declined. Goes without saying, we can still do it online, time and resources permitting. But again, I do not think it should be barring us from making Harry a part of the codebase, as it already is. In fact, we can be iterating on the development quicker having it in-tree. We could go over some interesting examples such as testing 2i (SAI), modelling Group By tests, or testing repair. If there is enough appetite and collaboration in the community, I will see if we can pull something like that together. Input on _what_ you would like to see / hear / tested is also appreciated. Harry was developed out of a strong need for large-scale testing, which also has informed many of its APIs, but we can make it easier to access for interactive testing / unit tests. We have been doing a lot of that with Transactional Metadata, too. > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?Yes, sorry for not responding on this thread earlier. I can not understate how excited I am about this, and how important I think this is. Time constraints are somehow hard to overcome, but I hope the results brought by TCM will make it all worth it.On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:I think pulling Harry into the tree will make adoption easier for the folks. I have been a bit swamped with Transactional Metadata work, but I wanted to make some of the things we were using for testing TCM available outside of TCM branch. This includes a bunch of helper methods to perform operations on the clusters, data generation, and more useful stuff. Of course, the question always remains about how much time I want to spend porting it all to Gossip, but I think we can find a reasonable compromise. I would not set this improvement as a prerequisite to pulling Harry into the main branch, but rather interpret it as a commitment from myself to take community input and make it more approachable by the day. On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:importantly it’s a million times better than the dtest-api process - which stymies development due to the friction.This is my major concern.What prompted this thread was harry being external to the core codebase and the lack of adoption and usage of it having led to atrophy of certain aspects of it, which then led to redundant implementation of some fuzz testing and lost time.We'd all be better served to have this closer to the main codebase as a forcing function to 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Benedict
In this case Harry is a testing module - it’s not something we will develop in tandem with C* releases, and we will want improvements to be applied across all branches.So it seems a natural fit for submodules to me.On 24 May 2023, at 21:09, Caleb Rackliffe  wrote:> Submodules do have their own overhead and edge cases, so I am mostly in favor of using for cases where the code must live outside of tree (such as jvm-dtest that lives out of tree as all branches need the same interfaces)Agreed. Basically where I've ended up on this topic.> We could go over some interesting examples such as testing 2i (SAI)+100On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:> I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).We can pick that up as an example, sure. On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).Ideally we'd be able to get something together that's a high level "In the next 15 minutes, you will know and understand A-G and have access to N% of the power of harry" kind of offer.Honestly, there's a lot in our ecosystem where we could benefit from taking a page from their book in terms of onboarding and getting started IMO.On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:> I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test?  Would that be the right venue?  I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, but unfortunately it got declined. Goes without saying, we can still do it online, time and resources permitting. But again, I do not think it should be barring us from making Harry a part of the codebase, as it already is. In fact, we can be iterating on the development quicker having it in-tree. We could go over some interesting examples such as testing 2i (SAI), modelling Group By tests, or testing repair. If there is enough appetite and collaboration in the community, I will see if we can pull something like that together. Input on _what_ you would like to see / hear / tested is also appreciated. Harry was developed out of a strong need for large-scale testing, which also has informed many of its APIs, but we can make it easier to access for interactive testing / unit tests. We have been doing a lot of that with Transactional Metadata, too. > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?Yes, sorry for not responding on this thread earlier. I can not understate how excited I am about this, and how important I think this is. Time constraints are somehow hard to overcome, but I hope the results brought by TCM will make it all worth it.On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:I think pulling Harry into the tree will make adoption easier for the folks. I have been a bit swamped with Transactional Metadata work, but I wanted to make some of the things we were using for testing TCM available outside of TCM branch. This includes a bunch of helper methods to perform operations on the clusters, data generation, and more useful stuff. Of course, the question always remains about how much time I want to spend porting it all to Gossip, but I think we can find a reasonable compromise. I would not set this improvement as a prerequisite to pulling Harry into the main branch, but rather interpret it as a commitment from myself to take community input and make it more approachable by the day. On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:importantly it’s a million times better than the dtest-api process - which stymies development due to the friction.This is my major concern.What prompted this thread was harry being external to the core codebase and the lack of adoption and usage of it having led to atrophy of certain aspects of it, which then led to redundant implementation of some fuzz testing and lost time.We'd all be better served to have this closer to the main codebase as a forcing function to smooth out the rough edges, integrate it, and make it a collective artifact and first class citizen IMO.I have similar opinions about the dtest-api.On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:It’s not without hiccups, and I’m sure we have more 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Caleb Rackliffe
> Submodules do have their own overhead and edge cases, so I am mostly in
favor of using for cases where the code must live outside of tree (such as
jvm-dtest that lives out of tree as all branches need the same interfaces)

Agreed. Basically where I've ended up on this topic.

> We could go over some interesting examples such as testing 2i (SAI)

+100


On Wed, May 24, 2023 at 1:40 PM Alex Petrov  wrote:

> > I'm about to need to harry test for the paging across tombstone work for
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my
> own overlapping fuzzing came in). In the process, I'll see if I can't
> distill something really simple along the lines of how React approaches it (
> https://react.dev/learn).
>
> We can pick that up as an example, sure.
>
> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
>
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry
> workshop,
>
> I'm about to need to harry test for the paging across tombstone work for
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my
> own overlapping fuzzing came in). In the process, I'll see if I can't
> distill something really simple along the lines of how React approaches it (
> https://react.dev/learn).
>
> Ideally we'd be able to get something together that's a high level "In the
> next 15 minutes, you will know and understand A-G and have access to N% of
> the power of harry" kind of offer.
>
> Honestly, there's a *lot* in our ecosystem where we could benefit from
> taking a page from their book in terms of onboarding and getting started
> IMO.
>
> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
>
> > I wonder if a mini-onboarding session would be good as a community
> session - go over Harry, how to run it, how to add a test?  Would that be
> the right venue?  I just would like to see how we can not only plug it in
> to regular CI but get everyone that wants to add a test be able to know how
> to get started with it.
>
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry
> workshop, but unfortunately it got declined. Goes without saying, we can
> still do it online, time and resources permitting. But again, I do not
> think it should be barring us from making Harry a part of the codebase, as
> it already is. In fact, we can be iterating on the development quicker
> having it in-tree.
>
> We could go over some interesting examples such as testing 2i (SAI),
> modelling Group By tests, or testing repair. If there is enough appetite
> and collaboration in the community, I will see if we can pull something
> like that together. Input on _what_ you would like to see / hear / tested
> is also appreciated. Harry was developed out of a strong need for
> large-scale testing, which also has informed many of its APIs, but we can
> make it easier to access for interactive testing / unit tests. We have been
> doing a lot of that with Transactional Metadata, too.
>
> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any
> thoughts here?
>
> Yes, sorry for not responding on this thread earlier. I can not understate
> how excited I am about this, and how important I think this is. Time
> constraints are somehow hard to overcome, but I hope the results brought by
> TCM will make it all worth it.
>
> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>
> I think pulling Harry into the tree will make adoption easier for the
> folks. I have been a bit swamped with Transactional Metadata work, but I
> wanted to make some of the things we were using for testing TCM available
> outside of TCM branch. This includes a bunch of helper methods to perform
> operations on the clusters, data generation, and more useful stuff. Of
> course, the question always remains about how much time I want to spend
> porting it all to Gossip, but I think we can find a reasonable compromise.
>
> I would not set this improvement as a prerequisite to pulling Harry into
> the main branch, but rather interpret it as a commitment from myself to
> take community input and make it more approachable by the day.
>
> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
>
> importantly it’s a million times better than the dtest-api process - which
> stymies development due to the friction.
>
> This is my major concern.
>
> What prompted this thread was harry being external to the core codebase
> and the lack of adoption and usage of it having led to atrophy of certain
> aspects of it, which then led to redundant implementation of some fuzz
> testing and lost time.
>
> We'd all be better served to have this closer to the main codebase as a
> forcing function to smooth out the rough edges, integrate it, and make it a
> collective artifact and first class citizen IMO.
>
> I have similar opinions about the dtest-api.
>
>
> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
>
>
> It’s not without hiccups, and I’m sure we have more to learn. But it
> mostly just works, and importantly it’s a 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Alex Petrov
> I'm about to need to harry test for the paging across tombstone work for 
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own 
> overlapping fuzzing came in). In the process, I'll see if I can't distill 
> something really simple along the lines of how React approaches it 
> (https://react.dev/learn).

We can pick that up as an example, sure. 

On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,
> I'm about to need to harry test for the paging across tombstone work for 
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own 
> overlapping fuzzing came in). In the process, I'll see if I can't distill 
> something really simple along the lines of how React approaches it 
> (https://react.dev/learn).
> 
> Ideally we'd be able to get something together that's a high level "In the 
> next 15 minutes, you will know and understand A-G and have access to N% of 
> the power of harry" kind of offer.
> 
> Honestly, there's a *lot* in our ecosystem where we could benefit from taking 
> a page from their book in terms of onboarding and getting started IMO.
> 
> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
>> > I wonder if a mini-onboarding session would be good as a community session 
>> > - go over Harry, how to run it, how to add a test?  Would that be the 
>> > right venue?  I just would like to see how we can not only plug it in to 
>> > regular CI but get everyone that wants to add a test be able to know how 
>> > to get started with it.
>> 
>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, 
>> but unfortunately it got declined. Goes without saying, we can still do it 
>> online, time and resources permitting. But again, I do not think it should 
>> be barring us from making Harry a part of the codebase, as it already is. In 
>> fact, we can be iterating on the development quicker having it in-tree. 
>> 
>> We could go over some interesting examples such as testing 2i (SAI), 
>> modelling Group By tests, or testing repair. If there is enough appetite and 
>> collaboration in the community, I will see if we can pull something like 
>> that together. Input on _what_ you would like to see / hear / tested is also 
>> appreciated. Harry was developed out of a strong need for large-scale 
>> testing, which also has informed many of its APIs, but we can make it easier 
>> to access for interactive testing / unit tests. We have been doing a lot of 
>> that with Transactional Metadata, too. 
>> 
>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>> > thoughts here?
>> 
>> Yes, sorry for not responding on this thread earlier. I can not understate 
>> how excited I am about this, and how important I think this is. Time 
>> constraints are somehow hard to overcome, but I hope the results brought by 
>> TCM will make it all worth it.
>> 
>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>>> I think pulling Harry into the tree will make adoption easier for the 
>>> folks. I have been a bit swamped with Transactional Metadata work, but I 
>>> wanted to make some of the things we were using for testing TCM available 
>>> outside of TCM branch. This includes a bunch of helper methods to perform 
>>> operations on the clusters, data generation, and more useful stuff. Of 
>>> course, the question always remains about how much time I want to spend 
>>> porting it all to Gossip, but I think we can find a reasonable compromise. 
>>> 
>>> I would not set this improvement as a prerequisite to pulling Harry into 
>>> the main branch, but rather interpret it as a commitment from myself to 
>>> take community input and make it more approachable by the day. 
>>> 
>>> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
> importantly it’s a million times better than the dtest-api process - 
> which stymies development due to the friction.
 This is my major concern.
 
 What prompted this thread was harry being external to the core codebase 
 and the lack of adoption and usage of it having led to atrophy of certain 
 aspects of it, which then led to redundant implementation of some fuzz 
 testing and lost time.
 
 We'd all be better served to have this closer to the main codebase as a 
 forcing function to smooth out the rough edges, integrate it, and make it 
 a collective artifact and first class citizen IMO.
 
 I have similar opinions about the dtest-api.
 
 
 On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
> 
> It’s not without hiccups, and I’m sure we have more to learn. But it 
> mostly just works, and importantly it’s a million times better than the 
> dtest-api process - which stymies development due to the friction.
> 
>> On 24 May 2023, at 08:39, Mick Semb Wever  wrote:
>> 
>> 
>> WRT git submodules and CASSANDRA-18204, are we 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Mick Semb Wever
>
> So looking at accord trunk, we needed 12 votes for a release, and each
> vote is a min of 3 days, so 36 days of overhead vs 5 hours of work?
>


That's apples and oranges (wait time vs effort time).  I was most
interested in (and supportive of) your qualified opinion :-)




> One thing that can be annoying is for people who don’t use work trees and
> switch between trunk and cassandra-4.x in the same directory… I am not sure
> if the issues here are my scripts, or git getting confused…. If you use
> work trees (I strongly recommend regardless of submodules or not) you don’t
> have these issues (my disk layout is below [1]).
>


I'm wondering if we should wait until our first git submodule is in trunk,
so the process gets more exposure, before adding our second ?
It's a bit of a pita to get rid of warnings/errors from bad hooks.


> Submodules do have their own overhead and edge cases, so I am mostly in
> favor of using for cases where the code must live outside of tree (such as
> jvm-dtest that lives out of tree as all branches need the same interfaces)
>
>

Agree. If it makes sense it would be better to just bring the code in.


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread David Capwell
> The time spent on getting that running has been a fair few hours, where we 
> could have cut many manual module releases in that time. 

We spent a few hours getting submodules working, and we no longer need to 
release for every single commit…

$ git log b9025e59395f47535e4ed1fec20b1186cdb07db8..HEAD | grep 'commit ' | wc 
-l
  12

So looking at accord trunk, we needed 12 votes for a release, and each vote is 
a min of 3 days, so 36 days of overhead vs 5 hours of work?

There are some hiccups, but this is mostly in the “never did this before, how 
do I setup” case, so something that prob can be improved… once you do your 
first patch the issues kinda go away.  

One thing that can be annoying is for people who don’t use work trees and 
switch between trunk and cassandra-4.x in the same directory… I am not sure if 
the issues here are my scripts, or git getting confused…. If you use work trees 
(I strongly recommend regardless of submodules or not) you don’t have these 
issues (my disk layout is below [1]).


> I'd like to discuss bringing cassandra-harry in-tree as a submodule

For accord, the main reason to keep it out of tree was to allow other projects 
to use the library (similar to RAFT libraries that exist for projects to use), 
but my mental model for Harry is that most of the code is Cassandra specific 
(models, converting timestamps to Cassandra data, etc.), so wondering if it 
makes sense in its own repo vs being in trunk directly?  Submodules do have 
their own overhead and edge cases, so I am mostly in favor of using for cases 
where the code must live outside of tree (such as jvm-dtest that lives out of 
tree as all branches need the same interfaces)



[1] I have a single git repo and use git worktrees to keep each branch in a 
isolated directory (this avoids the .git overhead in every directory)… my 
layout is

$ ls
3.0 3.114.0 4.1 cep-15-accord   
prs trunk
$ ls prs
prs:
4.1 cep-15-accord   trunk


> On May 24, 2023, at 7:53 AM, Josh McKenzie  wrote:
> 
>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,
> I'm about to need to harry test for the paging across tombstone work for 
> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own 
> overlapping fuzzing came in). In the process, I'll see if I can't distill 
> something really simple along the lines of how React approaches it 
> (https://react.dev/learn).
> 
> Ideally we'd be able to get something together that's a high level "In the 
> next 15 minutes, you will know and understand A-G and have access to N% of 
> the power of harry" kind of offer.
> 
> Honestly, there's a lot in our ecosystem where we could benefit from taking a 
> page from their book in terms of onboarding and getting started IMO.
> 
> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
>> > I wonder if a mini-onboarding session would be good as a community session 
>> > - go over Harry, how to run it, how to add a test?  Would that be the 
>> > right venue?  I just would like to see how we can not only plug it in to 
>> > regular CI but get everyone that wants to add a test be able to know how 
>> > to get started with it.
>> 
>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, 
>> but unfortunately it got declined. Goes without saying, we can still do it 
>> online, time and resources permitting. But again, I do not think it should 
>> be barring us from making Harry a part of the codebase, as it already is. In 
>> fact, we can be iterating on the development quicker having it in-tree. 
>> 
>> We could go over some interesting examples such as testing 2i (SAI), 
>> modelling Group By tests, or testing repair. If there is enough appetite and 
>> collaboration in the community, I will see if we can pull something like 
>> that together. Input on _what_ you would like to see / hear / tested is also 
>> appreciated. Harry was developed out of a strong need for large-scale 
>> testing, which also has informed many of its APIs, but we can make it easier 
>> to access for interactive testing / unit tests. We have been doing a lot of 
>> that with Transactional Metadata, too. 
>> 
>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>> > thoughts here?
>> 
>> Yes, sorry for not responding on this thread earlier. I can not understate 
>> how excited I am about this, and how important I think this is. Time 
>> constraints are somehow hard to overcome, but I hope the results brought by 
>> TCM will make it all worth it.
>> 
>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>>> I think pulling Harry into the tree will make adoption easier for the 
>>> folks. I have been a bit swamped with Transactional Metadata work, but I 
>>> wanted to make some of the things we were using for testing TCM available 
>>> outside of TCM branch. This includes a bunch of helper methods to perform 
>>> operations on 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Josh McKenzie
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,
I'm about to need to harry test for the paging across tombstone work for 
https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own 
overlapping fuzzing came in). In the process, I'll see if I can't distill 
something really simple along the lines of how React approaches it 
(https://react.dev/learn).

Ideally we'd be able to get something together that's a high level "In the next 
15 minutes, you will know and understand A-G and have access to N% of the power 
of harry" kind of offer.

Honestly, there's a *lot* in our ecosystem where we could benefit from taking a 
page from their book in terms of onboarding and getting started IMO.

On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
> > I wonder if a mini-onboarding session would be good as a community session 
> > - go over Harry, how to run it, how to add a test?  Would that be the right 
> > venue?  I just would like to see how we can not only plug it in to regular 
> > CI but get everyone that wants to add a test be able to know how to get 
> > started with it.
> 
> I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, 
> but unfortunately it got declined. Goes without saying, we can still do it 
> online, time and resources permitting. But again, I do not think it should be 
> barring us from making Harry a part of the codebase, as it already is. In 
> fact, we can be iterating on the development quicker having it in-tree. 
> 
> We could go over some interesting examples such as testing 2i (SAI), 
> modelling Group By tests, or testing repair. If there is enough appetite and 
> collaboration in the community, I will see if we can pull something like that 
> together. Input on _what_ you would like to see / hear / tested is also 
> appreciated. Harry was developed out of a strong need for large-scale 
> testing, which also has informed many of its APIs, but we can make it easier 
> to access for interactive testing / unit tests. We have been doing a lot of 
> that with Transactional Metadata, too. 
> 
> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
> > thoughts here?
> 
> Yes, sorry for not responding on this thread earlier. I can not understate 
> how excited I am about this, and how important I think this is. Time 
> constraints are somehow hard to overcome, but I hope the results brought by 
> TCM will make it all worth it.
> 
> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>> I think pulling Harry into the tree will make adoption easier for the folks. 
>> I have been a bit swamped with Transactional Metadata work, but I wanted to 
>> make some of the things we were using for testing TCM available outside of 
>> TCM branch. This includes a bunch of helper methods to perform operations on 
>> the clusters, data generation, and more useful stuff. Of course, the 
>> question always remains about how much time I want to spend porting it all 
>> to Gossip, but I think we can find a reasonable compromise. 
>> 
>> I would not set this improvement as a prerequisite to pulling Harry into the 
>> main branch, but rather interpret it as a commitment from myself to take 
>> community input and make it more approachable by the day. 
>> 
>> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
 importantly it’s a million times better than the dtest-api process - which 
 stymies development due to the friction.
>>> This is my major concern.
>>> 
>>> What prompted this thread was harry being external to the core codebase and 
>>> the lack of adoption and usage of it having led to atrophy of certain 
>>> aspects of it, which then led to redundant implementation of some fuzz 
>>> testing and lost time.
>>> 
>>> We'd all be better served to have this closer to the main codebase as a 
>>> forcing function to smooth out the rough edges, integrate it, and make it a 
>>> collective artifact and first class citizen IMO.
>>> 
>>> I have similar opinions about the dtest-api.
>>> 
>>> 
>>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
 
 It’s not without hiccups, and I’m sure we have more to learn. But it 
 mostly just works, and importantly it’s a million times better than the 
 dtest-api process - which stymies development due to the friction.
 
> On 24 May 2023, at 08:39, Mick Semb Wever  wrote:
> 
> 
> WRT git submodules and CASSANDRA-18204, are we happy with how it is 
> working for accord ? 
> 
> The time spent on getting that running has been a fair few hours, where 
> we could have cut many manual module releases in that time. 
> 
> David and folks working on accord ? 
> 
> 
> 
> On Tue, 23 May 2023 at 20:09, Josh McKenzie  wrote:
>> __
>> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>> thoughts here?
>> 
>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
>>> I think it 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Alex Petrov
> I wonder if a mini-onboarding session would be good as a community session - 
> go over Harry, how to run it, how to add a test?  Would that be the right 
> venue?  I just would like to see how we can not only plug it in to regular CI 
> but get everyone that wants to add a test be able to know how to get started 
> with it.

I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, 
but unfortunately it got declined. Goes without saying, we can still do it 
online, time and resources permitting. But again, I do not think it should be 
barring us from making Harry a part of the codebase, as it already is. In fact, 
we can be iterating on the development quicker having it in-tree. 

We could go over some interesting examples such as testing 2i (SAI), modelling 
Group By tests, or testing repair. If there is enough appetite and 
collaboration in the community, I will see if we can pull something like that 
together. Input on _what_ you would like to see / hear / tested is also 
appreciated. Harry was developed out of a strong need for large-scale testing, 
which also has informed many of its APIs, but we can make it easier to access 
for interactive testing / unit tests. We have been doing a lot of that with 
Transactional Metadata, too. 

> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts 
> here?

Yes, sorry for not responding on this thread earlier. I can not understate how 
excited I am about this, and how important I think this is. Time constraints 
are somehow hard to overcome, but I hope the results brought by TCM will make 
it all worth it.

On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
> I think pulling Harry into the tree will make adoption easier for the folks. 
> I have been a bit swamped with Transactional Metadata work, but I wanted to 
> make some of the things we were using for testing TCM available outside of 
> TCM branch. This includes a bunch of helper methods to perform operations on 
> the clusters, data generation, and more useful stuff. Of course, the question 
> always remains about how much time I want to spend porting it all to Gossip, 
> but I think we can find a reasonable compromise. 
> 
> I would not set this improvement as a prerequisite to pulling Harry into the 
> main branch, but rather interpret it as a commitment from myself to take 
> community input and make it more approachable by the day. 
> 
> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
>>> importantly it’s a million times better than the dtest-api process - which 
>>> stymies development due to the friction.
>> This is my major concern.
>> 
>> What prompted this thread was harry being external to the core codebase and 
>> the lack of adoption and usage of it having led to atrophy of certain 
>> aspects of it, which then led to redundant implementation of some fuzz 
>> testing and lost time.
>> 
>> We'd all be better served to have this closer to the main codebase as a 
>> forcing function to smooth out the rough edges, integrate it, and make it a 
>> collective artifact and first class citizen IMO.
>> 
>> I have similar opinions about the dtest-api.
>> 
>> 
>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
>>> 
>>> It’s not without hiccups, and I’m sure we have more to learn. But it mostly 
>>> just works, and importantly it’s a million times better than the dtest-api 
>>> process - which stymies development due to the friction.
>>> 
 On 24 May 2023, at 08:39, Mick Semb Wever  wrote:
 
 
 WRT git submodules and CASSANDRA-18204, are we happy with how it is 
 working for accord ? 
 
 The time spent on getting that running has been a fair few hours, where we 
 could have cut many manual module releases in that time. 
 
 David and folks working on accord ? 
 
 
 
 On Tue, 23 May 2023 at 20:09, Josh McKenzie  wrote:
> __
> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
> thoughts here?
> 
> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
>> I think it would be great to onboard Harry more officially into the 
>> project.  However it would be nice to perhaps do some sanity checking 
>> outside of Apple folks to see how approachable it is.  That is, can 
>> someone take it and just run it with the current readme without any 
>> additional context?
>> 
>> I wonder if a mini-onboarding session would be good as a community 
>> session - go over Harry, how to run it, how to add a test?  Would that 
>> be the right venue?  I just would like to see how we can not only plug 
>> it in to regular CI but get everyone that wants to add a test be able to 
>> know how to get started with it.
>> 
>> Jeremy
>> 
>>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:
>>> 
>>> Just to make sure I'm understanding the details, this would mean 
>>> apache/cassandra-harry maintains its status as a 

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Brandon Williams
On Wed, May 24, 2023 at 7:45 AM Josh McKenzie  wrote:
> What prompted this thread was harry being external to the core codebase and 
> the lack of adoption and usage of it having led to atrophy of certain aspects 
> of it, which then led to redundant implementation of some fuzz testing and 
> lost time.
>
> We'd all be better served to have this closer to the main codebase as a 
> forcing function to smooth out the rough edges, integrate it, and make it a 
> collective artifact and first class citizen IMO.

This is a convincing argument for me, let's pull Harry in.


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Alex Petrov
I think pulling Harry into the tree will make adoption easier for the folks. I 
have been a bit swamped with Transactional Metadata work, but I wanted to make 
some of the things we were using for testing TCM available outside of TCM 
branch. This includes a bunch of helper methods to perform operations on the 
clusters, data generation, and more useful stuff. Of course, the question 
always remains about how much time I want to spend porting it all to Gossip, 
but I think we can find a reasonable compromise. 

I would not set this improvement as a prerequisite to pulling Harry into the 
main branch, but rather interpret it as a commitment from myself to take 
community input and make it more approachable by the day. 

On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
>> importantly it’s a million times better than the dtest-api process - which 
>> stymies development due to the friction.
> This is my major concern.
> 
> What prompted this thread was harry being external to the core codebase and 
> the lack of adoption and usage of it having led to atrophy of certain aspects 
> of it, which then led to redundant implementation of some fuzz testing and 
> lost time.
> 
> We'd all be better served to have this closer to the main codebase as a 
> forcing function to smooth out the rough edges, integrate it, and make it a 
> collective artifact and first class citizen IMO.
> 
> I have similar opinions about the dtest-api.
> 
> 
> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
>> 
>> It’s not without hiccups, and I’m sure we have more to learn. But it mostly 
>> just works, and importantly it’s a million times better than the dtest-api 
>> process - which stymies development due to the friction.
>> 
>>> On 24 May 2023, at 08:39, Mick Semb Wever  wrote:
>>> 
>>> 
>>> WRT git submodules and CASSANDRA-18204, are we happy with how it is working 
>>> for accord ? 
>>> 
>>> The time spent on getting that running has been a fair few hours, where we 
>>> could have cut many manual module releases in that time. 
>>> 
>>> David and folks working on accord ? 
>>> 
>>> 
>>> 
>>> On Tue, 23 May 2023 at 20:09, Josh McKenzie  wrote:
 __
 I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
 thoughts here?
 
 On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
> I think it would be great to onboard Harry more officially into the 
> project.  However it would be nice to perhaps do some sanity checking 
> outside of Apple folks to see how approachable it is.  That is, can 
> someone take it and just run it with the current readme without any 
> additional context?
> 
> I wonder if a mini-onboarding session would be good as a community 
> session - go over Harry, how to run it, how to add a test?  Would that be 
> the right venue?  I just would like to see how we can not only plug it in 
> to regular CI but get everyone that wants to add a test be able to know 
> how to get started with it.
> 
> Jeremy
> 
>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:
>> 
>> Just to make sure I'm understanding the details, this would mean 
>> apache/cassandra-harry maintains its status as a separate repository, 
>> apache/cassandra references it as a submodule, and clones and builds 
>> Harry locally, rather than pulling a released JAR. We can then reference 
>> Harry as a library without maintaining public artifacts for it. Is that 
>> in line with what you're thinking?
>> 
>> > I'd also like to see us get a Harry run integrated as part of our 
>> > pre-commit CI
>> 
>> I'm a strong supporter of this, of course.
>> 
>>> On May 16, 2023, at 11:03 AM, Josh McKenzie  
>>> wrote:
>>> 
>>> Similar to what we've done with accord in 
>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to 
>>> discuss bringing cassandra-harry in-tree as a submodule. repo link: 
>>> https://github.com/apache/cassandra-harry
>>> 
>>> Given the value it's brought to the project's stabilization efforts and 
>>> the movement of other things in the ecosystem to being more integrated 
>>> (accord, build-scripts 
>>> https://issues.apache.org/jira/browse/CASSANDRA-18133), I think having 
>>> the testing framework better localized and integrated would be a net 
>>> benefit for adoption, awareness, maintenance, and tighter workflows as 
>>> we troubleshoot future failures it surfaces.
>>> 
>>> I'd also like to see us get a Harry run integrated as part of our 
>>> pre-commit CI (a 5 minute simple soak test for instance) and having 
>>> that local in this fashion should make that a cleaner integration as 
>>> well.
>>> 
>>> Thoughts?
 
> 


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Josh McKenzie
> importantly it’s a million times better than the dtest-api process - which 
> stymies development due to the friction.
This is my major concern.

What prompted this thread was harry being external to the core codebase and the 
lack of adoption and usage of it having led to atrophy of certain aspects of 
it, which then led to redundant implementation of some fuzz testing and lost 
time.

We'd all be better served to have this closer to the main codebase as a forcing 
function to smooth out the rough edges, integrate it, and make it a collective 
artifact and first class citizen IMO.

I have similar opinions about the dtest-api.


On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
> 
> It’s not without hiccups, and I’m sure we have more to learn. But it mostly 
> just works, and importantly it’s a million times better than the dtest-api 
> process - which stymies development due to the friction.
> 
>> On 24 May 2023, at 08:39, Mick Semb Wever  wrote:
>> 
>> 
>> WRT git submodules and CASSANDRA-18204, are we happy with how it is working 
>> for accord ? 
>> 
>> The time spent on getting that running has been a fair few hours, where we 
>> could have cut many manual module releases in that time. 
>> 
>> David and folks working on accord ? 
>> 
>> 
>> 
>> On Tue, 23 May 2023 at 20:09, Josh McKenzie  wrote:
>>> __
>>> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>>> thoughts here?
>>> 
>>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
 I think it would be great to onboard Harry more officially into the 
 project.  However it would be nice to perhaps do some sanity checking 
 outside of Apple folks to see how approachable it is.  That is, can 
 someone take it and just run it with the current readme without any 
 additional context?
 
 I wonder if a mini-onboarding session would be good as a community session 
 - go over Harry, how to run it, how to add a test?  Would that be the 
 right venue?  I just would like to see how we can not only plug it in to 
 regular CI but get everyone that wants to add a test be able to know how 
 to get started with it.
 
 Jeremy
 
> On May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:
> 
> Just to make sure I'm understanding the details, this would mean 
> apache/cassandra-harry maintains its status as a separate repository, 
> apache/cassandra references it as a submodule, and clones and builds 
> Harry locally, rather than pulling a released JAR. We can then reference 
> Harry as a library without maintaining public artifacts for it. Is that 
> in line with what you're thinking?
> 
> > I'd also like to see us get a Harry run integrated as part of our 
> > pre-commit CI
> 
> I'm a strong supporter of this, of course.
> 
>> On May 16, 2023, at 11:03 AM, Josh McKenzie  wrote:
>> 
>> Similar to what we've done with accord in 
>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to 
>> discuss bringing cassandra-harry in-tree as a submodule. repo link: 
>> https://github.com/apache/cassandra-harry
>> 
>> Given the value it's brought to the project's stabilization efforts and 
>> the movement of other things in the ecosystem to being more integrated 
>> (accord, build-scripts 
>> https://issues.apache.org/jira/browse/CASSANDRA-18133), I think having 
>> the testing framework better localized and integrated would be a net 
>> benefit for adoption, awareness, maintenance, and tighter workflows as 
>> we troubleshoot future failures it surfaces.
>> 
>> I'd also like to see us get a Harry run integrated as part of our 
>> pre-commit CI (a 5 minute simple soak test for instance) and having that 
>> local in this fashion should make that a cleaner integration as well.
>> 
>> Thoughts?
>>> 


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Benedict
It’s not without hiccups, and I’m sure we have more to learn. But it mostly just works, and importantly it’s a million times better than the dtest-api process - which stymies development due to the friction.On 24 May 2023, at 08:39, Mick Semb Wever  wrote:WRT git submodules and CASSANDRA-18204, are we happy with how it is working for accord ? The time spent on getting that running has been a fair few hours, where we could have cut many manual module releases in that time. David and folks working on accord ? On Tue, 23 May 2023 at 20:09, Josh McKenzie  wrote:I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:I think it would be great to onboard Harry more officially into the project.  However it would be nice to perhaps do some sanity checking outside of Apple folks to see how approachable it is.  That is, can someone take it and just run it with the current readme without any additional context?I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test?  Would that be the right venue?  I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.JeremyOn May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:Just to make sure I'm understanding the details, this would mean apache/cassandra-harry maintains its status as a separate repository, apache/cassandra references it as a submodule, and clones and builds Harry locally, rather than pulling a released JAR. We can then reference Harry as a library without maintaining public artifacts for it. Is that in line with what you're thinking?> I'd also like to see us get a Harry run integrated as part of our pre-commit CII'm a strong supporter of this, of course.On May 16, 2023, at 11:03 AM, Josh McKenzie  wrote:Similar to what we've done with accord in https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss bringing cassandra-harry in-tree as a submodule. repo link: https://github.com/apache/cassandra-harryGiven the value it's brought to the project's stabilization efforts and the movement of other things in the ecosystem to being more integrated (accord, build-scripts https://issues.apache.org/jira/browse/CASSANDRA-18133), I think having the testing framework better localized and integrated would be a net benefit for adoption, awareness, maintenance, and tighter workflows as we troubleshoot future failures it surfaces.I'd also like to see us get a Harry run integrated as part of our pre-commit CI (a 5 minute simple soak test for instance) and having that local in this fashion should make that a cleaner integration as well.Thoughts?


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Mick Semb Wever
WRT git submodules and CASSANDRA-18204, are we happy with how it is working
for accord ?

The time spent on getting that running has been a fair few hours, where we
could have cut many manual module releases in that time.

David and folks working on accord ?



On Tue, 23 May 2023 at 20:09, Josh McKenzie  wrote:

> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any
> thoughts here?
>
> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
>
> I think it would be great to onboard Harry more officially into the
> project.  However it would be nice to perhaps do some sanity checking
> outside of Apple folks to see how approachable it is.  That is, can someone
> take it and just run it with the current readme without any additional
> context?
>
> I wonder if a mini-onboarding session would be good as a community session
> - go over Harry, how to run it, how to add a test?  Would that be the right
> venue?  I just would like to see how we can not only plug it in to regular
> CI but get everyone that wants to add a test be able to know how to get
> started with it.
>
> Jeremy
>
> On May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:
>
> Just to make sure I'm understanding the details, this would mean
> apache/cassandra-harry maintains its status as a separate repository,
> apache/cassandra references it as a submodule, and clones and builds Harry
> locally, rather than pulling a released JAR. We can then reference Harry as
> a library without maintaining public artifacts for it. Is that in line with
> what you're thinking?
>
> > I'd also like to see us get a Harry run integrated as part of our
> pre-commit CI
>
> I'm a strong supporter of this, of course.
>
> On May 16, 2023, at 11:03 AM, Josh McKenzie  wrote:
>
> Similar to what we've done with accord in
> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to
> discuss bringing cassandra-harry in-tree as a submodule. repo link:
> https://github.com/apache/cassandra-harry
>
> Given the value it's brought to the project's stabilization efforts and
> the movement of other things in the ecosystem to being more integrated
> (accord, build-scripts
> https://issues.apache.org/jira/browse/CASSANDRA-18133), I think having
> the testing framework better localized and integrated would be a net
> benefit for adoption, awareness, maintenance, and tighter workflows as we
> troubleshoot future failures it surfaces.
>
> I'd also like to see us get a Harry run integrated as part of our
> pre-commit CI (a 5 minute simple soak test for instance) and having that
> local in this fashion should make that a cleaner integration as well.
>
> Thoughts?
>
>
>


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-23 Thread Josh McKenzie
I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts 
here?

On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
> I think it would be great to onboard Harry more officially into the project.  
> However it would be nice to perhaps do some sanity checking outside of Apple 
> folks to see how approachable it is.  That is, can someone take it and just 
> run it with the current readme without any additional context?
> 
> I wonder if a mini-onboarding session would be good as a community session - 
> go over Harry, how to run it, how to add a test?  Would that be the right 
> venue?  I just would like to see how we can not only plug it in to regular CI 
> but get everyone that wants to add a test be able to know how to get started 
> with it.
> 
> Jeremy
> 
>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:
>> 
>> Just to make sure I'm understanding the details, this would mean 
>> apache/cassandra-harry maintains its status as a separate repository, 
>> apache/cassandra references it as a submodule, and clones and builds Harry 
>> locally, rather than pulling a released JAR. We can then reference Harry as 
>> a library without maintaining public artifacts for it. Is that in line with 
>> what you're thinking?
>> 
>> > I'd also like to see us get a Harry run integrated as part of our 
>> > pre-commit CI
>> 
>> I'm a strong supporter of this, of course.
>> 
>>> On May 16, 2023, at 11:03 AM, Josh McKenzie  wrote:
>>> 
>>> Similar to what we've done with accord in 
>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss 
>>> bringing cassandra-harry in-tree as a submodule. repo link: 
>>> https://github.com/apache/cassandra-harry
>>> 
>>> Given the value it's brought to the project's stabilization efforts and the 
>>> movement of other things in the ecosystem to being more integrated (accord, 
>>> build-scripts https://issues.apache.org/jira/browse/CASSANDRA-18133), I 
>>> think having the testing framework better localized and integrated would be 
>>> a net benefit for adoption, awareness, maintenance, and tighter workflows 
>>> as we troubleshoot future failures it surfaces.
>>> 
>>> I'd also like to see us get a Harry run integrated as part of our 
>>> pre-commit CI (a 5 minute simple soak test for instance) and having that 
>>> local in this fashion should make that a cleaner integration as well.
>>> 
>>> Thoughts?


Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-16 Thread Jeremy Hanna
I think it would be great to onboard Harry more officially into the project.  
However it would be nice to perhaps do some sanity checking outside of Apple 
folks to see how approachable it is.  That is, can someone take it and just run 
it with the current readme without any additional context?

I wonder if a mini-onboarding session would be good as a community session - go 
over Harry, how to run it, how to add a test?  Would that be the right venue?  
I just would like to see how we can not only plug it in to regular CI but get 
everyone that wants to add a test be able to know how to get started with it.

Jeremy

> On May 16, 2023, at 1:34 PM, Abe Ratnofsky  wrote:
> 
> Just to make sure I'm understanding the details, this would mean 
> apache/cassandra-harry maintains its status as a separate repository, 
> apache/cassandra references it as a submodule, and clones and builds Harry 
> locally, rather than pulling a released JAR. We can then reference Harry as a 
> library without maintaining public artifacts for it. Is that in line with 
> what you're thinking?
> 
> > I'd also like to see us get a Harry run integrated as part of our 
> > pre-commit CI
> 
> I'm a strong supporter of this, of course.
> 
>> On May 16, 2023, at 11:03 AM, Josh McKenzie  wrote:
>> 
>> Similar to what we've done with accord in 
>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss 
>> bringing cassandra-harry in-tree as a submodule. repo link: 
>> https://github.com/apache/cassandra-harry
>> 
>> Given the value it's brought to the project's stabilization efforts and the 
>> movement of other things in the ecosystem to being more integrated (accord, 
>> build-scripts https://issues.apache.org/jira/browse/CASSANDRA-18133), I 
>> think having the testing framework better localized and integrated would be 
>> a net benefit for adoption, awareness, maintenance, and tighter workflows as 
>> we troubleshoot future failures it surfaces.
>> 
>> I'd also like to see us get a Harry run integrated as part of our pre-commit 
>> CI (a 5 minute simple soak test for instance) and having that local in this 
>> fashion should make that a cleaner integration as well.
>> 
>> Thoughts?
> 



Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-16 Thread Abe Ratnofsky
Just to make sure I'm understanding the details, this would mean 
apache/cassandra-harry maintains its status as a separate repository, 
apache/cassandra references it as a submodule, and clones and builds Harry 
locally, rather than pulling a released JAR. We can then reference Harry as a 
library without maintaining public artifacts for it. Is that in line with what 
you're thinking?

> I'd also like to see us get a Harry run integrated as part of our pre-commit 
> CI

I'm a strong supporter of this, of course.

> On May 16, 2023, at 11:03 AM, Josh McKenzie  wrote:
> 
> Similar to what we've done with accord in 
> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss 
> bringing cassandra-harry in-tree as a submodule. repo link: 
> https://github.com/apache/cassandra-harry
> 
> Given the value it's brought to the project's stabilization efforts and the 
> movement of other things in the ecosystem to being more integrated (accord, 
> build-scripts https://issues.apache.org/jira/browse/CASSANDRA-18133), I think 
> having the testing framework better localized and integrated would be a net 
> benefit for adoption, awareness, maintenance, and tighter workflows as we 
> troubleshoot future failures it surfaces.
> 
> I'd also like to see us get a Harry run integrated as part of our pre-commit 
> CI (a 5 minute simple soak test for instance) and having that local in this 
> fashion should make that a cleaner integration as well.
> 
> Thoughts?