Bumping into worktree + submodule pain on some harry related work; it looks like "git worktree" and submodules are not currently fully implemented:
https://git-scm.com/docs/git-worktree#_bugs BUGS Multiple checkout in general is still experimental, and the support for submodules is incomplete. It is NOT recommended to make multiple checkouts of a superproject. I rely pretty heavily on worktrees and I know a lot of other folks who do too. This is a dealbreaker for me in terms of adding anything else as a submodule and I'd like to know if the accord folks have been running into any worktree related woes w/the accord integration. On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote: > Regarding approachability, one of the things I thought is worth adding is a > DSL. I feel like there's enough functionality in Harry and there's enough > information for anyone who needs to write even an involved test out there, > but adoption doesn't usually start with complex use-cases, so it could be > that making it extremely simple to generate the data and validating that > written data is where it's supposed to be, should help adoption a lot. > Unfortunately, more complex use-cases such as group-by support, or SAI > testing will require a bit more knowledge and writing an involved model, so I > do not see any shortcuts we can take here. > > > I do think that moving Harry in-tree would improve approachability > > I think it's similar as it is with in-jvm dtest api. I feel like we wold > evolve it more actively if we didn't have to cut a release before every > commit. In other words, I think that changing Harry code and extending > functionality will be easier, which I think will eventually lead to quicker > adoption. But of course the act of moving itself does not increase adoption, > it just comes from better ergonomics. > > > On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote: >> I'm seeing a few distinct topics here: >> >> 1. Harry's adoption and approachability >> >> I agree that approachability is one of Harry's main improvement areas right >> now. If our goal is to produce a fuzz testing framework for the Cassandra >> project, then adoption by contributors and usage for new feature development >> are reasonable indicators for whether we're achieving that goal. If Harry is >> not getting adopted by contributors outside of Apple, and is not getting >> used for new feature development, then we should make an effort to >> understand why. I don't think that a several-hour seminar is the best point >> of leverage to achieve those goals. >> >> Here's what I think we do need: >> >> - The README should be understandable by anyone interested in writing a fuzz >> test >> - Example tests should be runnable from a fresh clone of Cassandra, in an >> IDE or on the command line >> - Examples of how we would test new features (like CEP-7, CEP-29, etc) with >> the fuzz testing framework >> >> I find the JVM dtest framework accomplishes similar goals, and one reason is >> because there are plenty of examples, and it's relatively easy to copy and >> paste one example and have it do what you'd like. I believe the same >> approach would work for a fuzz testing framework. >> >> Some of these tasks above are already done for Harry, such as better IDE >> support for samples. This will be available in OSS Harry shortly. >> >> 2. Moving Harry in-tree vs. in submodule >> >> As I understand it, making Harry a submodule of Cassandra would make it >> easier to deal with versioning, since we wouldn't have to do the entire >> release dance we need to do for dtest-api, but I don't see this as a big >> improvement to approachability. >> >> I do think that moving Harry in-tree would improve approachability, for the >> same reason as the JVM dtests. It's nice to write a feature or fix, find a >> similar JVM dtest, copy, paste, and edit, and have something useful. >> >> 3. General subdivision of Cassandra projects >> >> This topic has come up quite a few times recently - around shared utilities >> (CEP-10 concurrency primitives, etc), dtest-api, query parser, etc. The >> project has tried out a few different approaches on composition of separate >> projects. Hopefully in the near future we find the one that works best and >> can start this process of splitting out libraries. >> >> -- >> Abe >> >>> On May 25, 2023, at 6:36 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>>> I would really like us to split out utilities into a common project >>> +1 to the sentiment. >>> >>> Would also advocate strongly for it being more tightly integrated with the >>> base project than what we've been doing with our ecosystem (i.e. completely >>> separate projects, not submodules), mostly from a discoverability and >>> workflow standpoint. >>> >>> I'm definitely salty about having to have 4 IDE's / projects open just to >>> work on the entire stack. >>> >>> On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote: >>>> This was not a talk, but rather an interactive workshop, unfortunately >>>> will not work in a recorded way, but I am trying to work out ways to >>>> preserve this. >>>> >>>> On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote: >>>>> Since the talk was not accepted for Cassandra Summit, would it be >>>>> possible to record it as a simple youtube video and publish it so that >>>>> the detailed information about how to use Harry is not lost? >>>>> >>>>> On Thu, May 25, 2023 at 7:36 AM Alex Petrov <al...@coffeenco.de> wrote: >>>>>> __ >>>>>> While we are at it, we may also want to pull the in-jvm dtest API as a >>>>>> submodule, and actually move some tests that are common between the >>>>>> branches there. >>>>>> >>>>>> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote: >>>>>>> Isn’t the other reason Accord works well as a submodule that it has no >>>>>>> dependencies on C* proper? Harry does at the moment, right? (Not that >>>>>>> we couldn’t address that…just trying to think this through…) >>>>>>> >>>>>>>> On May 24, 2023, at 6:54 PM, Benedict <bened...@apache.org> wrote: >>>>>>>> >>>>>>>> >>>>>>>> In this case Harry is a testing module - it’s not something we will >>>>>>>> develop in tandem with C* releases, and we will want improvements to >>>>>>>> be applied across all branches. >>>>>>>> >>>>>>>> So it seems a natural fit for submodules to me. >>>>>>>> >>>>>>>> >>>>>>>>> On 24 May 2023, at 21:09, Caleb Rackliffe <calebrackli...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> > Submodules do have their own overhead and edge cases, so I am >>>>>>>>> > mostly in favor of using for cases where the code must live outside >>>>>>>>> > of tree (such as jvm-dtest that lives out of tree as all branches >>>>>>>>> > need the same interfaces) >>>>>>>>> >>>>>>>>> Agreed. Basically where I've ended up on this topic. >>>>>>>>> >>>>>>>>> > We could go over some interesting examples such as testing 2i (SAI) >>>>>>>>> >>>>>>>>> +100 >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov <al...@coffeenco.de> >>>>>>>>> wrote: >>>>>>>>>> __ >>>>>>>>>> > I'm about to need to harry test for the paging across tombstone >>>>>>>>>> > work for https://issues.apache.org/jira/browse/CASSANDRA-18424 >>>>>>>>>> > (that's where my own overlapping fuzzing came in). In the process, >>>>>>>>>> > I'll see if I can't distill something really simple along the >>>>>>>>>> > lines of how React approaches it (https://react.dev/learn). >>>>>>>>>> >>>>>>>>>> We can pick that up as an example, sure. >>>>>>>>>> >>>>>>>>>> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote: >>>>>>>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry >>>>>>>>>>>> workshop, >>>>>>>>>>> I'm about to need to harry test for the paging across tombstone >>>>>>>>>>> work for https://issues.apache.org/jira/browse/CASSANDRA-18424 >>>>>>>>>>> (that's where my own overlapping fuzzing came in). In the process, >>>>>>>>>>> I'll see if I can't distill something really simple along the lines >>>>>>>>>>> of how React approaches it (https://react.dev/learn). >>>>>>>>>>> >>>>>>>>>>> Ideally we'd be able to get something together that's a high level >>>>>>>>>>> "In the next 15 minutes, you will know and understand A-G and have >>>>>>>>>>> access to N% of the power of harry" kind of offer. >>>>>>>>>>> >>>>>>>>>>> Honestly, there's a *lot* in our ecosystem where we could benefit >>>>>>>>>>> from taking a page from their book in terms of onboarding and >>>>>>>>>>> getting started IMO. >>>>>>>>>>> >>>>>>>>>>> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote: >>>>>>>>>>>> > I wonder if a mini-onboarding session would be good as a >>>>>>>>>>>> > community session - go over Harry, how to run it, how to add a >>>>>>>>>>>> > test? Would that be the right venue? I just would like to see >>>>>>>>>>>> > how we can not only plug it in to regular CI but get everyone >>>>>>>>>>>> > that wants to add a test be able to know how to get started with >>>>>>>>>>>> > it. >>>>>>>>>>>> >>>>>>>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry >>>>>>>>>>>> workshop, but unfortunately it got declined. Goes without saying, >>>>>>>>>>>> we can still do it online, time and resources permitting. But >>>>>>>>>>>> again, I do not think it should be barring us from making Harry a >>>>>>>>>>>> part of the codebase, as it already is. In fact, we can be >>>>>>>>>>>> iterating on the development quicker having it in-tree. >>>>>>>>>>>> >>>>>>>>>>>> We could go over some interesting examples such as testing 2i >>>>>>>>>>>> (SAI), modelling Group By tests, or testing repair. If there is >>>>>>>>>>>> enough appetite and collaboration in the community, I will see if >>>>>>>>>>>> we can pull something like that together. Input on _what_ you >>>>>>>>>>>> would like to see / hear / tested is also appreciated. Harry was >>>>>>>>>>>> developed out of a strong need for large-scale testing, which also >>>>>>>>>>>> has informed many of its APIs, but we can make it easier to access >>>>>>>>>>>> for interactive testing / unit tests. We have been doing a lot of >>>>>>>>>>>> that with Transactional Metadata, too. >>>>>>>>>>>> >>>>>>>>>>>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got >>>>>>>>>>>> > any thoughts here? >>>>>>>>>>>> >>>>>>>>>>>> Yes, sorry for not responding on this thread earlier. I can not >>>>>>>>>>>> understate how excited I am about this, and how important I think >>>>>>>>>>>> this is. Time constraints are somehow hard to overcome, but I hope >>>>>>>>>>>> the results brought by TCM will make it all worth it. >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote: >>>>>>>>>>>>> I think pulling Harry into the tree will make adoption easier for >>>>>>>>>>>>> the folks. I have been a bit swamped with Transactional Metadata >>>>>>>>>>>>> work, but I wanted to make some of the things we were using for >>>>>>>>>>>>> testing TCM available outside of TCM branch. This includes a >>>>>>>>>>>>> bunch of helper methods to perform operations on the clusters, >>>>>>>>>>>>> data generation, and more useful stuff. Of course, the question >>>>>>>>>>>>> always remains about how much time I want to spend porting it all >>>>>>>>>>>>> to Gossip, but I think we can find a reasonable compromise. >>>>>>>>>>>>> >>>>>>>>>>>>> I would not set this improvement as a prerequisite to pulling >>>>>>>>>>>>> Harry into the main branch, but rather interpret it as a >>>>>>>>>>>>> commitment from myself to take community input and make it more >>>>>>>>>>>>> approachable by the day. >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote: >>>>>>>>>>>>>>> importantly it’s a million times better than the dtest-api >>>>>>>>>>>>>>> process - which stymies development due to the friction. >>>>>>>>>>>>>> This is my major concern. >>>>>>>>>>>>>> >>>>>>>>>>>>>> What prompted this thread was harry being external to the core >>>>>>>>>>>>>> codebase and the lack of adoption and usage of it having led to >>>>>>>>>>>>>> atrophy of certain aspects of it, which then led to redundant >>>>>>>>>>>>>> implementation of some fuzz testing and lost time. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We'd all be better served to have this closer to the main >>>>>>>>>>>>>> codebase as a forcing function to smooth out the rough edges, >>>>>>>>>>>>>> integrate it, and make it a collective artifact and first class >>>>>>>>>>>>>> citizen IMO. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have similar opinions about the dtest-api. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It’s not without hiccups, and I’m sure we have more to learn. >>>>>>>>>>>>>>> But it mostly just works, and importantly it’s a million times >>>>>>>>>>>>>>> better than the dtest-api process - which stymies development >>>>>>>>>>>>>>> due to the friction. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 24 May 2023, at 08:39, Mick Semb Wever <m...@apache.org> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> WRT git submodules and CASSANDRA-18204, are we happy with how >>>>>>>>>>>>>>>> it is working for accord ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The time spent on getting that running has been a fair few >>>>>>>>>>>>>>>> hours, where we could have cut many manual module releases in >>>>>>>>>>>>>>>> that time. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David and folks working on accord ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, 23 May 2023 at 20:09, Josh McKenzie >>>>>>>>>>>>>>>> <jmcken...@apache.org> wrote: >>>>>>>>>>>>>>>>> __ >>>>>>>>>>>>>>>>> I'll hold off on this until Alex Petrov chimes in. @Alex -> >>>>>>>>>>>>>>>>> got any thoughts here? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote: >>>>>>>>>>>>>>>>>> I think it would be great to onboard Harry more officially >>>>>>>>>>>>>>>>>> into the project. However it would be nice to perhaps do >>>>>>>>>>>>>>>>>> some sanity checking outside of Apple folks to see how >>>>>>>>>>>>>>>>>> approachable it is. That is, can someone take it and just >>>>>>>>>>>>>>>>>> run it with the current readme without any additional >>>>>>>>>>>>>>>>>> context? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I wonder if a mini-onboarding session would be good as a >>>>>>>>>>>>>>>>>> community session - go over Harry, how to run it, how to add >>>>>>>>>>>>>>>>>> a test? Would that be the right venue? I just would like >>>>>>>>>>>>>>>>>> to see how we can not only plug it in to regular CI but get >>>>>>>>>>>>>>>>>> everyone that wants to add a test be able to know how to get >>>>>>>>>>>>>>>>>> started with it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Jeremy >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky <a...@aber.io> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Just to make sure I'm understanding the details, this would >>>>>>>>>>>>>>>>>>> mean apache/cassandra-harry maintains its status as a >>>>>>>>>>>>>>>>>>> separate repository, apache/cassandra references it as a >>>>>>>>>>>>>>>>>>> submodule, and clones and builds Harry locally, rather than >>>>>>>>>>>>>>>>>>> pulling a released JAR. We can then reference Harry as a >>>>>>>>>>>>>>>>>>> library without maintaining public artifacts for it. Is >>>>>>>>>>>>>>>>>>> that in line with what you're thinking? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> > I'd also like to see us get a Harry run integrated as >>>>>>>>>>>>>>>>>>> > part of our pre-commit CI >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm a strong supporter of this, of course. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On May 16, 2023, at 11:03 AM, Josh McKenzie >>>>>>>>>>>>>>>>>>>> <jmcken...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Similar to what we've done with accord in >>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd >>>>>>>>>>>>>>>>>>>> like to discuss bringing cassandra-harry in-tree as a >>>>>>>>>>>>>>>>>>>> submodule. repo link: >>>>>>>>>>>>>>>>>>>> https://github.com/apache/cassandra-harry >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Given the value it's brought to the project's >>>>>>>>>>>>>>>>>>>> stabilization efforts and the movement of other things in >>>>>>>>>>>>>>>>>>>> the ecosystem to being more integrated (accord, >>>>>>>>>>>>>>>>>>>> build-scripts >>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18133), I >>>>>>>>>>>>>>>>>>>> think having the testing framework better localized and >>>>>>>>>>>>>>>>>>>> integrated would be a net benefit for adoption, awareness, >>>>>>>>>>>>>>>>>>>> maintenance, and tighter workflows as we troubleshoot >>>>>>>>>>>>>>>>>>>> future failures it surfaces. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'd also like to see us get a Harry run integrated as part >>>>>>>>>>>>>>>>>>>> of our pre-commit CI (a 5 minute simple soak test for >>>>>>>>>>>>>>>>>>>> instance) and having that local in this fashion should >>>>>>>>>>>>>>>>>>>> make that a cleaner integration as well. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thoughts? >