> Josh, do you see any reports on what isn’t working? I think most people > don’t touch 1% of what git can do… so it might be that 10% is broken but that > no one in our domain actually touches that path? Was changing .gitmodule in harry to point to a branch and git just straight up went out to lunch when I tried to "git submodule update --init --recursive --remote" or any derivation thereof. Reproducing today in a worktree with GIT_TRACE, and it looks like the submodule command is hanging on:
> 16:00:48.253406 git.c:460 trace: built-in: git index-pack > --stdin --fix-thin '--keep=fetch-pack 32955 on Joshuas-MacBook-Pro.local' > --check-self-contained-and-connected > On a whim I just let it run and it finally got unstuck after probably 5+ minutes; this might just be down to me being impatient and the default logging on git being... completely silent. =/ Looks like subsequent runs aren't hanging on that and are hopping right through, so perhaps this a "first run tax" for submodule + worktree. On Thu, Jun 1, 2023, at 2:05 PM, David Capwell wrote: > To be clear, we only use the relative syntax during development and not long > lived feature branches like cep-15-accord; we use https address there. So > when you create a PR you switch to relative paths (if-and-only-if you change > the submodule), then on merge you switch back to https pointing to apache. > So the main issue has been when 2 authors try to work together (such as > during review of a PR) > >> On Jun 1, 2023, at 10:15 AM, David Capwell <dcapw...@apple.com> wrote: >> >> Most edge cases we have seen in Accord are working with feature branches >> from other authors where we use relative paths to make sure the git@ vs >> https:// doesn’t become a problem for CI (submodule points to https:// to >> work in CI, but if you do that during feature development it gets annoying >> to push to GitHub… so we do ../cassandra-accord.git so git respects w/e >> protocol you are using). In 1-2 peoples environments, when they checked out >> another authors logic the C* remote was correct, but the Accord one was >> still pointing to Apache (which doesn’t have the feature branch)…. This is >> trivial to fix, and might be a bug with our git hooks…. But still calling >> out as it has been an issue. >> >> Josh, do you see any reports on what isn’t working? I think most people >> don’t touch 1% of what git can do… so it might be that 10% is broken but >> that no one in our domain actually touches that path? >> >>> On May 31, 2023, at 12:36 PM, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>> Bumping into worktree + submodule pain on some harry related work; it looks >>> like "git worktree" and submodules are not currently fully implemented: >>> >>> https://git-scm.com/docs/git-worktree#_bugs >>> BUGS >>> >>> Multiple checkout in general is still experimental, and the support for >>> submodules is incomplete. It is NOT recommended to make multiple checkouts >>> of a superproject. >>> >>> I rely pretty heavily on worktrees and I know a lot of other folks who do >>> too. This is a dealbreaker for me in terms of adding anything else as a >>> submodule and I'd like to know if the accord folks have been running into >>> any worktree related woes w/the accord integration. >>> >>> >>> On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote: >>>> Regarding approachability, one of the things I thought is worth adding is >>>> a DSL. I feel like there's enough functionality in Harry and there's >>>> enough information for anyone who needs to write even an involved test out >>>> there, but adoption doesn't usually start with complex use-cases, so it >>>> could be that making it extremely simple to generate the data and >>>> validating that written data is where it's supposed to be, should help >>>> adoption a lot. Unfortunately, more complex use-cases such as group-by >>>> support, or SAI testing will require a bit more knowledge and writing an >>>> involved model, so I do not see any shortcuts we can take here. >>>> >>>> > I do think that moving Harry in-tree would improve approachability >>>> >>>> I think it's similar as it is with in-jvm dtest api. I feel like we wold >>>> evolve it more actively if we didn't have to cut a release before every >>>> commit. In other words, I think that changing Harry code and extending >>>> functionality will be easier, which I think will eventually lead to >>>> quicker adoption. But of course the act of moving itself does not increase >>>> adoption, it just comes from better ergonomics. >>>> >>>> >>>> On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote: >>>>> I'm seeing a few distinct topics here: >>>>> >>>>> 1. Harry's adoption and approachability >>>>> >>>>> I agree that approachability is one of Harry's main improvement areas >>>>> right now. If our goal is to produce a fuzz testing framework for the >>>>> Cassandra project, then adoption by contributors and usage for new >>>>> feature development are reasonable indicators for whether we're achieving >>>>> that goal. If Harry is not getting adopted by contributors outside of >>>>> Apple, and is not getting used for new feature development, then we >>>>> should make an effort to understand why. I don't think that a >>>>> several-hour seminar is the best point of leverage to achieve those goals. >>>>> >>>>> Here's what I think we do need: >>>>> >>>>> - The README should be understandable by anyone interested in writing a >>>>> fuzz test >>>>> - Example tests should be runnable from a fresh clone of Cassandra, in an >>>>> IDE or on the command line >>>>> - Examples of how we would test new features (like CEP-7, CEP-29, etc) >>>>> with the fuzz testing framework >>>>> >>>>> I find the JVM dtest framework accomplishes similar goals, and one reason >>>>> is because there are plenty of examples, and it's relatively easy to copy >>>>> and paste one example and have it do what you'd like. I believe the same >>>>> approach would work for a fuzz testing framework. >>>>> >>>>> Some of these tasks above are already done for Harry, such as better IDE >>>>> support for samples. This will be available in OSS Harry shortly. >>>>> >>>>> 2. Moving Harry in-tree vs. in submodule >>>>> >>>>> As I understand it, making Harry a submodule of Cassandra would make it >>>>> easier to deal with versioning, since we wouldn't have to do the entire >>>>> release dance we need to do for dtest-api, but I don't see this as a big >>>>> improvement to approachability. >>>>> >>>>> I do think that moving Harry in-tree would improve approachability, for >>>>> the same reason as the JVM dtests. It's nice to write a feature or fix, >>>>> find a similar JVM dtest, copy, paste, and edit, and have something >>>>> useful. >>>>> >>>>> 3. General subdivision of Cassandra projects >>>>> >>>>> This topic has come up quite a few times recently - around shared >>>>> utilities (CEP-10 concurrency primitives, etc), dtest-api, query parser, >>>>> etc. The project has tried out a few different approaches on composition >>>>> of separate projects. Hopefully in the near future we find the one that >>>>> works best and can start this process of splitting out libraries. >>>>> >>>>> -- >>>>> Abe >>>>> >>>>>> On May 25, 2023, at 6:36 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>>>>> >>>>>>> I would really like us to split out utilities into a common project >>>>>> +1 to the sentiment. >>>>>> >>>>>> Would also advocate strongly for it being more tightly integrated with >>>>>> the base project than what we've been doing with our ecosystem (i.e. >>>>>> completely separate projects, not submodules), mostly from a >>>>>> discoverability and workflow standpoint. >>>>>> >>>>>> I'm definitely salty about having to have 4 IDE's / projects open just >>>>>> to work on the entire stack. >>>>>> >>>>>> On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote: >>>>>>> This was not a talk, but rather an interactive workshop, unfortunately >>>>>>> will not work in a recorded way, but I am trying to work out ways to >>>>>>> preserve this. >>>>>>> >>>>>>> On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote: >>>>>>>> Since the talk was not accepted for Cassandra Summit, would it be >>>>>>>> possible to record it as a simple youtube video and publish it so that >>>>>>>> the detailed information about how to use Harry is not lost? >>>>>>>> >>>>>>>> On Thu, May 25, 2023 at 7:36 AM Alex Petrov <al...@coffeenco.de> wrote: >>>>>>>>> __ >>>>>>>>> While we are at it, we may also want to pull the in-jvm dtest API as >>>>>>>>> a submodule, and actually move some tests that are common between the >>>>>>>>> branches there. >>>>>>>>> >>>>>>>>> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote: >>>>>>>>>> Isn’t the other reason Accord works well as a submodule that it has >>>>>>>>>> no dependencies on C* proper? Harry does at the moment, right? (Not >>>>>>>>>> that we couldn’t address that…just trying to think this through…) >>>>>>>>>> >>>>>>>>>>> On May 24, 2023, at 6:54 PM, Benedict <bened...@apache.org> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In this case Harry is a testing module - it’s not something we will >>>>>>>>>>> develop in tandem with C* releases, and we will want improvements >>>>>>>>>>> to be applied across all branches. >>>>>>>>>>> >>>>>>>>>>> So it seems a natural fit for submodules to me. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 24 May 2023, at 21:09, Caleb Rackliffe >>>>>>>>>>>> <calebrackli...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> > Submodules do have their own overhead and edge cases, so I am >>>>>>>>>>>> > mostly in favor of using for cases where the code must live >>>>>>>>>>>> > outside of tree (such as jvm-dtest that lives out of tree as all >>>>>>>>>>>> > branches need the same interfaces) >>>>>>>>>>>> >>>>>>>>>>>> Agreed. Basically where I've ended up on this topic. >>>>>>>>>>>> >>>>>>>>>>>> > We could go over some interesting examples such as testing 2i >>>>>>>>>>>> > (SAI) >>>>>>>>>>>> >>>>>>>>>>>> +100 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov <al...@coffeenco.de> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> __ >>>>>>>>>>>>> > I'm about to need to harry test for the paging across tombstone >>>>>>>>>>>>> > work for https://issues.apache.org/jira/browse/CASSANDRA-18424 >>>>>>>>>>>>> > (that's where my own overlapping fuzzing came in). In the >>>>>>>>>>>>> > process, I'll see if I can't distill something really simple >>>>>>>>>>>>> > along the lines of how React approaches it >>>>>>>>>>>>> > (https://react.dev/learn). >>>>>>>>>>>>> >>>>>>>>>>>>> We can pick that up as an example, sure. >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote: >>>>>>>>>>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour >>>>>>>>>>>>>>> Harry workshop, >>>>>>>>>>>>>> I'm about to need to harry test for the paging across tombstone >>>>>>>>>>>>>> work for https://issues.apache.org/jira/browse/CASSANDRA-18424 >>>>>>>>>>>>>> (that's where my own overlapping fuzzing came in). In the >>>>>>>>>>>>>> process, I'll see if I can't distill something really simple >>>>>>>>>>>>>> along the lines of how React approaches it >>>>>>>>>>>>>> (https://react.dev/learn). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ideally we'd be able to get something together that's a high >>>>>>>>>>>>>> level "In the next 15 minutes, you will know and understand A-G >>>>>>>>>>>>>> and have access to N% of the power of harry" kind of offer. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Honestly, there's a *lot* in our ecosystem where we could >>>>>>>>>>>>>> benefit from taking a page from their book in terms of >>>>>>>>>>>>>> onboarding and getting started IMO. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote: >>>>>>>>>>>>>>> > I wonder if a mini-onboarding session would be good as a >>>>>>>>>>>>>>> > community session - go over Harry, how to run it, how to add >>>>>>>>>>>>>>> > a test? Would that be the right venue? I just would like to >>>>>>>>>>>>>>> > see how we can not only plug it in to regular CI but get >>>>>>>>>>>>>>> > everyone that wants to add a test be able to know how to get >>>>>>>>>>>>>>> > started with it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour >>>>>>>>>>>>>>> Harry workshop, but unfortunately it got declined. Goes without >>>>>>>>>>>>>>> saying, we can still do it online, time and resources >>>>>>>>>>>>>>> permitting. But again, I do not think it should be barring us >>>>>>>>>>>>>>> from making Harry a part of the codebase, as it already is. In >>>>>>>>>>>>>>> fact, we can be iterating on the development quicker having it >>>>>>>>>>>>>>> in-tree. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We could go over some interesting examples such as testing 2i >>>>>>>>>>>>>>> (SAI), modelling Group By tests, or testing repair. If there is >>>>>>>>>>>>>>> enough appetite and collaboration in the community, I will see >>>>>>>>>>>>>>> if we can pull something like that together. Input on _what_ >>>>>>>>>>>>>>> you would like to see / hear / tested is also appreciated. >>>>>>>>>>>>>>> Harry was developed out of a strong need for large-scale >>>>>>>>>>>>>>> testing, which also has informed many of its APIs, but we can >>>>>>>>>>>>>>> make it easier to access for interactive testing / unit tests. >>>>>>>>>>>>>>> We have been doing a lot of that with Transactional Metadata, >>>>>>>>>>>>>>> too. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> >>>>>>>>>>>>>>> > got any thoughts here? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, sorry for not responding on this thread earlier. I can not >>>>>>>>>>>>>>> understate how excited I am about this, and how important I >>>>>>>>>>>>>>> think this is. Time constraints are somehow hard to overcome, >>>>>>>>>>>>>>> but I hope the results brought by TCM will make it all worth it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote: >>>>>>>>>>>>>>>> I think pulling Harry into the tree will make adoption easier >>>>>>>>>>>>>>>> for the folks. I have been a bit swamped with Transactional >>>>>>>>>>>>>>>> Metadata work, but I wanted to make some of the things we were >>>>>>>>>>>>>>>> using for testing TCM available outside of TCM branch. This >>>>>>>>>>>>>>>> includes a bunch of helper methods to perform operations on >>>>>>>>>>>>>>>> the clusters, data generation, and more useful stuff. Of >>>>>>>>>>>>>>>> course, the question always remains about how much time I want >>>>>>>>>>>>>>>> to spend porting it all to Gossip, but I think we can find a >>>>>>>>>>>>>>>> reasonable compromise. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I would not set this improvement as a prerequisite to pulling >>>>>>>>>>>>>>>> Harry into the main branch, but rather interpret it as a >>>>>>>>>>>>>>>> commitment from myself to take community input and make it >>>>>>>>>>>>>>>> more approachable by the day. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote: >>>>>>>>>>>>>>>>>> importantly it’s a million times better than the dtest-api >>>>>>>>>>>>>>>>>> process - which stymies development due to the friction. >>>>>>>>>>>>>>>>> This is my major concern. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> What prompted this thread was harry being external to the >>>>>>>>>>>>>>>>> core codebase and the lack of adoption and usage of it having >>>>>>>>>>>>>>>>> led to atrophy of certain aspects of it, which then led to >>>>>>>>>>>>>>>>> redundant implementation of some fuzz testing and lost time. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We'd all be better served to have this closer to the main >>>>>>>>>>>>>>>>> codebase as a forcing function to smooth out the rough edges, >>>>>>>>>>>>>>>>> integrate it, and make it a collective artifact and first >>>>>>>>>>>>>>>>> class citizen IMO. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have similar opinions about the dtest-api. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> It’s not without hiccups, and I’m sure we have more to >>>>>>>>>>>>>>>>>> learn. But it mostly just works, and importantly it’s a >>>>>>>>>>>>>>>>>> million times better than the dtest-api process - which >>>>>>>>>>>>>>>>>> stymies development due to the friction. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 24 May 2023, at 08:39, Mick Semb Wever <m...@apache.org> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> WRT git submodules and CASSANDRA-18204, are we happy with >>>>>>>>>>>>>>>>>>> how it is working for accord ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The time spent on getting that running has been a fair few >>>>>>>>>>>>>>>>>>> hours, where we could have cut many manual module releases >>>>>>>>>>>>>>>>>>> in that time. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> David and folks working on accord ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Tue, 23 May 2023 at 20:09, Josh McKenzie >>>>>>>>>>>>>>>>>>> <jmcken...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>> __ >>>>>>>>>>>>>>>>>>>> I'll hold off on this until Alex Petrov chimes in. @Alex >>>>>>>>>>>>>>>>>>>> -> got any thoughts here? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote: >>>>>>>>>>>>>>>>>>>>> I think it would be great to onboard Harry more >>>>>>>>>>>>>>>>>>>>> officially into the project. However it would be nice to >>>>>>>>>>>>>>>>>>>>> perhaps do some sanity checking outside of Apple folks to >>>>>>>>>>>>>>>>>>>>> see how approachable it is. That is, can someone take it >>>>>>>>>>>>>>>>>>>>> and just run it with the current readme without any >>>>>>>>>>>>>>>>>>>>> additional context? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I wonder if a mini-onboarding session would be good as a >>>>>>>>>>>>>>>>>>>>> community session - go over Harry, how to run it, how to >>>>>>>>>>>>>>>>>>>>> add a test? Would that be the right venue? I just would >>>>>>>>>>>>>>>>>>>>> like to see how we can not only plug it in to regular CI >>>>>>>>>>>>>>>>>>>>> but get everyone that wants to add a test be able to know >>>>>>>>>>>>>>>>>>>>> how to get started with it. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Jeremy >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky >>>>>>>>>>>>>>>>>>>>>> <a...@aber.io> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Just to make sure I'm understanding the details, this >>>>>>>>>>>>>>>>>>>>>> would mean apache/cassandra-harry maintains its status >>>>>>>>>>>>>>>>>>>>>> as a separate repository, apache/cassandra references it >>>>>>>>>>>>>>>>>>>>>> as a submodule, and clones and builds Harry locally, >>>>>>>>>>>>>>>>>>>>>> rather than pulling a released JAR. We can then >>>>>>>>>>>>>>>>>>>>>> reference Harry as a library without maintaining public >>>>>>>>>>>>>>>>>>>>>> artifacts for it. Is that in line with what you're >>>>>>>>>>>>>>>>>>>>>> thinking? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> > I'd also like to see us get a Harry run integrated as >>>>>>>>>>>>>>>>>>>>>> > part of our pre-commit CI >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I'm a strong supporter of this, of course. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On May 16, 2023, at 11:03 AM, Josh McKenzie >>>>>>>>>>>>>>>>>>>>>>> <jmcken...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Similar to what we've done with accord in >>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, >>>>>>>>>>>>>>>>>>>>>>> I'd like to discuss bringing cassandra-harry in-tree as >>>>>>>>>>>>>>>>>>>>>>> a submodule. repo link: >>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/cassandra-harry >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Given the value it's brought to the project's >>>>>>>>>>>>>>>>>>>>>>> stabilization efforts and the movement of other things >>>>>>>>>>>>>>>>>>>>>>> in the ecosystem to being more integrated (accord, >>>>>>>>>>>>>>>>>>>>>>> build-scripts >>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18133), >>>>>>>>>>>>>>>>>>>>>>> I think having the testing framework better localized >>>>>>>>>>>>>>>>>>>>>>> and integrated would be a net benefit for adoption, >>>>>>>>>>>>>>>>>>>>>>> awareness, maintenance, and tighter workflows as we >>>>>>>>>>>>>>>>>>>>>>> troubleshoot future failures it surfaces. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I'd also like to see us get a Harry run integrated as >>>>>>>>>>>>>>>>>>>>>>> part of our pre-commit CI (a 5 minute simple soak test >>>>>>>>>>>>>>>>>>>>>>> for instance) and having that local in this fashion >>>>>>>>>>>>>>>>>>>>>>> should make that a cleaner integration as well. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thoughts?