Most edge cases we have seen in Accord are working with feature branches from other authors where we use relative paths to make sure the git@ vs https:// doesn’t become a problem for CI (submodule points to https:// to work in CI, but if you do that during feature development it gets annoying to push to GitHub… so we do ../cassandra-accord.git so git respects w/e protocol you are using). In 1-2 peoples environments, when they checked out another authors logic the C* remote was correct, but the Accord one was still pointing to Apache (which doesn’t have the feature branch)…. This is trivial to fix, and might be a bug with our git hooks…. But still calling out as it has been an issue.
Josh, do you see any reports on what isn’t working? I think most people don’t touch 1% of what git can do… so it might be that 10% is broken but that no one in our domain actually touches that path? > On May 31, 2023, at 12:36 PM, Josh McKenzie <jmcken...@apache.org> wrote: > > Bumping into worktree + submodule pain on some harry related work; it looks > like "git worktree" and submodules are not currently fully implemented: > > https://git-scm.com/docs/git-worktree#_bugs > BUGS > Multiple checkout in general is still experimental, and the support for > submodules is incomplete. It is NOT recommended to make multiple checkouts of > a superproject. > > I rely pretty heavily on worktrees and I know a lot of other folks who do > too. This is a dealbreaker for me in terms of adding anything else as a > submodule and I'd like to know if the accord folks have been running into any > worktree related woes w/the accord integration. > > > On Sun, May 28, 2023, at 10:14 AM, Alex Petrov wrote: >> Regarding approachability, one of the things I thought is worth adding is a >> DSL. I feel like there's enough functionality in Harry and there's enough >> information for anyone who needs to write even an involved test out there, >> but adoption doesn't usually start with complex use-cases, so it could be >> that making it extremely simple to generate the data and validating that >> written data is where it's supposed to be, should help adoption a lot. >> Unfortunately, more complex use-cases such as group-by support, or SAI >> testing will require a bit more knowledge and writing an involved model, so >> I do not see any shortcuts we can take here. >> >> > I do think that moving Harry in-tree would improve approachability >> >> I think it's similar as it is with in-jvm dtest api. I feel like we wold >> evolve it more actively if we didn't have to cut a release before every >> commit. In other words, I think that changing Harry code and extending >> functionality will be easier, which I think will eventually lead to quicker >> adoption. But of course the act of moving itself does not increase adoption, >> it just comes from better ergonomics. >> >> >> On Thu, May 25, 2023, at 8:03 PM, Abe Ratnofsky wrote: >>> I'm seeing a few distinct topics here: >>> >>> 1. Harry's adoption and approachability >>> >>> I agree that approachability is one of Harry's main improvement areas right >>> now. If our goal is to produce a fuzz testing framework for the Cassandra >>> project, then adoption by contributors and usage for new feature >>> development are reasonable indicators for whether we're achieving that >>> goal. If Harry is not getting adopted by contributors outside of Apple, and >>> is not getting used for new feature development, then we should make an >>> effort to understand why. I don't think that a several-hour seminar is the >>> best point of leverage to achieve those goals. >>> >>> Here's what I think we do need: >>> >>> - The README should be understandable by anyone interested in writing a >>> fuzz test >>> - Example tests should be runnable from a fresh clone of Cassandra, in an >>> IDE or on the command line >>> - Examples of how we would test new features (like CEP-7, CEP-29, etc) with >>> the fuzz testing framework >>> >>> I find the JVM dtest framework accomplishes similar goals, and one reason >>> is because there are plenty of examples, and it's relatively easy to copy >>> and paste one example and have it do what you'd like. I believe the same >>> approach would work for a fuzz testing framework. >>> >>> Some of these tasks above are already done for Harry, such as better IDE >>> support for samples. This will be available in OSS Harry shortly. >>> >>> 2. Moving Harry in-tree vs. in submodule >>> >>> As I understand it, making Harry a submodule of Cassandra would make it >>> easier to deal with versioning, since we wouldn't have to do the entire >>> release dance we need to do for dtest-api, but I don't see this as a big >>> improvement to approachability. >>> >>> I do think that moving Harry in-tree would improve approachability, for the >>> same reason as the JVM dtests. It's nice to write a feature or fix, find a >>> similar JVM dtest, copy, paste, and edit, and have something useful. >>> >>> 3. General subdivision of Cassandra projects >>> >>> This topic has come up quite a few times recently - around shared utilities >>> (CEP-10 concurrency primitives, etc), dtest-api, query parser, etc. The >>> project has tried out a few different approaches on composition of separate >>> projects. Hopefully in the near future we find the one that works best and >>> can start this process of splitting out libraries. >>> >>> -- >>> Abe >>> >>>> On May 25, 2023, at 6:36 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>>> >>>>> I would really like us to split out utilities into a common project >>>> +1 to the sentiment. >>>> >>>> Would also advocate strongly for it being more tightly integrated with the >>>> base project than what we've been doing with our ecosystem (i.e. >>>> completely separate projects, not submodules), mostly from a >>>> discoverability and workflow standpoint. >>>> >>>> I'm definitely salty about having to have 4 IDE's / projects open just to >>>> work on the entire stack. >>>> >>>> On Thu, May 25, 2023, at 5:05 AM, Alex Petrov wrote: >>>>> This was not a talk, but rather an interactive workshop, unfortunately >>>>> will not work in a recorded way, but I am trying to work out ways to >>>>> preserve this. >>>>> >>>>> On Thu, May 25, 2023, at 10:26 AM, Claude Warren, Jr via dev wrote: >>>>>> Since the talk was not accepted for Cassandra Summit, would it be >>>>>> possible to record it as a simple youtube video and publish it so that >>>>>> the detailed information about how to use Harry is not lost? >>>>>> >>>>>> On Thu, May 25, 2023 at 7:36 AM Alex Petrov <al...@coffeenco.de >>>>>> <mailto:al...@coffeenco.de>> wrote: >>>>>> >>>>>> While we are at it, we may also want to pull the in-jvm dtest API as a >>>>>> submodule, and actually move some tests that are common between the >>>>>> branches there. >>>>>> >>>>>> On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote: >>>>>>> Isn’t the other reason Accord works well as a submodule that it has no >>>>>>> dependencies on C* proper? Harry does at the moment, right? (Not that >>>>>>> we couldn’t address that…just trying to think this through…) >>>>>>> >>>>>>>> On May 24, 2023, at 6:54 PM, Benedict <bened...@apache.org >>>>>>>> <mailto:bened...@apache.org>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> In this case Harry is a testing module - it’s not something we will >>>>>>>> develop in tandem with C* releases, and we will want improvements to >>>>>>>> be applied across all branches. >>>>>>>> >>>>>>>> So it seems a natural fit for submodules to me. >>>>>>>> >>>>>>>> >>>>>>>>> On 24 May 2023, at 21:09, Caleb Rackliffe <calebrackli...@gmail.com >>>>>>>>> <mailto:calebrackli...@gmail.com>> wrote: >>>>>>>>> >>>>>>>>> > Submodules do have their own overhead and edge cases, so I am >>>>>>>>> > mostly in favor of using for cases where the code must live outside >>>>>>>>> > of tree (such as jvm-dtest that lives out of tree as all branches >>>>>>>>> > need the same interfaces) >>>>>>>>> >>>>>>>>> Agreed. Basically where I've ended up on this topic. >>>>>>>>> >>>>>>>>> > We could go over some interesting examples such as testing 2i (SAI) >>>>>>>>> >>>>>>>>> +100 >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov <al...@coffeenco.de >>>>>>>>> <mailto:al...@coffeenco.de>> wrote: >>>>>>>>> >>>>>>>>> > I'm about to need to harry test for the paging across tombstone >>>>>>>>> > work for https://issues.apache.org/jira/browse/CASSANDRA-18424 >>>>>>>>> > (that's where my own overlapping fuzzing came in). In the process, >>>>>>>>> > I'll see if I can't distill something really simple along the lines >>>>>>>>> > of how React approaches it (https://react.dev/learn). >>>>>>>>> >>>>>>>>> We can pick that up as an example, sure. >>>>>>>>> >>>>>>>>> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote: >>>>>>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry >>>>>>>>>>> workshop, >>>>>>>>>> I'm about to need to harry test for the paging across tombstone work >>>>>>>>>> for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's >>>>>>>>>> where my own overlapping fuzzing came in). In the process, I'll see >>>>>>>>>> if I can't distill something really simple along the lines of how >>>>>>>>>> React approaches it (https://react.dev/learn). >>>>>>>>>> >>>>>>>>>> Ideally we'd be able to get something together that's a high level >>>>>>>>>> "In the next 15 minutes, you will know and understand A-G and have >>>>>>>>>> access to N% of the power of harry" kind of offer. >>>>>>>>>> >>>>>>>>>> Honestly, there's a lot in our ecosystem where we could benefit from >>>>>>>>>> taking a page from their book in terms of onboarding and getting >>>>>>>>>> started IMO. >>>>>>>>>> >>>>>>>>>> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote: >>>>>>>>>>> > I wonder if a mini-onboarding session would be good as a >>>>>>>>>>> > community session - go over Harry, how to run it, how to add a >>>>>>>>>>> > test? Would that be the right venue? I just would like to see >>>>>>>>>>> > how we can not only plug it in to regular CI but get everyone >>>>>>>>>>> > that wants to add a test be able to know how to get started with >>>>>>>>>>> > it. >>>>>>>>>>> >>>>>>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry >>>>>>>>>>> workshop, but unfortunately it got declined. Goes without saying, >>>>>>>>>>> we can still do it online, time and resources permitting. But >>>>>>>>>>> again, I do not think it should be barring us from making Harry a >>>>>>>>>>> part of the codebase, as it already is. In fact, we can be >>>>>>>>>>> iterating on the development quicker having it in-tree. >>>>>>>>>>> >>>>>>>>>>> We could go over some interesting examples such as testing 2i >>>>>>>>>>> (SAI), modelling Group By tests, or testing repair. If there is >>>>>>>>>>> enough appetite and collaboration in the community, I will see if >>>>>>>>>>> we can pull something like that together. Input on _what_ you would >>>>>>>>>>> like to see / hear / tested is also appreciated. Harry was >>>>>>>>>>> developed out of a strong need for large-scale testing, which also >>>>>>>>>>> has informed many of its APIs, but we can make it easier to access >>>>>>>>>>> for interactive testing / unit tests. We have been doing a lot of >>>>>>>>>>> that with Transactional Metadata, too. >>>>>>>>>>> >>>>>>>>>>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got >>>>>>>>>>> > any thoughts here? >>>>>>>>>>> >>>>>>>>>>> Yes, sorry for not responding on this thread earlier. I can not >>>>>>>>>>> understate how excited I am about this, and how important I think >>>>>>>>>>> this is. Time constraints are somehow hard to overcome, but I hope >>>>>>>>>>> the results brought by TCM will make it all worth it. >>>>>>>>>>> >>>>>>>>>>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote: >>>>>>>>>>>> I think pulling Harry into the tree will make adoption easier for >>>>>>>>>>>> the folks. I have been a bit swamped with Transactional Metadata >>>>>>>>>>>> work, but I wanted to make some of the things we were using for >>>>>>>>>>>> testing TCM available outside of TCM branch. This includes a bunch >>>>>>>>>>>> of helper methods to perform operations on the clusters, data >>>>>>>>>>>> generation, and more useful stuff. Of course, the question always >>>>>>>>>>>> remains about how much time I want to spend porting it all to >>>>>>>>>>>> Gossip, but I think we can find a reasonable compromise. >>>>>>>>>>>> >>>>>>>>>>>> I would not set this improvement as a prerequisite to pulling >>>>>>>>>>>> Harry into the main branch, but rather interpret it as a >>>>>>>>>>>> commitment from myself to take community input and make it more >>>>>>>>>>>> approachable by the day. >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote: >>>>>>>>>>>>>> importantly it’s a million times better than the dtest-api >>>>>>>>>>>>>> process - which stymies development due to the friction. >>>>>>>>>>>>> This is my major concern. >>>>>>>>>>>>> >>>>>>>>>>>>> What prompted this thread was harry being external to the core >>>>>>>>>>>>> codebase and the lack of adoption and usage of it having led to >>>>>>>>>>>>> atrophy of certain aspects of it, which then led to redundant >>>>>>>>>>>>> implementation of some fuzz testing and lost time. >>>>>>>>>>>>> >>>>>>>>>>>>> We'd all be better served to have this closer to the main >>>>>>>>>>>>> codebase as a forcing function to smooth out the rough edges, >>>>>>>>>>>>> integrate it, and make it a collective artifact and first class >>>>>>>>>>>>> citizen IMO. >>>>>>>>>>>>> >>>>>>>>>>>>> I have similar opinions about the dtest-api. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> It’s not without hiccups, and I’m sure we have more to learn. >>>>>>>>>>>>>> But it mostly just works, and importantly it’s a million times >>>>>>>>>>>>>> better than the dtest-api process - which stymies development >>>>>>>>>>>>>> due to the friction. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 24 May 2023, at 08:39, Mick Semb Wever <m...@apache.org >>>>>>>>>>>>>>> <mailto:m...@apache.org>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> WRT git submodules and CASSANDRA-18204, are we happy with how >>>>>>>>>>>>>>> it is working for accord ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The time spent on getting that running has been a fair few >>>>>>>>>>>>>>> hours, where we could have cut many manual module releases in >>>>>>>>>>>>>>> that time. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> David and folks working on accord ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, 23 May 2023 at 20:09, Josh McKenzie >>>>>>>>>>>>>>> <jmcken...@apache.org <mailto:jmcken...@apache.org>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'll hold off on this until Alex Petrov chimes in. @Alex -> got >>>>>>>>>>>>>>> any thoughts here? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote: >>>>>>>>>>>>>>>> I think it would be great to onboard Harry more officially >>>>>>>>>>>>>>>> into the project. However it would be nice to perhaps do some >>>>>>>>>>>>>>>> sanity checking outside of Apple folks to see how approachable >>>>>>>>>>>>>>>> it is. That is, can someone take it and just run it with the >>>>>>>>>>>>>>>> current readme without any additional context? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I wonder if a mini-onboarding session would be good as a >>>>>>>>>>>>>>>> community session - go over Harry, how to run it, how to add a >>>>>>>>>>>>>>>> test? Would that be the right venue? I just would like to >>>>>>>>>>>>>>>> see how we can not only plug it in to regular CI but get >>>>>>>>>>>>>>>> everyone that wants to add a test be able to know how to get >>>>>>>>>>>>>>>> started with it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jeremy >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky <a...@aber.io >>>>>>>>>>>>>>>>> <mailto:a...@aber.io>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Just to make sure I'm understanding the details, this would >>>>>>>>>>>>>>>>> mean apache/cassandra-harry maintains its status as a >>>>>>>>>>>>>>>>> separate repository, apache/cassandra references it as a >>>>>>>>>>>>>>>>> submodule, and clones and builds Harry locally, rather than >>>>>>>>>>>>>>>>> pulling a released JAR. We can then reference Harry as a >>>>>>>>>>>>>>>>> library without maintaining public artifacts for it. Is that >>>>>>>>>>>>>>>>> in line with what you're thinking? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> > I'd also like to see us get a Harry run integrated as part >>>>>>>>>>>>>>>>> > of our pre-commit CI >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm a strong supporter of this, of course. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On May 16, 2023, at 11:03 AM, Josh McKenzie >>>>>>>>>>>>>>>>>> <jmcken...@apache.org <mailto:jmcken...@apache.org>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Similar to what we've done with accord in >>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd >>>>>>>>>>>>>>>>>> like to discuss bringing cassandra-harry in-tree as a >>>>>>>>>>>>>>>>>> submodule. repo link: >>>>>>>>>>>>>>>>>> https://github.com/apache/cassandra-harry >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Given the value it's brought to the project's stabilization >>>>>>>>>>>>>>>>>> efforts and the movement of other things in the ecosystem to >>>>>>>>>>>>>>>>>> being more integrated (accord, build-scripts >>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18133), I >>>>>>>>>>>>>>>>>> think having the testing framework better localized and >>>>>>>>>>>>>>>>>> integrated would be a net benefit for adoption, awareness, >>>>>>>>>>>>>>>>>> maintenance, and tighter workflows as we troubleshoot future >>>>>>>>>>>>>>>>>> failures it surfaces. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'd also like to see us get a Harry run integrated as part >>>>>>>>>>>>>>>>>> of our pre-commit CI (a 5 minute simple soak test for >>>>>>>>>>>>>>>>>> instance) and having that local in this fashion should make >>>>>>>>>>>>>>>>>> that a cleaner integration as well. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thoughts?