Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

Alex Petrov Wed, 24 May 2023 23:38:04 -0700

While we are at it, we may also want to pull the in-jvm dtest API as a 
submodule, and actually move some tests that are common between the branches 
there.


On Thu, May 25, 2023, at 6:03 AM, Caleb Rackliffe wrote:
> Isn’t the other reason Accord works well as a submodule that it has no 
> dependencies on C* proper? Harry does at the moment, right? (Not that we 
> couldn’t address that…just trying to think this through…)
> 
>> On May 24, 2023, at 6:54 PM, Benedict <[email protected]> wrote:
>> 
>> 
>> In this case Harry is a testing module - it’s not something we will develop 
>> in tandem with C* releases, and we will want improvements to be applied 
>> across all branches.
>> 
>> So it seems a natural fit for submodules to me.
>> 
>> 
>>> On 24 May 2023, at 21:09, Caleb Rackliffe <[email protected]> wrote:
>>> 
>>> > Submodules do have their own overhead and edge cases, so I am mostly in 
>>> > favor of using for cases where the code must live outside of tree (such 
>>> > as jvm-dtest that lives out of tree as all branches need the same 
>>> > interfaces)
>>> 
>>> Agreed. Basically where I've ended up on this topic.
>>> 
>>> > We could go over some interesting examples such as testing 2i (SAI)
>>> 
>>> +100
>>> 
>>> 
>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov <[email protected]> wrote:
>>>> __
>>>> > I'm about to need to harry test for the paging across tombstone work for 
>>>> > https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my 
>>>> > own overlapping fuzzing came in). In the process, I'll see if I can't 
>>>> > distill something really simple along the lines of how React approaches 
>>>> > it (https://react.dev/learn).
>>>> 
>>>> We can pick that up as an example, sure. 
>>>> 
>>>> On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:
>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
>>>>>> workshop,
>>>>> I'm about to need to harry test for the paging across tombstone work for 
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my 
>>>>> own overlapping fuzzing came in). In the process, I'll see if I can't 
>>>>> distill something really simple along the lines of how React approaches 
>>>>> it (https://react.dev/learn).
>>>>> 
>>>>> Ideally we'd be able to get something together that's a high level "In 
>>>>> the next 15 minutes, you will know and understand A-G and have access to 
>>>>> N% of the power of harry" kind of offer.
>>>>> 
>>>>> Honestly, there's a *lot* in our ecosystem where we could benefit from 
>>>>> taking a page from their book in terms of onboarding and getting started 
>>>>> IMO.
>>>>> 
>>>>> On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:
>>>>>> > I wonder if a mini-onboarding session would be good as a community 
>>>>>> > session - go over Harry, how to run it, how to add a test?  Would that 
>>>>>> > be the right venue?  I just would like to see how we can not only plug 
>>>>>> > it in to regular CI but get everyone that wants to add a test be able 
>>>>>> > to know how to get started with it.
>>>>>> 
>>>>>> I have submitted a proposal to Cassandra Summit for a 4-hour Harry 
>>>>>> workshop, but unfortunately it got declined. Goes without saying, we can 
>>>>>> still do it online, time and resources permitting. But again, I do not 
>>>>>> think it should be barring us from making Harry a part of the codebase, 
>>>>>> as it already is. In fact, we can be iterating on the development 
>>>>>> quicker having it in-tree. 
>>>>>> 
>>>>>> We could go over some interesting examples such as testing 2i (SAI), 
>>>>>> modelling Group By tests, or testing repair. If there is enough appetite 
>>>>>> and collaboration in the community, I will see if we can pull something 
>>>>>> like that together. Input on _what_ you would like to see / hear / 
>>>>>> tested is also appreciated. Harry was developed out of a strong need for 
>>>>>> large-scale testing, which also has informed many of its APIs, but we 
>>>>>> can make it easier to access for interactive testing / unit tests. We 
>>>>>> have been doing a lot of that with Transactional Metadata, too. 
>>>>>> 
>>>>>> > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>>>>>> > thoughts here?
>>>>>> 
>>>>>> Yes, sorry for not responding on this thread earlier. I can not 
>>>>>> understate how excited I am about this, and how important I think this 
>>>>>> is. Time constraints are somehow hard to overcome, but I hope the 
>>>>>> results brought by TCM will make it all worth it.
>>>>>> 
>>>>>> On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
>>>>>>> I think pulling Harry into the tree will make adoption easier for the 
>>>>>>> folks. I have been a bit swamped with Transactional Metadata work, but 
>>>>>>> I wanted to make some of the things we were using for testing TCM 
>>>>>>> available outside of TCM branch. This includes a bunch of helper 
>>>>>>> methods to perform operations on the clusters, data generation, and 
>>>>>>> more useful stuff. Of course, the question always remains about how 
>>>>>>> much time I want to spend porting it all to Gossip, but I think we can 
>>>>>>> find a reasonable compromise. 
>>>>>>> 
>>>>>>> I would not set this improvement as a prerequisite to pulling Harry 
>>>>>>> into the main branch, but rather interpret it as a commitment from 
>>>>>>> myself to take community input and make it more approachable by the 
>>>>>>> day. 
>>>>>>> 
>>>>>>> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
>>>>>>>>> importantly it’s a million times better than the dtest-api process - 
>>>>>>>>> which stymies development due to the friction.
>>>>>>>> This is my major concern.
>>>>>>>> 
>>>>>>>> What prompted this thread was harry being external to the core 
>>>>>>>> codebase and the lack of adoption and usage of it having led to 
>>>>>>>> atrophy of certain aspects of it, which then led to redundant 
>>>>>>>> implementation of some fuzz testing and lost time.
>>>>>>>> 
>>>>>>>> We'd all be better served to have this closer to the main codebase as 
>>>>>>>> a forcing function to smooth out the rough edges, integrate it, and 
>>>>>>>> make it a collective artifact and first class citizen IMO.
>>>>>>>> 
>>>>>>>> I have similar opinions about the dtest-api.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
>>>>>>>>> 
>>>>>>>>> It’s not without hiccups, and I’m sure we have more to learn. But it 
>>>>>>>>> mostly just works, and importantly it’s a million times better than 
>>>>>>>>> the dtest-api process - which stymies development due to the friction.
>>>>>>>>> 
>>>>>>>>>> On 24 May 2023, at 08:39, Mick Semb Wever <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> WRT git submodules and CASSANDRA-18204, are we happy with how it is 
>>>>>>>>>> working for accord ? 
>>>>>>>>>> 
>>>>>>>>>> The time spent on getting that running has been a fair few hours, 
>>>>>>>>>> where we could have cut many manual module releases in that time. 
>>>>>>>>>> 
>>>>>>>>>> David and folks working on accord ? 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, 23 May 2023 at 20:09, Josh McKenzie <[email protected]> 
>>>>>>>>>> wrote:
>>>>>>>>>>> __
>>>>>>>>>>> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>>>>>>>>>>> thoughts here?
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
>>>>>>>>>>>> I think it would be great to onboard Harry more officially into 
>>>>>>>>>>>> the project.  However it would be nice to perhaps do some sanity 
>>>>>>>>>>>> checking outside of Apple folks to see how approachable it is.  
>>>>>>>>>>>> That is, can someone take it and just run it with the current 
>>>>>>>>>>>> readme without any additional context?
>>>>>>>>>>>> 
>>>>>>>>>>>> I wonder if a mini-onboarding session would be good as a community 
>>>>>>>>>>>> session - go over Harry, how to run it, how to add a test?  Would 
>>>>>>>>>>>> that be the right venue?  I just would like to see how we can not 
>>>>>>>>>>>> only plug it in to regular CI but get everyone that wants to add a 
>>>>>>>>>>>> test be able to know how to get started with it.
>>>>>>>>>>>> 
>>>>>>>>>>>> Jeremy
>>>>>>>>>>>> 
>>>>>>>>>>>>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky <[email protected]> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Just to make sure I'm understanding the details, this would mean 
>>>>>>>>>>>>> apache/cassandra-harry maintains its status as a separate 
>>>>>>>>>>>>> repository, apache/cassandra references it as a submodule, and 
>>>>>>>>>>>>> clones and builds Harry locally, rather than pulling a released 
>>>>>>>>>>>>> JAR. We can then reference Harry as a library without maintaining 
>>>>>>>>>>>>> public artifacts for it. Is that in line with what you're 
>>>>>>>>>>>>> thinking?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> > I'd also like to see us get a Harry run integrated as part of 
>>>>>>>>>>>>> > our pre-commit CI
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm a strong supporter of this, of course.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On May 16, 2023, at 11:03 AM, Josh McKenzie 
>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Similar to what we've done with accord in 
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like 
>>>>>>>>>>>>>> to discuss bringing cassandra-harry in-tree as a submodule. repo 
>>>>>>>>>>>>>> link: https://github.com/apache/cassandra-harry
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Given the value it's brought to the project's stabilization 
>>>>>>>>>>>>>> efforts and the movement of other things in the ecosystem to 
>>>>>>>>>>>>>> being more integrated (accord, build-scripts 
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18133), I think 
>>>>>>>>>>>>>> having the testing framework better localized and integrated 
>>>>>>>>>>>>>> would be a net benefit for adoption, awareness, maintenance, and 
>>>>>>>>>>>>>> tighter workflows as we troubleshoot future failures it surfaces.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'd also like to see us get a Harry run integrated as part of 
>>>>>>>>>>>>>> our pre-commit CI (a 5 minute simple soak test for instance) and 
>>>>>>>>>>>>>> having that local in this fashion should make that a cleaner 
>>>>>>>>>>>>>> integration as well.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thoughts?
>>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

Reply via email to