Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

Alex Petrov Wed, 24 May 2023 07:33:10 -0700

> I wonder if a mini-onboarding session would be good as a community session - 
> go over Harry, how to run it, how to add a test?  Would that be the right 
> venue?  I just would like to see how we can not only plug it in to regular CI 
> but get everyone that wants to add a test be able to know how to get started 
> with it.


I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, 
but unfortunately it got declined. Goes without saying, we can still do it 
online, time and resources permitting. But again, I do not think it should be 
barring us from making Harry a part of the codebase, as it already is. In fact, 
we can be iterating on the development quicker having it in-tree. 

We could go over some interesting examples such as testing 2i (SAI), modelling 
Group By tests, or testing repair. If there is enough appetite and 
collaboration in the community, I will see if we can pull something like that 
together. Input on _what_ you would like to see / hear / tested is also 
appreciated. Harry was developed out of a strong need for large-scale testing, 
which also has informed many of its APIs, but we can make it easier to access 
for interactive testing / unit tests. We have been doing a lot of that with 
Transactional Metadata, too. 

> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts 
> here?

Yes, sorry for not responding on this thread earlier. I can not understate how 
excited I am about this, and how important I think this is. Time constraints 
are somehow hard to overcome, but I hope the results brought by TCM will make 
it all worth it.

On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:
> I think pulling Harry into the tree will make adoption easier for the folks. 
> I have been a bit swamped with Transactional Metadata work, but I wanted to 
> make some of the things we were using for testing TCM available outside of 
> TCM branch. This includes a bunch of helper methods to perform operations on 
> the clusters, data generation, and more useful stuff. Of course, the question 
> always remains about how much time I want to spend porting it all to Gossip, 
> but I think we can find a reasonable compromise. 
> 
> I would not set this improvement as a prerequisite to pulling Harry into the 
> main branch, but rather interpret it as a commitment from myself to take 
> community input and make it more approachable by the day. 
> 
> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:
>>> importantly it’s a million times better than the dtest-api process - which 
>>> stymies development due to the friction.
>> This is my major concern.
>> 
>> What prompted this thread was harry being external to the core codebase and 
>> the lack of adoption and usage of it having led to atrophy of certain 
>> aspects of it, which then led to redundant implementation of some fuzz 
>> testing and lost time.
>> 
>> We'd all be better served to have this closer to the main codebase as a 
>> forcing function to smooth out the rough edges, integrate it, and make it a 
>> collective artifact and first class citizen IMO.
>> 
>> I have similar opinions about the dtest-api.
>> 
>> 
>> On Wed, May 24, 2023, at 4:05 AM, Benedict wrote:
>>> 
>>> It’s not without hiccups, and I’m sure we have more to learn. But it mostly 
>>> just works, and importantly it’s a million times better than the dtest-api 
>>> process - which stymies development due to the friction.
>>> 
>>>> On 24 May 2023, at 08:39, Mick Semb Wever <m...@apache.org> wrote:
>>>> 
>>>> 
>>>> WRT git submodules and CASSANDRA-18204, are we happy with how it is 
>>>> working for accord ? 
>>>> 
>>>> The time spent on getting that running has been a fair few hours, where we 
>>>> could have cut many manual module releases in that time. 
>>>> 
>>>> David and folks working on accord ? 
>>>> 
>>>> 
>>>> 
>>>> On Tue, 23 May 2023 at 20:09, Josh McKenzie <jmcken...@apache.org> wrote:
>>>>> __
>>>>> I'll hold off on this until Alex Petrov chimes in. @Alex -> got any 
>>>>> thoughts here?
>>>>> 
>>>>> On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:
>>>>>> I think it would be great to onboard Harry more officially into the 
>>>>>> project.  However it would be nice to perhaps do some sanity checking 
>>>>>> outside of Apple folks to see how approachable it is.  That is, can 
>>>>>> someone take it and just run it with the current readme without any 
>>>>>> additional context?
>>>>>> 
>>>>>> I wonder if a mini-onboarding session would be good as a community 
>>>>>> session - go over Harry, how to run it, how to add a test?  Would that 
>>>>>> be the right venue?  I just would like to see how we can not only plug 
>>>>>> it in to regular CI but get everyone that wants to add a test be able to 
>>>>>> know how to get started with it.
>>>>>> 
>>>>>> Jeremy
>>>>>> 
>>>>>>> On May 16, 2023, at 1:34 PM, Abe Ratnofsky <a...@aber.io> wrote:
>>>>>>> 
>>>>>>> Just to make sure I'm understanding the details, this would mean 
>>>>>>> apache/cassandra-harry maintains its status as a separate repository, 
>>>>>>> apache/cassandra references it as a submodule, and clones and builds 
>>>>>>> Harry locally, rather than pulling a released JAR. We can then 
>>>>>>> reference Harry as a library without maintaining public artifacts for 
>>>>>>> it. Is that in line with what you're thinking?
>>>>>>> 
>>>>>>> > I'd also like to see us get a Harry run integrated as part of our 
>>>>>>> > pre-commit CI
>>>>>>> 
>>>>>>> I'm a strong supporter of this, of course.
>>>>>>> 
>>>>>>>> On May 16, 2023, at 11:03 AM, Josh McKenzie <jmcken...@apache.org> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Similar to what we've done with accord in 
>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to 
>>>>>>>> discuss bringing cassandra-harry in-tree as a submodule. repo link: 
>>>>>>>> https://github.com/apache/cassandra-harry
>>>>>>>> 
>>>>>>>> Given the value it's brought to the project's stabilization efforts 
>>>>>>>> and the movement of other things in the ecosystem to being more 
>>>>>>>> integrated (accord, build-scripts 
>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-18133), I think having 
>>>>>>>> the testing framework better localized and integrated would be a net 
>>>>>>>> benefit for adoption, awareness, maintenance, and tighter workflows as 
>>>>>>>> we troubleshoot future failures it surfaces.
>>>>>>>> 
>>>>>>>> I'd also like to see us get a Harry run integrated as part of our 
>>>>>>>> pre-commit CI (a 5 minute simple soak test for instance) and having 
>>>>>>>> that local in this fashion should make that a cleaner integration as 
>>>>>>>> well.
>>>>>>>> 
>>>>>>>> Thoughts?
>>>>> 
>> 
>

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

Reply via email to