Strong +1. One concerns, how we define 'unnecessary branches', which mean how to distinguish the branches someone accidentally created, I try to traverse some of them but didn't get one obvious rule. Thanks.
Best Regards, - He Xiaoqiao On Wed, May 29, 2024 at 10:11 PM Ayush Saxena <ayush...@gmail.com> wrote: > +1 for the proposal, the first thing might be to just drop the > unnecessary branches which are either from dependabot or someone > accidentally created a branch in the main repo rather than in their > fork, there are many, I don't think we need them in the archived repo > either > > Regarding (3) If you mean just branches, then should be ok, maybe lets > not touch the release tags for now IMO > > Regrading the regex, I tried this locally & it works > ``` > ayushsaxena@ayushsaxena hadoop % git tag --delete `git tag --list > 'ozone*'` > Deleted tag 'ozone-0.2.1-alpha-RC0' (was 90b070452bc7) > Deleted tag 'ozone-0.3.0-alpha' (was e9921ebf7e8d) > Deleted tag 'ozone-0.3.0-alpha-RC0' (was 3fbd1f15b894) > Deleted tag 'ozone-0.3.0-alpha-RC1' (was cdad29240e52) > Deleted tag 'ozone-0.4.0-alpha-RC0' (was 07fd26ef6d8c) > Deleted tag 'ozone-0.4.0-alpha-RC1' (was c4f9a20bbe55) > Deleted tag 'ozone-0.4.0-alpha-RC2' (was 6860c595ed19) > Deleted tag 'ozone-0.4.1-alpha' (was 687173ff4be4) > Deleted tag 'ozone-0.4.1-alpha-RC0' (was 9062dac447c8) > ayushsaxena@ayushsaxena hadoop % > > ``` > > -Ayush > > On Wed, 29 May 2024 at 18:07, Steve Loughran > <ste...@cloudera.com.invalid> wrote: > > > > I'm waiting for a git update of trunk to complete, not having done it > since > > last week. The 1.8 GB download is taking a long time over a VPN. > > > > Updating files: 100% (8518/8518), done. > > Switched to branch 'trunk' > > Your branch is up to date with 'apache/trunk'. > > remote: Enumerating objects: 4142992, done. > > remote: Counting objects: 100% (4142972/4142972), done. > > remote: Compressing objects: 100% (503038/503038), done. > > ^Receiving objects: 11% (483073/4142404), 204.18 MiB | 7.05 MiB/s > > remote: Total 4142404 (delta 3583765), reused 4140936 (delta 3582453) > > Receiving objects: 100% (4142404/4142404), 1.80 GiB | 6.36 MiB/s, done. > > Resolving deltas: 42% (1505182/3583765) > > ... > > > > > > We have too many branches and too many tags, which makes for big > downloads > > and slow clones, as well as complaints from git whenever I manually push > > things to gitbox.apache.org. > > > > I think we can/should clean up, which can be done as > > > > > > 1. Create a hadoop-source-archive repository, > > 2. into which we add all of the current hadoop-repository. This > ensures > > all the history is preserved. > > 3. Delete all the old release branches, where old is defined as, > maybe < > > 2.6? > > 4. feature branches which are merged/abandoned > > 5. all the single JIRA branches which are the same thing, "MR-279" > being > > a key example. ozone-* probably too. > > 6. Do some tag pruning too. (Is there a way to do this with > wildcards? I > > could use it locally...) > > > > With an archive repo, all the old development history for branches off > the > > current release chain + tags are still available, but the core repo is > > much, much smaller. > > > > What do people think? > > > > If others are interested, I'll need some help carefully getting the > > hadoop-source-archive repo up. We'd need to somehow get all of hadoop > trunk > > into it. > > > > Meanwhile, I will cull some merged feature branches. > > > > Steve > > --------------------------------------------------------------------- > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >