Strong +1. One concerns, how we define 'unnecessary branches', which
mean how to distinguish the branches someone accidentally created, I try
to traverse some of them but didn't get one obvious rule. Thanks.

Best Regards,
- He Xiaoqiao



On Wed, May 29, 2024 at 10:11 PM Ayush Saxena <ayush...@gmail.com> wrote:

> +1 for the proposal, the first thing might be to just drop the
> unnecessary branches which are either from dependabot or someone
> accidentally created a branch in the main repo rather than in their
> fork, there are many, I don't think we need them in the archived repo
> either
>
> Regarding (3) If you mean just branches, then should be ok, maybe lets
> not touch the release tags for now IMO
>
> Regrading the regex, I tried this locally & it works
> ```
> ayushsaxena@ayushsaxena hadoop % git tag --delete  `git tag --list
> 'ozone*'`
> Deleted tag 'ozone-0.2.1-alpha-RC0' (was 90b070452bc7)
> Deleted tag 'ozone-0.3.0-alpha' (was e9921ebf7e8d)
> Deleted tag 'ozone-0.3.0-alpha-RC0' (was 3fbd1f15b894)
> Deleted tag 'ozone-0.3.0-alpha-RC1' (was cdad29240e52)
> Deleted tag 'ozone-0.4.0-alpha-RC0' (was 07fd26ef6d8c)
> Deleted tag 'ozone-0.4.0-alpha-RC1' (was c4f9a20bbe55)
> Deleted tag 'ozone-0.4.0-alpha-RC2' (was 6860c595ed19)
> Deleted tag 'ozone-0.4.1-alpha' (was 687173ff4be4)
> Deleted tag 'ozone-0.4.1-alpha-RC0' (was 9062dac447c8)
> ayushsaxena@ayushsaxena hadoop %
>
> ```
>
> -Ayush
>
> On Wed, 29 May 2024 at 18:07, Steve Loughran
> <ste...@cloudera.com.invalid> wrote:
> >
> > I'm waiting for a git update of trunk to complete, not having done it
> since
> > last week. The 1.8 GB download is taking a long time over a VPN.
> >
> > Updating files: 100% (8518/8518), done.
> > Switched to branch 'trunk'
> > Your branch is up to date with 'apache/trunk'.
> > remote: Enumerating objects: 4142992, done.
> > remote: Counting objects: 100% (4142972/4142972), done.
> > remote: Compressing objects: 100% (503038/503038), done.
> > ^Receiving objects:  11% (483073/4142404), 204.18 MiB | 7.05 MiB/s
> > remote: Total 4142404 (delta 3583765), reused 4140936 (delta 3582453)
> > Receiving objects: 100% (4142404/4142404), 1.80 GiB | 6.36 MiB/s, done.
> > Resolving deltas:  42% (1505182/3583765)
> > ...
> >
> >
> > We have too many branches and too many tags, which makes for big
> downloads
> > and slow clones, as well as complaints from git whenever I manually push
> > things to gitbox.apache.org.
> >
> > I think we can/should clean up, which can be done as
> >
> >
> >    1. Create a hadoop-source-archive repository,
> >    2. into which we add all of the current hadoop-repository. This
> ensures
> >    all the history is preserved.
> >    3. Delete all the old release branches, where old is defined as,
> maybe <
> >    2.6?
> >    4. feature branches which are merged/abandoned
> >    5. all the single JIRA branches which are the same thing, "MR-279"
> being
> >    a key example. ozone-* probably too.
> >    6. Do some tag pruning too. (Is there a way to do this with
> wildcards? I
> >    could use it locally...)
> >
> > With an archive repo, all the old development history for branches off
> the
> > current release chain + tags are still available, but the core repo is
> > much, much smaller.
> >
> > What do people think?
> >
> > If others are interested, I'll need some help carefully getting the
> > hadoop-source-archive repo up. We'd need to somehow get all of hadoop
> trunk
> > into it.
> >
> > Meanwhile, I will cull some merged feature branches.
> >
> > Steve
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

Reply via email to