Re: distexpand for autogenerated upstream distfile resources (was: standardize and simplify GitHub submodule handling in ports?)

2023-08-09 Thread Marc Espie
On Wed, Aug 09, 2023 at 12:54:12AM -0400, Thomas Frohwein wrote:
> - It includes logic that finds the first MASTER_SITESn that isn't
>   otherwise used, and throws an ERROR if it overruns past
>   MASTER_SITES9.

That logic will hopefully be soon 100% obsolete.

I need some okays on the .VARIABLES make patch.

I have code to be able to use more or less arbitrary
suffixes in MASTER_SITES in bsd.port.mk, and the corresponding
stuff for sqlports and dpb ought to be fairly trivial.



distexpand for autogenerated upstream distfile resources (was: standardize and simplify GitHub submodule handling in ports?)

2023-08-08 Thread Thomas Frohwein
On Mon, Aug 07, 2023 at 09:17:05PM +0100, Stuart Henderson wrote:

[...]

> I think maybe I'd prefer to have some variable that could be used
> *instead* of the existing GH_* variables rather than in conjunction with
> them (so they can be used for all GH archive ports, rather than have
> them a special case for multi-distfile ports). If that's the standard
> way to do things we can have a sweep of the tree converting other
> ports (or at least the ones that don't use go.port.mk ;)
> 
> It would be kind-of helpful if it could support more than just github
> too (gitlab.com, sr.ht, ..). While that could be done with different
> variables (GH_xx, GL_xx, SRHT_xx etc) they're all a similar enough
> layout to each other that making the site part of the variable itself
> rather than the name would be simpler and easier to add more sites
> (plus it covers the case where you have some port using one file from
> github and one from gitlab, etc).
> 
> Playing with syntax ideas, maybe something like this would be easy to
> use for pprts not needing a rename -
> 
> SOMEVAR+= github vim vim refs/tags/v9.0.1677
> SOMEVAR+= github vim colorschemes 22986fa2a3d2f7229efd4019fcbca411caa6afbb
> 
> or with some auto-renaming (and specifying more of the path to avoid the
> extra GH_WRKSRC which I think might not be enough in some cases anyway -
> a port may have several distfiles that need to go into different base
> dirs) -
> 
> SOMEVAR+= github fortran-lang fpm refs/tags/v0.7.0
> OTHERVAR+=github toml-f toml-f e49f5523e4ee67db6628618864504448fb8c8939 
> vendor/toml-f
> OTHERVAR+=github urbanjost M_CLI2 
> 90a1a146e19c8ad37b0469b8cbd04bc28eb67a50 vendor/M_CLI2
> 
> (no idea what to use as real names instead of SOMEVAR/OTHERVAR though!)
> 
> How does that sort of thing seem to you? (i.e. using the same basic idea as
> you have for submodules, but making it the standard for all gh distfiles)?

I ran with your suggestion and came up with a solution that I've named
distexpand. The idea is to use templates for commonly used,
automatically generated and therefore predictably named, stored, and
packaged dist files. 2 variables take different arguments/parts that
are 'expanded' with the template to working MASTER_SITESn and
DISTFILES.

The current configuration in the ports Makefile is done like this,
after putting distexpand.port.mk into /usr/ports/infrastructure/mk/:

MODULES += distexpand
DISTEXPAND += template account1 project1 id1(commithash/tag)
DISTEXPANDX += template account2 project2 id2(commithash/tag) targetdir

'template' is currently set up for github, gitlab, and sourcehut. You
can use multiple DISTEXPAND and DISTEXPANDX as needed. This will _not_
use up more MASTER_SITESn, as long as the template stays the same.

Regarding the naming, I'm definitely open to discuss other suggestions.
DISTEXPAND is what I've been able to think of that most clearly conveys
the use of the fragments that are expanded to a full address for
fetching the distfile. DISTEXPANDX - the last 'X' is meant to stand for
'extended' as this is the version that relocates the extracted files to
a target dir. I'm slightly partial to consider naming the variables
instead 'DISTEXPAND4' and 'DISTEXPAND5' which would remind the porter
of the number of components for each version.

For the templating, I used %account, %project, %id, %subdir as the
placeholders. Those are substituted later with :S. I'm open to
suggestions if there may be a more established pattern for placeholders
in strings in Makefile context.

This can replace GH_{ACCOUNT,PROJECT,TAGNAME,COMMIT}. Tags are detected
as such, and in that case a DISTNAME will be set to $project-$tag if
not otherwise set. In other scenarios, a DISTNAME or PKGNAME may need
to be set.

A couple of other things to note compared to before:
- GH_WRKSRC is gone without replacement. Its usefulness was
  questionable.
- It includes logic that finds the first MASTER_SITESn that isn't
  otherwise used, and throws an ERROR if it overruns past
  MASTER_SITES9.
- Using tags is now by just proving '0.1.0' or 'v0.11.2' or other
  non-commithash string (the heuristic checks for length to determine
  if this is a tag or a commit hash).
- It currently uses 2 longer for-loops that are almost identical, but
  one for DISTEXPAND, and the other one for DISTEXPANDX. Given the
  limitations in Makefiles, I couldn't think of a way to reuse more
  code there.

This doesn't need to be in a module, but this way it's easy to plug in
and experiment with.

I'm attaching the distexpand.port.mk, as well as the patch for using it
with neovim as an example. I've tested this with about 3 dozen ports
that use combinations of mostly github sites, but also a gitlab and a
sourcehut dist source [1].

[1] https://thfr.info/tmp/distexpand-ports.txt
Index: Makefile
===
RCS file: /cvs/ports/editors/neovim/Makefile,v
retrieving revision 1.37
diff -u -p -r1.37 Makefile
--- Make

Re: standardize and simplify GitHub submodule handling in ports?

2023-08-07 Thread Stuart Henderson
On 2023/08/07 14:53, Thomas Frohwein wrote:
> On Mon, Aug 07, 2023 at 06:59:15PM +0100, Stuart Henderson wrote:
> [...]
> 
> > I haven't looked at other ports, but asterisk, vim and vmm-firmware do
> > not use git submodules.
> 
> With vim, it's the way colorschemes are pulled in that *could* be
> reworked using GH_SUBMODULES syntax. The old way continues to work, so
> for any of the ports listed, there is no need to change anything.

It does feel like a bit of a hack to use something named after submodules
for something else. (And submodules thenselves are a total hack which I
don't think shoukd be used at all, especially when projects expect to
build from a checkkut rather than a proper distfile, but I digress ;)

I think maybe I'd prefer to have some variable that could be used
*instead* of the existing GH_* variables rather than in conjunction with
them (so they can be used for all GH archive ports, rather than have
them a special case for multi-distfile ports). If that's the standard
way to do things we can have a sweep of the tree converting other
ports (or at least the ones that don't use go.port.mk ;)

It would be kind-of helpful if it could support more than just github
too (gitlab.com, sr.ht, ..). While that could be done with different
variables (GH_xx, GL_xx, SRHT_xx etc) they're all a similar enough
layout to each other that making the site part of the variable itself
rather than the name would be simpler and easier to add more sites
(plus it covers the case where you have some port using one file from
github and one from gitlab, etc).

Playing with syntax ideas, maybe something like this would be easy to
use for pprts not needing a rename -

SOMEVAR+=   github vim vim refs/tags/v9.0.1677
SOMEVAR+=   github vim colorschemes 22986fa2a3d2f7229efd4019fcbca411caa6afbb

or with some auto-renaming (and specifying more of the path to avoid the
extra GH_WRKSRC which I think might not be enough in some cases anyway -
a port may have several distfiles that need to go into different base
dirs) -

SOMEVAR+=   github fortran-lang fpm refs/tags/v0.7.0
OTHERVAR+=  github toml-f toml-f e49f5523e4ee67db6628618864504448fb8c8939 
vendor/toml-f
OTHERVAR+=  github urbanjost M_CLI2 
90a1a146e19c8ad37b0469b8cbd04bc28eb67a50 vendor/M_CLI2

(no idea what to use as real names instead of SOMEVAR/OTHERVAR though!)

How does that sort of thing seem to you? (i.e. using the same basic idea as
you have for submodules, but making it the standard for all gh distfiles)?



Re: standardize and simplify GitHub submodule handling in ports?

2023-08-07 Thread Thomas Frohwein
On Mon, Aug 07, 2023 at 06:59:15PM +0100, Stuart Henderson wrote:
[...]

> I haven't looked at other ports, but asterisk, vim and vmm-firmware do
> not use git submodules.

With vim, it's the way colorschemes are pulled in that *could* be
reworked using GH_SUBMODULES syntax. The old way continues to work, so
for any of the ports listed, there is no need to change anything.



Re: standardize and simplify GitHub submodule handling in ports?

2023-08-07 Thread Brian Callahan
On 8/7/2023 1:59 PM, Stuart Henderson wrote:
> On 2023/08/07 12:44, Thomas Frohwein wrote:
>> I tested this with the about 30 ports I could identify that use GitHub
>> submodules, by adjusting the Makefile to use GH_SUBMODULES. Here a few
>> points from what I've observed:
> ..
>>
>> The full table of what I tested and the result up to if the port still
>> packages is here: https://thfr.info/tmp/github-submodule-ports.txt
> 
> I haven't looked at other ports, but asterisk, vim and vmm-firmware do
> not use git submodules.
> 

I don't want to change the DMD build process. It matches what upstream
does, and has been helpful for debugging the process with them in the past.

~Brian



Re: standardize and simplify GitHub submodule handling in ports?

2023-08-07 Thread Stuart Henderson
On 2023/08/07 12:44, Thomas Frohwein wrote:
> I tested this with the about 30 ports I could identify that use GitHub
> submodules, by adjusting the Makefile to use GH_SUBMODULES. Here a few
> points from what I've observed:
..
> 
> The full table of what I tested and the result up to if the port still
> packages is here: https://thfr.info/tmp/github-submodule-ports.txt

I haven't looked at other ports, but asterisk, vim and vmm-firmware do
not use git submodules.



Re: standardize and simplify GitHub submodule handling in ports?

2023-08-07 Thread Thomas Frohwein
On Sun, Aug 06, 2023 at 07:00:49PM +0200, Marc Espie wrote:

[...]

> > > I'm also wondering if keeping the main GH_* stuff in bsd.port.mk makes a 
> > > lot
> > > of sense, instead of grouping everything in github.port.mk
> > 
> > I'm for it, maybe as a second step after the module for just the
> > submodule handling is done because there would be a lot of ports churn
> > with moving the main GH_* stuff out of bsd.port.mk.
> 
> Probably not. We have a few "big" modules that don't depend on explicitly
> adding to modules, but on some variable triggering it (historic imake stuff
> or configure), having github.port.mk brought in from one of the GH_* variables
> (probably don't need to test them all) would be acceptable.

I'm sharing a new version after some testing, this time as a diff
because it turns out the handling of GH_*-related DISTNAME juggling
needs to happen before setting up the DISTFILES in the module. So this
diff now effectively moves all the GH_* bits from bsd.port.mk into
github.port.mk. The module is hooked up by defining GH_ACCOUNT or
GH_SUBMODULES.

Some rework was needed while testing to handle the different situations
for the target directory of the submodule. This ended up with a test if
the directory exists, and depending on the result either rmdir'ing the
(presumed empty) directory, or mkdir'ing the parent directory.

I tested this with the about 30 ports I could identify that use GitHub
submodules, by adjusting the Makefile to use GH_SUBMODULES. Here a few
points from what I've observed:

1. It's well possible to mix and match with the old way of defining
MASTER_SITES0 etc and DISTFILES=filename.ext:0. An example is
devel/fpm, where I kept this line:
MASTER_SITES0 =
https://github.com/fortran-lang/fpm/releases/download/v${V}/
while replacing MASTER_SITES{1,2} with the new GH_SUBMODULES way.

2. Setting GH_WRKSRC allows keeping the diff from the current Makefile
small, for example in devel/fpm:
+GH_WRKSRC =${WRKSRC}/vendor
and in editors/neovim:
+GH_WRKSRC =${STATIC_DEPS_WRKSRC}

3. It's possible to either use the commit hash or a tagname, like here
for lang/jruby:
+GH_SUBMODULES+=jnr jffi refs/tags/jffi-1.3.10 jffi

4. When using EXTRACT_ONLY, the name to use is more complicated now,
but can be identified with `$ make show=CHECKSUMFILES`. Here is what I
had to change it to with print/lilypond:
-EXTRACT_ONLY=  ${DISTNAME}.tar.gz urw-base35-fonts-${URW_V}.tar.gz
+EXTRACT_ONLY=  ${DISTNAME}.tar.gz \
+   
gh-submodules/ArtifexSoftware-urw-base35-fonts-${URW_V}.tar.gz

The full table of what I tested and the result up to if the port still
packages is here: https://thfr.info/tmp/github-submodule-ports.txt

I also tested megaglest as a port that uses GH_ACCOUNT etc without using any
submodules. Tested also with dxx-rebirth as an example of a port not
using any GH_*.

This draft hijacks MASTER_SITES8 for the modules purposes - not
MASTER_SITES9, as I noticed that is used by some modules like cargo and
go. I grep'd through the ports tree and couldn't identify any explicit
use of MASTER_SITES8.

I think this is ready for more general testing:
$ cd /usr/ports/infrastructure/mk; patch < /path/to/github-submodules.diff
Index: bsd.port.mk
===
RCS file: /cvs/ports/infrastructure/mk/bsd.port.mk,v
retrieving revision 1.1595
diff -u -p -r1.1595 bsd.port.mk
--- bsd.port.mk 8 Jul 2023 10:20:16 -   1.1595
+++ bsd.port.mk 7 Aug 2023 16:42:28 -
@@ -136,8 +136,8 @@ _ALL_VARIABLES += BROKEN COMES_WITH \
CONFIGURE_STYLE USE_LIBTOOL SEPARATE_BUILD \
SHARED_LIBS TARGETS PSEUDO_FLAVOR \
AUTOCONF_VERSION AUTOMAKE_VERSION CONFIGURE_ARGS \
-   GH_ACCOUNT GH_COMMIT GH_PROJECT GH_TAGNAME MAKEFILE_LIST \
-   USE_LLD USE_NOEXECONLY USE_WXNEEDED USE_NOBTCFI \
+   GH_ACCOUNT GH_COMMIT GH_PROJECT GH_SUBMODULES GH_TAGNAME \
+   MAKEFILE_LIST USE_LLD USE_NOEXECONLY USE_WXNEEDED USE_NOBTCFI \
COMPILER COMPILER_LANGS COMPILER_LINKS \
SUBST_VARS UPDATE_PLIST_ARGS \
PKGPATHS DEBUG_PACKAGES DEBUG_CONFIGURE_ARGS \
@@ -313,6 +313,10 @@ MODULES += ${_i}
 .  endif
 .endfor
 
+.if defined(GH_ACCOUNT) || defined(GH_SUBMODULES)
+MODULES += github
+.endif
+
 MODULES ?=
 .if !empty(MODULES) || !empty(COMPILER)
 _MODULES_DONE =
@@ -611,18 +615,6 @@ BUILD_DEPENDS += textproc/groff>=1.21
 _PKG_ARGS += -DUSE_GROFF=1
 .endif
 
-# github related variables
-GH_TAGNAME ?=
-GH_COMMIT ?=
-GH_ACCOUNT ?=
-GH_PROJECT ?=
-
-.if !empty(GH_PROJECT) && !empty(GH_TAGNAME)
-_GH_TAG_DIST = ${GH_TAGNAME:C/^(v|V|ver|[Rr]el|[Rr]elease)[-._]?([0-9])/\2/}
-DISTNAME ?=   ${GH_PROJECT}-${_GH_TAG_DIST}
-GH_DISTFILE = ${GH_PROJECT}-${_GH_TAG_DIST}${EXTRACT_SUFX}
-.endif
-
 PKGNAME ?= ${DISTNAME}
 FULLPKGNAME ?= ${PKGNAME}${FLAVOR_EXT}
 _MASTER ?=
@@ -871,11 +863,7 @@ _WRKDIRS = ${WRKOBJDIR_${PKGPATH}}/${_WR
 _WRKDIRS += ${WRKOBJDIR}/${_WRKDIR_STEM}
 _WRKDIRS += ${WRKOBJ

Re: standardize and simplify GitHub submodule handling in ports?

2023-08-06 Thread Marc Espie
On Sat, Aug 05, 2023 at 09:50:57PM -0400, Thomas Frohwein wrote:
> On Sat, Aug 05, 2023 at 11:27:24PM +0200, Marc Espie wrote:
> > Some comments already. I haven't looked very closely.
> 
> > On Sat, Aug 05, 2023 at 03:12:18PM -0400, Thomas Frohwein wrote:
> > > The current draft hijacks post-extract target, but it would be easy to
> > > add this to _post-extract-finalize in bsd.port.mk similar to how the
> > > post-extract commands from modules are handled, if this is of interest.
> > 
> > Please do that.
> 
> Thanks, I updated the draft. Realized that including it with
> MODULES=github is easiest and then this can just use
> MODGITHUB_post-extract and doesn't need any custom code in bsd.port.mk.
> I had a thinko in post-extract (needs to be '||', not '&&') which is
> also corrected.
> 
> > > # where submodule distfiles will be stored
> > > GHSM_DIST_SUBDIR ?=   gh-submodules
> > 
> > Please keep to the GH_* subspace.
> > 
> > Already, modules usually use MOD* variable names, but in the GH case, that
> > would be a bit long.
> 
> I renamed GHSM_* to GH_*. I wouldn't mind using MODGH_* if that's an
> option, but MODGITHUB_* would be pretty unwieldy, especially if we were
> to make the existing GH_{ACCOUNT,PROJECT,TAGNAME} etc. part of this.
> 
> [...]
> 
> > Please do a single loop.  That's slightly more readable for me.
> 
> yes, done.
> 
> [...]
> 
> > Also please draft a diff for port-modules(5)
> 
> I'm attaching a diff for port-modules.5, along with the updated
> github.port.mk.
> 
> > I'm also wondering if keeping the main GH_* stuff in bsd.port.mk makes a lot
> > of sense, instead of grouping everything in github.port.mk
> 
> I'm for it, maybe as a second step after the module for just the
> submodule handling is done because there would be a lot of ports churn
> with moving the main GH_* stuff out of bsd.port.mk.

Probably not. We have a few "big" modules that don't depend on explicitly
adding to modules, but on some variable triggering it (historic imake stuff
or configure), having github.port.mk brought in from one of the GH_* variables
(probably don't need to test them all) would be acceptable.



Re: standardize and simplify GitHub submodule handling in ports?

2023-08-05 Thread Thomas Frohwein
On Sat, Aug 05, 2023 at 11:27:24PM +0200, Marc Espie wrote:
> Some comments already. I haven't looked very closely.

> On Sat, Aug 05, 2023 at 03:12:18PM -0400, Thomas Frohwein wrote:
> > The current draft hijacks post-extract target, but it would be easy to
> > add this to _post-extract-finalize in bsd.port.mk similar to how the
> > post-extract commands from modules are handled, if this is of interest.
> 
> Please do that.

Thanks, I updated the draft. Realized that including it with
MODULES=github is easiest and then this can just use
MODGITHUB_post-extract and doesn't need any custom code in bsd.port.mk.
I had a thinko in post-extract (needs to be '||', not '&&') which is
also corrected.

> > # where submodule distfiles will be stored
> > GHSM_DIST_SUBDIR ?= gh-submodules
> 
> Please keep to the GH_* subspace.
> 
> Already, modules usually use MOD* variable names, but in the GH case, that
> would be a bit long.

I renamed GHSM_* to GH_*. I wouldn't mind using MODGH_* if that's an
option, but MODGITHUB_* would be pretty unwieldy, especially if we were
to make the existing GH_{ACCOUNT,PROJECT,TAGNAME} etc. part of this.

[...]

> Please do a single loop.  That's slightly more readable for me.

yes, done.

[...]

> Also please draft a diff for port-modules(5)

I'm attaching a diff for port-modules.5, along with the updated
github.port.mk.

> I'm also wondering if keeping the main GH_* stuff in bsd.port.mk makes a lot
> of sense, instead of grouping everything in github.port.mk

I'm for it, maybe as a second step after the module for just the
submodule handling is done because there would be a lot of ports churn
with moving the main GH_* stuff out of bsd.port.mk.
# List of static dependencies. The format is:
# account project tag_or_commit target_dir # license
# Example:
# GH_SUBMODULES +=  moonlight-stream moonlight-common-c \
#   c9426a6a71c4162e65dde8c0c71a25f1dbca46ba \
#   third-party/moonlight-common-c # GPL-v3.0+
GH_SUBMODULES ?=

# Master site for github tarballs
GH_MASTER_SITES ?=  https://github.com/

# where submodule distfiles will be stored
GH_DIST_SUBDIR ?=   gh-submodules

# where submodules will be extracted to
GH_WRKSRC ?=${WRKSRC}

# Grab submodules by default with MASTER_SITES8. (Don't use 9 to avoid collision
# with language-specific mechanisms, like devel/cargo or lang/go.)
GH_MASTER_SITESN ?= 8
MASTER_SITES${GH_MASTER_SITESN} ?=  ${GH_MASTER_SITES}

# Default GitHub distfile suffix
GH_SUFX ?=  .tar.gz

.if defined(DISTNAME)
DISTFILES ?= ${DISTNAME}${EXTRACT_SUFX}
.elif !empty(GH_ACCOUNT) && !empty(GH_PROJECT)
DISTFILES ?= ${GH_DISTFILE}
.endif

# post-extract target for moving the submodules to GH_WRKSRC
MODGITHUB_post-extract = \
@${ECHO_MSG} "moving GitHub submodules to ${GH_WRKSRC}" ; \
mkdir -p ${GH_WRKSRC} ;

.for _ghaccount _ghproject _ghtagcommit _targetdir in ${GH_SUBMODULES}
DISTFILES +=
${GH_DIST_SUBDIR}/{}${_ghaccount}/${_ghproject}/archive/${_ghtagcommit}${GH_SUFX}:${GH_MASTER_SITESN}
MODGITHUB_post-extract += \
test -d ${GH_WRKSRC}/${_targetdir} && \
rm -rf ${GH_WRKSRC}/${_targetdir} ; \
mv ${WRKDIR}/${_ghproject}-${_ghtagcommit} ${GH_WRKSRC}/${_targetdir} ;
.endfor
Index: port-modules.5
===
RCS file: /cvs/src/share/man/man5/port-modules.5,v
retrieving revision 1.264
diff -u -p -r1.264 port-modules.5
--- port-modules.5  9 May 2023 19:44:06 -   1.264
+++ port-modules.5  6 Aug 2023 01:48:22 -
@@ -744,6 +744,15 @@ contains c++, this module provides
 .Ev MODGCC4_CPPLIBDEP
 and
 .Ev MODGCC4_CPPWANTLIB .
+.It github
+Set
+.Li GH_SUBMODULES += w x y z
+for each GitHub submodule to be used in the port, specifying GitHub account, 
project, commit hash, and the target directory relative to
+.Ev GH_WRKSRC
+.Po
+defaults to
+.Ev WRKSRC
+.Pc .
 .It gnu
 This module is documented in the main
 .Xr bsd.port.mk 5


Re: standardize and simplify GitHub submodule handling in ports?

2023-08-05 Thread Marc Espie
Some comments already. I haven't looked very closely.

On Sat, Aug 05, 2023 at 03:12:18PM -0400, Thomas Frohwein wrote:
> The current draft hijacks post-extract target, but it would be easy to
> add this to _post-extract-finalize in bsd.port.mk similar to how the
> post-extract commands from modules are handled, if this is of interest.

Please do that.

> # where submodule distfiles will be stored
> GHSM_DIST_SUBDIR ?=   gh-submodules

Please keep to the GH_* subspace.

Already, modules usually use MOD* variable names, but in the GH case, that
would be a bit long.

> .for _ghaccount _ghproject _ghtagcommit _targetdir in ${GH_SUBMODULES}
> DISTFILES +=  
> ${GHSM_DIST_SUBDIR}/{}${_ghaccount}/${_ghproject}/archive/${_ghtagcommit}${GH_SUFX}:${GHSM_MASTER_SITESN}
> .endfor
> 
> # post-extract target for moving the submodules to the target directories
> GHSM_post-extract =
> .for _ghaccount _ghproject _ghtagcommit _targetdir in ${GH_SUBMODULES}
> GHSM_post-extract += \
>   test -d ${GHSM_WRKSR}/${_targetdir} || rm -rf 
> ${GHSM_WRKSRC}/${_targetdir} ; \
That line is weird.
>   mv ${WRKDIR}/${_ghproject}-${_ghtagcommit} ${GHSM_WRKSRC}/${_targetdir} 
> ;
> .endfor

Please do a single loop.  That's slightly more readable for me.

> # XXX: would best belong in _post-extract-finalize in bsd.port.mk rather than
> #  hijacking post-extract here
> post-extract:
>   @${ECHO_MSG} "moving GitHub submodules to ${GHSM_WRKSRC}" ;
>   mkdir -p ${GHSM_WRKSRC} ;
>   ${GHSM_post-extract}


Also please draft a diff for port-modules(5)


I'm also wondering if keeping the main GH_* stuff in bsd.port.mk makes a lot
of sense, instead of grouping everything in github.port.mk



standardize and simplify GitHub submodule handling in ports?

2023-08-05 Thread Thomas Frohwein
Hi,

GitHub projects using submodules seems to be a common enough case that
I'm wondering if it would be helpful to simplify and standardize it. I
got the idea when I saw how the modules for go and cargo pull in their
modules or crates, respectively.

The current state is that projects with submodules are either manually
packaged and hosted, or a combination of of MASTER_SITES0 through 9,
manual DISTFILES setting, and post-extract movement are used. What both
have in common is at least some maintenance burden and a significant
barrier for especially for those newer to the ports system.

The diff below proposes a module github.port.mk that aims to simplify
this process. At its core, the main way to manage submodules is to add
a line like the following:

GH_SUBMODULES+= luvit luv 093a977b82077591baefe1e880d37dfa2730bd54 \
luv # Apache-2.0

This way the GitHub tarball from the commit 093a977b from the project
luv by account luvit is added to the distfiles, and then the extracted
files are moved to ${GHSM_WRKSRC}/luv (GHSM_WRKSRC defaults to
${WRKSRC}). I saw license comments used with MODCARGO_CRATES and think
this would also be a good location for the submodule licenses.

This way, the multi-step setup and maintenance in MASTER_SITESX,
DISTFILES, and post-extract is reduced to essentially one location.

The current draft hijacks post-extract target, but it would be easy to
add this to _post-extract-finalize in bsd.port.mk similar to how the
post-extract commands from modules are handled, if this is of interest.

I'm attaching the github.port.mk file, as well as 3 diffs to show how
the simplified Makefiles look with this. A quick grep through the ports
tree shows there are at least a couple of dozen ports that could
benefit from a rework...
# List of static dependencies. The format is:
# account project tag_or_commit target_dir # license
# Example:
# GH_SUBMODULES +=  moonlight-stream moonlight-common-c \
#   c9426a6a71c4162e65dde8c0c71a25f1dbca46ba \
#   third-party/moonlight-common-c # GPL-v3.0+
GH_SUBMODULES ?=

# Master site for github tarballs
GH_MASTER_SITES ?=  https://github.com/

# where submodule distfiles will be stored
GHSM_DIST_SUBDIR ?= gh-submodules

# where submodules will be extracted to
GHSM_WRKSRC ?=  ${WRKSRC}

# Grab submodules by default with MASTER_SITES8. (Don't use 9 to avoid collision
# with language-specific mechanisms, like devel/cargo or lang/go.)
GHSM_MASTER_SITESN ?=   8
MASTER_SITES${GHSM_MASTER_SITESN} ?=${GH_MASTER_SITES}

# Default GitHub distfile suffix
GH_SUFX ?=  .tar.gz

.if defined(DISTNAME)
DISTFILES ?= ${DISTNAME}${EXTRACT_SUFX}
.elif !empty(GH_ACCOUNT) && !empty(GH_PROJECT)
DISTFILES ?= ${GH_DISTFILE}
.endif

.for _ghaccount _ghproject _ghtagcommit _targetdir in ${GH_SUBMODULES}
DISTFILES +=
${GHSM_DIST_SUBDIR}/{}${_ghaccount}/${_ghproject}/archive/${_ghtagcommit}${GH_SUFX}:${GHSM_MASTER_SITESN}
.endfor

# post-extract target for moving the submodules to the target directories
GHSM_post-extract =
.for _ghaccount _ghproject _ghtagcommit _targetdir in ${GH_SUBMODULES}
GHSM_post-extract += \
test -d ${GHSM_WRKSR}/${_targetdir} || rm -rf 
${GHSM_WRKSRC}/${_targetdir} ; \
mv ${WRKDIR}/${_ghproject}-${_ghtagcommit} ${GHSM_WRKSRC}/${_targetdir} 
;
.endfor

# XXX: would best belong in _post-extract-finalize in bsd.port.mk rather than
#  hijacking post-extract here
post-extract:
@${ECHO_MSG} "moving GitHub submodules to ${GHSM_WRKSRC}" ;
mkdir -p ${GHSM_WRKSRC} ;
${GHSM_post-extract}
Index: Makefile
===
RCS file: /cvs/ports/editors/neovim/Makefile,v
retrieving revision 1.37
diff -u -p -r1.37 Makefile
--- Makefile14 Jun 2023 07:47:57 -  1.37
+++ Makefile5 Aug 2023 19:06:09 -
@@ -23,22 +23,20 @@ CATEGORIES =editors devel
 HOMEPAGE = https://neovim.io
 MAINTAINER =   Edd Barrett 
 
+# Move static deps source code under WRKDIST so that they can be patched.
+STATIC_DEPS_WRKSRC =   ${WRKDIST}/static-deps
+GHSM_WRKSRC =  ${STATIC_DEPS_WRKSRC}
+
 # The versions listed here must match those in cmake.deps/CMakeLists.txt.
-LUV_VER =  093a977b82077591baefe1e880d37dfa2730bd54
-LUAJIT_VER =   505e2c03de35e2718eef0d2d3660712e06dadf1f
-LUACOMPAT_VER =v0.9
-
-MASTER_SITES0 =https://github.com/luvit/luv/archive/
-MASTER_SITES1 = https://github.com/LuaJIT/LuaJIT/archive/
-MASTER_SITES2 = https://github.com/keplerproject/lua-compat-5.3/archive/
-DISTFILES =${DISTNAME}${EXTRACT_SUFX} \
-   luv-{}${LUV_VER}${EXTRACT_SUFX}:0 \
-   luajit-{}${LUAJIT_VER}${EXTRACT_SUFX}:1 \
-   lua-compat-5.3-{}${LUACOMPAT_VER}${EXTRACT_SUFX}:2
-
-# Neovim: Apache 2.0 + Vim License
-# LuaJIT: MIT + public domain
-# libluv: Apache 2.0
+GH_SUBMODULES+=luvit luv 093a977b82077591baefe1e880d37dfa2730bd54 \
+   luv # Apa