Example - on http://datasets.datalad.org we have a few hundred datasets
organized into a hierarchy as git submodules. Each git submodules carries its
own .git/ directory so they are "self sufficient" and we could readily assess
their sizes, and "cut the tree" at any level without looking for the
supermodule somewhere high up in the tree.
.gitmodules typically has relative paths for the url and path for the
submodules there, the form which I think we chose because it used to work (I
could be utterly wrong! but I think it was done in an informed fashion)
for git clone --recursive:
$> curl http://datasets.datalad.org/labs/gobbini/famface/.gitmodules
[submodule "data"]
path = data
url = ./data
and possibly outside:
$> curl
http://datasets.datalad.org/labs/gobbini/famface/data/.gitmodules
[submodule "scripts/mridefacer"]
path = scripts/mridefacer
url = https://github.com/yarikoptic/mridefacer
But unfortunately git doesn't even consider such (valid AFAIK) situation
while cloning where url has to have .git suffix but repository is not bare and
a relative "data" path (or "./data" url) is referring to the worktree.
$> git clone --recursive
http://datasets.datalad.org/labs/gobbini/famface/.git
Cloning into 'famface'...
remote: Counting objects: 61, done.
remote: Compressing objects: 100% (54/54), done.
remote: Total 61 (delta 14), reused 0 (delta 0)
Unpacking objects: 100% (61/61), done.
Submodule 'data'
(http://datasets.datalad.org/labs/gobbini/famface/.git/data) registered for
path 'data'
Cloning into '/tmp/famface/data'...
fatal: repository
'http://datasets.datalad.org/labs/gobbini/famface/.git/data/' not found
fatal: clone of
'http://datasets.datalad.org/labs/gobbini/famface/.git/data' into submodule
path '/tmp/famface/data' failed
Failed to clone 'data'. Retry scheduled
Cloning into '/tmp/famface/data'...
fatal: repository
'http://datasets.datalad.org/labs/gobbini/famface/.git/data/' not found
fatal: clone of
'http://datasets.datalad.org/labs/gobbini/famface/.git/data' into submodule
path '/tmp/famface/data' failed
Failed to clone 'data' a second time, aborting
on the server I use the "smart HTTP" git backend, but not sure if that is the
one to blame, since
I do not see in the logs any attempt to get the /data from not under .git/:
10.31.188.88 - - [13/Dec/2018:12:18:38 -0500] "GET
/labs/gobbini/famface/.git/info/refs?service=git-upload-pack HTTP/1.1" 200 681
"-" "git/2.20.0.rc2.403.gdbc3b29805"
10.31.188.88 - - [13/Dec/2018:12:18:38 -0500] "POST
/labs/gobbini/famface/.git/git-upload-pack HTTP/1.1" 200 69276 "-"
"git/2.20.0.rc2.403.gdbc3b29805"
==> datasets.datalad.org-error.log <==
[Thu Dec 13 12:18:38.673447 2018] [core:info] [pid 7570:tid
140683541153536] [client 10.31.188.88:32794] AH00128: File does not exist:
/srv/datasets.datalad.org/www/labs/gobbini/famface/.git/data/info/refs
==> datasets.datalad.org-access-comb.log <==
10.31.188.88 - - [13/Dec/2018:12:18:38 -0500] "GET
/labs/gobbini/famface/.git/data/info/refs?service=git-upload-pack HTTP/1.1" 404
485 "-" "git/2.20.0.rc2.403.gdbc3b29805"
==> datasets.datalad.org-error.log <==
[Thu Dec 13 12:18:38.689277 2018] [core:info] [pid 7572:tid
140683574724352] [client 10.31.188.88:32796] AH00128: File does not exist:
/srv/datasets.datalad.org/www/labs/gobbini/famface/.git/data/info/refs
==> datasets.datalad.org-access-comb.log <==
10.31.188.88 - - [13/Dec/2018:12:18:38 -0500] "GET
/labs/gobbini/famface/.git/data/info/refs?service=git-upload-pack HTTP/1.1" 404
485 "-" "git/2.20.0.rc2.403.gdbc3b29805"
--
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik