Re: Submodule, subtree, or something else?
On Sun, Aug 23, 2015 at 7:11 AM, Jānis Rukšāns janis.ruks...@gmail.com wrote: On Pk, 2015-08-21 at 17:07 -0700, Stefan Beller wrote: On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns janis.ruks...@gmail.com wrote: A major drawback of submodules in my opinion is the inability to make a full clone from an existing one without having access to the central repository, which is something I have to do from time to time. Can you elaborate on that a bit more? git clone --recurse-submodules should do that no matter which remote you contact? I mean that if I have cloned a repository with submodules, cloning that repository with --recurse-submodules will either access the central server if absolute URLs are used, or requires additional clones for each submodule. For example git clone --recursive http://somewhere/projectA.git git clone --recursive file://$(pwd)/projectA projectA.tmp The second command will cause the submodules to be downloaded again, or expect them to be found in $(pwd). IIUC, the second command will lookup the submodules in $(pwd), but if they are not there they are skipped, so all of the existing submodules are cloned. Why do you need more submodules in the tmp clone than in $(pwd)/projectA would be my next question. But I see your point now. Or am I mistaken, or doing something wrong? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Submodule, subtree, or something else?
On P , 2015-08-24 at 09:51 -0700, Stefan Beller wrote: IIUC, the second command will lookup the submodules in $(pwd), but if they are not there they are skipped, so all of the existing submodules are cloned. Why do you need more submodules in the tmp clone than in $(pwd)/projectA would be my next question. But I see your point now. The $(pwd) was just an example to illustrate my point. The actual use case is that I would be hacking on something at work, notice that it is already late and I have to catch the last bus home, yet I don't want to postpone whatever I was working on until the next day. So I would do git commit -a -m [WIP] Stuff, finish at home to save my work so far, go home, and clone / fetch it over ssh. Another important factor is that a lot of our code can be meaningfully tested only on the actual hardware, and is built in a VM. Quite often getting things right involve many iterations of hack hack hack, git commit --amend, fetch reset --hard in the VM, build, test, repeat. Being able to clone / fetch directly from the copy I am working on makes it a lot easier. As I wrote in the other e-mail, I managed to achieve the desired result by using ./submodule (without .git suffix) as the submodule URL, and creating a file named submodule in the bare repo with 'gitdir: ../submodule.git' as it's contents, but I'm not sure whether it is a good idea or not. Jānis -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Submodule, subtree, or something else?
On Sv, 2015-08-23 at 17:13 -0600, Cox, Michael wrote: You might want to take a look at how the Boost (boost.org) project uses submodules. They use submodules for each library. I know they use relative paths in their .gitmodules file to avoid the problem you're referring to regarding git clone --recurse-submodules. Thanks! I had a look at their setup, and they are using ../libx.git for submodules, which unfortunately breaks when cloning from another working copy: $ git clone --recursive file:///tmp/gittest/repo.a/main.git main.work Cloning into 'main.work'... snip Submodule 'liba' (file:///tmp/gittest/repo.a/liba.git) registered for path 'liba' Cloning into 'liba'... snip Submodule path 'liba': checked out '6a0ef37c03a7068328956dcb8a08bc39f280edfc' $ git clone --recursive file://($pwd)/main.work main.home Cloning into 'main.home'... snip Submodule 'liba' (file:///tmp/gittest/work/liba.git) registered for path 'liba' Cloning into 'liba'... fatal: '/tmp/gittest/work/liba.git' does not appear to be a git repository fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. Clone of 'file:///tmp/gittest/work/liba.git' into submodule path 'liba' failed After some trial and error I managed to get what I wanted to achieve by using ./liba as the submodule URL (no .git suffix!), and creating a file named liba in /tmp/gittest/repo.a/main.git (ie. the bare origin repo) with a single line in it: gitdir: ../liba.git However, I'm not sure it is the right thing, or even advisable to do so. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Submodule, subtree, or something else?
On Pk, 2015-08-21 at 17:07 -0700, Stefan Beller wrote: On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns janis.ruks...@gmail.com wrote: A major drawback of submodules in my opinion is the inability to make a full clone from an existing one without having access to the central repository, which is something I have to do from time to time. Can you elaborate on that a bit more? git clone --recurse-submodules should do that no matter which remote you contact? I mean that if I have cloned a repository with submodules, cloning that repository with --recurse-submodules will either access the central server if absolute URLs are used, or requires additional clones for each submodule. For example git clone --recursive http://somewhere/projectA.git git clone --recursive file://$(pwd)/projectA projectA.tmp The second command will cause the submodules to be downloaded again, or expect them to be found in $(pwd). Or am I mistaken, or doing something wrong? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Submodule, subtree, or something else?
Hello, First of all, I apologise for the wall of text that follows; obviously I am bad at this. My $DAYJOB is switching from Subversion to Git, primarily because of it's distributed nature (we are scattered all across the globe), and the ease of branching and merging. One issue that has popped up is how to manage code shared between multiple projects. Our SVN setup used a shared repository for all projects, either using externals for shared code, or, more often than not, simply merging the code between projects as needed. Ignoring the fact that merging with SVN is somewhat cumbersome, overall it has worked quite well for us, especially when combined with git-svn. For external libraries that rarely change, submodules appear to be the obvious choice when using Git. On the other hand, I've found them somewhat cumbersome to use, and subtree merging (either using git subtree, or directly with git merge -s subtree) is closer to what we were doing in SVN. A major drawback of submodules in my opinion is the inability to make a full clone from an existing one without having access to the central repository, which is something I have to do from time to time. For internal libraries, the situation is even less clear. For many of these libraries, most of the development happens within the context of a single project, with commits to main project being interleaved with commits to the subproject(s), resulting in histories resembling: (using git submodule) A---B---S1---S2---C---S3 ,´ ,´ ,´ N---OPQ---R (using git subtree with --rejoin) A---B---N---O---M1---M2---Q---C---R---M3 /// N'--O'---PQ'--R' (using merge -s subtree) A---B---M1---M2---C---M3 /// N---OPQ---R where A, B and C are changes to the main project, N, O, P, Q and R are changes to library code, and Sn and Mn are submodule updates and merge commits, respectively. From what I have gathered, submodules have issues with branching and merging, therefore, unless I'm mistaken, submodules are kinda out of question. Of the remaining two options, merging directly results in a nicer history, but requires making all changes to the library repo first (although I am quite sure that a similar effect can be achieved with plumbing, similarly to how git subtree split works), and is harder to use than git subtree. Also, all three options can result in the main project history being cluttered with extra commits. Lastly, there is a particularly painful 3rd party library that has an enormous amount of local modifications that are never going to make it upstream, essentially making it a fork, project specific changes that are required for one project, but would break others, separate language bindings that access the internals (often requiring bug fixes to be made simultaneously to both), and, if that wasn't enough, it *requires* several source files to be modified for each individual project that uses it. It's a complete mess, but we're stuck with it for the existing projects, as switching to an alternative would be too time consuming. To sum up, I'm looking for something that would let us share code between multiple projects, allow for: 1) separate histories with relatively easy branching and merging 2) distributed workflow without having to set up a multiple repositories everywhere (eg. work - home - laptop) 3) to work on the shared code within a project using it 4) inspection of the complete history 5) modifications that are not shared with other projects and would not result in lots of clutter in the history. Repository size is somewhat less of an issue, because each submodule has to be checked out anyway. Submodules let you have #3, and #1, #2 and #5 to a point, after which it becomes a pain. git subtree allows #1, #2, #3 and #4, and #5 with some pain (?), but results in duplicate commits. Using subtree merge strategy directly gives everything except #3, but is harder to use than submodules or subtree. Are there any other options beside these three for sharing (or in some cases, not sharing) common code between projects using Git, that would address the above points better? Or, alternatively, ways to work around the drawbacks of the existing tools? Lastly, I will be grateful for any suggestions about how to handle the messy case described above better. Thanks, Jānis -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Submodule, subtree, or something else?
On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns janis.ruks...@gmail.com wrote: Hello, First of all, I apologise for the wall of text that follows; obviously I am bad at this. My $DAYJOB is switching from Subversion to Git, primarily because of it's distributed nature (we are scattered all across the globe), and the ease of branching and merging. One issue that has popped up is how to manage code shared between multiple projects. Our SVN setup used a shared repository for all projects, either using externals for shared code, or, more often than not, simply merging the code between projects as needed. Ignoring the fact that merging with SVN is somewhat cumbersome, overall it has worked quite well for us, especially when combined with git-svn. For external libraries that rarely change, submodules appear to be the obvious choice when using Git. On the other hand, I've found them somewhat cumbersome to use, and subtree merging (either using git subtree, or directly with git merge -s subtree) is closer to what we were doing in SVN. A major drawback of submodules in my opinion is the inability to make a full clone from an existing one without having access to the central repository, which is something I have to do from time to time. Can you elaborate on that a bit more? git clone --recurse-submodules should do that no matter which remote you contact? For internal libraries, the situation is even less clear. For many of these libraries, most of the development happens within the context of a single project, with commits to main project being interleaved with commits to the subproject(s), resulting in histories resembling: (using git submodule) A---B---S1---S2---C---S3 ,´ ,´ ,´ N---OPQ---R (using git subtree with --rejoin) A---B---N---O---M1---M2---Q---C---R---M3 /// N'--O'---PQ'--R' (using merge -s subtree) A---B---M1---M2---C---M3 /// N---OPQ---R where A, B and C are changes to the main project, N, O, P, Q and R are changes to library code, and Sn and Mn are submodule updates and merge commits, respectively. From what I have gathered, submodules have issues with branching and merging, therefore, unless I'm mistaken, submodules are kinda out of question. Of the remaining two options, merging directly results in a nicer history, but requires making all changes to the library repo first (although I am quite sure that a similar effect can be achieved with plumbing, similarly to how git subtree split works), and is harder to use than git subtree. Also, all three options can result in the main project history being cluttered with extra commits. Lastly, there is a particularly painful 3rd party library that has an enormous amount of local modifications that are never going to make it upstream, essentially making it a fork, project specific changes that are required for one project, but would break others, separate language bindings that access the internals (often requiring bug fixes to be made simultaneously to both), and, if that wasn't enough, it *requires* several source files to be modified for each individual project that uses it. It's a complete mess, but we're stuck with it for the existing projects, as switching to an alternative would be too time consuming. To sum up, I'm looking for something that would let us share code between multiple projects, allow for: 1) separate histories with relatively easy branching and merging 2) distributed workflow without having to set up a multiple repositories everywhere (eg. work - home - laptop) 3) to work on the shared code within a project using it 4) inspection of the complete history 5) modifications that are not shared with other projects and would not result in lots of clutter in the history. Repository size is somewhat less of an issue, because each submodule has to be checked out anyway. Submodules let you have #3, and #1, #2 and #5 to a point, after which it becomes a pain. git subtree allows #1, #2, #3 and #4, and #5 with some pain (?), but results in duplicate commits. Using subtree merge strategy directly gives everything except #3, but is harder to use than submodules or subtree. Are there any other options beside these three for sharing (or in some cases, not sharing) common code between projects using Git, that would address the above points better? Or, alternatively, ways to work around the drawbacks of the existing tools? Lastly, I will be grateful for any suggestions about how to handle the messy case described above better. Thanks, Jānis -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the