Re: Git GSoC 2014
On Sat, Feb 15, 2014 at 7:17 PM, Thomas Rast t...@thomasrast.ch wrote: It also includes an ok from Nicolas Pitre, who has been the driving force behind packv5 development. The only thing that makes me uneasy is Nit: pack v4. You are probably confused with index v5, which is also cooking for a while now. that Duy is not in the list (Duy, have you been asked by libgit2 about possible relicensing?). I don't remember. But for the record I'm OK with relicensing my contributions in git.git for inclusion in libgit2. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Thomas Rast t...@thomasrast.ch writes: David Kastrup d...@gnu.org writes: This definitely should not be we'll think about it if and when that project is finished material. Yes, all of this is true. However, you are painting a big devil on the wall. [...] Your scenario above mostly applies if and when we really go the way of my dream and scrap the code that is in git. So it's not a big devil I am painting but rather the consequences if everything goes according to plan. (I have similar long-term dreams for other git components like ref storage and diffs, but that's just me.) Second, how many contributions would actually have been prevented by GPLv2+LE licensing? That's not as much the question as to how many will be prevented in future by such a step. The libgit2 community is different from that of Git and with a different focus. If you take a look at its front page, you'll see statements like Link with open and proprietary software. No strings attached. and Trusted and used in production by GitHub, Microsoft, [...]. A fusion with this project and its aims and licensing will have consequences regarding which developers and users are attracted to Git. The act first, think later approach is not really doing anybody favors, and I don't consider it fair to GSoC students to employ them for making this proposal gain leverage: one should first think this through on behalf of the project before putting students to work on this so that they know reasonably well what will happen with their work. The kind of Now that we made $x do $y, we are obliged to do $z. scenario is easy to avoid by _first_ contemplating whether or not $z is where one wants to go. That's not painting a devil but common sense. I'm not saying that the answer is in any way self-evident. Merely that the best time to answer it is _before_ getting invested. The only data I have on this is libgit2/git.git-authors, which records who has consented for their _existing_ code to be relicensed. I consider this to be a higher barrier than contributing new code, since there's no clear gain for the author in the relicensing. I consider it a lower barrier since the work is already done, and the authors did not when doing it think about proprietary spinoffs. But that's a minor point. All I am saying is that there are different opinions possible, and picking a particular path for future development will in either way influence who wants to be part of the respective communities. -- David Kastrup -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
On Sat, Feb 15, 2014 at 4:17 AM, Thomas Rast t...@thomasrast.ch wrote: David Kastrup d...@gnu.org writes: Thomas Rast t...@thomasrast.ch writes: Motivation: I believe that migrating to libgit2 is the better approach, medium term, than rewriting everything ourselves to be nice, clean and thread-safe. I took a shot a while ago at making the pack reading code thread-safe, but it's adding mess when we could simply replace it all by the already thread-safe libgit2 calls. It also helps shake out incompatibilities in libgit2. That would either require forking libgit2 for Git use or stop dead any contributions to that rather central part of the git codebase from contributors not wanting their contributions to get reused in binary proprietary software. It would also mean that no serious forward-going work (like developing new packing formats or network protocols) can be done on a pure GPLv2 codebase any more. So anybody insisting on contributing work under the current Git license only would be locked out from working on significant parts of Git and could no longer propose changes in central parts. Now this can all be repealed by the developing the atomic bomb does not mean that one has to use it argument but even if one does not use it, the world with and without it are different worlds and occupy mindshare and suggest solutions and diplomacy involving it. So this is definitely a large step towards a situation where erosion of the existing license and related parts of the community becomes more attractive. There is the rationale we can always say no at the end. How do you explain this no to the student who invested significant amounts of work into this, in a project proposed by the Git developers? This definitely should not be we'll think about it if and when that project is finished material. Yes, all of this is true. However, you are painting a big devil on the wall. ... Second, how many contributions would actually have been prevented by GPLv2+LE licensing? Interesting data point, I helped get libgit2 started in the first few days of its existence and discussed the license on the mailing list. I eventually stopped contributing, partly because of the GPLv2+LE license it uses. :-) I am not as interested in using the GPL for my work as David Kastrup is, but I wasn't really thrilled with GPLv2+LE. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Thomas Rast t...@thomasrast.ch writes: Here's my moonshot: --- 8 --- Replace object loading/writing layer by libgit2 Git reads objects from storage (loose and packed) through functions in sha1_file.c. Most commands only require very simple, opaque read and write access to the object storage. As a weatherballoon, show that it is feasible to use libgit2 git_odb_* routines for these simple callers. Aim for passing the git test suite using git_odb_* object storage access, except for tests that verify behavior in the face of storage corruption, replacement objects, alternate storage locations, and similar quirks. Of course it is even better if you pass the test suite without exception. [...] That absolutely requires a co-mentor from the libgit2 side to do, however. Perhaps you could talk someone into it? ;-) Motivation: I believe that migrating to libgit2 is the better approach, medium term, than rewriting everything ourselves to be nice, clean and thread-safe. I took a shot a while ago at making the pack reading code thread-safe, but it's adding mess when we could simply replace it all by the already thread-safe libgit2 calls. It also helps shake out incompatibilities in libgit2. That would either require forking libgit2 for Git use or stop dead any contributions to that rather central part of the git codebase from contributors not wanting their contributions to get reused in binary proprietary software. It would also mean that no serious forward-going work (like developing new packing formats or network protocols) can be done on a pure GPLv2 codebase any more. So anybody insisting on contributing work under the current Git license only would be locked out from working on significant parts of Git and could no longer propose changes in central parts. Now this can all be repealed by the developing the atomic bomb does not mean that one has to use it argument but even if one does not use it, the world with and without it are different worlds and occupy mindshare and suggest solutions and diplomacy involving it. So this is definitely a large step towards a situation where erosion of the existing license and related parts of the community becomes more attractive. There is the rationale we can always say no at the end. How do you explain this no to the student who invested significant amounts of work into this, in a project proposed by the Git developers? This definitely should not be we'll think about it if and when that project is finished material. -- David Kastrup -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
On Thu, Feb 13, 2014 at 06:17:17PM -0500, Ramkumar Ramachandra wrote: I'll throw in a few ideas from half-finished work. Thanks. A few comments: 1. Speed up git-rebase--am.sh Currently, git-rebase--am.sh is really slow because it dumps each patch to a file using git-format-patch, and picks it up to apply subsequently using git-am. Find a way to speed this up, without sacrificing safety. You can use the continuation features of cherry-pick, and dump to file only to persist state in the case of a failure. Isn't the merge backend faster? I thought that was the point of it. 3. Rewrite git-branch to use git-for-each-ref For higher flexibility in command-line options and output format, use git for-each-ref to re-implement git-branch. The first task is to grow features that are in branch but not fer into fer (like --column, --merged, --contains). The second task is to refactor fer so that an external program can call into it. I actually have this about 95% done, waiting for the patches to be polished. So I don't think it makes a good GSoC project (it would be stupid to start from scratch, and polishing my patches is a lame project). 4. Implement @{publish} (I just can't find the time to finish this) @{publish} is a feature like @{upstream}, showing the state of the publish-point in the case of triangular workflows. Implement this while sharing code with git-push, and polish it until the prompt shows publish-state. I think this could be a good GSoC-sized topic. I'm going to adjust the title to be better support for triangular workflows. I think they may want to examine other issues in the area, rather than drilling down on @{publish} in particular (but ultimately, it is up to the student to propose what they want to do). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Jeff King wrote: 1. Speed up git-rebase--am.sh Isn't the merge backend faster? I thought that was the point of it. I suppose, but I thought git-rebase--am.sh (the default flavor) could be improved by leveraging relatively new cherry-pick features; I assumed that the reason it was using format-patch/ am was because it was written before cherry-pick matured. Alternatively, can you think of a project that involves working on the sequencer? 3. Rewrite git-branch to use git-for-each-ref I actually have this about 95% done, waiting for the patches to be polished. So I don't think it makes a good GSoC project (it would be stupid to start from scratch, and polishing my patches is a lame project). Oh. I look forward to using a nicer git-branch soon. 4. Implement @{publish} I think this could be a good GSoC-sized topic. I'm going to adjust the title to be better support for triangular workflows. I think they may want to examine other issues in the area, rather than drilling down on @{publish} in particular (but ultimately, it is up to the student to propose what they want to do). That makes the project a little more open-ended then. I like it. I was hoping you'd have more comments on 3. Invent new conflict style. Although I'm not sure the conflict style I proposed would be terribly useful in the general case, it'll give the student an opportunity to look at older/ lesser-known portions of the codebase. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
On Fri, Feb 14, 2014 at 10:30:28AM -0500, Ramkumar Ramachandra wrote: Isn't the merge backend faster? I thought that was the point of it. I suppose, but I thought git-rebase--am.sh (the default flavor) could be improved by leveraging relatively new cherry-pick features; I assumed that the reason it was using format-patch/ am was because it was written before cherry-pick matured. I think that's somewhat the case. But the am technique also knows a lot of tricks that cherry-pick doesn't. For example, there is currently a bug where git rebase --keep-empty --whitespace=fix silently ignores the latter option, because the former causes it to follow a cherry-pick code path. So I am a little hesitant in pushing more code paths down the cherry-pick route (though it would be OK if we correctly identified _when_ we could use cherry-pick for performance, and only kick in then). Alternatively, can you think of a project that involves working on the sequencer? So yeah, obviously this is all tied up with the sequencer. In the spirit of let's not re-propose old projects, I shied away from suggesting finish up the sequencer. I know that the past projects did make progress, and that is a good thing. But I also think it doesn't make for a good bite-sized chunk. 3. Rewrite git-branch to use git-for-each-ref I actually have this about 95% done, waiting for the patches to be polished. So I don't think it makes a good GSoC project (it would be stupid to start from scratch, and polishing my patches is a lame project). Oh. I look forward to using a nicer git-branch soon. Actually, it is mostly about making a nicer git-for-each-ref, as I am pulling out the ref selection code (which is more advanced in git tag and git branch) and using it everywhere. So in that sense, maybe I am not shooting for what you want. I think you want the follow-on to that, which is to pull out the formatting code (which is more advanced in for-each-ref), and let it be used everywhere. I added this into the ideas page, but noting that there were two sides to it, and that one would need to examine and build on existing work (I know there was some discussion and experiments on the formatting side, too). I was hoping you'd have more comments on 3. Invent new conflict style. Although I'm not sure the conflict style I proposed would be terribly useful in the general case, it'll give the student an opportunity to look at older/ lesser-known portions of the codebase. I almost said more. :) I am not sure I have in my mind what a useful new format would look like, and I would worry that we are leading the student into a bit of a trap, as they need to both code, but also invent a new and useful format. But one thing I was really hoping for with these project descriptions (and I think we got) is that they are not completed project proposals. They are the kernels of ideas that the student will need to develop into full proposals. I would much rather have a student who reads that and says I have a brilliant idea for a format and proposes it, rather than one who blindly says OK, I'll implement your idea. Getting the former is much less likely, but if we do, I think it will lead to a much higher quality project. So I included it as-is, and I am curious to see what proposals we get. :) Thanks again for your list. I marked you as a potential mentor for the conflict-style project; given the right proposal, I'd be happy to mentor it, too (and without the right proposal, I do not think we should do it at all). I also listed both you and me as potential mentors for @{publish}, since we have both looked at the problem space. If you can't make the time commitment, that's fine; I can do it (and we don't need to decide until later anyway). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
On Thu, Feb 13, 2014 at 10:45 PM, Thomas Rast t...@thomasrast.ch wrote: Replace object loading/writing layer by libgit2 Git reads objects from storage (loose and packed) through functions in sha1_file.c. Most commands only require very simple, opaque read and write access to the object storage. As a weatherballoon, show that it is feasible to use libgit2 git_odb_* routines for these simple callers. Aim for passing the git test suite using git_odb_* object storage access, except for tests that verify behavior in the face of storage corruption, replacement objects, alternate storage locations, and similar quirks. Of course it is even better if you pass the test suite without exception. Language: C Difficulty: hard Possible mentors: Thomas Rast and fill in libgit2 expert Note that we have several people in the libgit2 team that are willing (and excited) to mentor or co-mentor this project or any of the other libgit2 related projects that have been proposed. Prospective list is - Vicent Marti - Russell Belfer - Ed Thomson - Carlos Martin (past GSoC student) So there shouldn't be any mentor shortage. Cheers, vmg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
On Thu, Feb 13, 2014 at 04:10:37AM -0500, Jeff King wrote: The Google Summer of Code application process is upon us. We have about 34 hours until the deadline (2014-02-14T19:00 UTC) . That's not very much time, but I know some people have been thinking about projects for a while, so I have hope that we can put together an ideas page. Just to let everybody know, the application is submitted. For reference, I've updated the submitted application text here: http://git.github.io/SoC-2014-Org-Application.html I've collected the discussion on the list on the ideas page: http://git.github.io/SoC-2014-Ideas.html Google folks will be looking at it over the next week, but prospective students probably won't see it until we are accepted to the program, which would happen Feb 24th. Please feel free to continue discussing or updating the ideas page in the meantime. I think there is enough there already to show Google what we are thinking about, but ultimately the students are the ones whom the page is meant to serve. Anything we can do to improve it before they read it is a good thing. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Jeff King p...@peff.net writes: - somebody to be the backup admin (I am assuming I'll be the primary admin, but as always, if somebody else wants to...) I can be backup, if Shawn doesn't want it. - ideas ideas ideas Here's my moonshot: --- 8 --- Replace object loading/writing layer by libgit2 Git reads objects from storage (loose and packed) through functions in sha1_file.c. Most commands only require very simple, opaque read and write access to the object storage. As a weatherballoon, show that it is feasible to use libgit2 git_odb_* routines for these simple callers. Aim for passing the git test suite using git_odb_* object storage access, except for tests that verify behavior in the face of storage corruption, replacement objects, alternate storage locations, and similar quirks. Of course it is even better if you pass the test suite without exception. Language: C Difficulty: hard Possible mentors: Thomas Rast and fill in libgit2 expert --- 8 --- That absolutely requires a co-mentor from the libgit2 side to do, however. Perhaps you could talk someone into it? ;-) Motivation: I believe that migrating to libgit2 is the better approach, medium term, than rewriting everything ourselves to be nice, clean and thread-safe. I took a shot a while ago at making the pack reading code thread-safe, but it's adding mess when we could simply replace it all by the already thread-safe libgit2 calls. It also helps shake out incompatibilities in libgit2. Downside: not listing code merged as a goal may not make the project as shiny, neither for Git nor for the student. -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Thomas Rast t...@thomasrast.ch writes: Downside: not listing code merged as a goal may not make the project as shiny, neither for Git nor for the student. I'd actually view that as an upside. This sounds like a good first step for a feasibility study that is really necessary. I wonder why the handling of storage corruption and replacement could be left broken, though. Is that because libgit2 has known breakages in these areas, or is there some other reason? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Jeff King wrote: - ideas ideas ideas I'll throw in a few ideas from half-finished work. 1. Speed up git-rebase--am.sh Currently, git-rebase--am.sh is really slow because it dumps each patch to a file using git-format-patch, and picks it up to apply subsequently using git-am. Find a way to speed this up, without sacrificing safety. You can use the continuation features of cherry-pick, and dump to file only to persist state in the case of a failure. Language: Shell script, C Difficulty: Most of the difficulty lies in what to do, not so much how to do it. Might require modifying cherry-pick to do additional work on failure. 2. Invent new conflict style As an alternative to the diff3 conflict style, invent a conflict style that shows the original unpatched segment along with the raw patch text. The user can then apply the patch by hand. Language: C Difficulty: Since it was first written, very few people have touched the xdiff portion of the code. Since the area is very core to git, the series will have to go through a ton of iterations. 3. Rewrite git-branch to use git-for-each-ref For higher flexibility in command-line options and output format, use git for-each-ref to re-implement git-branch. The first task is to grow features that are in branch but not fer into fer (like --column, --merged, --contains). The second task is to refactor fer so that an external program can call into it. Language: C Difficulty: fer was never written with the idea of being reusable; it therefore persists a lot of global state, and even leaks memory in some places. Refactoring it to be more modern is definitely a challenge. 4. Implement @{publish} (I just can't find the time to finish this) @{publish} is a feature like @{upstream}, showing the state of the publish-point in the case of triangular workflows. Implement this while sharing code with git-push, and polish it until the prompt shows publish-state. Language: C, Shell script Difficulty: Once you figure out how to share code with git-push, this task should be relatively straightforward. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Junio C Hamano gits...@pobox.com writes: Thomas Rast t...@thomasrast.ch writes: Downside: not listing code merged as a goal may not make the project as shiny, neither for Git nor for the student. I'd actually view that as an upside. This sounds like a good first step for a feasibility study that is really necessary. I wonder why the handling of storage corruption and replacement could be left broken, though. Is that because libgit2 has known breakages in these areas, or is there some other reason? It's because I don't know enough about what libgit2's state is, and I wanted to keep the scope limited. Naturally, the next step would then be to implement the lacking functionality (if any) in libgit2 so that the test suite passes. I just don't know if that's trivial, or something for the if we have time section of the project, or too much work. (I did do a quick can we reasonably link against libgit2 test where I gave git-cat-file a --libgit2 option that loads blobs with libgit2. There are some name collisions in the git_config* identifiers that need to be resolved, but otherwise it seems entirely possible.) -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html