Re: Git GSoC 2014

2014-02-15 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 7:17 PM, Thomas Rast t...@thomasrast.ch wrote:
 It also includes an ok from Nicolas Pitre, who has been the driving
 force behind packv5 development.  The only thing that makes me uneasy is

Nit: pack v4. You are probably confused with index v5, which is also
cooking for a while now.

 that Duy is not in the list (Duy, have you been asked by libgit2 about
 possible relicensing?).

I don't remember. But for the record I'm OK with relicensing my
contributions in git.git for inclusion in libgit2.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-15 Thread David Kastrup
Thomas Rast t...@thomasrast.ch writes:

 David Kastrup d...@gnu.org writes:

 This definitely should not be we'll think about it if and when that
 project is finished material.

 Yes, all of this is true.  However, you are painting a big devil on
 the wall.

[...]

 Your scenario above mostly applies if and when we really go the way of
 my dream and scrap the code that is in git.

So it's not a big devil I am painting but rather the consequences if
everything goes according to plan.

 (I have similar long-term dreams for other git components like ref
 storage and diffs, but that's just me.)

 Second, how many contributions would actually have been prevented by
 GPLv2+LE licensing?

That's not as much the question as to how many will be prevented in
future by such a step.

The libgit2 community is different from that of Git and with a different
focus.  If you take a look at its front page, you'll see statements like
Link with open and proprietary software.  No strings attached. and
Trusted and used in production by GitHub, Microsoft, [...].

A fusion with this project and its aims and licensing will have
consequences regarding which developers and users are attracted to Git.

The act first, think later approach is not really doing anybody
favors, and I don't consider it fair to GSoC students to employ them for
making this proposal gain leverage: one should first think this through
on behalf of the project before putting students to work on this so that
they know reasonably well what will happen with their work.

The kind of Now that we made $x do $y, we are obliged to do $z.
scenario is easy to avoid by _first_ contemplating whether or not $z is
where one wants to go.

That's not painting a devil but common sense.  I'm not saying that the
answer is in any way self-evident.  Merely that the best time to answer
it is _before_ getting invested.

 The only data I have on this is libgit2/git.git-authors, which records
 who has consented for their _existing_ code to be relicensed.  I
 consider this to be a higher barrier than contributing new code, since
 there's no clear gain for the author in the relicensing.

I consider it a lower barrier since the work is already done, and the
authors did not when doing it think about proprietary spinoffs.

But that's a minor point.  All I am saying is that there are different
opinions possible, and picking a particular path for future development
will in either way influence who wants to be part of the respective
communities.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-15 Thread Shawn Pearce
On Sat, Feb 15, 2014 at 4:17 AM, Thomas Rast t...@thomasrast.ch wrote:
 David Kastrup d...@gnu.org writes:

 Thomas Rast t...@thomasrast.ch writes:

 Motivation: I believe that migrating to libgit2 is the better approach,
 medium term, than rewriting everything ourselves to be nice, clean and
 thread-safe.  I took a shot a while ago at making the pack reading code
 thread-safe, but it's adding mess when we could simply replace it all by
 the already thread-safe libgit2 calls.  It also helps shake out
 incompatibilities in libgit2.

 That would either require forking libgit2 for Git use or stop dead any
 contributions to that rather central part of the git codebase from
 contributors not wanting their contributions to get reused in binary
 proprietary software.

 It would also mean that no serious forward-going work (like developing
 new packing formats or network protocols) can be done on a pure GPLv2
 codebase any more.  So anybody insisting on contributing work under the
 current Git license only would be locked out from working on significant
 parts of Git and could no longer propose changes in central parts.

 Now this can all be repealed by the developing the atomic bomb does not
 mean that one has to use it argument but even if one does not use it,
 the world with and without it are different worlds and occupy mindshare
 and suggest solutions and diplomacy involving it.

 So this is definitely a large step towards a situation where erosion of
 the existing license and related parts of the community becomes more
 attractive.

 There is the rationale we can always say no at the end.  How do you
 explain this no to the student who invested significant amounts of
 work into this, in a project proposed by the Git developers?

 This definitely should not be we'll think about it if and when that
 project is finished material.

 Yes, all of this is true.  However, you are painting a big devil on the
 wall.
...
 Second, how many contributions would actually have been prevented by
 GPLv2+LE licensing?

Interesting data point, I helped get libgit2 started in the first few
days of its existence and discussed the license on the mailing list. I
eventually stopped contributing, partly because of the GPLv2+LE
license it uses.

:-)

I am not as interested in using the GPL for my work as David Kastrup
is, but I wasn't really thrilled with GPLv2+LE.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-14 Thread David Kastrup
Thomas Rast t...@thomasrast.ch writes:

 Here's my moonshot:

 --- 8 ---
 Replace object loading/writing layer by libgit2

 Git reads objects from storage (loose and packed) through functions in
 sha1_file.c.  Most commands only require very simple, opaque read and
 write access to the object storage.  As a weatherballoon, show that it
 is feasible to use libgit2 git_odb_* routines for these simple callers.

 Aim for passing the git test suite using git_odb_* object storage
 access, except for tests that verify behavior in the face of storage
 corruption, replacement objects, alternate storage locations, and
 similar quirks.  Of course it is even better if you pass the test suite
 without exception.

[...]

 That absolutely requires a co-mentor from the libgit2 side to do,
 however.  Perhaps you could talk someone into it? ;-)

 Motivation: I believe that migrating to libgit2 is the better approach,
 medium term, than rewriting everything ourselves to be nice, clean and
 thread-safe.  I took a shot a while ago at making the pack reading code
 thread-safe, but it's adding mess when we could simply replace it all by
 the already thread-safe libgit2 calls.  It also helps shake out
 incompatibilities in libgit2.

That would either require forking libgit2 for Git use or stop dead any
contributions to that rather central part of the git codebase from
contributors not wanting their contributions to get reused in binary
proprietary software.

It would also mean that no serious forward-going work (like developing
new packing formats or network protocols) can be done on a pure GPLv2
codebase any more.  So anybody insisting on contributing work under the
current Git license only would be locked out from working on significant
parts of Git and could no longer propose changes in central parts.

Now this can all be repealed by the developing the atomic bomb does not
mean that one has to use it argument but even if one does not use it,
the world with and without it are different worlds and occupy mindshare
and suggest solutions and diplomacy involving it.

So this is definitely a large step towards a situation where erosion of
the existing license and related parts of the community becomes more
attractive.

There is the rationale we can always say no at the end.  How do you
explain this no to the student who invested significant amounts of
work into this, in a project proposed by the Git developers?

This definitely should not be we'll think about it if and when that
project is finished material.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-14 Thread Jeff King
On Thu, Feb 13, 2014 at 06:17:17PM -0500, Ramkumar Ramachandra wrote:

 I'll throw in a few ideas from half-finished work.

Thanks. A few comments:

 1. Speed up git-rebase--am.sh
 
 Currently, git-rebase--am.sh is really slow because it dumps each
 patch to a file using git-format-patch, and picks it up to apply
 subsequently using git-am. Find a way to speed this up, without
 sacrificing safety. You can use the continuation features of
 cherry-pick, and dump to file only to persist state in the case of a
 failure.

Isn't the merge backend faster? I thought that was the point of it.

 3. Rewrite git-branch to use git-for-each-ref
 
 For higher flexibility in command-line options and output format, use
 git for-each-ref to re-implement git-branch. The first task is to grow
 features that are in branch but not fer into fer (like --column,
 --merged, --contains). The second task is to refactor fer so that an
 external program can call into it.

I actually have this about 95% done, waiting for the patches to be
polished. So I don't think it makes a good GSoC project (it would be
stupid to start from scratch, and polishing my patches is a lame
project).

 4. Implement @{publish}
 (I just can't find the time to finish this)
 
 @{publish} is a feature like @{upstream}, showing the state of the
 publish-point in the case of triangular workflows. Implement this
 while sharing code with git-push, and polish it until the prompt shows
 publish-state.

I think this could be a good GSoC-sized topic. I'm going to adjust the
title to be better support for triangular workflows. I think they may
want to examine other issues in the area, rather than drilling down on
@{publish} in particular (but ultimately, it is up to the student to
propose what they want to do).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-14 Thread Ramkumar Ramachandra
Jeff King wrote:
 1. Speed up git-rebase--am.sh

 Isn't the merge backend faster? I thought that was the point of it.

I suppose, but I thought git-rebase--am.sh (the default flavor) could
be improved by leveraging relatively new cherry-pick features; I
assumed that the reason it was using format-patch/ am was because it
was written before cherry-pick matured. Alternatively, can you think
of a project that involves working on the sequencer?

 3. Rewrite git-branch to use git-for-each-ref

 I actually have this about 95% done, waiting for the patches to be
 polished. So I don't think it makes a good GSoC project (it would be
 stupid to start from scratch, and polishing my patches is a lame
 project).

Oh. I look forward to using a nicer git-branch soon.

 4. Implement @{publish}

 I think this could be a good GSoC-sized topic. I'm going to adjust the
 title to be better support for triangular workflows. I think they may
 want to examine other issues in the area, rather than drilling down on
 @{publish} in particular (but ultimately, it is up to the student to
 propose what they want to do).

That makes the project a little more open-ended then. I like it.

I was hoping you'd have more comments on 3. Invent new conflict
style. Although I'm not sure the conflict style I proposed would be
terribly useful in the general case, it'll give the student an
opportunity to look at older/ lesser-known portions of the codebase.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-14 Thread Jeff King
On Fri, Feb 14, 2014 at 10:30:28AM -0500, Ramkumar Ramachandra wrote:

  Isn't the merge backend faster? I thought that was the point of it.
 
 I suppose, but I thought git-rebase--am.sh (the default flavor) could
 be improved by leveraging relatively new cherry-pick features; I
 assumed that the reason it was using format-patch/ am was because it
 was written before cherry-pick matured.

I think that's somewhat the case. But the am technique also knows a lot
of tricks that cherry-pick doesn't. For example, there is currently a
bug where git rebase --keep-empty --whitespace=fix silently ignores
the latter option, because the former causes it to follow a cherry-pick
code path.

So I am a little hesitant in pushing more code paths down the
cherry-pick route (though it would be OK if we correctly identified
_when_ we could use cherry-pick for performance, and only kick in then).

 Alternatively, can you think of a project that involves working on the
 sequencer?

So yeah, obviously this is all tied up with the sequencer. In the spirit
of let's not re-propose old projects, I shied away from suggesting
finish up the sequencer. I know that the past projects did make
progress, and that is a good thing. But I also think it doesn't make for
a good bite-sized chunk.

  3. Rewrite git-branch to use git-for-each-ref
 
  I actually have this about 95% done, waiting for the patches to be
  polished. So I don't think it makes a good GSoC project (it would be
  stupid to start from scratch, and polishing my patches is a lame
  project).
 
 Oh. I look forward to using a nicer git-branch soon.

Actually, it is mostly about making a nicer git-for-each-ref, as I am
pulling out the ref selection code (which is more advanced in git tag
and git branch) and using it everywhere. So in that sense, maybe I am
not shooting for what you want. I think you want the follow-on to that,
which is to pull out the formatting code (which is more advanced in
for-each-ref), and let it be used everywhere.

I added this into the ideas page, but noting that there were two sides
to it, and that one would need to examine and build on existing work (I
know there was some discussion and experiments on the formatting side,
too).

 I was hoping you'd have more comments on 3. Invent new conflict
 style. Although I'm not sure the conflict style I proposed would be
 terribly useful in the general case, it'll give the student an
 opportunity to look at older/ lesser-known portions of the codebase.

I almost said more. :) I am not sure I have in my mind what a useful
new format would look like, and I would worry that we are leading the
student into a bit of a trap, as they need to both code, but also invent
a new and useful format.

But one thing I was really hoping for with these project descriptions
(and I think we got) is that they are not completed project proposals.
They are the kernels of ideas that the student will need to develop into
full proposals. I would much rather have a student who reads that and
says I have a brilliant idea for a format and proposes it, rather than
one who blindly says OK, I'll implement your idea. Getting the former
is much less likely, but if we do, I think it will lead to a much higher
quality project.

So I included it as-is, and I am curious to see what proposals we get.
:)

Thanks again for your list. I marked you as a potential mentor for the
conflict-style project; given the right proposal, I'd be happy to mentor
it, too (and without the right proposal, I do not think we should do it
at all). I also listed both you and me as potential mentors for
@{publish}, since we have both looked at the problem space. If you can't
make the time commitment, that's fine; I can do it (and we don't need to
decide until later anyway).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-14 Thread Vicent Martí
On Thu, Feb 13, 2014 at 10:45 PM, Thomas Rast t...@thomasrast.ch wrote:
 Replace object loading/writing layer by libgit2

 Git reads objects from storage (loose and packed) through functions in
 sha1_file.c.  Most commands only require very simple, opaque read and
 write access to the object storage.  As a weatherballoon, show that it
 is feasible to use libgit2 git_odb_* routines for these simple callers.

 Aim for passing the git test suite using git_odb_* object storage
 access, except for tests that verify behavior in the face of storage
 corruption, replacement objects, alternate storage locations, and
 similar quirks.  Of course it is even better if you pass the test suite
 without exception.

 Language: C
 Difficulty: hard
 Possible mentors: Thomas Rast and fill in libgit2 expert

Note that we have several people in the libgit2 team that are willing
(and excited) to mentor or co-mentor this project or any of the other
libgit2 related projects that have been proposed.

Prospective list is

- Vicent Marti
- Russell Belfer
- Ed Thomson
- Carlos Martin (past GSoC student)

So there shouldn't be any mentor shortage.

Cheers,
vmg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-14 Thread Jeff King
On Thu, Feb 13, 2014 at 04:10:37AM -0500, Jeff King wrote:

 The Google Summer of Code application process is upon us. We have about
 34 hours until the deadline (2014-02-14T19:00 UTC) . That's not very
 much time, but I know some people have been thinking about projects for
 a while, so I have hope that we can put together an ideas page.

Just to let everybody know, the application is submitted. For reference,
I've updated the submitted application text here:

  http://git.github.io/SoC-2014-Org-Application.html

I've collected the discussion on the list on the ideas page:

  http://git.github.io/SoC-2014-Ideas.html

Google folks will be looking at it over the next week, but prospective
students probably won't see it until we are accepted to the program,
which would happen Feb 24th.

Please feel free to continue discussing or updating the ideas page in
the meantime. I think there is enough there already to show Google what
we are thinking about, but ultimately the students are the ones whom the
page is meant to serve.  Anything we can do to improve it before they
read it is a good thing.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Thomas Rast
Jeff King p...@peff.net writes:

   - somebody to be the backup admin (I am assuming I'll be the primary
 admin, but as always, if somebody else wants to...)

I can be backup, if Shawn doesn't want it.

   - ideas ideas ideas

Here's my moonshot:

--- 8 ---
Replace object loading/writing layer by libgit2

Git reads objects from storage (loose and packed) through functions in
sha1_file.c.  Most commands only require very simple, opaque read and
write access to the object storage.  As a weatherballoon, show that it
is feasible to use libgit2 git_odb_* routines for these simple callers.

Aim for passing the git test suite using git_odb_* object storage
access, except for tests that verify behavior in the face of storage
corruption, replacement objects, alternate storage locations, and
similar quirks.  Of course it is even better if you pass the test suite
without exception.

Language: C
Difficulty: hard
Possible mentors: Thomas Rast and fill in libgit2 expert
--- 8 ---

That absolutely requires a co-mentor from the libgit2 side to do,
however.  Perhaps you could talk someone into it? ;-)

Motivation: I believe that migrating to libgit2 is the better approach,
medium term, than rewriting everything ourselves to be nice, clean and
thread-safe.  I took a shot a while ago at making the pack reading code
thread-safe, but it's adding mess when we could simply replace it all by
the already thread-safe libgit2 calls.  It also helps shake out
incompatibilities in libgit2.

Downside: not listing code merged as a goal may not make the project
as shiny, neither for Git nor for the student.

-- 
Thomas Rast
t...@thomasrast.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Junio C Hamano
Thomas Rast t...@thomasrast.ch writes:

 Downside: not listing code merged as a goal may not make the project
 as shiny, neither for Git nor for the student.

I'd actually view that as an upside. This sounds like a good first
step for a feasibility study that is really necessary.

I wonder why the handling of storage corruption and replacement
could be left broken, though. Is that because libgit2 has known
breakages in these areas, or is there some other reason?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Ramkumar Ramachandra
Jeff King wrote:
   - ideas ideas ideas

I'll throw in a few ideas from half-finished work.

1. Speed up git-rebase--am.sh

Currently, git-rebase--am.sh is really slow because it dumps each
patch to a file using git-format-patch, and picks it up to apply
subsequently using git-am. Find a way to speed this up, without
sacrificing safety. You can use the continuation features of
cherry-pick, and dump to file only to persist state in the case of a
failure.

Language: Shell script, C
Difficulty: Most of the difficulty lies in what to do, not so much
how to do it. Might require modifying cherry-pick to do additional
work on failure.

2. Invent new conflict style

As an alternative to the diff3 conflict style, invent a conflict style
that shows the original unpatched segment along with the raw patch
text. The user can then apply the patch by hand.

Language: C
Difficulty: Since it was first written, very few people have touched
the xdiff portion of the code. Since the area is very core to git, the
series will have to go through a ton of iterations.

3. Rewrite git-branch to use git-for-each-ref

For higher flexibility in command-line options and output format, use
git for-each-ref to re-implement git-branch. The first task is to grow
features that are in branch but not fer into fer (like --column,
--merged, --contains). The second task is to refactor fer so that an
external program can call into it.

Language: C
Difficulty: fer was never written with the idea of being reusable; it
therefore persists a lot of global state, and even leaks memory in
some places. Refactoring it to be more modern is definitely a
challenge.

4. Implement @{publish}
(I just can't find the time to finish this)

@{publish} is a feature like @{upstream}, showing the state of the
publish-point in the case of triangular workflows. Implement this
while sharing code with git-push, and polish it until the prompt shows
publish-state.

Language: C, Shell script
Difficulty: Once you figure out how to share code with git-push, this
task should be relatively straightforward.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Thomas Rast
Junio C Hamano gits...@pobox.com writes:

 Thomas Rast t...@thomasrast.ch writes:

 Downside: not listing code merged as a goal may not make the project
 as shiny, neither for Git nor for the student.

 I'd actually view that as an upside. This sounds like a good first
 step for a feasibility study that is really necessary.

 I wonder why the handling of storage corruption and replacement
 could be left broken, though. Is that because libgit2 has known
 breakages in these areas, or is there some other reason?

It's because I don't know enough about what libgit2's state is, and I
wanted to keep the scope limited.  Naturally, the next step would then
be to implement the lacking functionality (if any) in libgit2 so that
the test suite passes.  I just don't know if that's trivial, or
something for the if we have time section of the project, or too much
work.

(I did do a quick can we reasonably link against libgit2 test where I
gave git-cat-file a --libgit2 option that loads blobs with libgit2.
There are some name collisions in the git_config* identifiers that need
to be resolved, but otherwise it seems entirely possible.)

-- 
Thomas Rast
t...@thomasrast.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html