Re: Git for games working group

2018-09-24 Thread John Austin
On Mon, Sep 24, 2018 at 12:58 PM Taylor Blau  wrote:
> I'm replying to this part of the email to note that this would cause Git
> LFS to have to do some extra work, since running 'git lfs install'
> already writes to .git/hooks/post-commit (ironically, to detect and
> unlock locks that we should have released).

Right, that should have been another bullet point. The fact that there
can only be one git hook is.. frustrating.

Perhaps, if LFS has an option to bundle global-graph, LFS could merge
the hooks when installing?

If you instead install global-graph after LFS, I think it should
probably attempt something like:
  -- first move the existing hook to a folder: post-commit.d/
  -- install the global-graph hook to post-commit.d/
  -- install a new hook at post-commit that simply calls all
executables in post-commit.d/

Not sure if this is something that's been discussed, since I know LFS
has a similar issue with existing hooks, but might be sensible.



Re: Git for games working group

2018-09-24 Thread John Austin
Perhaps git-global-graph is a decent name. GGG? G3? :). The structure
right now in my head looks a bit like:

Global Graph:
 client - post-commit git hooks to push changes up to the GG
 git server - just the standard git server configuration
 query server - replies with information about the current state of the GG

Locks Pre-Commit:
 client - pre-commit hook that makes requests to the GG query server

For cross-platform compatibility, the Global Graph client and the
Locks/Conflicts client are the pieces that need to be use-able on all
platforms. My goal is to keep these pieces as simple as possible. I'd
like to at least start prototyping these in Rust, hopefully in a way
that can either be easily ported or easily re-implemented in C later
on, once things are feature-frozen.

For LFS, The main points of integration with I see are:
-- bundling of packages (optionally install this package with a
normal LFS installation)
-- `git lfs locks` integration. ie. integration with the read-only
control of LFS

If we push more of the functionality into the gg query server, the
integration with `lfs locks` could be simple enough to be a couple of
web requests. That might help avoid integration issues.

> we strictly avoid using CGo
What's the main reason for this? Build system complexity?
On Mon, Sep 24, 2018 at 7:37 AM Taylor Blau  wrote:
>
> On Sun, Sep 23, 2018 at 12:53:58PM -0700, John Austin wrote:
> > On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
> >  wrote:
> > >  I would even like to help with your effort and have non-unixy platforms 
> > > I'd like to do this on.
> > > Having this separate from git LFS is an even better idea IMO, and I would 
> > > suggest implementing this using the same set of build tools that git uses 
> > > so that it is broadly portable, unlike git LFS. Glad to help there too.
> >
> > Great to hear -- once the code is in a bit better shape I can open it
> > up on github. Cross platform is definitely one of my focuses. I'm
> > currently implementing in Rust because it targets the same space as C
> > and has great, near trivial, cross-platform support. What sorts of
> > platforms are you interested in? Windows is my first target because
> > that's where many game developers live.
>
> This would likely mean that Git LFS will have to reimplement it, since
> we strictly avoid using CGo (Go's mechanism to issue function calls to
> other languages).
>
> The upshot is that it likely shouldn't be too much effort for anybody,
> and the open-source community would get a Go implementation of the API,
> too.
>
> Thanks,
> Taylor
>



Re: Git for games working group

2018-09-23 Thread John Austin
Regarding integration into LFS, I'd like to build the library in such
a way that it would easy to bundle with LFS (so they could share the
same git hooks), but also make it flexible enough to work for other
workflows.
On Sun, Sep 23, 2018 at 12:53 PM John Austin  wrote:
>
> On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
>  wrote:
> >  I would even like to help with your effort and have non-unixy platforms 
> > I'd like to do this on.
> > Having this separate from git LFS is an even better idea IMO, and I would 
> > suggest implementing this using the same set of build tools that git uses 
> > so that it is broadly portable, unlike git LFS. Glad to help there too.
>
> Great to hear -- once the code is in a bit better shape I can open it
> up on github. Cross platform is definitely one of my focuses. I'm
> currently implementing in Rust because it targets the same space as C
> and has great, near trivial, cross-platform support. What sorts of
> platforms are you interested in? Windows is my first target because
> that's where many game developers live.
>
> > I would suggest that a higher-level grouping mechanism of resource groups 
> > might be helpful - as in "In need this directory" rather than "I need this 
> > file". Better still, I could see "I need all objects in this commit-ish", 
> > which would allow a revert operation to succeed or fail atomically while 
> > adhering to a lock requirement.
> > One bit that traditional lock-brokering systems implement involve forcing 
> > security attribute changes - so an unlocked file is stored as chmod a-w to 
> > prevent accidental modification of lockables, when changing that to chmod 
> > ?+w when a lock is acquired. It's not perfect, but does catch a lot of 
> > errors.
>
> Agreed -- I think this is all up to how the query endpoint and client
> is designed. A couple of different types of clients could be
> implemented, depending on the policies you want in place. One could
> have strict security that stored unlocked files with a-w, as
> mentioned. Another could be a weaker client, and simply warn
> developers when their current branch is in conflict.



Re: Git for games working group

2018-09-23 Thread John Austin
On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
 wrote:
>  I would even like to help with your effort and have non-unixy platforms I'd 
> like to do this on.
> Having this separate from git LFS is an even better idea IMO, and I would 
> suggest implementing this using the same set of build tools that git uses so 
> that it is broadly portable, unlike git LFS. Glad to help there too.

Great to hear -- once the code is in a bit better shape I can open it
up on github. Cross platform is definitely one of my focuses. I'm
currently implementing in Rust because it targets the same space as C
and has great, near trivial, cross-platform support. What sorts of
platforms are you interested in? Windows is my first target because
that's where many game developers live.

> I would suggest that a higher-level grouping mechanism of resource groups 
> might be helpful - as in "In need this directory" rather than "I need this 
> file". Better still, I could see "I need all objects in this commit-ish", 
> which would allow a revert operation to succeed or fail atomically while 
> adhering to a lock requirement.
> One bit that traditional lock-brokering systems implement involve forcing 
> security attribute changes - so an unlocked file is stored as chmod a-w to 
> prevent accidental modification of lockables, when changing that to chmod ?+w 
> when a lock is acquired. It's not perfect, but does catch a lot of errors.

Agreed -- I think this is all up to how the query endpoint and client
is designed. A couple of different types of clients could be
implemented, depending on the policies you want in place. One could
have strict security that stored unlocked files with a-w, as
mentioned. Another could be a weaker client, and simply warn
developers when their current branch is in conflict.



Re: Git for games working group

2018-09-23 Thread John Austin
I've been putting together a prototype file-locking implementation for
a system that plays better with git. What are everyone's thoughts on
something like the following? I'm tentatively labeling this system
git-sync or sync-server. There are two pieces:

1. A centralized repository called the Global Graph that contains the
union git commit graph for local developer repos. When Developer A
makes a local commit on branch 'feature', git-sync will automatically
push that new commit up to the global server, under a name-spaced
branch: 'developera_repoabcdef/feature'. This can be done silently as
a force push, and shouldn't ever interrupt the developer's workflow.
Simple http queries can be made to the Global Graph, such as "Which
commits descend from commit abcdefgh?"

2. A client-side tool that queries the Global Graph to determine when
your current changes are in conflict with another developer. It might
ask "Are there any commits I don't have locally that modify
lockable_file.bin?". This could either be on pre-commit, or for more
security, be part of a read-only marking system ala Git LFS. There
wouldn't be any "lock" per say, rather, the client could refuse to
modify a file if it found other commits for that file in the global
graph.

The key here is the separation of concerns. The Global Graph is fairly
dimwitted -- it doesn't know anything about file locking. But it
provides a layer of information from which we can implement file
locking on the client side (or perhaps other interesting systems).

Thoughts?
On Mon, Sep 17, 2018 at 10:23 AM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Mon, Sep 17 2018, Joey Hess wrote:
>
> > Ævar Arnfjörð Bjarmason wrote:
> >> There's surely other aspects of that square peg of large file tracking
> >> not fitting the round hole of file locking, the point of my write-up was
> >> not that *that* solution is perfect, but there's prior art here that's
> >> very easily adopted to distributed locking if someone wanted to scratch
> >> that itch, since the notion of keeping a log of who has/hasn't gotten a
> >> file is very similar to a log of who has/hasn't locked some file(s) in
> >> the tree.
> >
> > Actually they are fundamentally very different. git-annex's tracking of
> > locations of files is eventually consistent, which of course means that
> > at any given point in time it may be currently inconsistent. That is
> > fine for tracking locations of files, but not for locking.
> >
> > When git-annex needs to do an operation that relies on someone else's
> > copy of a file actually being present, it uses real locking. That
> > locking is not centralized, instead it relies on the connections between
> > git repositories. That turns out to be sufficient for git-annex's own
> > locking needs, but it would not be sufficient to avoid file edit
> > conflict problems in eg a split brain situation.
>
> Right, all of that's true. I forgot to explicitly say what I meant by
> "locking" in this context. Clearly it's not suitable for something like
> actual file locking (in the sense of flock() et al), but rather just
> advisory locking in the loosest sense of the word, i.e. some git-ish way
> of someone writing on the office whiteboard "unless you're Bob, don't
> touch main.c today Tuesday Sep 17th, he's hacking on it".
>
> So just a way to have some eventually consistent side channel to pass
> such a message through git. Something similar to what git-annex does
> with its "git-annex" branch would work for that, as long as everyone who
> wanted get such messages ran some equivalent of "git annex sync" in a
> timely manner (or checked the office whiteboard every day...).
>
> Such a schema is never going to be 100% reliable even in centralized
> source control systems, e.g. even with cvs/perforce you might pull the
> latest changes, then go on a plane and edit the locked main.c. Then the
> lock has "failed" in the sense of "the message didn't get there in time,
> and two people who could have just picked different areas to work on
> made conflicting edits".
>
> As noted upthread this isn't my use-case, I just wanted to point the
> git-annex method of distributing metadata as a bolt-on to git as
> interesting prior art. If someone wants "truly distributed, but with
> file locking like cvs/perforce" something like what git-annex is doing
> would probably work for them.
>



Re: Git for games working group

2018-09-16 Thread John Austin
Thanks for all the thoughts so far -- I'm going to try to collate some
of my responses to avoid this getting too lengthy.

## Regarding Merging / Diffing
A couple of folks have suggested that we could improve merging /
diffing of binary files in general. I think this is useful, but can
only ever result in minor improvements, for the following reasons:

1. Game developers use an incredible amount of proprietary file
formats: Maya, Houdini, Photoshop, Wwise, Unreal UAssets, etc. At the
end of the day, it's fairly unlikely that we can build visual merge
tools for these asset types without an enormous amount of corporate
support.

2. Merging doesn't have a meaning for many types of files. I think git
has trained us that everything is merge-able, but that's not always
the case. If you gave an audio designer two voice-over audio files and
asked them to merge them, they'd give you a pretty strange look. You
have to re-record it from scratch. Content files can be highly
intertwined and highly subjective: as a textual metaphor, every line
of content conflicts with every other line. Even if you had a perfect
merge tool, it just doesn't make much sense to try to merge changes,
unless it's an incredibly simple change.

## Regarding File Locking:
File locking works well enough in Perforce, but there are a couple of
issues I've found using file locking in LFS or in Gitolite (hadn't
seen this before, thanks!).

1. File Locking is an 'active' system. File Locking adds extra
operations that must be taken, both before writing to a file and then
after finishing your changes. Artists either must drop down to a
terminal (unlikely), or we must integrate our file-locking system with
existing artist tools (a large amount of work). Either way it adds a
lot of extra grunt-work. Imagine having to manually mark which files
you modify rather than just using git status. One of git's biggest
benefit is removing this type of manual labor.

2. File Locking doesn't extend well across branches. Acquiring a lock
usually blocks modifications to this file across all branches. This
cuts off basic branching models and features (like having release
branches) that are large part of why git is so successful.

3. It's not entirely sound. Developer A can modify 'binary.bin', and
push the changes to master. Developer B, who is behind master by a
couple of days, can then unknowingly acquire the lock and make further
changes ignoring A's new commit. When B attempts to push, they will
get conflicts. If you look closely, this is a symptom of issue 2:
locking doesn't understand branches.

## "Implicit" Locking

Instead, I think it's better to think about how we can use the
structure of the git graph to solve the issue. Imagine the following
pre-commit hook for a developer attempting to commit 'binary.bin':

If there exists any commit binary.bin on a different branch that is
not integrated into this branch,  block the commit.

In this case, making a commit with a file blocks others from touching
it, until they pull in that commit. To make the parallel, making a
commit acquires a 'lock' on the file, but there's no release. The only
requirement is that you always modify the latest version of the file.

This has issues of its own, and it's a simplification of the system I
have in mind. It means Developer A needs to have information about the
commit graph local to Developer B's machine (but notably not the
files). However I think it is a better starting place for thinking
about these sorts of systems. The locks fall implicitly from the
commit graph structure, so it plays well with all of your normal git
commands. You can branch, cherry-pick, rebase, etc without any extra
support or aliases. I'll write up something a bit more detailed in a
bit.

- JA
On Sun, Sep 16, 2018 at 7:55 AM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Sat, Sep 15 2018, Taylor Blau wrote:
>
> > On Fri, Sep 14, 2018 at 02:09:12PM -0700, John Austin wrote:
> >> I've been working myself on strategies for handling binary conflicts,
> >> and particularly how to do it in a git-friendly way (ie. avoiding as
> >> much centralization as possible and playing into the commit/branching
> >> model of git).
> >
> > Git LFS handles conflict resolution and merging over binary files with
> > two primary mechanisms: (1) file locking, and (2) use of a merge-tool.
> >
> >   1. is the most "non-Git-friendly" solution, since it requires the use
> >  of a centralized Git LFS server (to be run alongside your remote
> >  repository) and that every clone phones home to make sure that they
> >  are OK to acquire a lock.
> >
> >  The workflow that we expect is that users will run 'git lfs lock
> >  /path/to/file' any time they want to make a change to an
> >  unmeregeable file, and that this call first checks to make sure
> >  that they ar

Re: Git for games working group

2018-09-16 Thread John Austin
> Right, though this still subjects the remote copy to all of the
> difficulty of packing large objects (though Christian's work to support
> other object database implementations would go a long way to help this).

Ah, interesting -- I didn't realize this step was part of the
bottleneck. I presumed git didn't do much more than perhaps gzip'ing
binary files when it packed them up. Or do you mean the growing cost
of storing the objects locally as you work? Perhaps that could be
solved by allowing the client more control (ie. delete the oldest
blobs that exist on the server).



Re: Git for games working group

2018-09-14 Thread John Austin
> There's also the nascent "don't fetch all the blobs" work-in-progress
> clone mode which might be of interest to you:
> https://blog.github.com/2018-09-10-highlights-from-git-2-19/#partial-clones

Yes! I've been pretty excited about this functionality. It drives a
lot of GVFS/VFS for Git under the hood. I think it's a great solution
to the repo-size issue.

> Is this just a reference to the advisory locking mode perforce/cvs
> etc. have or is there something else at play here?

Good catch. I actually phrased this precisely to avoid calling it
"File Locking".

An essential example would be a team of 5 audio designers working
together on the SFX for a game. If one designer wants to add a layer
of ambience to 40% of the .wav files, they have to coordinate with
everyone else on the project manually. Without coordination this
developer will clobber any changes made to these files while he worked
on them. File Locking is the way that Perforce manages this, where a
developer can exclusively block modifications on a set of files across
the entire team.

File locking is just one solution to the problem. It's also one that
doesn't play well with git's decentralized structure and branching
model. I would state the problem more generally:
Developers need some way to know, as early as possible, if modifying a
file will cause conflicts upstream.

Optionally this knowledge can block modifying the file directly (if
we're certain there's already a conflicting version of the file on a
different branch).

JA



Re: Git for games working group

2018-09-14 Thread John Austin
Hey Taylor,

Great to have your support! I think LFS has done a great job so far
solving the large file issue. I've been working myself on strategies
for handling binary conflicts, and particularly how to do it in a
git-friendly way (ie. avoiding as much centralization as possible and
playing into the commit/branching model of git). I've got to a loose
design that I like, but it'd be good to get some feedback, as well as
hearing what other game devs would want in a binary conflict system.

- John
On Fri, Sep 14, 2018 at 12:00 PM Taylor Blau  wrote:
>
> Hi John,
>
> On Fri, Sep 14, 2018 at 10:55:39AM -0700, John Austin wrote:
> > Is anyone interested in contributing/offering insights? I suspect most
> > folks here are git users as is, but if you know someone stuck on
> > Perforce, I'd love to chat with them!
>
> I'm thrilled that other folks are interested in this, too. I'm not a
> video game developer myself, but I am the maintainer of Git LFS. If
> there's a capacity in which I could be useful to this group, I'd be more
> than happy to offer myself in that capacity.
>
> I'm cc-ing in brian carlson, Lars Schneider, and Preben Ingvaldsen on
> this email, too, since they all server on the core team of the project.
>
> Thanks,
> Taylor
>



Re: Git for games working group

2018-09-14 Thread John Austin
Hey Taylor,

Great to have your support! I think LFS has done a great job so far
solving the large file issue. I've been working myself on strategies
for handling binary conflicts, and particularly how to do it in a
git-friendly way (ie. avoiding as much centralization as possible and
playing into the commit/branching model of git). I've got to a loose
design that I like, but it'd be good to get some feedback, as well as
hearing what other game devs would want in a binary conflict system.

- John


On Fri, Sep 14, 2018 at 12:00 PM Taylor Blau  wrote:
>
> Hi John,
>
> On Fri, Sep 14, 2018 at 10:55:39AM -0700, John Austin wrote:
> > Is anyone interested in contributing/offering insights? I suspect most
> > folks here are git users as is, but if you know someone stuck on
> > Perforce, I'd love to chat with them!
>
> I'm thrilled that other folks are interested in this, too. I'm not a
> video game developer myself, but I am the maintainer of Git LFS. If
> there's a capacity in which I could be useful to this group, I'd be more
> than happy to offer myself in that capacity.
>
> I'm cc-ing in brian carlson, Lars Schneider, and Preben Ingvaldsen on
> this email, too, since they all server on the core team of the project.
>
> Thanks,
> Taylor
>



Git for games working group

2018-09-14 Thread John Austin
Hey all,

I've been putting together a working group for game studios wanting to
use Git. There are a couple of blockers that keep most game and media
companies on Perforce or others, but most would love to use git if it
were feasible.

The biggest tasks I'd like to tackle are:
 - improvements to large file management (mostly solved by LFS, GVFS)
 - avoiding excessive binary file conflicts (this is one of the big
reasons most studio are on Perforce)

Is anyone interested in contributing/offering insights? I suspect most
folks here are git users as is, but if you know someone stuck on
Perforce, I'd love to chat with them!

Happy to field thoughts in this thread or answer other questions about
why git doesn't work for games at the moment.

Cheers,
JA