Re: git-p4 Question
On 04/20/2015 09:41 AM, FusionX86 wrote: Hopefully this is an appropriate place to ask questions about git-p4. I started at a company that wants to migrate from Perforce to Git. I'm new to Perforce and have been trying to learn just enough about it to get through this migration. You might also like to check out my git-p4raw project which imports directly from the raw repository files into a git repo using git fast-import http://github.com/samv/git-p4raw Apparently it's my most popular github project :-). YMMV. Sam. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: weaning distributions off tarballs: extended verification of git tags
On 03/02/2015 12:08 PM, Junio C Hamano wrote: I have a hazy recollection of what it would take to replace SHA-1 in git with something else; it should be possible (though tricky) to do it lazily, where a tree entry has bits (eg, some of the currently unused file mode bits) to denotes which hash algorithm is in use for the entry. However I don't think that got past idea stage... I think one reason why it didn't was because it would not work well. That bit that tells this is a new object or old would mean that a single tree can have many different object names, depending on which of its component entries are using that bit and which aren't. There goes the we know two trees with the same object name are identical without recursing into them optimization out the window. Also it would make it impossible to do what you suggest to Joey to do, i.e. exactly the same way that git does, once you start saying that a tree object can be encoded in more than one different ways, wouldn't it? I was reasoning that people would rather not have to rewrite their whole history in order to switch checksum algorithms, and that by allowing trees to be lazily converted that this would make things more efficient. However, I think I see your point here that this doesn't work. However, as a per-commit header, then only first commit which changes the hashing algorithm would have to re-checksum each of the files: but just in the current tree, not all the way back to the beginning of history. The delta logic should not have to care, and these objects with the same content but different object ID should pack perfectly, so long as git-pack-objects knows to re-checksum objects with the available hash algorithms and spot matches. Other operations like diff which span commit hashing algorithms might be able to get away with their existing object ranking algorithms and cache alternate object IDs for content as they operate to facilitate exact matching across hash algorithm changes. But actually, for the original problem - just producing a signature with a different hashing algorithm - probably it would be sufficient to just re-hash the current commit and the current tree recursively, and the mixed hash-algorithm case does not need to exist. But I'm just thinking it might not be too hard to make git nicely generic, to be well prepared for when a second pre-image attack on SHA-1 becomes practical. Sam -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: weaning distributions off tarballs: extended verification of git tags
On 03/02/2015 10:12 AM, Joey Hess wrote: I support this proposal, as someone who no longer releases tarballs of my software, when I can possibly avoid it. I have worried about signed tags / commits only being a SHA1 break away from useless. As to the implementation, checksumming the collection of raw objects is certainly superior to tar. Colin had suggested sorting the objects by checksum, but I don't think that is necessary. Just stream the commit object, then its tree object, followed by the content of each object listed in the tree, recursing into subtrees as necessary. That will be a stable stream for a given commit, or tree. I would really just do it exactly the same way that git does: checksum the objects including their headers with the new hashes. I have a hazy recollection of what it would take to replace SHA-1 in git with something else; it should be possible (though tricky) to do it lazily, where a tree entry has bits (eg, some of the currently unused file mode bits) to denotes which hash algorithm is in use for the entry. However I don't think that got past idea stage... Sam -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: cherry picking and merge
On 08/01/2014 10:48 AM, Mike Stump wrote: There is also git-imerge, third party tool that is intended to help merging changes (and make it possible to do it in incremental way). Then remove git merge and replace it with git-imerge. :-) Anyway, I read that, and I can see some beauty of that that might be nice in complex merges. The problem is, I want git merge to work. Git merge has a notion of discrete merge strategies. The default, recursive merge strategy isn't completely oblivious to history; in the event that the two branches don't have a single merge bases, it performs 3-way merges (strangely enough) recursively, with the merge bases of the branch you're trying to merge until it completes. In general, this works pretty well. Some systems even simpler than that (eg, github's green merge button) work acceptably as well. There's no particular reason that you couldn't implement a merge strategy which works more like SVN's approach, which essentially does an internal rebase and then commits the result. The advantages of a rebase in this situation is that you get to eliminate changes which don't need to be applied, either because (in SVN's case), it had some metadata/hearsay information that told it that it could skip that change, or (in git's case), because it found content/facts that the change already was applied on one side. However, there are corresponding disadvantages to this strategy. It's just as easy to contrive a situation where this internal rebasing doesn't do the right thing, even without cheating by getting the metadata wrong. And besides, there's already a way to do this: do an actual rebase. You could also do a rebase, and then if, say, the original branch you're rebasing is published and you don't want to rewrite, then you can easily enough use squash merging, merge -s ours, etc to make it look like the strategy you wanted was a built-in git merge strategy. Or, in the spirit of open source, you could contribute the code required to make 'imerge' a built-in strategy. I was curious if svn handles this better the same or worse, and it did it just fine. I know that a while ago, svn could not handle this, it would do what git does currently. Apparently they figured out it was a bug and fixed it. Have you guys figured out it is a bug yet? The first step in solving a problem, is admitting you have a problem. So, I have to chuckle when I read this indignant comment. There's a funny story to the while ago you refer to. This refers to the time period during which SVN was relevant; about versions 1.4 and earlier (being generous). Back in those days, SVN projects for the most part avoided merging, because it was so problematic and not tracked at all. As one core SVN developer said to me, they found teams collaborate more closely if they're all working on the same branch. Sure, you could do it, and I even know of a few communities who did, but by and large, it was avoided. Then, the new wave of version control systems including Git, bzr and Mercurial were cropping up, and their merges were actually good enough that you could practically use them. The SVN core team had to keep pace to match. So, in 1.5 the merge tracking system, previously only supplied as a contrib script, became core. This is ironic, because the version control system which SVN imitated poorly--Perforce--had a very sophisticated, if over-complicated, merge tracking system which was also based on metadata. Per-branch, per-patch, per-file entries for whether or not a patch had been integrated into the target branch. I can only guess that the reason they didn't implement this in the original SVN version was that it was something of a pain point for users in Perforce. Possibly something to do with the way that Perforce would store double entries for each merge (yes: two rows in a relational store, one representing the mirror image of the other), and differentiated between many different forms of integrated (ie, 2 rows and 4 states instead of, say, a single bit). So the underlying data model wasn't as simple as it could have been, and this was reflected in the difficult to use command-line tools. Plus, they were using BerkeleyDB for metadata instead of the relational ISAM library, and debugging a rabbit's nest of merge record as Perforce used would have been a nightmare. They didn't go there. And besides, they found that often, detecting patches as already applied based on content, like 'patch' did, worked. Prior to 1.5, the Perl community developed SVK, an offline version of SVN, and this had a far simpler model for merge tracking, more similar to git's: just tracking whole-branch merges rather than individual files, patches, and branches. SVN eventually added two separate ways of tracking merges: either a per-file, per-branch, per-commit or a per-branch, per-commit model. Anyway, I'm not sure where I'm going with this, but I guess a little extra perspective would be useful! Sam --
Re: [git-users] worlds slowest git repo- what to do?
On 05/15/2014 12:06 PM, Philip Oakley wrote: From: John Fisher fishook2...@gmail.com I assert based on one piece of evidence ( a post from a facebook dev) that I now have the worlds biggest and slowest git repository, and I am not a happy guy. I used to have the worlds biggest CVS repository, but CVS can't handle multi-G sized files. So I moved the repo to git, because we are using that for our new projects. goal: keep 150 G of files (mostly binary) from tiny sized to over 8G in a version-control system. problem: git is absurdly slow, think hours, on fast hardware. question: any suggestions beyond these- http://git-annex.branchable.com/ https://github.com/jedbrown/git-fat https://github.com/schacon/git-media http://code.google.com/p/boar/ subversion You could shard. Break the problem up into smaller repositories, eg via submodules. Try ~128 shards and I'd expect that 129 small clones should complete faster than a single 150G clone, as well as being resumable etc. The first challenge will be figuring out what to shard on, and how to lay out the repository. You could have all of the large files in their own directory, and then the main repository just has symlinks into the sharded area. In that case, I would recommend sharding by date of the introduced blob, so that there's a good chance you won't need to clone everything forever; as shards with not many files for the current version could in theory be retired. Or, if the directory structure already suits it, you could directly use submodules. The second challenge will be writing the filter-branch script for this :-) Good luck, Sam -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [spf:guess,mismatch] [PATCH v2] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
On 10/11/2013 06:07 AM, Yoshioka Tsuneo wrote: + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '¥0'; This seems to be an encoding mistake; was this supposed to be an ASCII arrow? Sam -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/2] git-svn.perl: keep processing all commits in parents_exclude
On 08/11/2012 10:14 AM, Steven Walter wrote: This fixes a bug where git finds the incorrect merge parent. Consider a repository with trunk, branch1 of trunk, and branch2 of branch1. Without this change, git interprets a merge of branch2 into trunk as a merge of branch1 into trunk. Signed-off-by: Steven Walter stevenrwal...@gmail.com --- git-svn.perl |1 - t/t9164-git-svn-fetch-merge-branch-of-branch2.sh | 53 ++ 2 files changed, 53 insertions(+), 1 deletion(-) create mode 100755 t/t9164-git-svn-fetch-merge-branch-of-branch2.sh diff --git a/git-svn.perl b/git-svn.perl index abcec11..c4678c1 100755 --- a/git-svn.perl +++ b/git-svn.perl @@ -3623,7 +3623,6 @@ sub parents_exclude { if ( $commit eq $excluded ) { push @excluded, $commit; $found++; - last; } I could believe that, too. I like this change: one line of code, 53 lines of test and a paragraph of explanation :-). Cheers, Sam. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/2] git-svn.perl: keep processing all commits in parents_exclude
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 08/18/2012 01:43 PM, Steven Walter wrote: How about a Signed-Off-By? Signed-Off-By: Sam Vilain s...@vilain.net Sam -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBCgAGBQJQMCcnAAoJEBdtaL3wGtIoJ1UIAIJ6Xz5OEMmMk1tq546eggHg I+sJIFjqg+mo53VqT0/bKhqg8sLx8F/Gda15nwOUMcslKJdA+sCc+QhAtgSWJ1WK Idw59jtZHbabfopBHNgneSqVBhXSKpNw3e3EvlRVkK1wobO0+c0X6YkBG0eBCZl2 6RYXIAb6jX04k1hSrnxcPn+REkoyl31aEuFBPNz0wRWHjju+G6bPY/x7D/gO1YOc /uRQXveQngJOLwawDR+dGS+0aWPseX/sbZqsVFo0hVQYqoHt+s4uVuriBfHSRKd+ R1eUoY0ikW4UvEwZX74Zf3SeoVLLFnkCW8B5XsGb10IojbvY3uyYevATXI79j1Y= =Lb7H -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html