Re: easy way to demonstrate length of colliding SHA-1 prefixes?
On Mon, Dec 03, 2018 at 02:30:44PM -0800, Matthew DeVore wrote: > Here is a one-liner to do it. It is Perl line noise, so it's not very cute, > thought that is subjective. The output shown below is for the Git project > (not Linux) repository as I've currently synced it: > > $ git rev-list --objects HEAD | sort | perl -anE 'BEGIN { $prev = ""; $long > = "" } $n = $F[0]; for my $i (reverse 1..40) {last if $i < length($long); if > (substr($prev, 0, $i) eq substr($n, 0, $i)) {$long = substr($prev, 0, $i); > last} } $prev = $n; END {say $long}' Ooh, object-collision golf. Try: git cat-file --batch-all-objects --batch-check='%(objectname)' instead of "rev-list | sort". It's _much_ faster, because it doesn't have to actually open the objects and walk the graph. Some versions of uniq have "-w" (including GNU, but it's definitely not in POSIX), which lets you do: git cat-file --batch-all-objects --batch-check='%(objectname)' | uniq -cdw 7 to list all collisions of length 7 (it will show just the first item from each group, but you can use -D to see them all). > > You'll always need to list them all. It's inherently an operation where > > for each SHA-1 you need to search for other ones with that prefix up to > > a given length. > > > > Perhaps you've missed that you can use --abbrev=N for this, and just > > grep for things that are loger than that N, e.g. for linux.git: > > > > git log --oneline --abbrev=10 --pretty=format:%h | > > grep -E -v '^.{10}$' | > > perl -pe 's/^(.{10}).*/$1/' > > I think the goal was to search all object hashes, not just commits. And git > rev-list --objects will do that. You can add "-t --raw" to see the abbreviated tree and blob names, though it gets tricky around handling merges. -Peff
Re: easy way to demonstrate length of colliding SHA-1 prefixes?
On 12/02/2018 05:23 AM, Ævar Arnfjörð Bjarmason wrote: On Sun, Dec 02 2018, Robert P. J. Day wrote: as part of an upcoming git class i'm delivering, i thought it would be amusing to demonstrate the maximum length of colliding SHA-1 prefixes in a repository (in my case, i use the linux kernel git repo for most of my examples). is there a way to display the objects in the object database that clash in the longest object name SHA-1 prefix; i mean, short of manually listing all object names, running that through cut and sort and uniq and ... you get the idea. is there a cute way to do that? thanks. Here is a one-liner to do it. It is Perl line noise, so it's not very cute, thought that is subjective. The output shown below is for the Git project (not Linux) repository as I've currently synced it: $ git rev-list --objects HEAD | sort | perl -anE 'BEGIN { $prev = ""; $long = "" } $n = $F[0]; for my $i (reverse 1..40) {last if $i < length($long); if (substr($prev, 0, $i) eq substr($n, 0, $i)) {$long = substr($prev, 0, $i); last} } $prev = $n; END {say $long}' c68038ef $ git cat-file -t c68038ef error: short SHA1 c68038ef is ambiguous hint: The candidates are: hint: c68038effe commit 2012-06-01 - vcs-svn: suppress a signed/unsigned comparison warning hint: c68038ef00 blob fatal: Not a valid object name c68038ef You'll always need to list them all. It's inherently an operation where for each SHA-1 you need to search for other ones with that prefix up to a given length. Perhaps you've missed that you can use --abbrev=N for this, and just grep for things that are loger than that N, e.g. for linux.git: git log --oneline --abbrev=10 --pretty=format:%h | grep -E -v '^.{10}$' | perl -pe 's/^(.{10}).*/$1/' I think the goal was to search all object hashes, not just commits. And git rev-list --objects will do that.
Re: easy way to demonstrate length of colliding SHA-1 prefixes?
On Sun, 2 Dec 2018, Ævar Arnfjörð Bjarmason wrote: > On Sun, Dec 02 2018, Robert P. J. Day wrote: > > > as part of an upcoming git class i'm delivering, i thought it > > would be amusing to demonstrate the maximum length of colliding > > SHA-1 prefixes in a repository (in my case, i use the linux kernel > > git repo for most of my examples). > > > > is there a way to display the objects in the object database > > that clash in the longest object name SHA-1 prefix; i mean, short > > of manually listing all object names, running that through cut and > > sort and uniq and ... you get the idea. > > > > is there a cute way to do that? thanks. > > You'll always need to list them all. It's inherently an operation > where for each SHA-1 you need to search for other ones with that > prefix up to a given length. i assumed as much, just wasn't sure about the esoteric dark corners of git i've never gotten to yet. > Perhaps you've missed that you can use --abbrev=N for this, and just > grep for things that are loger than that N, e.g. for linux.git: > > git log --oneline --abbrev=10 --pretty=format:%h | > grep -E -v '^.{10}$' | > perl -pe 's/^(.{10}).*/$1/' > > This will list the 4 objects that need more than 10 characters to be > shown unambiguously. If you then "git cat-file -t" them you'll get > the disambiguation help. that's pretty close to what i came up with, thanks. rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: easy way to demonstrate length of colliding SHA-1 prefixes?
On Sun, Dec 02 2018, Robert P. J. Day wrote: > as part of an upcoming git class i'm delivering, i thought it would > be amusing to demonstrate the maximum length of colliding SHA-1 > prefixes in a repository (in my case, i use the linux kernel git repo > for most of my examples). > > is there a way to display the objects in the object database that > clash in the longest object name SHA-1 prefix; i mean, short of > manually listing all object names, running that through cut and sort > and uniq and ... you get the idea. > > is there a cute way to do that? thanks. You'll always need to list them all. It's inherently an operation where for each SHA-1 you need to search for other ones with that prefix up to a given length. Perhaps you've missed that you can use --abbrev=N for this, and just grep for things that are loger than that N, e.g. for linux.git: git log --oneline --abbrev=10 --pretty=format:%h | grep -E -v '^.{10}$' | perl -pe 's/^(.{10}).*/$1/' This will list the 4 objects that need more than 10 characters to be shown unambiguously. If you then "git cat-file -t" them you'll get the disambiguation help.
easy way to demonstrate length of colliding SHA-1 prefixes?
as part of an upcoming git class i'm delivering, i thought it would be amusing to demonstrate the maximum length of colliding SHA-1 prefixes in a repository (in my case, i use the linux kernel git repo for most of my examples). is there a way to display the objects in the object database that clash in the longest object name SHA-1 prefix; i mean, short of manually listing all object names, running that through cut and sort and uniq and ... you get the idea. is there a cute way to do that? thanks. rday