Re: SHA-1 collision in repository?

2018-03-07 Thread Myria
The fulltext whose checksum is 80a10d37de91cadc604ba30e379651b3 I found out is the first 16384 bytes of the file (see other parts of this thread). 16384 is SVN__STREAM_CHUNK_SIZE. On Fri, Mar 2, 2018 at 3:07 PM, Daniel Shahaf wrote: > Daniel Shahaf wrote on Fri, Mar 02, 2018 at 22:57:51 +: >

Re: SHA-1 collision in repository?

2018-03-07 Thread Myria
During rep_write_contents_close, there is a call to get_shared_rep. get_shared_rep calls svn_fs_fs__get_contents_from_file, which has the code pasted below. /* Build the representation list (delta chain). */ if (rh->type == svn_fs_fs__rep_plain) { rb->rs_list = apr_array_make(pool,

Re: SHA-1 collision in repository?

2018-03-07 Thread Nathan Hartman
On Mar 5, 2018, at 10:54 PM, Myria wrote: > > Final email for the night >.< > > What's clobbering the expanded_size is this in build_rep_list: > > /* The value as stored in the data struct. > 0 is either for unknown length or actually zero length. */ > *expanded_size = first_rep->expanded

Re: SHA-1 collision in repository?

2018-03-05 Thread Myria
anually >> add the users list back now. >> >> Below is the thread I sent. >> >> >> -- Forwarded message -- >> From: Myria >> Date: Mon, Mar 5, 2018 at 6:37 PM >> Subject: Re: SHA-1 collision in repository? >> To: Philip Mar

Re: SHA-1 collision in repository?

2018-03-05 Thread Myria
d I sent. > > > -- Forwarded message ------ > From: Myria > Date: Mon, Mar 5, 2018 at 6:37 PM > Subject: Re: SHA-1 collision in repository? > To: Philip Martin > > > I now know where the checksum error happens, but not why. > > svn: E

Fwd: SHA-1 collision in repository?

2018-03-05 Thread Myria
GMail keeps doing reply instead of reply all. I'm having to manually add the users list back now. Below is the thread I sent. -- Forwarded message -- From: Myria Date: Mon, Mar 5, 2018 at 6:37 PM Subject: Re: SHA-1 collision in repository? To: Philip Martin I now know

Re: SHA-1 collision in repository?

2018-03-04 Thread Nathan Hartman
On Mar 4, 2018, at 5:28 AM, Stefan Sperling wrote: > >> On Sat, Mar 03, 2018 at 09:08:41PM -0500, Nathan Hartman wrote: >> Does this mean that content being committed to the repository is never elided >> based on the SHA hash alone but only after a fulltext verification that the >> content actual

Re: SHA-1 collision in repository?

2018-03-04 Thread Stefan Sperling
On Sun, Mar 04, 2018 at 11:12:00AM +, Philip Martin wrote: > Stefan Sperling writes: > > > Yes. And if the content differs, it must be rejected, because an FSFS > > repository can only store one content per SHA1 checksum. > > To be accurate the server-side code can handle the files perfectly

Re: SHA-1 collision in repository?

2018-03-04 Thread Philip Martin
Myria writes: > How can I dump out the two things that Subversion thinks have the same > SHA-1 checksum but don't match? This seems to be rather difficult to do. On the server side: svnlook cat repository path-in-repository svnlook cat -r N repository path-in-repository svnlook cat -t TX

Re: SHA-1 collision in repository?

2018-03-04 Thread Philip Martin
Stefan Sperling writes: > Yes. And if the content differs, it must be rejected, because an FSFS > repository can only store one content per SHA1 checksum. To be accurate the server-side code can handle the files perfectly well if rep-caching is disabled. One can retreive either file, dump/load

Re: SHA-1 collision in repository?

2018-03-04 Thread Myria
How can I dump out the two things that Subversion thinks have the same SHA-1 checksum but don't match? This seems to be rather difficult to do. That said, it's far more likely that there's a bug in Subversion than that we randomly collided SHA-1. On Sun, Mar 4, 2018 at 02:29 Philip Martin wrote

Re: SHA-1 collision in repository?

2018-03-04 Thread Philip Martin
Nathan Hartman writes: > Does this mean that content being committed to the repository is never > elided based on the SHA hash alone but only after a fulltext > verification that the content actually already exists in the > repository? That's correct. Fulltext matching was added in 1.9.6 and 1.

Re: SHA-1 collision in repository?

2018-03-04 Thread Stefan Sperling
On Sat, Mar 03, 2018 at 09:08:41PM -0500, Nathan Hartman wrote: > On Mar 2, 2018, at 6:16 PM, Philip Martin wrote: > > > > Since the file being committed matched a SHA1 in the rep-cache the > > commit process will attempt to remove this delta but will first verify > > that the fulltext obtained b

Re: SHA-1 collision in repository?

2018-03-03 Thread Nathan Hartman
On Mar 2, 2018, at 6:16 PM, Philip Martin wrote: > > Since the file being committed matched a SHA1 in the rep-cache the > commit process will attempt to remove this delta but will first verify > that the fulltext obtained by expanding the delta in the protorev file > matches the fulltext in the r

Re: SHA-1 collision in repository?

2018-03-02 Thread Philip Martin
Myria writes: > I just found out that the file causing the error from the large commit > is not the large file - it's one of the smaller files, about 55 KB. > If I commit that single smaller file from the large commit, it errors > the same way as the original 227185 would. This is exactly like t

Re: SHA-1 collision in repository?

2018-03-02 Thread Daniel Shahaf
Daniel Shahaf wrote on Fri, Mar 02, 2018 at 22:57:51 +: > Myria wrote on Mon, Feb 26, 2018 at 13:41:05 -0800: > > In other news, unknown whether related to the current problem, my > > attempt to clone the repository to my local computer is failing: > > > > D:\>svnsync sync file:///d:/svnclone

Re: SHA-1 collision in repository?

2018-03-02 Thread Daniel Shahaf
Myria wrote on Thu, Mar 01, 2018 at 18:45:38 -0800: > Also, I have no control over what was in the repository five years > ago. The huge files were compiled versions of WebKit libraries. Note that in 2017, WebKit intentionally committed a SHA-1 collision into their repository. If you have the We

Re: SHA-1 collision in repository?

2018-03-02 Thread Daniel Shahaf
Myria wrote on Mon, Feb 26, 2018 at 13:41:05 -0800: > In other news, unknown whether related to the current problem, my > attempt to clone the repository to my local computer is failing: > > D:\>svnsync sync file:///d:/svnclone > Transmitting file data > ...

Re: SHA-1 collision in repository?

2018-03-02 Thread Myria
The problem is identical on Windows command line, Windows TortoiseSVN, Ubuntu-Linux, Ubuntu-Linux on Windows, and macOS. I'm just bad at GDB. On Thu, Mar 1, 2018 at 9:09 PM, Nico Kadel-Garcia wrote: > On Thu, Mar 1, 2018 at 10:25 PM, Myria wrote: >> I just found out that the file causing the er

Re: SHA-1 collision in repository?

2018-03-01 Thread Nico Kadel-Garcia
On Thu, Mar 1, 2018 at 10:25 PM, Myria wrote: > I just found out that the file causing the error from the large commit > is not the large file - it's one of the smaller files, about 55 KB. > If I commit that single smaller file from the large commit, it errors > the same way as the original 227185

Re: SHA-1 collision in repository?

2018-03-01 Thread Myria
I just found out that the file causing the error from the large commit is not the large file - it's one of the smaller files, about 55 KB. If I commit that single smaller file from the large commit, it errors the same way as the original 227185 would. This is exactly like the original problem with

Re: SHA-1 collision in repository?

2018-03-01 Thread Myria
On Wed, Feb 28, 2018 at 6:17 AM, Nico Kadel-Garcia wrote: > On Tue, Feb 27, 2018 at 4:09 PM, Myria wrote: > >> Not to mention that the two revisions complained about are unrelated, and >> 2/3 the repository history apart. >> >> One thing that's interesting is that the commit the svnsync failed on

Re: SHA-1 collision in repository?

2018-02-28 Thread Nico Kadel-Garcia
On Tue, Feb 27, 2018 at 4:09 PM, Myria wrote: > Not to mention that the two revisions complained about are unrelated, and > 2/3 the repository history apart. > > One thing that's interesting is that the commit the svnsync failed on is a > gigantic commit. It's 1.8 GB. Maybe that svnsync is fail

Re: SHA-1 collision in repository?

2018-02-28 Thread Philip Martin
Johan Corveleyn writes: > I'm wondering whether this is related to the bug that was fixed for > 1.8.x here: > > http://svn.apache.org/viewvc?view=revision&revision=1803435 > > ... or a similar problem. > I'm actually not sure whether that bugfix was released already (it's > not mentioned in CHANG

Re: SHA-1 collision in repository?

2018-02-27 Thread Johan Corveleyn
[ Please keep the users list in cc. ] On Tue, Feb 27, 2018 at 11:38 PM, Myria wrote: > On Tue, Feb 27, 2018 at 2:22 PM, Johan Corveleyn wrote: >> On Tue, Feb 27, 2018 at 10:09 PM, Myria wrote: >>> >>> On Tue, Feb 27, 2018 at 05:54 Philip Martin >>> wrote: Myria writes: > -

Re: SHA-1 collision in repository?

2018-02-27 Thread Johan Corveleyn
On Tue, Feb 27, 2018 at 10:09 PM, Myria wrote: > > On Tue, Feb 27, 2018 at 05:54 Philip Martin > wrote: >> >> Myria writes: >> >> > -bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where >> > hash='db11617ef1454332336e00abc311d44bc698f3b3'" >> > db11617ef1454332336e00abc311d44bc698f3b3|6

Re: SHA-1 collision in repository?

2018-02-27 Thread Myria
On Tue, Feb 27, 2018 at 05:54 Philip Martin wrote: > Myria writes: > > > -bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where > > hash='db11617ef1454332336e00abc311d44bc698f3b3'" > > db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680 > > > > The line from the grep -a comm

Re: SHA-1 collision in repository?

2018-02-27 Thread Philip Martin
Myria writes: > -bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where > hash='db11617ef1454332336e00abc311d44bc698f3b3'" > db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680 > > The line from the grep -a command containing that hash is below. They > all match. > text: 6044

Re: SHA-1 collision in repository?

2018-02-26 Thread Branko Čibej
On 26.02.2018 22:41, Myria wrote: > -bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where > hash='db11617ef1454332336e00abc311d44bc698f3b3'" > db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680 > > The line from the grep -a command containing that hash is below. They > all m

Re: SHA-1 collision in repository?

2018-02-26 Thread Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where hash='db11617ef1454332336e00abc311d44bc698f3b3'" db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680 The line from the grep -a command containing that hash is below. They all match. text: 604440 34 134255 136680 c9f4fabc4

Re: SHA-1 collision in repository?

2018-02-23 Thread Branko Čibej
On 24.02.2018 01:09, Myria wrote: > Once it's on my local machine, I'll be able to compile TortoiseSVN and > debug it while pointing to a file:// repository. (TortoiseSVN instead > of command-line svn because TortoiseSVN is compiled with Visual C++ > and is therefore many times easier to debug.)

Re: SHA-1 collision in repository?

2018-02-23 Thread Philip Martin
Philip Martin writes: > There are a couple of options: > > A) disable rep-caching by editing fsfs.conf inside the repository > > B) reset the mapping by deleting/renaming the file db/rep-cache.db > inside the repository (but please rename rather than delete if you > want to help us

Re: SHA-1 collision in repository?

2018-02-23 Thread Philip Martin
Myria writes: > I was able to branch (svn copy) the affected branch to a new branch, > and committing the same file to the new branch has the same error. > Checking out that revision works fine; only that commit is affected. I suspect the problem is that the repository revision files are OK but

Re: SHA-1 collision in repository?

2018-02-23 Thread Myria
On Fri, Feb 23, 2018 at 2:50 PM, Philip Martin wrote: > Stefan Sperling writes: > > I think this might be the case since you mentioned earlier that you > could not find a file with the given checksum. The checksums apply to > the repository format, i.e. before keyword/eol transformation, and if

Re: SHA-1 collision in repository?

2018-02-23 Thread Philip Martin
Stefan Sperling writes: > On Fri, Feb 23, 2018 at 01:06:36PM -0800, Myria wrote: >> The revision 605556 is simply the current revision number of the >> repository at the time of the attempted commit, and is unrelated to >> the problem. If I attempt the commit now, it's a higher number, but >> ot

Re: SHA-1 collision in repository?

2018-02-23 Thread Stefan Sperling
On Fri, Feb 23, 2018 at 01:06:36PM -0800, Myria wrote: > I'm not subscribed to this mailing list, so I have no standard way to > reply to Philip's email. I don't even know his email address. > > > That pattern, all of MD5, SHA1 and size matching, is exactly what > > happens if a SHA1 collision is

Re: SHA-1 collision in repository?

2018-02-23 Thread Myria
I'm not subscribed to this mailing list, so I have no standard way to reply to Philip's email. I don't even know his email address. > That pattern, all of MD5, SHA1 and size matching, is exactly what > happens if a SHA1 collision is committed using an old version of > Subversion where the rep-cac

Re: SHA-1 collision in repository?

2018-02-22 Thread Philip Martin
Branko Čibej writes: > On 22.02.2018 21:30, Myria wrote: >> When we try to commit a very specific version of a very specific >> binary file, we get a SHA-1 collision error from the Subversion >> repository: >> >> D:\confidential>svn commit secret.bin -m "Testing broken commit" >> Sendings

Re: SHA-1 collision in repository?

2018-02-22 Thread Matt Simmons
I would get more advice from people here before you invest that time. I'm a relative amateur and would listen to people with more experience than myself. --Matt On Thu, Feb 22, 2018 at 2:29 PM, Myria wrote: > That was one document we ran into when searching, yes. > > We can do an svnsync, but t

Re: SHA-1 collision in repository?

2018-02-22 Thread Myria
That was one document we ran into when searching, yes. We can do an svnsync, but this will take about a week to run--the repository is 43 GB with 600,000 commits. I guess we'll start it now. On Thu, Feb 22, 2018 at 2:04 PM, Matt Simmons wrote: > Hi Melissa, > > That definitely is interesting. >

Re: SHA-1 collision in repository?

2018-02-22 Thread Branko Čibej
On 22.02.2018 21:30, Myria wrote: > When we try to commit a very specific version of a very specific > binary file, we get a SHA-1 collision error from the Subversion > repository: > > D:\confidential>svn commit secret.bin -m "Testing broken commit" > Sendingsecret.bin > Transmitting file d

Re: SHA-1 collision in repository?

2018-02-22 Thread Matt Simmons
Hi Melissa, That definitely is interesting. I assume you have read http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options If you do an svnsync to another location and attempt the commit there, does the problem replicate itself? --Matt On

SHA-1 collision in repository?

2018-02-22 Thread Myria
When we try to commit a very specific version of a very specific binary file, we get a SHA-1 collision error from the Subversion repository: D:\confidential>svn commit secret.bin -m "Testing broken commit" Sendingsecret.bin Transmitting file data .svn: E16: Commit failed (details follo