Re: Let's switch to time-based releases

2017-12-01 Thread Jacek Materna
6 month
2 year LTS

On Fri, Dec 1, 2017 at 7:38 AM, Branko Čibej  wrote:

> On 01.12.2017 13:16, Johan Corveleyn wrote:
> > Great! I gather from the reactions in this thread that we have
> > consensus on the principle, we "just" need to do it :-). And figure
> > out some details.
>
> Will that be Subversion 18.04 Anticlimactic Albatross?
>
>
> -- Brane
>



-- 

Jacek Materna
Chief Technology Officer

Assembla
+1 210 410 7661


Re: How to manage PR's (was: [GitHub] subversion pull request #6: VC should be 14.1 and not 15.0)

2017-10-10 Thread Jacek Materna
There's webhooks in place yes- we could setup a middleman like zapier to
send "records" via an exchange for archive purposes. More than happy to
setup or help setup. Typical use case Johan, for those wanting a "log" of
github activity.

Thoughts?

-jacek

On Tue, Oct 10, 2017 at 4:44 PM, Stefan Sperling  wrote:

> On Tue, Oct 10, 2017 at 05:32:21PM -0400, Paul Hammant wrote:
> > >
> > >
> > > Anybody knows why asfgit closed the PR?
> > >
> > > I've clicked on the above hyperlink, and saw some comments made by
> > > Julian (11 hours ago) and by the submitter (8 hours ago), but these
> > > comments are not visible on dev@. Shouldn't those be synced back here
> > > somehow? Or how should we handle PR's, preferably with a process
> > > that's still centered around the dev@ mailinglist?
> > >
> > > If we can't really handle PR's well, maybe we shouldn't allow them
> > > (i.e. indicate this on github somehow, that it's just a mirror and we
> > > won't accept PR's, but expect patch submissions on dev@ instead)?
> > >
> > >
> > If you ask people comfortable with donating code via PR to email patches
> > (or similar) to dev@ instead, you'll get a 1/100th contribution rate. Or
> > worse.
> >
> > There's no easy answer.
>
> Couldn't some automated process mail PRs from github as (--git) diffs to
> dev@?
>



-- 

Jacek Materna
Chief Technology Officer

Assembla
+1 210 410 7661


Re: [RFC] Using LZ4 compression by default

2017-08-18 Thread Jacek Materna
d the projected
> size of the repositories will increase accordingly.
>
> One additional point to consider here is that such change may be
> going a bit against the policy of adding a new optional feature and
> switching the default in the next minor release.
>
>  (3) Compress with LZ4 by default, but only in new format 8 repositories.
>
> This option is similar to (2), but with a more limited scope where
> LZ4 compression is only used for the new repositories created with
> Subversion 1.10 binaries.
>
>
> Personally, I find the significant speed improvement for both read and
> write
> operations from LZ4 compression quite important, and I think that the
> actual
> reduction in the compression ratio is acceptable, considering the gained
> benefits.  I also think that the risks associated with switching the
> default
> on-disk format are low in this particular case, considering that the LZ4
> library is stable.  (It has been available for a long time and is used by
> projects like Linux Kernel and ZFS).
>
> In other words, I think that we would benefit from using LZ4 compression
> by default.
>
> Among the options (2) and (3) that make LZ4 the new default compression
> algorithm, I think that option (2) is better.  The reasoning here is that
> using LZ4 compression would improve the performance even for existing
> repositories by making new commits faster and by speeding up read
> operations for the new committed files.  Apart from this, option (3)
> needs implementation and is probably going to have a couple of related
> challenges, which can be otherwise avoided.
>
> With all that in mind, I propose that we do (2).  Any objections?
>
>
> Thanks,
> Evgeny Kotkov
>



-- 

Jacek Materna
Chief Technology Officer

Assembla
+1 210 410 7661


Re: Giving 1.10 alpha more exposure to gather feedback

2017-08-17 Thread Jacek Materna
I mentioned earlier on past thread, and still stand by the statement
that setting up a collaborative space beyond email is key to
attracting, enabling and growing an active beta testing user base who
is committed to genuinely testing, breaking and improving new
features. Access to code, commenting, tickets to track improvements
with real-time collaboration are super table-stakes. It may be too
late for this alpha3 but something for the future to consider.

I am actively running content marketing to rally excitement behind
current and future release like alpha3 but more across the spectrum
would increase visibility and 'fill the funnel' for new engaged
beta-users wanting to push the alpha's over the line. Super key.

-jacek

On Thu, Aug 17, 2017 at 9:58 AM, Johan Corveleyn  wrote:
> On Tue, Aug 15, 2017 at 6:35 PM, Daniel Shahaf  
> wrote:
>> Johan Corveleyn wrote on Tue, 15 Aug 2017 11:57 +0200:
>>> Can we try to give 1.10 alpha3 a bit more exposure, in order to gather
>>> more feedback? I'm thinking of:
>>>
>>> - A post to users@, […]
>>>
>>> - Reaching out to producers of binaries to create a 1.10-alpha3 […]
>>>
>>> - Announce / ask feedback more prominently on our own website.
>>>
>>> - Reach out to software that integrates SVN (e.g. IDE's) to start
>>> working on 1.10 support, […]
>>
>> Just one: ideally, by the time we post to users@, alpha3 binaries will 
>> already
>> be available for most libsvn_client consumers.
>
> Hm, yes. I guess that would be best. But maybe the binary packagers
> also would like a short write-up to accompany their publishing of the
> 1.10 alpha3.
>
> So I'm imagining that:
>
> 1. We write some short explanation / call for testing, that we can
> send to interested binary packagers (a text they can reuse when
> publishing for their users / on their website).
>
> 2. We contact binary packagers and ask them if they'd like to help out
> by publishing 1.10 alpha3 somewhere (clearly marked as "alpha /
> unstable / cutting edge"), where they can reuse our text from 1 if
> they want. We ask them to confirm it and when they'll do it.
>
> 3. We publish details on our own website, linking to the various
> binary packagers that have confirmed.
>
> 4. We mail users@ with some extended version of text 1., with a link
> to our own webpage and links to the various binaries that have become
> available.
>
> 5. Where possible / known, we contact other software vendors that
> integrate SVN (IDE's etc) if they'd like to join the effort, and
> prepare their software to handle tree conflicts using the new
> functionality.
>
>
> On Tue, Aug 15, 2017 at 10:00 PM, Nathan Hartman
>  wrote:
>> I assume alpha status means not for production use.
>> The question then is how to test effectively, in a manner that
>> is representative of real use
>> without excessive risk to data.
>>
>> It may be helpful to include suggestions in the
>> aforementioned exposure.
>
> For the tree conflict resolution feature we're mainly interested in
> client-side testing. So end-users trying the 1.10 alpha3 on a working
> copy, either during their daily work (with all disclaimers about it
> being only an alpha) or to try out typical scenario's where they
> encounter tree conflicts. They can do a significant part of
> experimentation / testing without even committing the result to their
> repositories.
>
> I guess you're right: we should write up a couple of suggestions on
> how users can test the alpha.
>
>
> On Tue, Aug 15, 2017 at 11:37 PM, Andreas Stieger
>  wrote:
>> The openSUSE project has 1.10 alpha3 binaries ready for all current
>> openSUSE and SUSE Linux Enterprise distributions.
>>
>> https://software.opensuse.org//download.html?project=home%3AAndreasStieger%3Abranches%3Adevel%3Atools%3Ascm%3Asvn&package=subversion
>>
>> Andreas
>
> Great, thanks Andreas!
>
> --
> Johan



-- 

Jacek Materna
Chief Technology Officer

Assembla
+1 210 410 7661


Re: [RFC] Using LZ4 compression by default

2017-08-04 Thread Jacek Materna
Based on what we are seeing - disk storage is at least 10x less concerning
across a huge swatch of enterprise SVN users than UX (performance and speed
of developer workflows). It's becoming less and less so each quarter.
Unless the disk storage increase impacts the server side performance in
aggregate I don't understand why the priority (biased defaults) would not
be to increase the speed of every operation.

?

On Fri, Aug 4, 2017 at 9:30 AM, Nathan Hartman 
wrote:

> On Aug 2, 2017, at 2:59 PM, Evgeny Kotkov 
> wrote:
> > With the recently added support for LZ4 compression (r1801940 et al),
> > we now have an option of using it by default for the on-disk data and
> > over the wire.
> > [...]
> > The amount of the required additional space is proportional
> > to the difference between the compression ratio of LZ4 and zlib-5,
> > which can be roughly estimated as around 30-35% for compressible
> > binary and text files, although that may vary depending on the
> > actual data.
> >
> > To illustrate how these changes will affect the speed of some of the
> > operations, the 'svn import' of a 2 GB file over HTTP on LAN in my
> > environment takes 18 seconds instead of 63 seconds.
>
> Regarding on disk storage, for small repos a 30% size increase is probably
> not material, but it may be significant for large repos. Is it feasible to
> get the best of both worlds by using LZ4 for fast commits and then
> recompress using zlib in svnadmin pack?




-- 

Jacek Materna
Chief Technology Officer

Assembla
+1 210 410 7661


Re: svn commit: r1801940 - in /subversion/trunk: ./ notes/ subversion/include/ subversion/include/private/ subversion/libsvn_delta/ subversion/libsvn_fs_fs/ subversion/libsvn_subr/ subversion/tests/li

2017-07-25 Thread Jacek Materna
On Tue, Jul 25, 2017 at 6:39 AM, Daniel Shahaf  wrote:
> Good morning Evgeny,
>
> First of all, thanks for the well written response.
>
> Evgeny Kotkov wrote on Mon, 24 Jul 2017 19:19 +0300:
>> Daniel Shahaf  writes:
>> > I'm a bit uncomfortable with this logic.
>> >
>> > 1. It violates the principle of least surprise: compression-level=9
>> > means 'gzip -9', compression-level=5 means 'gzip -5', but
>> > compression-level=1 means LZ4 (with the default acceleration_factor)
>> > rather than 'gzip -1'.
>> >
>> > 2. It leaves no way to use zlib level 1 in f8 filesystems.  This seems
>> > like a decision that should be left to the admin, rather than hardcoded
>> > into the library.
>> >
>> > 3. What if somebody wanted to add a backend with, say, xz compression.
>> > (xz compression also takes levels like gzip.)  Would it make sense to have
>> > two tunables:
>> > .
>> > compression-backend = { lz4 | zlib }
>> > compression-level = {1..N for lz4, 1..9 for zlib}
>> > .
>> > and then other compression backends could be easily added?
>> >
>> > This would also allow admins to set the 'acceleration_factor' of lz4.
>>
>> First of all, I agree that this logic has drawbacks if observed from the
>> idealistic point of view:
>>   - the choice of the compression algorithm happens implicitly, and
>>   - it doesn't allow users to use stop using LZ4 for any reason with
>> the compression level set to 1.
>>
>> In the meanwhile, I think that the current approach is quite pragmatic,
>> as LZ4 is a suitable alternative for zlib1, and considering the big picture
>> with a similar configuration knob in mod_dav_svn, where the choice of the
>> compression algorithm is tied to the negotiation of the wire format for
>> older clients (see below).
>
> I'm afraid I disagree with both of these points.
>
> You keep stating "lz4 is better than zlib(level=1)" as an absolute fact.
> I do not see it this way.  Yes, there is a benchmark you cited that
> scored lz4 higher than zlib, but that does not imply that lz4 is
> universally preferable to zlib (and that it will remain so for the
> lifetime of the 1.10.x series).
>
> Secondly, the compatibility requirements of mod_dav_svn and FSFS are
> different.  mod_dav_svn is supposed to support multiple client minor
> versions simultaneously; FSFS did a format bump explicitly because
> supporting 1.9 and 1.10 simultaneously was not possible.
>
>> When I've been thinking about allowing an explicit choice of the algorithm,
>> I had a slightly different line of thought, opposed to "compression-backend
>> + compression-level", with a single option:
>>
>>   compression = none | lz4 | zlib | zlib-1 ... zlib-9
>>
>>   (The rationale is to avoid having two dependent options; as well as that,
>>currently, I don't have the data showing that being able to tune the
>>acceleration factor of LZ4 can noticeably improve performance.)
>
> +1

+1

>
>> However, there are a couple of difficulties with porting this approach to
>> mod_dav_svn, i.e., if we introduce the SVNCompression directive.  There
>> are clients that don't use LZ4, so, presumably, this options would require
>> specifying all formats that a server can use, in the preferred order:
>>
>>   SVNCompression "lz4, zlib"
>>
>> While such approach is explicit, it also has a couple of drawbacks, as it:
>>   - leaves a window for mistakes (say, if the user sets "SVNCompression lz4"
>> and inadvertently disables compression for older clients),
>>   - is not forward-compatible, as new compression algorithms require the
>> server to be reconfigured, and
>>   - adds complexity.
>>
>
> - We can detect the configuration "SVNCompression lz4" and error out on it.
>
> - Forward compatibility: I'm not convinced that we even _need_ forward
>   compatibility for this; we can just tell people who set this knob that
>   they may miss out on new compression algorithms if they don't review
>   the setting of that knob when they upgrade to a new minor version.
>   (That's exactly how the client-side config:auth:password-stores
>   works.)  There are other options to solve the compatibility issue, but
>   as you say, they would add complexity.

+1

>
> - I don't see how the fact that «SVNCompression "lz4, zlib"» might be
>   considered too complex to add, affects my arguments about fsfs.conf.
>   As I said earlier, FSFS f8 does not need to support 1.9 and 1.10
>   clients in parallel, so it has no need for a compression negotiation
>   configuration.  Perhaps you could clarify the fsfs part of your
>   argument?
>
>> As we are lucky that LZ4 is a suitable alternative for zlib1, and that our
>> current configuration knobs are not tightly bound to zlib, I propose that we
>> keep the current logic for now and postpone the generic solution up to the
>> moment when we add another compression algorithm that does not fit this
>> scheme or requires additional configuration.
>>
>> In other words, we can always do this separately, when it's absolutely needed
>> 

Re: Big files PUT into Subversion - encountering

2017-07-15 Thread Jacek Materna
Which kernel version you running?

On Thu, Jul 13, 2017 at 12:44 PM, Paul Hammant  wrote:
> Markus - you may be right on hopes for perf improvements.
>
> I'm reevaluating what I said a couple of days ago in this thread. The best
> case PUT times for that 15GB random resource at 7 mins, but about 1/5 of
> them are at 15 mins. I'm going to try to undo the TMPDIR change and see if
> it goes back to 15 mins consistently
>
> The drive is mounted as 'async' in Linux. Is that what you meant by no-sync
> ?  Sync kills USB drive performance by 90% on Linux in my experience.
>
> The 4TB drive's Svn server size is now up at 3TB, if anyone is interested.
>
> - Paul
>
> On Wed, Jul 12, 2017 at 2:22 AM, Markus Schaber 
> wrote:
>>
>> Hi, Paul,
>>
>>
>>
>> If at all, I’d expect a speed boost if the temp folder is on a fast drive
>> (e. G. SSD or RAM Disk) separate from the backend storage, so storage and
>> temp file I/O won’t compete for I/O. (Size of RAM useable for OS caches also
>> makes a difference, and mount options like “no-sync” which can be acceptable
>> for temp folders – but never for backend storage, of course).
>>
>>
>>
>> And I also guess the speed difference is more siginificant if there are
>> several concurrent accesses, so the I/O operations overlap. A single SVN
>> backend process is pretty much “serialized” in what it does, no concurrent /
>> async I/O yet.
>>
>>



-- 

Jacek Materna
CTO

Assembla
+1 210 410 7661
+48 578 296 708


Re: Proposal: new fsfs.conf properties

2017-07-13 Thread Jacek Materna
On Thu, Jul 13, 2017 at 9:36 AM, Johan Corveleyn  wrote:
>
> On Thu, Jul 13, 2017 at 8:27 AM, Branko Čibej  wrote:
> > On 13.07.2017 04:07, Paul Hammant wrote:
> >> I flipped _back_ to Python's requests.put(..) in my solution - from a
> >> regular Svn client. That relies on 'autoversioning=on' for it to work
> >> over DAV, I mean. In that configuration it functions like curl, of
> >> course.
> >>
> >> _Commit Messages_
> >>
> >> I'd love a --header "svn:message: my message" too.  I raised it before
> >> in https://issues.apache.org/jira/browse/SVN-4454
> >>  but the team was
> >> split back in 2013 on yes vs no. I'm thinking I'd like it again,
> >> having reviewed all the comments. Can I ask for that issue/request to
> >> be reopened, please ?
> >
> >
> > I'm wondering what you gain with curl and autoversioning over, e.g.,
> > svnmucc or using our bindings (or even our libraries)? Other than that
> > you can't set the log message or any other properties.
> >
> > FWIW, I strongly disagree with the idea of adding this feature, given
> > that there are already _two_ ways of doing this without having a working
> > copy.
>
> As I said in my previous response here, I think the reason for Paul to
> go for curl+autoversioning is speed, because it eliminates client-side
> deltification. It was suggested and demonstrated by Philip here:
>
> https://svn.haxx.se/dev/archive-2017-07/0040.shtml
>
> But I'm wondering if that curl advantage won't dissapear if we develop
> a solution for a normal svn client to skip deltification.

Keen to understand this - speeding up commits in a supported way (for
specific workflows) would be a major win.

>
> --
> Johan


Re: [PATCH] Tweak the SHA-1 FAQ entry

2017-07-07 Thread Jacek Materna
Fair enough!

Rest looks great. 

--
Jacek Materna

Assembla
CTO
+1 210 410 7661
+48 578 296 708
Sent from my Mobile Device

> On Jul 7, 2017, at 5:47 PM, Evgeny Kotkov  wrote:
> 
> Jacek Materna  writes:
> 
>> Shouldn't vulnerability / shattered remain a keyword? I believe it was
>> a comment in the patch I submitted - just wondering about google-ness
>> of it... shattered has a lot of google juice.
> 
> I was thinking about turning this FAQ entry into a more or less generic
> statement about Subversion and SHA-1 collisions, rather than making it
> centered around one particular kind of collision (https://shattered.io/).
> 
> 
> Regards,
> Evgeny Kotkov


Re: [PATCH] Tweak the SHA-1 FAQ entry

2017-07-07 Thread Jacek Materna
Shouldn't vulnerability / shattered remain a keyword? I believe it was
a comment in the patch I submitted - just wondering about google-ness
of it... shattered has a lot of google juice.

collision vs. 'shattered / vulnerability'

Otherwise more complete verbiage is always good.

On Fri, Jul 7, 2017 at 4:46 PM, Evgeny Kotkov
 wrote:
>
> Hi all,
>
> I made an attempt to tweak the SHA-1 FAQ entry (which is available at
> https://subversion.apache.org/faq#shattered-sha1) to make it a bit more
> user-friendly.
>
> Please see the attached patch.  What do you think about making a change
> like this?
>
> For convenience, here is the final result as plain text:
>
> [[[
> How is Subversion affected by SHA-1 hash collisions?
>
> Publication of the first known SHA-1 collision by Google and CWI unveiled a
> couple of related issues in the Subversion's use of SHA-1. The Subversion's
> core does not rely on SHA-1 for content indexing, but it was being used for
> such purposes in the following supplementary features:
>
> - repository data deduplication feature, and
> - content deduplication feature in the working copy.
>
> Speaking of the repository data deduplication feature, this can result in
> inability to access files with colliding SHA-1 values or cause data loss for
> such files. To prevent different content with identical SHA-1 from being
> stored in a repository, upgrade to 1.9.6 or 1.8.18 which, by default,
> prevent storing data with such collisions. See our SHA-1 advisory for
> details.
>
> Until the upgrade to these new releases is available, Unix-based servers can
> use the pre-commit hook found here. As an aside, we welcome Windows developers
> to submit a pre-commit script for the Windows platform. More information on
> submission can be found here.
>
> The working copy uses SHA-1 for deduplication of the stored content, and for
> performance reasons a client will avoid fetching content with the same SHA-1
> checksum. The workaround for this issue is to prevent storage of the colliding
> objects in the first place, via upgrade to 1.9.6 or installation of the
> aforementioned pre-commit script.
>
> Storing content with SHA-1 collisions it not a supported use case. If you
> have content with colliding SHA-1 hash values, we suggest you transform it
> via gzip before committing it to avoid the collision altogether. Moreover,
> an upgrade to 1.9.6 to prevent future insertion of duplicates is highly
> recommended.
> ]]]
>
>
> Regards,
> Evgeny Kotkov




-- 

Jacek Materna
CTO

Assembla
+1 210 410 7661
+48 578 296 708


Re: Expected speed of commit over HTTP?

2017-07-06 Thread Jacek Materna
Paul,

Got back from Ops. We don't have anything "special" setup other than what 
already been mentioned. Overall the defining metrics for commit performance 
are: ssh vs https, rtt on the network path and "randomness" of the blobs going 
up. 

No magic params on our end. We have it tuned the max via the metrics above. 

Most of the real world use cases require the default settings so curl, etc 
isn't seen a lot in the wild. 

Overall Dav and http is a poor transport in general for SVN. 

Hope it helps. 

--
Jacek Materna

Assembla
CTO
+1 210 410 7661
+48 578 296 708
Sent from my Mobile Device

> On Jul 7, 2017, at 1:10 AM, Paul Hammant  wrote:
> 
> With autorevision set to 'on' and curl:
> 
> Reverence speed for boot drive to USB3 spinning platter 4TB thing:
> 
> paul@paul-HiBox:~$ time cp /home/paul/clientDir/seven /media/paul/sg4t/sevenb
> real  0m1.539s
> 
> Create a new 501MB file on Svn Server:
> 
> paul@paul-HiBox:~$ time curl -u paul:myPassword  
> http://192.168.1.178/svn/svnRepo1/twoB --upload-file 
> /home/paul/clientDir/seven
> 201 Created
> real  0m49.442s
> 
> I ran that a couple more times and it was up there at 50s
> 
> Needlessly overwrite 501MB file (file is unchanged) on Svn Server:
> 
> paul@paul-HiBox:~$ time curl -u paul:myPassword  
> http://192.168.1.178/svn/svnRepo1/twoB --upload-file 
> /home/paul/clientDir/seven
> real  0m13.426s
> 
> Change the compression-level=0
> 
> paul@paul-HiBox:~$ sudo nano /media/paul/sg4t/svnParent/svnRepo1/db/fsfs.conf 
> 
> Create a new 501MB file on Svn Server:
> 
> paul@paul-HiBox:~$ time curl -u paul:myPassword  
> http://192.168.1.178/svn/svnRepo1/twoC --upload-file 
> /home/paul/clientDir/seven
> 201 Created
> real  0m15.312s
> 
> Yay - a modest speed boost!!!
> 
> Restart Apache - which I didn't do before:
> 
> paul@paul-HiBox:~$ systemctl restart apache2
> 
> Create a new 501MB file on Svn Server:
> 
> paul@paul-HiBox:~$ time curl -u paul:myPassword  
> http://192.168.1.178/svn/svnRepo1/twoD --upload-file 
> /home/paul/clientDir/seven
> 201 Created
> real  0m14.925s
> 
> Conclusion:
> 
> With compression-level=5 (default), there's is a 1:33 cp to curl-PUT ratio.
> With compression-level=0, there's is a 1:10 cp to curl-PUT ratio.
> 
> Is there there are other alluring settings, such as...
> enable-rep-sharing = false
> enable-dir-deltification = false
> 
> ... but they didn't yield an improvement.
> 
> Thanks for all the replies, gang.
> 
> -  Paul


Re: Expected speed of commit over HTTP?

2017-07-06 Thread Jacek Materna
Let me go do some digging with the Ops team-

On Thu, Jul 6, 2017 at 2:31 PM, Paul Hammant  wrote:

> Jacek, Can you share any config settings for mod_dav that might be areas
> to experiment with ?
>
> And, yes, just because the file commit has no discernible deltas at all,
> doesn't mean that Svn doesn't attempt to make its own determination.  I
> wasn't using .bin suffixes as it happens but I'll bet there's something in
> amongst mimetypes and suffixes somewhere that allows skipping.
>
> -ph
>
> On Thu, Jul 6, 2017 at 8:07 AM, Jacek Materna  wrote:
>
>> We've say 2-3x speed decrease using HTTP vs SSH in our cloud. With SSH
>> and some tuning we are getting close to the IO rate on the end server of
>> any size. HTTP/mod_dav and RTT is usually the culprit in our cases, on a
>> LAN that issues collapses. Likely server side processing of the delta's?
>>
>> -jacek
>>
>> On Thu, Jul 6, 2017 at 1:54 PM, Paul Hammant  wrote:
>>
>>> For something that's 500MB in size (random binary data) I'm experiencing
>>> commits taking
>>> 10x longer than a straight copy to the drive the Svn repo is on.
>>>
>>> Both timings are on the same Ubuntu 17.04 machine, with the boot drive
>>> being the starting position of the 512MB file and a USB3 mounted 4TB
>>> seagate hard drive being the destination.
>>>
>>> My goal is to fill the 4TB drive with commits for the simple experience
>>> of that.
>>>
>>> How many places in the Apache2 --> mod_dav --> mod_dav_svn handoff does
>>> the 512MB temporarily manifest itself in a file system on the way to its
>>> ultimate destination?  Is 1/10th speed the expectation?  Sure, I get that
>>> 7bit/8bit shenanigans are a factor, but not that much right?
>>>
>>> - Paul
>>>
>>>
>>
>>
>> --
>>
>> Jacek Materna
>> CTO
>>
>> Assembla
>> +1 210 410 7661 <(210)%20410-7661>
>> +48 578 296 708 <+48%20578%20296%20708>
>>
>
>


-- 

Jacek Materna
CTO

Assembla
+1 210 410 7661
+48 578 296 708


Re: Expected speed of commit over HTTP?

2017-07-06 Thread Jacek Materna
We've say 2-3x speed decrease using HTTP vs SSH in our cloud. With SSH and
some tuning we are getting close to the IO rate on the end server of any
size. HTTP/mod_dav and RTT is usually the culprit in our cases, on a LAN
that issues collapses. Likely server side processing of the delta's?

-jacek

On Thu, Jul 6, 2017 at 1:54 PM, Paul Hammant  wrote:

> For something that's 500MB in size (random binary data) I'm experiencing
> commits taking
> 10x longer than a straight copy to the drive the Svn repo is on.
>
> Both timings are on the same Ubuntu 17.04 machine, with the boot drive
> being the starting position of the 512MB file and a USB3 mounted 4TB
> seagate hard drive being the destination.
>
> My goal is to fill the 4TB drive with commits for the simple experience of
> that.
>
> How many places in the Apache2 --> mod_dav --> mod_dav_svn handoff does
> the 512MB temporarily manifest itself in a file system on the way to its
> ultimate destination?  Is 1/10th speed the expectation?  Sure, I get that
> 7bit/8bit shenanigans are a factor, but not that much right?
>
> - Paul
>
>


-- 

Jacek Materna
CTO

Assembla
+1 210 410 7661
+48 578 296 708


Dates for 1.10

2017-05-23 Thread Jacek Materna
Is there a general GA or BETA timeframe/date on the roadmap for the
1.10 rls? Trying to understand timeline.

thanks-


Re: Check SHA vs Content (was: RE: svn commit: r1759233 - /subversion/trunk/subversion/libsvn_wc/questions.c)

2017-05-23 Thread Jacek Materna
On Mon, May 22, 2017 at 10:56 PM, Johan Corveleyn  wrote:
> On Tue, May 9, 2017 at 1:21 PM, Johan Corveleyn  wrote:
>> On Tue, Apr 4, 2017 at 11:33 AM, Stefan Sperling  wrote:
>>> On Mon, Feb 20, 2017 at 09:05:25AM +0100, Bert Huijben wrote:
 This code is still in trunk without any of the discussed improvements, so
 this change is currently part of 1.10.0-alpha1.

 If we don't implement the improvements I think we should check if we want
 to revert to the 1.0-1.9 behavior before we really look at releasing 1.10.

 See discussion below

 Bert
>>>
>>> I think the proposed approach as implemented on trunk can no longer be
>>> considered viable, unfortunately, because of this step:
>>>
 > >>> 4. Calculate SHA-1 checksum of detranslated contents of working file
 > >>>and compare it with pristine's checksum stored in wc.db.
>>>
>>> Given that the SHA1 collision problem is real, we are now trying to stop
>>> relying on hashes to compare content. So it does not make sense to add
>>> new code which relies on hashes in this way, in my opinion.
>>>
>>> It seems that using SHA1 to compare content is key to the proposed approach.
>>> If that is correct, then I don't agree with releasing 1.10 with this feature
>>> and I would be in favour of reverting this change.
>>>
>>> Ivan, do you have any further comments on this thread? You have remained
>>> silent for quite some time now :(
>>
>> Where are we with this? Seems the consensus is to revert r1759233 to
>> not further increase our reliance on sha1? Or is there still a way to
>> keep r1759233 in some way, and improve it to make the sha1 test
>> "sensitive but not specific", like danielsh proposed?
>>
>> Ivan?
>
> Hmmm, I'm wondering, now we decided to disallow sha1 collisions to
> enter the repository, how does this reflect on this discussion?
>
> Okay, it's another "collision-sensitive-dependency" on sha1, but there
> are others in the working copy (pristines) and ra_serf, and since
> we're assuming a non-sha1-collision world (by blocking them from the
> repository), it makes little sense to me to simply revert this, and
> not fix other sha1-dependencies. OTOH, if we want to gradually reduce
> our dependence on sha1, this is an easy one to remove, since it's not
> in production yet ... Dunno.
>
> However, there were also other, performance-related, objections to the
> new implementation. Ivan proposed improvements [1] but they were never
> implemented.
>
> So, what to do?
>
> Unless someone objects in the coming couple of days, I'd say we revert 
> r1759233.
>
> [1] https://svn.haxx.se/dev/archive-2016-09/0016.shtml
>
> --
> Johan

I agree in not adding more new logs to the fire. I also agree we
should attack the already-in-prod deps on SHA1 as well in the
background-


Re: [PATCH] 1.10 Release notes and FAQ around SHA-1

2017-05-16 Thread Jacek Materna
On Sun, May 14, 2017 at 1:59 PM, Stefan Fuhrmann
 wrote:
> On 09.05.2017 20:43, Stefan Sperling wrote:
>>
>> On Mon, May 08, 2017 at 10:46:39AM +0200, Jacek Materna wrote:
>>>
>>> Team,
>>>
>>> I wanted to start a discussion around the FAQ (and 1.10 rls. notes) as it
>>> pertains to the SHA-1 issue affecting all versions of SVN RE: "Continue
>>> the
>>> 1.10 alphas?" thread.
>>
>> I have added a small advisory-style writeup we could mail out along
>> with a 1.9.6 release announcement: http://svn.apache.org/r1794624
>> Does this look OK?
>>
>> Of course, the FAQ and such could still be updated.
>>
> Looks good!
>
> The only thing I'm not sure about is whether to
> stress the fact that the user will also lose data.
> It's there, implicitly, but the wording feels a bit
> too focussed on the "errors and inconvenience"
> side of things.
>
> -- Stefan^2.
>
>

I have not changed the reference to the trunk version of the hook
script as I have not seen a stable "release" branch/tag version which
has it in place yet. I assume this will come after release.

[[[
Add to website FAQ around SHA-1 vulnerability
]]]
--- faq.html2017-03-16 17:18:18.0 +0100
+++ faq.new.html2017-05-16 14:47:08.0 +0200
@@ -61,6 +61,8 @@
 list?
 How is Subversion affected by changes
 in Daylight Savings Time (DST)?
+How do I protect my repository from the SHA-1
+Shattered vulnerability?
 
 
 How-to:
@@ -743,6 +745,47 @@
 
 
 
+
+How do I protect my repository from the SHA-1 Shattered vulnerability?
+  ¶
+
+
+Subversion's use of SHA-1 in how it processes content is subject to hashing
+collisions as identified by https://shattered.io/";>Google). One of
+Subversions's key assumptions in processing content is that SHA-1 is unique 
for 
+all objects.
+Subversion has two main areas of vulnerability.
+
+
+The FS backend (repository) uses SHA-1.
+The Working Copy/RA layers use SHA-1.
+
+
+The FS layer uses SHA-1 when identifying objects to store in the repository. 
To 
+prevent non duplicate content from being stored that has identical SHA-1, 
+upgrade to 1.9.6 (where would prevent storage of duplicates) or install the 
+pre-commit hook found https://svn.apache.org/repos/asf/subversion/trunk/tools/hook-scripts/reject-known-sha1-collisions.sh";>
+here. As an aside, we welcome Windows developers to submit a pre-commit 
+script for the Windows platform. More information on submission can be found 
https://subversion.apache.org/docs/community-guide/general.html#patches";>
+here.
+
+
+The working copy/RA layer uses SHA-1 for de-duplication of content stored in 
+the working copy, and for performance reasons clients using the HTTP protocol 
+will avoid fetching content with a SHA-1 checksum which has been fetched 
+previously. There is no known workaround for this vector except to prevent 
+storage of the colliding objects in the first place, via upgrade to 1.9.6 or 
+installation of the aforementioned pre-commit script.
+
+
+Storing content with SHA1 collisions it not a supported use case. If you have 
+repositories with colliding SHA-1 content we suggest you transform it via gzip
+before storage to avoid the collision altogether. Moreover an upgrade to 1.9.6
+to prevent future insertion duplicates is highly recommended.
+
+
+
 
 
 


Re: beta@ feedback mailing list?

2017-05-15 Thread Jacek Materna
Great to hear. I agree - it's a fairly wide topic. Nothing to jump on quickly.

As an aside, I wanted to augment my original response which had an
omission of a reference.

> Lets consider the five main work flows:
> - reviewing a patch submission;
> - reviewing a (typically recent) commit;
> - reviewing a back-port nomination (from trunk to branches/1.9.x);
> - reviewing a patch to a vulnerability (this is done on private@).
> - beta/feedback release

The top four points came from a private discussion I had with Daniel
Shahaf, who correctly suggested a strategy to me on attacking this
topic around work flow preservation. Thanks Daniel.

On Mon, May 15, 2017 at 2:37 PM, Johan Corveleyn  wrote:
> On Thu, May 11, 2017 at 4:03 PM, Jacek Materna  wrote:
>> On Thu, May 11, 2017 at 10:14 AM, Stefan Sperling  wrote:
>>> On Thu, May 11, 2017 at 01:04:01AM +0200, Johan Corveleyn wrote:
>>>> How do other ASF projects do this actually? Forums, presence in other
>>>> online places, more modern website look and feel, ...?
>>>
>>> They use github :)
>>
>>
>> On Thu, May 11, 2017 at 1:04 AM, Johan Corveleyn  wrote:
>>> On Tue, May 9, 2017 at 3:32 PM, Jacek Materna  wrote:
>>>> Just observing from afar, in my opinion the root of what you are
>>>> trying to achieve here ties more to a lack of 'modern' collaboration.
>>>> If we want to engage the community/users more (expand the
>>>> IB/participation sphere - new - users) I would also explore
>>>> alternative mediums (versus email). One of the reasons Github has been
>>>> so successful in making git an overwhelming force has little to do
>>>> with git itself. They made the process easy, rewarding and exciting to
>>>> contribute as a user.
>>>>
>>>> An approachable UX leads to more engagement - every time. I think it
>>>> would be great if we had an army of excited users wanting to test new
>>>> features. The product would benefit. Users in SaaS for example always
>>>> enjoy being [volunteering] part of a "beta" program - there is
>>>> something satisfying for users in it. On the flip-side "beta" program
>>>> for on-premise "enterprise" products are rarely so.
>>>>
>>>> Adding ontop the beta@ ... If we can make the "beta" collaborative,
>>>> more engaging for users I think its a real step forward towards an
>>>> army.
>>>
>>> I think you've got a point here, Jacek. I can see that our general
>>> UX-impression as a project / community comes across as dated. It would
>>> be great if we could improve that UX, and make it more modern, if that
>>> helps reaching a broader group of users to help us beta-testing etc
>>> ... and increase enthousiasm for our upcoming release.
>>>
>>> Do you (or anyone) have any concrete suggestions (within reach of our
>>> very limited resources, especially regarding volunteer time to spend
>>> on it)? People that want to help with this?
>>>
>>> How do other ASF projects do this actually? Forums, presence in other
>>> online places, more modern website look and feel, ...?
>>>
>>> --
>>> Johan
>>
>> Thinking out loud here ...
>>
>> Idea here is to change incrementally, so we can: change tools, cannot
>> impact work flow, limit effort and amplify our capabilities as a team.
>>
>> Lets consider the five main work flows:
>> - reviewing a patch submission;
>> - reviewing a (typically recent) commit;
>> - reviewing a back-port nomination (from trunk to branches/1.9.x);
>> - reviewing a patch to a vulnerability (this is done on private@).
>> - beta/feedback release
>>
>> Focusing on #5 for this thread and knowing that apache projects cannot
>> have mandatory infrastructure dependencies on third parties, in order
>> to ensure the projects' long-term independence; projects may use
>> third-party-hosted tools, but they may not rely on such tools - the
>> projects always have to have a Plan B for in case the third party
>> service goes down.
>>
>> If we wanted to try the "github" model - Assembla is more than happy
>> to support the community with native SVN support for "collab".
>>
>> For the case of beta@ we've done this successfully before where we
>> create a public area for users to discuss, comment on features, code,
>> ideas for an upcoming release. It would be extremely simple to put
>> 1.10 into a repo with blame/compare/pull reque

Re: beta@ feedback mailing list?

2017-05-11 Thread Jacek Materna
On Thu, May 11, 2017 at 10:14 AM, Stefan Sperling  wrote:
> On Thu, May 11, 2017 at 01:04:01AM +0200, Johan Corveleyn wrote:
>> How do other ASF projects do this actually? Forums, presence in other
>> online places, more modern website look and feel, ...?
>
> They use github :)


On Thu, May 11, 2017 at 1:04 AM, Johan Corveleyn  wrote:
> On Tue, May 9, 2017 at 3:32 PM, Jacek Materna  wrote:
>> Just observing from afar, in my opinion the root of what you are
>> trying to achieve here ties more to a lack of 'modern' collaboration.
>> If we want to engage the community/users more (expand the
>> IB/participation sphere - new - users) I would also explore
>> alternative mediums (versus email). One of the reasons Github has been
>> so successful in making git an overwhelming force has little to do
>> with git itself. They made the process easy, rewarding and exciting to
>> contribute as a user.
>>
>> An approachable UX leads to more engagement - every time. I think it
>> would be great if we had an army of excited users wanting to test new
>> features. The product would benefit. Users in SaaS for example always
>> enjoy being [volunteering] part of a "beta" program - there is
>> something satisfying for users in it. On the flip-side "beta" program
>> for on-premise "enterprise" products are rarely so.
>>
>> Adding ontop the beta@ ... If we can make the "beta" collaborative,
>> more engaging for users I think its a real step forward towards an
>> army.
>
> I think you've got a point here, Jacek. I can see that our general
> UX-impression as a project / community comes across as dated. It would
> be great if we could improve that UX, and make it more modern, if that
> helps reaching a broader group of users to help us beta-testing etc
> ... and increase enthousiasm for our upcoming release.
>
> Do you (or anyone) have any concrete suggestions (within reach of our
> very limited resources, especially regarding volunteer time to spend
> on it)? People that want to help with this?
>
> How do other ASF projects do this actually? Forums, presence in other
> online places, more modern website look and feel, ...?
>
> --
> Johan

Thinking out loud here ...

Idea here is to change incrementally, so we can: change tools, cannot
impact work flow, limit effort and amplify our capabilities as a team.

Lets consider the five main work flows:
- reviewing a patch submission;
- reviewing a (typically recent) commit;
- reviewing a back-port nomination (from trunk to branches/1.9.x);
- reviewing a patch to a vulnerability (this is done on private@).
- beta/feedback release

Focusing on #5 for this thread and knowing that apache projects cannot
have mandatory infrastructure dependencies on third parties, in order
to ensure the projects' long-term independence; projects may use
third-party-hosted tools, but they may not rely on such tools - the
projects always have to have a Plan B for in case the third party
service goes down.

If we wanted to try the "github" model - Assembla is more than happy
to support the community with native SVN support for "collab".

For the case of beta@ we've done this successfully before where we
create a public area for users to discuss, comment on features, code,
ideas for an upcoming release. It would be extremely simple to put
1.10 into a repo with blame/compare/pull request/protected directories
capabilities along side ticket tracking for feedback.

If the test is successful and we improve quality/feedback, it's a great win.

I can also help get resources to move other channels such as the
forums, public discussion around 1.10 - once we close on a date.
Getting good engagement is not as easy as a forum - marketing is a
very important axis to get results, especially if we want to reach
audiences typically not involved, such as for example game artists who
use SVN every day - plenty of persona's out there that are using SVN
for its power, are non-technical but would love the opportunity to
help shape the "latest" SVN release with feedback.

I think a modern subversion website is a great idea. I could look at
getting resources to help with that as well. Even a simple
re-surfacing may be a great step.

If nobody is allergic to it I could setup a hosted 1.10-beta and see
what everyone has to say with a concrete dartboard to throw darts at -
worst case we burn it down and/or get the idea train going.


Re: [PATCH] 1.10 Release notes and FAQ around SHA-1

2017-05-11 Thread Jacek Materna
For review.

best.
-j

On Wed, May 10, 2017 at 10:52 AM, Jacek Materna  wrote:
> Correct. #2 is moot given the rejection strategy moving forward.
>
> On Tue, May 9, 2017 at 6:12 PM, Daniel Shahaf 
> wrote:
>>
>> Jacek Materna wrote on Tue, May 09, 2017 at 14:39:51 +0200:
>> > On Tue, May 9, 2017 at 2:12 PM, Daniel Shahaf 
>> > wrote:
>> > > Jacek Materna wrote on Mon, May 08, 2017 at 10:46:39 +0200:
>> > >> Team,
>> > >>
>> > >> I wanted to start a discussion around the FAQ (and 1.10 rls. notes)
>> > >> as it
>> > >> pertains to the SHA-1 issue affecting all versions of SVN RE:
>> > >> "Continue the
>> > >> 1.10 alphas?" thread.
>> > >>
>> > >> 1) We should bias towards pro-active mitigation of this issue in
>> > >> docs/code
>> > >> as we know a real solution will likely NOT come with 1.10 after all.
>> > >
>> > > Agreed: a solution in code would be preferable, but whichever cases
>> > > are
>> > > not working as we want them to, should be documented.
>> > >
>> > >> 2) Consider patching 1.10 with de-duplication off by default ?
>> > >
>> > > What's the rationale behind this?  (honest question)
>> > >
>> > > I can see that this would, for one, allow sha1 collisions to be
>> > > committed over RA, but I'm not sure what benefit you have in mind.
>> > >
>> >
>> > Apologize for the ambiguity. I had the representation sharing feature
>> > in-mind (fsfs.conf).
>> > Ideally we know we want to fix it so that it recognizes this scenario as
>> > two different files and does not try to share the content.
>>
>> Current trunk behaves this way since r1785734/r1785754.  Moreover, given
>> the status of the "[PATCH] reject SHA1 collisions" thread, it seems
>> likely that 1.10.0 will refuse to admit collisions into the repository
>> using a FSFS-level check that's active whenever rep-sharing is.
>> I assume these changes address the concern underlying your point #2?
>>
>> Looking forward to the revised patch.
>>
>> Cheers,
>>
>> Daniel
>
>
>
>
> --
>
> Jacek Materna
> CTO
>
> Assembla
> 210-410-7661



-- 

Jacek Materna
CTO

Assembla
210-410-7661


faq.patch
Description: Binary data


Re: [PATCH] 1.10 Release notes and FAQ around SHA-1

2017-05-10 Thread Jacek Materna
Correct. #2 is moot given the rejection strategy moving forward.

On Tue, May 9, 2017 at 6:12 PM, Daniel Shahaf 
wrote:

> Jacek Materna wrote on Tue, May 09, 2017 at 14:39:51 +0200:
> > On Tue, May 9, 2017 at 2:12 PM, Daniel Shahaf 
> wrote:
> > > Jacek Materna wrote on Mon, May 08, 2017 at 10:46:39 +0200:
> > >> Team,
> > >>
> > >> I wanted to start a discussion around the FAQ (and 1.10 rls. notes)
> as it
> > >> pertains to the SHA-1 issue affecting all versions of SVN RE:
> "Continue the
> > >> 1.10 alphas?" thread.
> > >>
> > >> 1) We should bias towards pro-active mitigation of this issue in
> docs/code
> > >> as we know a real solution will likely NOT come with 1.10 after all.
> > >
> > > Agreed: a solution in code would be preferable, but whichever cases are
> > > not working as we want them to, should be documented.
> > >
> > >> 2) Consider patching 1.10 with de-duplication off by default ?
> > >
> > > What's the rationale behind this?  (honest question)
> > >
> > > I can see that this would, for one, allow sha1 collisions to be
> > > committed over RA, but I'm not sure what benefit you have in mind.
> > >
> >
> > Apologize for the ambiguity. I had the representation sharing feature
> > in-mind (fsfs.conf).
> > Ideally we know we want to fix it so that it recognizes this scenario as
> > two different files and does not try to share the content.
>
> Current trunk behaves this way since r1785734/r1785754.  Moreover, given
> the status of the "[PATCH] reject SHA1 collisions" thread, it seems
> likely that 1.10.0 will refuse to admit collisions into the repository
> using a FSFS-level check that's active whenever rep-sharing is.
> I assume these changes address the concern underlying your point #2?
>
> Looking forward to the revised patch.
>
> Cheers,
>
> Daniel
>



-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: [PATCH] 1.10 Release notes and FAQ around SHA-1

2017-05-10 Thread Jacek Materna
Looks great Stefan - will review and work on the FAQ patch this week.

On Tue, May 9, 2017 at 8:43 PM, Stefan Sperling  wrote:

> On Mon, May 08, 2017 at 10:46:39AM +0200, Jacek Materna wrote:
> > Team,
> >
> > I wanted to start a discussion around the FAQ (and 1.10 rls. notes) as it
> > pertains to the SHA-1 issue affecting all versions of SVN RE: "Continue
> the
> > 1.10 alphas?" thread.
>
> I have added a small advisory-style writeup we could mail out along
> with a 1.9.6 release announcement: http://svn.apache.org/r1794624
> Does this look OK?
>
> Of course, the FAQ and such could still be updated.
>



-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

2017-05-09 Thread Jacek Materna
+1 on rejection.

On Tue, May 9, 2017 at 3:37 PM, Mark Phippard  wrote:
> On Tue, May 9, 2017 at 9:25 AM, Stefan Sperling  wrote:
>>
>> On IRC, Branko and Johan raised concerns about the proposed backport.
>>
>> The proposed backport allows files with SHA1 collisions into the
>> repository
>> and avoids de-duplication of such content by the rep-cache. It fixes the
>> integrity problem with the rep-cache but other problems remain.
>
>
> This approach makes a lot of sense to me.
>
>
>
> --
> Thanks
>
> Mark Phippard
> http://markphip.blogspot.com/



-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: beta@ feedback mailing list?

2017-05-09 Thread Jacek Materna
Just observing from afar, in my opinion the root of what you are
trying to achieve here ties more to a lack of 'modern' collaboration.
If we want to engage the community/users more (expand the
IB/participation sphere - new - users) I would also explore
alternative mediums (versus email). One of the reasons Github has been
so successful in making git an overwhelming force has little to do
with git itself. They made the process easy, rewarding and exciting to
contribute as a user.

An approachable UX leads to more engagement - every time. I think it
would be great if we had an army of excited users wanting to test new
features. The product would benefit. Users in SaaS for example always
enjoy being [volunteering] part of a "beta" program - there is
something satisfying for users in it. On the flip-side "beta" program
for on-premise "enterprise" products are rarely so.

Adding ontop the beta@ ... If we can make the "beta" collaborative,
more engaging for users I think its a real step forward towards an
army.

On Tue, May 9, 2017 at 1:28 PM, Johan Corveleyn  wrote:
> On Tue, May 9, 2017 at 1:06 PM, Daniel Shahaf  wrote:
>> Andreas Stieger wrote on Tue, May 09, 2017 at 12:55:31 +0200:
>>> Daniel Shahaf wrote:
>>> > One of the ideas that came up was to establish a dedicated mailing list
>>> > for beta / pre-release feedback.  The thinking is that having a channel
>>> > for advanced users to discuss 1.10-dev issues in — without noise from
>>> > support requests or design discussions — might encourage more such
>>> > discussion to happen.
>>>
>>> I don't know that given the current volume on either dev@ or users@
>>> would give way to a noise issue that would require a dedicated list
>>> for beta.
>>
>> The point wasn't that the current lists are too busy.  The point was
>> whether creating a dedicated list would encourage or enable users to
>> give feedback.
>>
>> I.e., would we receive more feedback with a dedicated list than with our
>> "post to users@" policy (the one we repeat in every release announcement)?
>
> Yes, I think that's a valid argument. It might help in creating some
> more traction focused around beta-testing 1.10 (in that case, maybe we
> should call the next 1.10 pre-release a "beta" instead of an "alpha"
> -- otherwise the announcement on our website might be a bit weird:
> "Subversion 1.10 alpha 3 has been released. Please test and report
> your feedback on beta-users@s.a.o" :-))
>
> --
> Johan



-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: [PATCH] 1.10 Release notes and FAQ around SHA-1

2017-05-09 Thread Jacek Materna
Hi,

On Tue, May 9, 2017 at 2:12 PM, Daniel Shahaf  wrote:
> Jacek Materna wrote on Mon, May 08, 2017 at 10:46:39 +0200:
>> Team,
>>
>> I wanted to start a discussion around the FAQ (and 1.10 rls. notes) as it
>> pertains to the SHA-1 issue affecting all versions of SVN RE: "Continue the
>> 1.10 alphas?" thread.
>>
>> 1) We should bias towards pro-active mitigation of this issue in docs/code
>> as we know a real solution will likely NOT come with 1.10 after all.
>
> Agreed: a solution in code would be preferable, but whichever cases are
> not working as we want them to, should be documented.
>
>> 2) Consider patching 1.10 with de-duplication off by default ?
>
> What's the rationale behind this?  (honest question)
>
> I can see that this would, for one, allow sha1 collisions to be
> committed over RA, but I'm not sure what benefit you have in mind.
>

Apologize for the ambiguity. I had the representation sharing feature
in-mind (fsfs.conf).
Ideally we know we want to fix it so that it recognizes this scenario as
two different files and does not try to share the content. The only cost to
disabling this feature by default is that you will not benefit from
the disk space savings
it provides. You can disable this feature at any time (prior to the problem
happening). Disabling the feature has no impact on the existing data in
the repository that is already using it. Only future commits will be impacted
in that they will not bother looking for the space savings. Repo's
prior to 1.6,
non-updated are do not care.

>> 3) Remediation of the issue (if affected) should be a different topic? -
>> how to get out of the weeds guide. Published by the group - authoritative,
>> trusted, final. A number of providers of SVN hosting have done their own
>> workarounds and written their own KB's on the topic - I think having a
>> master guide is important.
>
> Agreed.  Moreover, it'd be nice to draw on the knowledge accumulated in
> our downstreams.  I tried to provide such a guide in [1], but it's
> incomplete: it doesn't cover the dump/load issue.  (Hopefully we'll
> backport that issue's fix to 1.9.6.)
>
> [1] 
> https://mail-archives.apache.org/mod_mbox/subversion-dev/201702.mbox/%3C20170224213628.GA21715@fujitsu.shahaf.local2%3E
>
> Incidentally, that email is from late February, so the "90 days to
> publishing the exploit code" will be over soon.

OK. Let's see what they put out first. Mark over at CN wrote a great verbose
starting point at
http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options#.WRG2JLyGM6g

>
>> >>>>>>>>>>>>>>
>> General Questions:
>>  - How do I protect my repository against the SHA-1 vulnerability found by
>> Google?
>
> I see this is a patch for the FAQ.  For future reference, we prefer
> patches to be formatted in unidiff against the site's HTML source
> (https://svn.apache.org/repos/asf/subversion/site/publish/faq.en.html),
> however, I agree it's easier to first iterate on the wording and only
> later add the HTML markup.
>
> I suggest to say "shattered" somewhere in the question's title, to
> unambiguously identify the attack.

Fair.

>
>> Subversion's use of SHA-1 in how it processes content is subject to hashing
>> collisions as identified by Google (https://shattered.io/). Preventing
>> suspect object commits is the simplest and best way today to protect your
>> repository. Disabling repository sharing is not enough to solve the issue
>> alone as Subversion also uses SHA-1 to de-duplicate retransmission of
>> content to clients for a pristine working copy.
>
> This paragraph tries to say two things:
>
> 1. The FS layer (repository) uses sha1.  Workaround: use this hook
> script.  (Or upgrade to 1.10.0 / 1.9.TBD ?)
>
> 2. The WC/RA layers use sha1.  Workaround: none yet.
>
> I would suggest to make this division explicit.  E.g., we could say:
>
> "Subversion uses sha1 in X and Y.
>
> X uses sha1 for ...  The new failure modes / attacks are ...  The
> workaround / fix is...
>
> Y uses sha1 for ...  The new failure modes / attacks are ...  The
> workaround / fix is...
> "
>
> Basically, each paragraph would follow the same structure as our
> advisories: design description, problem description, fixes and
> workarounds.
>
> WDYT?

Great. Concise. Follows precedent set.

>
>> Prevention:
>>
>> Install a pre-commit hook that will reject new instances against known
>> collisions. While this will not guarantee protection from new collisions,

Re: stricter text conflicts in 1.10

2017-05-09 Thread Jacek Materna
I know for a fact that UX is already a major decision point around choosing
Subversion over modern alternatives.

What have we done in the past? A staggered +1 release model seems worthy
where we announce it in version A [with it disabled] to allow users to
"opt-in".
If the value is there, users will jump on it. We can measure it via feedback.

At some point down the line, opt-in folds into "default" and strict is
the new sheriff
in town. This is a very successful way of introducing incremental
customer facing
changes in the SaaS world - that is proven.

On Tue, May 9, 2017 at 12:14 PM, Stefan Sperling  wrote:
> I have seen several instances of proposals in our STATUS file where I
> cannot merge without text conflicts because I am using a trunk client.
>
> I suppose most of us use 1.9.x clients to do such merges, because
> otherwise there would be a lot more backport branches in STATUS when
> nominations get added, and before I run into such a conflict.
>
> This is probably due to the stricter text conflict checks added in r1731699.
> If so, are we really sure that we want to make the new behaviour the default?
> I can imagine that in organizations with a diverse SVN client install base
> this change will cause a lot of misunderstandings and confusion among users.
>
> And with the conflict resolver we are trying to make tree conflicts less
> painful. Now, at the same time text conflicts have become a lot more painful
> than they used to be. I don't think this is going to be a good sell.



-- 

Jacek Materna
CTO

Assembla
210-410-7661


[PATCH] 1.10 Release notes and FAQ around SHA-1

2017-05-08 Thread Jacek Materna
Team,

I wanted to start a discussion around the FAQ (and 1.10 rls. notes) as it
pertains to the SHA-1 issue affecting all versions of SVN RE: "Continue the
1.10 alphas?" thread.

1) We should bias towards pro-active mitigation of this issue in docs/code
as we know a real solution will likely NOT come with 1.10 after all.

2) Consider patching 1.10 with de-duplication off by default ?

3) Remediation of the issue (if affected) should be a different topic? -
how to get out of the weeds guide. Published by the group - authoritative,
trusted, final. A number of providers of SVN hosting have done their own
workarounds and written their own KB's on the topic - I think having a
master guide is important.

4) I am sure there are a number of other items this group can append to
this dialog from previous discussions on the topic.

>>>>>>>>>>>>>>
General Questions:
 - How do I protect my repository against the SHA-1 vulnerability found by
Google?


Subversion's use of SHA-1 in how it processes content is subject to hashing
collisions as identified by Google (https://shattered.io/). Preventing
suspect object commits is the simplest and best way today to protect your
repository. Disabling repository sharing is not enough to solve the issue
alone as Subversion also uses SHA-1 to de-duplicate retransmission of
content to clients for a pristine working copy.

Prevention:

Install a pre-commit hook that will reject new instances against known
collisions. While this will not guarantee protection from new collisions,
we will keep the hook up-to date as new collisions are publicly released.

The hook can be found here:
https://svn.apache.org/repos/asf/subversion/trunk/tools/hook-scripts/reject-known-sha1-collisions.sh
<<<<<<<<

Best.
-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: Continue the 1.10 alphas?

2017-05-04 Thread Jacek Materna
Great! Will do - thanks for the guidance-

-j

On Thu, May 4, 2017 at 1:53 PM, Johan Corveleyn  wrote:

> On Thu, May 4, 2017 at 12:34 PM, Jacek Materna  wrote:
> > Agreed - let me looking in Stefan's mods - I can take a look at the
> > client-side after that to see if I have a slot in the short-term.
>
> Okay. Concerning the working copy: IIUC if "fixing" means "making it
> possible to store sha1 collisions in a working copy", it's more or
> less impossible to fix this without a "format bump" of the working
> copy format (which means the fix can't be backported to 1.9 or 1.8 --
> and even for trunk / 1.10, a format bump is currently not planned).
> But "fixing" can also mean "rejecting the collision in a graceful
> way". That's probably much more realistic, and perhaps backportable.
> Though I believe there are big questions about the performance impact
> of any solution ...
>
> Anyway, if you want to look into this, please start a new thread to
> discuss your ideas first (we need to come to a consensus first about
> *what behaviour we want*, and how this could be achieved).
>
> > What's a reasonable / agreed way of "giving something more visibility -
> re:
> > hook" ?
>
> I guess the 1.10 release notes are an option. And our FAQ. Maybe a FAQ
> should be the first priority, as this issue applies to all older
> releases. Are you willing to draft something (either as a patch
> against [1], or just as a written suggestion)? If so, please send it
> in another thread too, so we can keep this thread focused on getting
> 1.10 alphas rolling again :-).
>
> [1] http://svn.apache.org/repos/asf/subversion/site/publish/faq.html
>
> Thanks,
> --
> Johan
>



-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: Continue the 1.10 alphas?

2017-05-04 Thread Jacek Materna
Agreed - let me looking in Stefan's mods - I can take a look at the
client-side after that to see if I have a slot in the short-term.

What's a reasonable / agreed way of "giving something more visibility - re:
hook" ?

-j

On Tue, May 2, 2017 at 11:29 PM, Johan Corveleyn  wrote:

> On Tue, May 2, 2017 at 3:21 PM, Jacek Materna  wrote:
> > Great to hear on 1.10 move along.
> >
> > On SHA1 I can help if you feel it may move things along in parallel - we
> > ended up having to use the pre-commit hook for our customer base as per
> > https://svn.apache.org/viewvc/subversion/trunk/tools/hook-
> scripts/reject-known-sha1-collisions.sh?view=markup&pathrev=1784336
>
> Yes, that pre-commit hook is certainly a good thing at the moment.
> Maybe we can give it more visibility and make it more accessible when
> releasing 1.10.
>
> IIUC, after Stefan Fuhrman's recent commits trunk can now handle sha1
> collisions in the back end, i.e. the repository can store both
> colliding files correctly. So with 1.10 it would no longer be
> necessary to protect the back-end with this hook. But the client-side
> and 'svnadmin dump / load' still have problems. And even if those were
> fixed, the hook would still be useful to support older clients from
> your 1.10 server.
>
> --
> Johan
>



-- 

Jacek Materna
CTO

Assembla
210-410-7661


Re: Continue the 1.10 alphas?

2017-05-02 Thread Jacek Materna
Great to hear on 1.10 move along.

On SHA1 I can help if you feel it may move things along in *parallel* - we
ended up having to use the pre-commit hook for our customer base as per
https://svn.apache.org/viewvc/subversion/trunk/tools/hook-scripts/reject-known-sha1-collisions.sh?view=markup&pathrev=1784336

best.
-jacek

On Tue, May 2, 2017 at 3:32 AM, Stefan Sperling  wrote:

> On Mon, May 01, 2017 at 11:57:54PM +0200, Johan Corveleyn wrote:
> > On Mon, May 1, 2017 at 10:54 PM, Julian Foad 
> wrote:
> > > Just asking...
> > >
> > > As I understand it, we paused the issuing of 1.10 alpha releases
> because we
> > > considered that the final 1.10 release will need to address the SHA1
> > > collision issue otherwise it won't be considered a viable release.
> > >
> > > It seemed reasonable to pause for a bit while the SHA1 issue was
> worked on,
> > > and Stefan2 has done some work on that. But currently it seems that
> there is
> > > nobody doing any further work on it.
> > >
> > > We could continue waiting, or maybe now we should resume the alpha
> testing
> > > of the new features (conflict resolution), and let the SHA1 work be
> fixed as
> > > and when someone is motivated to do so (before or after 1.10). It
> seems to
> > > me that sometimes in open source we need to get on with doing what we
> can
> > > do, and just trust that someone else will do the rest.
> > >
> > > Thoughts?
> >
> > +1.
> >
> > I think this "pause-for-sha1-fixes" has now taken more than long
> > enough. We should try gathering our focus again on releasing 1.10, and
> > get the improvements it brings in the hands of users.
>
> I agree!
>
> I was one of the people pushing for more SHA1 fixes but I did not find
> time to do any of that work myself. I will not object if we decide that
> these changes will have to happen later on. We do not seem to have enough
> resources to push more SHA1 fixes through right now. So let's do whatever
> else we can get done instead.
>



-- 

Jacek Materna
CTO

Assembla
210-410-7661