Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-10 Thread Daniel Shahaf
Branko Čibej wrote on Mon, Dec 10, 2012 at 00:26:20 +0100:
 On 10.12.2012 00:08, Johan Corveleyn wrote:
  On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
  Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
  2) Am I the only one who wants to protect his repository against this
  corruption? Judging from [1], I don't think so. It doesn't make sense
  that everyone starts writing this pre-commit hook, for something that
  IMHO is an obvious anti-corruption protection. I think every
  repository out there deserves to be protected against this.
 
  FWIW, I suggested a hook because you could implement that in about
  5 minutes, whereas doing a C code change would take at least 10 times
  that (and several weeks if you have to wait for it to appear in a 1.7.x
  release that you can install at $WORK).  I won't object to C code
  verifying the svn:eol-style invariant.
  Thanks. And your pre-commit hook example is much appreciated.
 
  For the moment I get the impression that it's not really doable /
  desirable to implement this in the repository. At least until now
  no-one has suggested how it could be done, and I don't know enough
  myself about the server-side / back-end to figure it out :-).
 
 The first obstacle is that the server does not interpret properties.[1]
 Therefore, you'd have to implement this check at transaction-commit
 time, because there's no earlier point where you're guaranteed to have
 all node properties at their final values. This implies that the time to
 reject a commit would be proportional to the size of the commit (which
 isn't the case when it comes to conflict detection).

Can't you do what stefan2 suggested, but in libsvn_repos?  The repos
layer commit editor would remember for each file whether its textdelta
has added a new 0x0D byte, then at close_edit() it would iterate all
files in the commit --- and for each file only compare its svn:eol-style
property to the by-now-precomputed did it it contain a 0x0D bit.

I'm not sure how efficient it would be to parse for 0x0D's in
libsvn_repos, though.  Maybe we should make this optional.

 
 All properties are interpreted by clients. A buggy client will cause
 cause problems for other users, so the best course of action is to
 report the bug (to SvnKit in this case?).
 
 Also note that you're using a much too strong term when you talk about
 corrupted files in this case. There's nothing corrupted about them.
 

An invariant failed to maintain.  Granted it's not an FS_level
invariant, though.


 
 -- Brane
 
 [1] Not strictly true since mod_dav_svn will look at svn:mime-type when
 serving the content at its default, unversioned URL; but it doesn't
 actually interpret the value.
 
 -- 
 Branko Čibej
 Director of Subversion | WANdisco | www.wandisco.com
 


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-10 Thread Branko Čibej
On 10.12.2012 11:31, Daniel Shahaf wrote:
 Branko Čibej wrote on Mon, Dec 10, 2012 at 00:26:20 +0100:
 On 10.12.2012 00:08, Johan Corveleyn wrote:
 On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.

 FWIW, I suggested a hook because you could implement that in about
 5 minutes, whereas doing a C code change would take at least 10 times
 that (and several weeks if you have to wait for it to appear in a 1.7.x
 release that you can install at $WORK).  I won't object to C code
 verifying the svn:eol-style invariant.
 Thanks. And your pre-commit hook example is much appreciated.

 For the moment I get the impression that it's not really doable /
 desirable to implement this in the repository. At least until now
 no-one has suggested how it could be done, and I don't know enough
 myself about the server-side / back-end to figure it out :-).
 The first obstacle is that the server does not interpret properties.[1]
 Therefore, you'd have to implement this check at transaction-commit
 time, because there's no earlier point where you're guaranteed to have
 all node properties at their final values. This implies that the time to
 reject a commit would be proportional to the size of the commit (which
 isn't the case when it comes to conflict detection).
 Can't you do what stefan2 suggested, but in libsvn_repos?  The repos
 layer commit editor would remember for each file whether its textdelta
 has added a new 0x0D byte, then at close_edit() it would iterate all
 files in the commit --- and for each file only compare its svn:eol-style
 property to the by-now-precomputed did it it contain a 0x0D bit.

 I'm not sure how efficient it would be to parse for 0x0D's in
 libsvn_repos, though.  Maybe we should make this optional.


It's not enough to just look for \r in the delta stream. Certainly
wouldn't help with historically broken files.

Moreover, if libsvn_repos started looking at svn:eol-style, that'd only
make sense if you verified all normalizations, not just native. And
why should it just be svn:eol-style, when svn:keywords has potentially
the same problem? Eventually you start verifying everything that affects
file contents.

Maybe that's a good idea, but /why/ put it in libsvn_repos when it's
much easier and cleaner to just provide some standard hooks, written in
C and distributed with releases, that the admin plug in if she feels
like it? Surely that's what hooks are for.


-- Brane


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-10 Thread Ivan Zhakov
On Mon, Dec 10, 2012 at 2:51 PM, Branko Čibej br...@wandisco.com wrote:
 On 10.12.2012 11:31, Daniel Shahaf wrote:
 Branko Čibej wrote on Mon, Dec 10, 2012 at 00:26:20 +0100:
 On 10.12.2012 00:08, Johan Corveleyn wrote:
 On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.

 FWIW, I suggested a hook because you could implement that in about
 5 minutes, whereas doing a C code change would take at least 10 times
 that (and several weeks if you have to wait for it to appear in a 1.7.x
 release that you can install at $WORK).  I won't object to C code
 verifying the svn:eol-style invariant.
 Thanks. And your pre-commit hook example is much appreciated.

 For the moment I get the impression that it's not really doable /
 desirable to implement this in the repository. At least until now
 no-one has suggested how it could be done, and I don't know enough
 myself about the server-side / back-end to figure it out :-).
 The first obstacle is that the server does not interpret properties.[1]
 Therefore, you'd have to implement this check at transaction-commit
 time, because there's no earlier point where you're guaranteed to have
 all node properties at their final values. This implies that the time to
 reject a commit would be proportional to the size of the commit (which
 isn't the case when it comes to conflict detection).
 Can't you do what stefan2 suggested, but in libsvn_repos?  The repos
 layer commit editor would remember for each file whether its textdelta
 has added a new 0x0D byte, then at close_edit() it would iterate all
 files in the commit --- and for each file only compare its svn:eol-style
 property to the by-now-precomputed did it it contain a 0x0D bit.

 I'm not sure how efficient it would be to parse for 0x0D's in
 libsvn_repos, though.  Maybe we should make this optional.


 It's not enough to just look for \r in the delta stream. Certainly
 wouldn't help with historically broken files.

 Moreover, if libsvn_repos started looking at svn:eol-style, that'd only
 make sense if you verified all normalizations, not just native. And
 why should it just be svn:eol-style, when svn:keywords has potentially
 the same problem? Eventually you start verifying everything that affects
 file contents.

 Maybe that's a good idea, but /why/ put it in libsvn_repos when it's
 much easier and cleaner to just provide some standard hooks, written in
 C and distributed with releases, that the admin plug in if she feels
 like it? Surely that's what hooks are for.


New program 'svnhooks' with set of hooks for standard recommended
polices would be great. svn:eol-style check is good start.


-- 
Ivan Zhakov


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-10 Thread Daniel Shahaf
Branko Čibej wrote on Mon, Dec 10, 2012 at 11:51:08 +0100:
 On 10.12.2012 11:31, Daniel Shahaf wrote:
  Branko Čibej wrote on Mon, Dec 10, 2012 at 00:26:20 +0100:
  On 10.12.2012 00:08, Johan Corveleyn wrote:
  On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
  Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
  2) Am I the only one who wants to protect his repository against this
  corruption? Judging from [1], I don't think so. It doesn't make sense
  that everyone starts writing this pre-commit hook, for something that
  IMHO is an obvious anti-corruption protection. I think every
  repository out there deserves to be protected against this.
 
  FWIW, I suggested a hook because you could implement that in about
  5 minutes, whereas doing a C code change would take at least 10 times
  that (and several weeks if you have to wait for it to appear in a 1.7.x
  release that you can install at $WORK).  I won't object to C code
  verifying the svn:eol-style invariant.
  Thanks. And your pre-commit hook example is much appreciated.
 
  For the moment I get the impression that it's not really doable /
  desirable to implement this in the repository. At least until now
  no-one has suggested how it could be done, and I don't know enough
  myself about the server-side / back-end to figure it out :-).
  The first obstacle is that the server does not interpret properties.[1]
  Therefore, you'd have to implement this check at transaction-commit
  time, because there's no earlier point where you're guaranteed to have
  all node properties at their final values. This implies that the time to
  reject a commit would be proportional to the size of the commit (which
  isn't the case when it comes to conflict detection).
  Can't you do what stefan2 suggested, but in libsvn_repos?  The repos
  layer commit editor would remember for each file whether its textdelta
  has added a new 0x0D byte, then at close_edit() it would iterate all
  files in the commit --- and for each file only compare its svn:eol-style
  property to the by-now-precomputed did it it contain a 0x0D bit.
 
  I'm not sure how efficient it would be to parse for 0x0D's in
  libsvn_repos, though.  Maybe we should make this optional.
 
 
 It's not enough to just look for \r in the delta stream. Certainly
 wouldn't help with historically broken files.

I wasn't trying to fix historically broken files.  (That would require
a fulltext scan --- that's not a cheap computation.)  Do you see another
problem?

 Maybe that's a good idea, but /why/ put it in libsvn_repos when it's
 much easier and cleaner to just provide some standard hooks, written in
 C and distributed with releases, that the admin plug in if she feels
 like it? Surely that's what hooks are for.

The argument is that a Subversion server should be enforcing
Subversion's invariants.

That said, I'm not opposed to doing it via standard hooks.  It's a good
way to introduce the feature in a way that allows more easily changing
it before it hits the APIs and their strict compatibility rules.

 
 
 -- Brane
 
 
 -- 
 Branko Čibej
 Director of Subversion | WANdisco | www.wandisco.com
 


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-10 Thread Johan Corveleyn
On Mon, Dec 10, 2012 at 12:47 PM, Daniel Shahaf d...@daniel.shahaf.name wrote:
 Branko Čibej wrote on Mon, Dec 10, 2012 at 11:51:08 +0100:
 On 10.12.2012 11:31, Daniel Shahaf wrote:
  Branko Čibej wrote on Mon, Dec 10, 2012 at 00:26:20 +0100:
  On 10.12.2012 00:08, Johan Corveleyn wrote:
  On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
  Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
  2) Am I the only one who wants to protect his repository against this
  corruption? Judging from [1], I don't think so. It doesn't make sense
  that everyone starts writing this pre-commit hook, for something that
  IMHO is an obvious anti-corruption protection. I think every
  repository out there deserves to be protected against this.
 
  FWIW, I suggested a hook because you could implement that in about
  5 minutes, whereas doing a C code change would take at least 10 times
  that (and several weeks if you have to wait for it to appear in a 1.7.x
  release that you can install at $WORK).  I won't object to C code
  verifying the svn:eol-style invariant.
  Thanks. And your pre-commit hook example is much appreciated.
 
  For the moment I get the impression that it's not really doable /
  desirable to implement this in the repository. At least until now
  no-one has suggested how it could be done, and I don't know enough
  myself about the server-side / back-end to figure it out :-).
  The first obstacle is that the server does not interpret properties.[1]
  Therefore, you'd have to implement this check at transaction-commit
  time, because there's no earlier point where you're guaranteed to have
  all node properties at their final values. This implies that the time to
  reject a commit would be proportional to the size of the commit (which
  isn't the case when it comes to conflict detection).
  Can't you do what stefan2 suggested, but in libsvn_repos?  The repos
  layer commit editor would remember for each file whether its textdelta
  has added a new 0x0D byte, then at close_edit() it would iterate all
  files in the commit --- and for each file only compare its svn:eol-style
  property to the by-now-precomputed did it it contain a 0x0D bit.
 
  I'm not sure how efficient it would be to parse for 0x0D's in
  libsvn_repos, though.  Maybe we should make this optional.


 It's not enough to just look for \r in the delta stream. Certainly
 wouldn't help with historically broken files.

 I wasn't trying to fix historically broken files.  (That would require
 a fulltext scan --- that's not a cheap computation.)  Do you see another
 problem?

Indeed, historically broken files are not the focus here. They're
broken in the history of the repository anyway, so that's not really
fixable anymore. But if we can prevent new broken files from entering
the repository, that would be great.

 Maybe that's a good idea, but /why/ put it in libsvn_repos when it's
 much easier and cleaner to just provide some standard hooks, written in
 C and distributed with releases, that the admin plug in if she feels
 like it? Surely that's what hooks are for.

 The argument is that a Subversion server should be enforcing
 Subversion's invariants.

 That said, I'm not opposed to doing it via standard hooks.  It's a good
 way to introduce the feature in a way that allows more easily changing
 it before it hits the APIs and their strict compatibility rules.

Yes, that would be fine I think. I like Ivan's suggestion of creating
a new standard 'svnhooks' program or something like that, which can
then be part of standard distributions.

-- 
Johan


RE: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-10 Thread Bert Huijben


 -Original Message-
 From: Branko Čibej [mailto:br...@wandisco.com]
 Sent: maandag 10 december 2012 8:04
 To: dev@subversion.apache.org
 Subject: Re: enforcing LF-normalization for svn:eol-style=native files (issue
 #4065)
 
 On 10.12.2012 07:35, Bert Huijben wrote:
  I don’t think you have to wait until commit time: You could verify the
  commit base revision’s properties + changes. In the cases where the
  properties change before commit, the commit would fail for being out of
  date.
 
 That would imply you can't change properties and contents in the same
 commit, wouldn't it. Which I suspect is not the case? :)
 
 The problem is worse, you can theoretically reopen and modify a
 transaction any number of times before committing it, so you don't
 really know until txn commit time which properties actually apply to
 which file contents.

On the fs layer this is probably true, but in a normal editor drive through the 
ra layers this should never happen.

On the ra layer the final set of properties is valid in the file_close() 
callback. I'm just not sure if this is really an event that the server sees for 
ra_dav, or if this is implied by the committing.

Bert 




svnhooks program (was Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065))

2012-12-10 Thread Ivan Zhakov
On Mon, Dec 10, 2012 at 3:53 PM, Johan Corveleyn jcor...@gmail.com wrote:
 On Mon, Dec 10, 2012 at 12:47 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
[...]

 The argument is that a Subversion server should be enforcing
 Subversion's invariants.

 That said, I'm not opposed to doing it via standard hooks.  It's a good
 way to introduce the feature in a way that allows more easily changing
 it before it hits the APIs and their strict compatibility rules.

 Yes, that would be fine I think. I like Ivan's suggestion of creating
 a new standard 'svnhooks' program or something like that, which can
 then be part of standard distributions.

I'd like to share my thoughts about 'svnhooks' if someone is going to
implement it.

Introduce svnhooks program with the separate subcommand for each hook
kind: 'pre-commit', 'post-comit', 'pre-lock', 'pre-unlock and etc.
The polices for svnhooks program defined in conf\svnhooks.conf file.
This makes very easy for user to use svnhooks: just add line 'svnhooks
HOOK-TYPE $*' in each hook and tweak svnhooks.conf configuration file.

svnhooks configuration file has separate section for each policy. For example:
[[[
[check-eol-normalization]
enable = yes

[rev-propchange]
enable = yes
users = svnsync

[edit-log-message]
enable = yes
users = admin, @author
]]]

-- 
Ivan Zhakov


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Daniel Shahaf
Johan Corveleyn wrote on Sun, Dec 09, 2012 at 01:02:55 +0100:
 Last week a colleague managed to commit a non-LF-normalized
 svn:eol-style=native file in our repository again. As explained in
 issue #4065 [1], this causes all kinds of problems.
 
 I suspect there might be a bug in SVNKit, some edge-case where it
 doesn't correctly translate the to-be-committed files to
 LF-termination. But regardless, I'd like to protect my repository. I
 don't fully control the clients that people use, and those clients can
 always have bugs, ...
 
 So ... how hard would it be to fix issue #4065, making the server
 enforce the right eol-ness for correct operation? It's a genuine
 question, I have no idea :-).
 

for i in $(svnlook changed -t $TXN $REPOS); do
  if propget = native || propget = LF ; then 
svnlook cat -t $TXN $REPOS $i | xxd -c1 -p | grep -i 0d
  fi
done

And writing that made me realise... contains CR is such a simple
condition, that you don't need the fulltext for it --- you would be able
to implement it by looking at the literal new text segments within
svndiff streams directly.  However, I'm not quite sure whether this
observation is the key to a simple implementation or to an
unnecessarily-complicated one.

 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4065 - server
 should enforce LF normalization for svn:eol-style=native files
 
 -- 
 Johan


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Branko Čibej
On 09.12.2012 09:14, Daniel Shahaf wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 01:02:55 +0100:
 Last week a colleague managed to commit a non-LF-normalized
 svn:eol-style=native file in our repository again. As explained in
 issue #4065 [1], this causes all kinds of problems.

 I suspect there might be a bug in SVNKit, some edge-case where it
 doesn't correctly translate the to-be-committed files to
 LF-termination. But regardless, I'd like to protect my repository. I
 don't fully control the clients that people use, and those clients can
 always have bugs, ...

 So ... how hard would it be to fix issue #4065, making the server
 enforce the right eol-ness for correct operation? It's a genuine
 question, I have no idea :-).

 for i in $(svnlook changed -t $TXN $REPOS); do
   if propget = native || propget = LF ; then 
 svnlook cat -t $TXN $REPOS $i | xxd -c1 -p | grep -i 0d
   fi
 done

 And writing that made me realise... contains CR is such a simple
 condition, that you don't need the fulltext for it --- you would be able
 to implement it by looking at the literal new text segments within
 svndiff streams directly.  However, I'm not quite sure whether this
 observation is the key to a simple implementation or to an
 unnecessarily-complicated one.

 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4065 - server
 should enforce LF normalization for svn:eol-style=native files

The server does not look at the contents of files, or the value of
properties. It does not even know that properties /have/ semantics. The
only reasonable place to do this would be in a pre-commit hook. Anything
else would require a major change in the design of the server and/or
libsvn_repos.

If the client encounters non-normalized files with svn:eol-style=native
and does /not/ properly normalize them regardless, that's a client bug.
Though of course the file would immediately show up as modified as soon
as it was updated.

-- Brane


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Johan Corveleyn
On Sun, Dec 9, 2012 at 11:42 AM, Branko Čibej br...@wandisco.com wrote:
 On 09.12.2012 09:14, Daniel Shahaf wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 01:02:55 +0100:
 Last week a colleague managed to commit a non-LF-normalized
 svn:eol-style=native file in our repository again. As explained in
 issue #4065 [1], this causes all kinds of problems.

 I suspect there might be a bug in SVNKit, some edge-case where it
 doesn't correctly translate the to-be-committed files to
 LF-termination. But regardless, I'd like to protect my repository. I
 don't fully control the clients that people use, and those clients can
 always have bugs, ...

 So ... how hard would it be to fix issue #4065, making the server
 enforce the right eol-ness for correct operation? It's a genuine
 question, I have no idea :-).

 for i in $(svnlook changed -t $TXN $REPOS); do
   if propget = native || propget = LF ; then
 svnlook cat -t $TXN $REPOS $i | xxd -c1 -p | grep -i 0d
   fi
 done

 And writing that made me realise... contains CR is such a simple
 condition, that you don't need the fulltext for it --- you would be able
 to implement it by looking at the literal new text segments within
 svndiff streams directly.  However, I'm not quite sure whether this
 observation is the key to a simple implementation or to an
 unnecessarily-complicated one.

 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4065 - server
 should enforce LF normalization for svn:eol-style=native files

 The server does not look at the contents of files, or the value of
 properties. It does not even know that properties /have/ semantics. The
 only reasonable place to do this would be in a pre-commit hook. Anything
 else would require a major change in the design of the server and/or
 libsvn_repos.

 If the client encounters non-normalized files with svn:eol-style=native
 and does /not/ properly normalize them regardless, that's a client bug.
 Though of course the file would immediately show up as modified as soon
 as it was updated.

Bah, that just sounds like the client (any client) should paper over
these corrupt files. I don't think that's a good solution.

These files are present in the repository in a corrupt form, not
following the expected invariant that they should have LF line
endings. If you 'svnlook cat' such a file, it will have CRLF line
endings, while normal svn:eol-style=native files will always have LF
line endings when 'svnlook cat'ed.

Any client that diffs that change ('svnlook diff', 'svn diff -c $REV
$REPOS', ...), will see a full file diff. They would all have to
normalize that corrupt file on the fly to give a good diff like it
should have been if it weren't corrupted.

Note that the Apache repository itself already contains such
corruptions (see [1], the user thread started by Konstantin Kolinko,
which prompted me to file the issue). Any large repository will have
them, because of the diversity of svn clients and bugs they may
contain. These buggy clients ruin the experience for everyone else.
I'd like there to be some better protection against this problem,
centrally.

A pre-commit hook is an option, and I'll probably have to go that
route if there is no other way. But I see two problems with it:

1) It's dog slow. Doing an svnlook pg and then potentially svnlook
cat for every changed file can easily make the pre-commit hook take
longer than the standard client-side timeout of 30 seconds (for neon,
dunno about the serf timeout). Just imagine doing a catch-up merge
touching 1000 files. Parsing both the properties and the content from
one svnlook diff call might be a workaround, but yuck.

2) Am I the only one who wants to protect his repository against this
corruption? Judging from [1], I don't think so. It doesn't make sense
that everyone starts writing this pre-commit hook, for something that
IMHO is an obvious anti-corruption protection. I think every
repository out there deserves to be protected against this.

[1] http://svn.haxx.se/users/archive-2011-10/0287.shtml

-- 
Johan


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Daniel Shahaf
Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.
 

FWIW, I suggested a hook because you could implement that in about
5 minutes, whereas doing a C code change would take at least 10 times
that (and several weeks if you have to wait for it to appear in a 1.7.x
release that you can install at $WORK).  I won't object to C code
verifying the svn:eol-style invariant.

 [1] http://svn.haxx.se/users/archive-2011-10/0287.shtml
 
 -- 
 Johan


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Johan Corveleyn
On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.


 FWIW, I suggested a hook because you could implement that in about
 5 minutes, whereas doing a C code change would take at least 10 times
 that (and several weeks if you have to wait for it to appear in a 1.7.x
 release that you can install at $WORK).  I won't object to C code
 verifying the svn:eol-style invariant.

Thanks. And your pre-commit hook example is much appreciated.

For the moment I get the impression that it's not really doable /
desirable to implement this in the repository. At least until now
no-one has suggested how it could be done, and I don't know enough
myself about the server-side / back-end to figure it out :-).

(side-note: I don't want to rush anything and get it backported to a
1.7.x release. For the moment I can manage (I watch the problem from
an asynchronous post-commit hook, and fix it as soon as it happens,
which is only a couple of times a year in our repos). But I'm
interested in a long-term solution to this, so this wart can be
eliminated :-)).

As for my 1st concern, about the slowness of the pre-commit hook
validation, it might be possible to make it faster if both 'svnlook
pg' and 'svnlook cat' would support multiple targets.

-- 
Johan


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Daniel Shahaf
Johan Corveleyn wrote on Mon, Dec 10, 2012 at 00:08:26 +0100:
 On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
  Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
  2) Am I the only one who wants to protect his repository against this
  corruption? Judging from [1], I don't think so. It doesn't make sense
  that everyone starts writing this pre-commit hook, for something that
  IMHO is an obvious anti-corruption protection. I think every
  repository out there deserves to be protected against this.
 
 
  FWIW, I suggested a hook because you could implement that in about
  5 minutes, whereas doing a C code change would take at least 10 times
  that (and several weeks if you have to wait for it to appear in a 1.7.x
  release that you can install at $WORK).  I won't object to C code
  verifying the svn:eol-style invariant.
 
 Thanks. And your pre-commit hook example is much appreciated.
 
 For the moment I get the impression that it's not really doable /
 desirable to implement this in the repository. At least until now
 no-one has suggested how it could be done, and I don't know enough
 myself about the server-side / back-end to figure it out :-).
 

Well, as Brane said the logic should live in libsvn_repos.  I would
guess the code path of interest is the commit editor.  (The property
validations live in svn_repos__validate_prop(), in fs-wrap.c.)

 (side-note: I don't want to rush anything and get it backported to a
 1.7.x release. For the moment I can manage (I watch the problem from
 an asynchronous post-commit hook, and fix it as soon as it happens,
 which is only a couple of times a year in our repos). But I'm
 interested in a long-term solution to this, so this wart can be
 eliminated :-)).
 

Well, just extend the wart to also fix it as soon as it happens:

% svn cat $FILE | dos2unix | svnmucc -mm put /dev/stdin $FILE
(BTW, there's a race condition in this line --- it's not guaranteed that
'cat' and 'put' open the RA session to the same base revision)

 As for my 1st concern, about the slowness of the pre-commit hook
 validation, it might be possible to make it faster if both 'svnlook
 pg' and 'svnlook cat' would support multiple targets.

Or if you wrote your hook script using the Python bindings.  (only one
svn_fs_open call, intra-process caches, ...)

 
 -- 
 Johan


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Branko Čibej
On 10.12.2012 00:08, Johan Corveleyn wrote:
 On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.

 FWIW, I suggested a hook because you could implement that in about
 5 minutes, whereas doing a C code change would take at least 10 times
 that (and several weeks if you have to wait for it to appear in a 1.7.x
 release that you can install at $WORK).  I won't object to C code
 verifying the svn:eol-style invariant.
 Thanks. And your pre-commit hook example is much appreciated.

 For the moment I get the impression that it's not really doable /
 desirable to implement this in the repository. At least until now
 no-one has suggested how it could be done, and I don't know enough
 myself about the server-side / back-end to figure it out :-).

The first obstacle is that the server does not interpret properties.[1]
Therefore, you'd have to implement this check at transaction-commit
time, because there's no earlier point where you're guaranteed to have
all node properties at their final values. This implies that the time to
reject a commit would be proportional to the size of the commit (which
isn't the case when it comes to conflict detection).

All properties are interpreted by clients. A buggy client will cause
cause problems for other users, so the best course of action is to
report the bug (to SvnKit in this case?).

Also note that you're using a much too strong term when you talk about
corrupted files in this case. There's nothing corrupted about them.


-- Brane

[1] Not strictly true since mod_dav_svn will look at svn:mime-type when
serving the content at its default, unversioned URL; but it doesn't
actually interpret the value.

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Branko Čibej
On 10.12.2012 00:21, Daniel Shahaf wrote:
 As for my 1st concern, about the slowness of the pre-commit hook
 validation, it might be possible to make it faster if both 'svnlook
 pg' and 'svnlook cat' would support multiple targets.
 Or if you wrote your hook script using the Python bindings.  (only one
 svn_fs_open call, intra-process caches, ...)

Or you can write that pre-commit hook in C, after all, using our very
own libraries. It'll take a bit longer than doing it in Python, but much
less than trying to shoehorn something into libsvn_repos that wasn't
designed to be there.

(N.B.: Even if it /was/ in libsvn_repos, it'd just get called at the
same time as the pre-commit hook; so you'd save a fork+exec and a few
open calls, but not that much more.)

-- Brane

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Stefan Fuhrmann
On Mon, Dec 10, 2012 at 12:26 AM, Branko Čibej br...@wandisco.com wrote:

 On 10.12.2012 00:08, Johan Corveleyn wrote:
  On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name
 wrote:
  Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
  2) Am I the only one who wants to protect his repository against this
  corruption? Judging from [1], I don't think so. It doesn't make sense
  that everyone starts writing this pre-commit hook, for something that
  IMHO is an obvious anti-corruption protection. I think every
  repository out there deserves to be protected against this.
 
  FWIW, I suggested a hook because you could implement that in about
  5 minutes, whereas doing a C code change would take at least 10 times
  that (and several weeks if you have to wait for it to appear in a 1.7.x
  release that you can install at $WORK).  I won't object to C code
  verifying the svn:eol-style invariant.
  Thanks. And your pre-commit hook example is much appreciated.
 
  For the moment I get the impression that it's not really doable /
  desirable to implement this in the repository. At least until now
  no-one has suggested how it could be done, and I don't know enough
  myself about the server-side / back-end to figure it out :-).

 The first obstacle is that the server does not interpret properties.[1]
 Therefore, you'd have to implement this check at transaction-commit
 time, because there's no earlier point where you're guaranteed to have
 all node properties at their final values. This implies that the time to
 reject a commit would be proportional to the size of the commit (which
 isn't the case when it comes to conflict detection).


Well, if we really want to implement such a feature,
we may as well be a bit more clever about it: While
writing incoming file contents to the protrev file, we
may determine its NL style as we go and store the
result. The latter could then be compared to the props
setting during commit time.

-- Stefan^2.

-- 
Certified  Supported Apache Subversion Downloads:
*

http://www.wandisco.com/subversion/download
*


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Johan Corveleyn
On Mon, Dec 10, 2012 at 12:26 AM, Branko Čibej br...@wandisco.com wrote:
 On 10.12.2012 00:08, Johan Corveleyn wrote:
 On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.

 FWIW, I suggested a hook because you could implement that in about
 5 minutes, whereas doing a C code change would take at least 10 times
 that (and several weeks if you have to wait for it to appear in a 1.7.x
 release that you can install at $WORK).  I won't object to C code
 verifying the svn:eol-style invariant.
 Thanks. And your pre-commit hook example is much appreciated.

 For the moment I get the impression that it's not really doable /
 desirable to implement this in the repository. At least until now
 no-one has suggested how it could be done, and I don't know enough
 myself about the server-side / back-end to figure it out :-).

 The first obstacle is that the server does not interpret properties.[1]
 Therefore, you'd have to implement this check at transaction-commit
 time, because there's no earlier point where you're guaranteed to have
 all node properties at their final values. This implies that the time to
 reject a commit would be proportional to the size of the commit (which
 isn't the case when it comes to conflict detection).

 All properties are interpreted by clients. A buggy client will cause
 cause problems for other users, so the best course of action is to
 report the bug (to SvnKit in this case?).

That's actually pretty hard, because I can't reproduce the issue. I
only suspect svnkit because I've only seen it (at our company) with
commits coming from IntelliJ IDEA, which uses svnkit under the hood.

Another known offender is git-svn, because they don't care about
translating line termination. So if you use git-svn on Windows, you
might end up committing an eol-style=native file with CRLF's in to the
repository.

 Also note that you're using a much too strong term when you talk about
 corrupted files in this case. There's nothing corrupted about them.

Well, it does mess up all diffs (client-side, server side for
post-commit mails, ...) made against that revision.

Plus it causes these files to be marked as modified in working copies
of other users ... sometimes (when their timestamps mismatch with
their timestamp-in-wc.db, so the content will be checksummed -- if the
timestamps are still ok, it will not show up as modified). This in
turn can cause totally confusing tree conflicts when a parent dir is
moved (confusing because there's a local edit that the user knows
nothing about).

But okay, corrupt may be exaggerating a bit :-) ...

 -- Brane

 [1] Not strictly true since mod_dav_svn will look at svn:mime-type when
 serving the content at its default, unversioned URL; but it doesn't
 actually interpret the value.

 --
 Branko Čibej
 Director of Subversion | WANdisco | www.wandisco.com




-- 
Johan


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Branko Čibej
On 10.12.2012 01:24, Stefan Fuhrmann wrote:
 Well, if we really want to implement such a feature,
 we may as well be a bit more clever about it: While
 writing incoming file contents to the protrev file, we
 may determine its NL style as we go and store the
 result. The latter could then be compared to the props
 setting during commit time.

Premature optimization. We haven't even decided we want this in
libsvn_repos, let alone libsvn_fs_fs, which is the only place where
protorev file is even a meaningful term.

-- Brane


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Branko Čibej
On 10.12.2012 01:25, Johan Corveleyn wrote:
 Another known offender is git-svn, because they don't care about
 translating line termination. So if you use git-svn on Windows, you
 might end up committing an eol-style=native file with CRLF's in to the
 repository.

Yuck. We should shout *really* loudly out to the world about broken
git-svn then, if it doesn't honour our wire protocol. This might even
get me to lean towards rejecting these kinds of commits by default.

-- Brane

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



RE: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Bert Huijben
I don’t think you have to wait until commit time: You could verify the
commit base revision’s properties + changes. In the cases where the
properties change before commit, the commit would fail for being out of
date.

Bert

 *From:* Branko Čibej
*Sent:* ‎December‎ ‎10‎, ‎2012 ‎12‎:‎26‎ ‎AM
*To:* dev@subversion.apache.org
*Subject:* Re: enforcing LF-normalization for svn:eol-style=native files
(issue #4065)

On 10.12.2012 00:08, Johan Corveleyn wrote:
 On Sun, Dec 9, 2012 at 11:43 PM, Daniel Shahaf d...@daniel.shahaf.name
wrote:
 Johan Corveleyn wrote on Sun, Dec 09, 2012 at 21:15:24 +0100:
 2) Am I the only one who wants to protect his repository against this
 corruption? Judging from [1], I don't think so. It doesn't make sense
 that everyone starts writing this pre-commit hook, for something that
 IMHO is an obvious anti-corruption protection. I think every
 repository out there deserves to be protected against this.

 FWIW, I suggested a hook because you could implement that in about
 5 minutes, whereas doing a C code change would take at least 10 times
 that (and several weeks if you have to wait for it to appear in a 1.7.x
 release that you can install at $WORK).  I won't object to C code
 verifying the svn:eol-style invariant.
 Thanks. And your pre-commit hook example is much appreciated.

 For the moment I get the impression that it's not really doable /
 desirable to implement this in the repository. At least until now
 no-one has suggested how it could be done, and I don't know enough
 myself about the server-side / back-end to figure it out :-).

The first obstacle is that the server does not interpret properties.[1]
Therefore, you'd have to implement this check at transaction-commit
time, because there's no earlier point where you're guaranteed to have
all node properties at their final values. This implies that the time to
reject a commit would be proportional to the size of the commit (which
isn't the case when it comes to conflict detection).

All properties are interpreted by clients. A buggy client will cause
cause problems for other users, so the best course of action is to
report the bug (to SvnKit in this case?).

Also note that you're using a much too strong term when you talk about
corrupted files in this case. There's nothing corrupted about them.


-- Brane

[1] Not strictly true since mod_dav_svn will look at svn:mime-type when
serving the content at its default, unversioned URL; but it doesn't
actually interpret the value.

--
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com


Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-09 Thread Branko Čibej
On 10.12.2012 07:35, Bert Huijben wrote:
 I don’t think you have to wait until commit time: You could verify the
 commit base revision’s properties + changes. In the cases where the
 properties change before commit, the commit would fail for being out of
 date.

That would imply you can't change properties and contents in the same
commit, wouldn't it. Which I suspect is not the case? :)

The problem is worse, you can theoretically reopen and modify a
transaction any number of times before committing it, so you don't
really know until txn commit time which properties actually apply to
which file contents.

-- Brane


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com



enforcing LF-normalization for svn:eol-style=native files (issue #4065)

2012-12-08 Thread Johan Corveleyn
Last week a colleague managed to commit a non-LF-normalized
svn:eol-style=native file in our repository again. As explained in
issue #4065 [1], this causes all kinds of problems.

I suspect there might be a bug in SVNKit, some edge-case where it
doesn't correctly translate the to-be-committed files to
LF-termination. But regardless, I'd like to protect my repository. I
don't fully control the clients that people use, and those clients can
always have bugs, ...

So ... how hard would it be to fix issue #4065, making the server
enforce the right eol-ness for correct operation? It's a genuine
question, I have no idea :-).

[1] http://subversion.tigris.org/issues/show_bug.cgi?id=4065 - server
should enforce LF normalization for svn:eol-style=native files

-- 
Johan