Detecting CR eol

2010-09-08 Thread Giulio Troccoli
I am writing a pre-commit hook script in perl. One of the requirement is that 
all files (luckly they are all text files) have the svn:eol-style property set 
to LF and the actual eol is indeed LF. If that's not the case I will reject the 
commit and direct the user to a page on our intranet to explain what to do to 
fix it.

My problem is how to detect whether the eol is LF and nothing else. I'm 
developing on Linux (Centos 5) and Perl 5.10. Subversion is 1.6.9, if it 
matters.

I thought about using the dos2unix utility (we only use Windows or Linux) and 
then check that the file hasn't changed, but it seems a lot of processing.

My second idea was to use a regular expression to check each line of each file. 
This way at least I would stop as soon as I find an eol that is not LF, saving 
some processing. I still need to svn cat each file into an array I think.

I know this is a common requirement but I don't know whether anyone has already 
done it in Perl. I would be greatful for any comment or suggestions of course.

Giulio


Linedata Limited
Registered Office: 85 Gracechurch St., London, EC3V 0AA
Registered in England and Wales No 3475006 VAT Reg No 710 3140 03






Re: Detecting CR eol

2010-09-08 Thread Csaba Raduly
Hi Giulio,

On Wed, Sep 8, 2010 at 10:25 AM, Giulio Troccoli  wrote:
> I am writing a pre-commit hook script in perl. One of the requirement is that 
> all files (luckily they are all text files) have the svn:eol-style property 
> set to LF and the actual eol is indeed LF. If that's not the case I will 
> reject the commit and direct the user to a page on our intranet to explain 
> what to do to fix it.
>
> My problem is how to detect whether the eol is LF and nothing else. I'm 
> developing on Linux (Centos 5) and Perl 5.10. Subversion is 1.6.9, if it 
> matters.
>
> I thought about using the dos2unix utility (we only use Windows or Linux) and 
> then check that the file hasn't changed, but it seems a lot of processing.
>
> My second idea was to use a regular expression to check each line of each 
> file. This way at least I would stop as soon as I find an eol that is not LF, 
> saving some processing. I still need to svn cat each file into an array I 
> think.
>

You need to use svnlook cat, but there is no need to read all its
output into memory. You can process it line-by-line.
Here's an outline (completely untested)

#!/usr/bin/perl -w
use strict;

my ($REPOS, $TXN) = @ARGV;

my $crlf = 0;

... determine the list of files
my @files = `svnlook changed -t $TXN $REPOS`;
chomp @files; # remove the newline at the end
s/^U\s+// for @files; # remove the leading U

FILE:
foreach my $file (@files) {
  open (SVN, "svnlook cat $file |") or die "open pipe failed: $!"
  while () # read from the pipe, one line at a time
  {
chomp; # cut the platform-specific line end. On Unix, this drops
the \n but keeps the \r
if ( /^M$/ ) { # last character is a \r (a.k.a. Control-M)
  $crlf = 1; last FILE;
}
  }
  close(SVN) or die "close pipe failed: $!" # it is very important to
check the close on pipes
}

if ($crlf)
{
  die "$file contains DOS line endings";
}


-- 
Life is complex, with real and imaginary parts.
"Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds
"People disagree with me. I just ignore them." -- Linus Torvalds


Re: Detecting CR eol

2010-09-08 Thread Campbell Allan

On Wednesday 08 Sep 2010, Csaba Raduly wrote:
> Hi Giulio,
>
> On Wed, Sep 8, 2010 at 10:25 AM, Giulio Troccoli  wrote:
> > I am writing a pre-commit hook script in perl. One of the requirement is
> > that all files (luckily they are all text files) have the svn:eol-style
> > property set to LF and the actual eol is indeed LF. If that's not the
> > case I will reject the commit and direct the user to a page on our
> > intranet to explain what to do to fix it.
> >
> > My problem is how to detect whether the eol is LF and nothing else. I'm
> > developing on Linux (Centos 5) and Perl 5.10. Subversion is 1.6.9, if it
> > matters.
> >
> > I thought about using the dos2unix utility (we only use Windows or Linux)
> > and then check that the file hasn't changed, but it seems a lot of
> > processing.
> >
> > My second idea was to use a regular expression to check each line of each
> > file. This way at least I would stop as soon as I find an eol that is not
> > LF, saving some processing. I still need to svn cat each file into an
> > array I think.
>
> You need to use svnlook cat, but there is no need to read all its
> output into memory. You can process it line-by-line.
> Here's an outline (completely untested)
>
> #!/usr/bin/perl -w
> use strict;
>
> my ($REPOS, $TXN) = @ARGV;
>
> my $crlf = 0;
>
> ... determine the list of files
> my @files = `svnlook changed -t $TXN $REPOS`;
> chomp @files; # remove the newline at the end
> s/^U\s+// for @files; # remove the leading U
>
> FILE:
> foreach my $file (@files) {
>   open (SVN, "svnlook cat $file |") or die "open pipe failed: $!"
>   while () # read from the pipe, one line at a time
>   {
> chomp; # cut the platform-specific line end. On Unix, this drops
> the \n but keeps the \r
> if ( /^M$/ ) { # last character is a \r (a.k.a. Control-M)
>   $crlf = 1; last FILE;
> }
>   }
>   close(SVN) or die "close pipe failed: $!" # it is very important to
> check the close on pipes
> }
>
> if ($crlf)
> {
>   die "$file contains DOS line endings";
> }

I don't believe you have to go to so much trouble in the pre-commit hook. If 
you have set the svn:eol-style property then subversion will ensure the file 
has those line endings on checkout and update them when committing into the 
repository. So all the hook needs to do is check for the property. See the 
book for more details

http://svnbook.red-bean.com/nightly/en/svn-book.html#svn.advanced.props.special.eol-style

I'd also normally expect the line ending style to be set to native so windows 
and unix users don't trample the existing incompatible line endings. The only 
reason perhaps for checking each file explicitly would be if there was 
something else needing the files to be in a particular format, ie releases to 
customers from a developer machine rather than an official build server that 
would check out a clean copy each time.

-- 

__
Sword Ciboodle is the trading name of ciboodle Limited (a company 
registered in Scotland with registered number SC143434 and whose 
registered office is at India of Inchinnan, Renfrewshire, UK, 
PA4 9LH) which is part of the Sword Group of companies.

This email (and any attachments) is intended for the named
recipient(s) and is private and confidential. If it is not for you, 
please inform us and then delete it. If you are not the intended 
recipient(s), the use, disclosure, copying or distribution of any 
information contained within this email is prohibited. Messages to 
and from us may be monitored. If the content is not about the 
business of the Sword Group then the message is neither from nor 
sanctioned by us.

Internet communications are not secure. You should scan this
message and any attachments for viruses. Under no circumstances
do we accept liability for any loss or damage which may result from
your receipt of this email or any attachment.
__



RE: Detecting CR eol

2010-09-08 Thread Giulio Troccoli

> I don't believe you have to go to so much trouble in the
> pre-commit hook. If you have set the svn:eol-style property
> then subversion will ensure the file has those line endings
> on checkout and update them when committing into the
> repository. So all the hook needs to do is check for the
> property. See the book for more details
>
> http://svnbook.red-bean.com/nightly/en/svn-book.html#svn.advan
> ced.props.special.eol-style

I'm not sure. Are you saying that if I set the svn:eol-style property to LF, 
for example, and my file has at least one line ending with CRLF, then 
Subversion will reject the commit? The book doesn't quite say that, and that 
wasn't my understanding on how the property works.

> I'd also normally expect the line ending style to be set to
> native so windows and unix users don't trample the existing
> incompatible line endings. The only reason perhaps for
> checking each file explicitly would be if there was something
> else needing the files to be in a particular format, ie
> releases to customers from a developer machine rather than an
> official build server that would check out a clean copy each time.

The requirement, to have LF, came a long time ago. I remeber having problems 
with svn:eol-style set to native. I think Subversion did not checkout the files 
with the correct EOL based on the platform, but maybe that was because the 
files were actually being committed with mixed EOLs.

G


Linedata Limited
Registered Office: 85 Gracechurch St., London, EC3V 0AA
Registered in England and Wales No 3475006 VAT Reg No 710 3140 03






Re: Detecting CR eol

2010-09-08 Thread Ulrich Eckhardt
On Wednesday 08 September 2010, Giulio Troccoli wrote:
> I remeber having problems with svn:eol-style set to native. I think
> Subversion did not checkout the files with the correct EOL based on
> the platform, but maybe that was because the files were actually
> being committed with mixed EOLs. 

It works for me.

That said, typical errors seem to be sharing working copies between different 
OSs, like Windows and Cygwin or Linux.

Uli

-- 
ML: http://subversion.tigris.org/mailing-list-guidelines.html
FAQ: http://subversion.tigris.org/faq.html
Docs: http://svnbook.red-bean.com/

Sator Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
Sator Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.
**



RE: Detecting CR eol

2010-09-08 Thread Giulio Troccoli
> On Wednesday 08 September 2010, Giulio Troccoli wrote:
> > I remeber having problems with svn:eol-style set to native. I think
> > Subversion did not checkout the files with the correct EOL based on
> > the platform, but maybe that was because the files were
> actually being
> > committed with mixed EOLs.
>
> It works for me.
>
> That said, typical errors seem to be sharing working copies
> between different OSs, like Windows and Cygwin or Linux.

We don't share working copies, of that I'm sure.

I'll do some testing, after all we were using SVN 1.3, if not even 1.2, at that 
time.

G


Linedata Limited
Registered Office: 85 Gracechurch St., London, EC3V 0AA
Registered in England and Wales No 3475006 VAT Reg No 710 3140 03






Re: Detecting CR eol

2010-09-08 Thread Campbell Allan

On Wednesday 08 Sep 2010, Giulio Troccoli wrote:
> > I don't believe you have to go to so much trouble in the
> > pre-commit hook. If you have set the svn:eol-style property
> > then subversion will ensure the file has those line endings
> > on checkout and update them when committing into the
> > repository. So all the hook needs to do is check for the
> > property. See the book for more details
> >
> > http://svnbook.red-bean.com/nightly/en/svn-book.html#svn.advan
> > ced.props.special.eol-style
>
> I'm not sure. Are you saying that if I set the svn:eol-style property to
> LF, for example, and my file has at least one line ending with CRLF, then
> Subversion will reject the commit? The book doesn't quite say that, and
> that wasn't my understanding on how the property works.
>

Before sending my previous reply I had tested it with a file changed using 
unix2dos. Prior to the commit svn diff only shows the text changes ignoring 
the line endings. I haven't explicitly tested changing a single line ending 
within the file but have done a quick concatenation test with half the file 
with LF and the other half CRLF. When committed the entire file in the 
working copy is changed to LF.

The part of the book that I felt was relevant is when the line ending is set 
to native subversion will store the file in the repository with LF's only. 
The client is then changing this to reflect the preferences of the client OS.

> > I'd also normally expect the line ending style to be set to
> > native so windows and unix users don't trample the existing
> > incompatible line endings. The only reason perhaps for
> > checking each file explicitly would be if there was something
> > else needing the files to be in a particular format, ie
> > releases to customers from a developer machine rather than an
> > official build server that would check out a clean copy each time.
>
> The requirement, to have LF, came a long time ago. I remeber having
> problems with svn:eol-style set to native. I think Subversion did not
> checkout the files with the correct EOL based on the platform, but maybe
> that was because the files were actually being committed with mixed EOLs.
>
> G
>
>
> Linedata Limited
> Registered Office: 85 Gracechurch St., London, EC3V 0AA
> Registered in England and Wales No 3475006 VAT Reg No 710 3140 03

This may have occurred if the property was set after the files had been 
versioned with the mixed line endings but I'm only guessing. Whenever I've 
set the property I've also ran the dos2unix command on the files. We've not 
noticed any problems with the value being set to native and we're running a 
fairly old server (1.4.6) with mainly 1.5 clients.

-- 

__
Sword Ciboodle is the trading name of ciboodle Limited (a company 
registered in Scotland with registered number SC143434 and whose 
registered office is at India of Inchinnan, Renfrewshire, UK, 
PA4 9LH) which is part of the Sword Group of companies.

This email (and any attachments) is intended for the named
recipient(s) and is private and confidential. If it is not for you, 
please inform us and then delete it. If you are not the intended 
recipient(s), the use, disclosure, copying or distribution of any 
information contained within this email is prohibited. Messages to 
and from us may be monitored. If the content is not about the 
business of the Sword Group then the message is neither from nor 
sanctioned by us.

Internet communications are not secure. You should scan this
message and any attachments for viruses. Under no circumstances
do we accept liability for any loss or damage which may result from
your receipt of this email or any attachment.
__



Re: Detecting CR eol

2010-09-08 Thread Ryan Schmidt
On Sep 8, 2010, at 07:45, Campbell Allan wrote:
> On Wednesday 08 Sep 2010, Csaba Raduly wrote:
> I don't believe you have to go to so much trouble in the pre-commit hook. If 
> you have set the svn:eol-style property then subversion will ensure the file 
> has those line endings on checkout and update them when committing into the 
> repository. So all the hook needs to do is check for the property. See the 
> book for more details

I understood this was the client's responsibility. So while the official 
Subversion client does this, and presumably the reputable other clients that 
use the Subversion libraries do this, there is nothing on the server side that 
enforces that this is the case. I know it is possible to get files with the 
wrong eol style into the repository by loading a dump file; maybe it is also 
possible using language bindings. So checking in a pre-commit hook script that 
line endings of files with svn:eol-style set are indeed LF seems like a smart 
idea to me.



Re: Detecting CR eol

2010-09-08 Thread Ryan Schmidt

On Sep 8, 2010, at 10:27, Campbell Allan wrote:

> Before sending my previous reply I had tested it with a file changed using 
> unix2dos. Prior to the commit svn diff only shows the text changes ignoring 
> the line endings. I haven't explicitly tested changing a single line ending 
> within the file but have done a quick concatenation test with half the file 
> with LF and the other half CRLF. When committed the entire file in the 
> working copy is changed to LF.

As I recall, if a file with svn:eol-style set has inconsistent line endings 
(e.g. some LF, some CRLF), Subversion will reject the commit and require the 
user to make the file's line endings consistent before proceeding. Though I 
don't know whether this is happening on the client or on the server.


> The part of the book that I felt was relevant is when the line ending is set 
> to native subversion will store the file in the repository with LF's only. 
> The client is then changing this to reflect the preferences of the client OS.

My understanding is that if svn:eol-style is set to *any value* then the 
repository stores the file with LF line endings and the client does eol 
translation to your desired style.




Re: Detecting CR eol

2010-09-09 Thread Campbell Allan

On Wednesday 08 Sep 2010, Ryan Schmidt wrote:
> On Sep 8, 2010, at 10:27, Campbell Allan wrote:
> > Before sending my previous reply I had tested it with a file changed
> > using unix2dos. Prior to the commit svn diff only shows the text changes
> > ignoring the line endings. I haven't explicitly tested changing a single
> > line ending within the file but have done a quick concatenation test with
> > half the file with LF and the other half CRLF. When committed the entire
> > file in the working copy is changed to LF.
>
> As I recall, if a file with svn:eol-style set has inconsistent line endings
> (e.g. some LF, some CRLF), Subversion will reject the commit and require
> the user to make the file's line endings consistent before proceeding.
> Though I don't know whether this is happening on the client or on the
> server.
>

Originally I thought the same which is why I tested it but subversion only 
complains if the svn:eol-style is not set. If the property is set then the 
official client (1.6.12 with 1.6.11 server. I've not tested others) converts 
the files in the working copy on commit. Diffs show only the text changes 
ignoring the line endings. The only question I can't answer is if the server 
would reject the commit if the client does not do the conversion. This would 
almost seem like a bug though unless the svn:eol-style property is only meant 
as a hint to the client in which case the documentation should be updated.

I've got a test script for repeating this quickly that I can post. I did 
notice there appears to be an inconsistency but I don't believe this is a bug 
in subversion as unix2dos also exibits the same problem. The test script 
concatenates three smaller files together to create a larger file. When 
subversion or unix2dos converts this file to have CRLF endings the resulting 
file only contains the first of the three smaller files.


> > The part of the book that I felt was relevant is when the line ending is
> > set to native subversion will store the file in the repository with LF's
> > only. The client is then changing this to reflect the preferences of the
> > client OS.
>
> My understanding is that if svn:eol-style is set to *any value* then the
> repository stores the file with LF line endings and the client does eol
> translation to your desired style.


-- 

__
Sword Ciboodle is the trading name of ciboodle Limited (a company 
registered in Scotland with registered number SC143434 and whose 
registered office is at India of Inchinnan, Renfrewshire, UK, 
PA4 9LH) which is part of the Sword Group of companies.

This email (and any attachments) is intended for the named
recipient(s) and is private and confidential. If it is not for you, 
please inform us and then delete it. If you are not the intended 
recipient(s), the use, disclosure, copying or distribution of any 
information contained within this email is prohibited. Messages to 
and from us may be monitored. If the content is not about the 
business of the Sword Group then the message is neither from nor 
sanctioned by us.

Internet communications are not secure. You should scan this
message and any attachments for viruses. Under no circumstances
do we accept liability for any loss or damage which may result from
your receipt of this email or any attachment.
__



Re: Detecting CR eol

2010-09-09 Thread Campbell Allan

On Wednesday 08 Sep 2010, Csaba Raduly wrote:
> Hi Giulio,
>
> On Wed, Sep 8, 2010 at 10:25 AM, Giulio Troccoli  wrote:
> > I am writing a pre-commit hook script in perl. One of the requirement is
> > that all files (luckily they are all text files) have the svn:eol-style
> > property set to LF and the actual eol is indeed LF. If that's not the
> > case I will reject the commit and direct the user to a page on our
> > intranet to explain what to do to fix it.
> >
> > My problem is how to detect whether the eol is LF and nothing else. I'm
> > developing on Linux (Centos 5) and Perl 5.10. Subversion is 1.6.9, if it
> > matters.
> >
> > I thought about using the dos2unix utility (we only use Windows or Linux)
> > and then check that the file hasn't changed, but it seems a lot of
> > processing.
> >
> > My second idea was to use a regular expression to check each line of each
> > file. This way at least I would stop as soon as I find an eol that is not
> > LF, saving some processing. I still need to svn cat each file into an
> > array I think.
>
> You need to use svnlook cat, but there is no need to read all its
> output into memory. You can process it line-by-line.
> Here's an outline (completely untested)
>

I had written something similar for someone else on here for checking 
properties being set but I like this approach better. Only comment to make 
though is this assumes only updates are occurring. It will fail on any adds, 
removals or property changes as the filename will not be stripped properly. 
My perl is too rusty to be able to do it so succintly as this though.

> #!/usr/bin/perl -w
> use strict;
>
> my ($REPOS, $TXN) = @ARGV;
>
> my $crlf = 0;
>
> ... determine the list of files
> my @files = `svnlook changed -t $TXN $REPOS`;

perhaps this to filter out removed files?

my @files = `svnlook changed -t $TXN $REPOS | grep -E '^[AU]'`;

> chomp @files; # remove the newline at the end
> s/^U\s+// for @files; # remove the leading U

I do know this bit should be changed for including added files.

s/^[AU]\s+// for @files; # remove the leading A or U

>
> FILE:
> foreach my $file (@files) {
>   open (SVN, "svnlook cat $file |") or die "open pipe failed: $!"
>   while () # read from the pipe, one line at a time
>   {
> chomp; # cut the platform-specific line end. On Unix, this drops
> the \n but keeps the \r
> if ( /^M$/ ) { # last character is a \r (a.k.a. Control-M)
>   $crlf = 1; last FILE;
> }
>   }
>   close(SVN) or die "close pipe failed: $!" # it is very important to
> check the close on pipes
> }
>
> if ($crlf)
> {
>   die "$file contains DOS line endings";
> }


-- 

__
Sword Ciboodle is the trading name of ciboodle Limited (a company 
registered in Scotland with registered number SC143434 and whose 
registered office is at India of Inchinnan, Renfrewshire, UK, 
PA4 9LH) which is part of the Sword Group of companies.

This email (and any attachments) is intended for the named
recipient(s) and is private and confidential. If it is not for you, 
please inform us and then delete it. If you are not the intended 
recipient(s), the use, disclosure, copying or distribution of any 
information contained within this email is prohibited. Messages to 
and from us may be monitored. If the content is not about the 
business of the Sword Group then the message is neither from nor 
sanctioned by us.

Internet communications are not secure. You should scan this
message and any attachments for viruses. Under no circumstances
do we accept liability for any loss or damage which may result from
your receipt of this email or any attachment.
__



Re: Detecting CR eol

2010-09-09 Thread Nico Kadel-Garcia
On Wed, Sep 8, 2010 at 11:27 AM, Campbell Allan
 wrote:

> The part of the book that I felt was relevant is when the line ending is set
> to native subversion will store the file in the repository with LF's only.
> The client is then changing this to reflect the preferences of the client OS.

Yeah, this can be nasty to stuff in a pre-commit hook. The use of the
"native" EOL setting is almost always a mistake, especially with CIFS
or NFS shared working directories. These cross-platform working copies
are actually quite common for Java developers, especially becauase
they often prefer the TortoiseSVN tool for managing their working
copies.

If you're publishing content for multiple operating systems, I'd
question trying to outsmart people with pre-commit hooks. Sometimes,
for example when publishing text documents like README.txt, you need
to publish and store documents with the EOL stored for the other OS.