Re: [Monotone-devel] line endings as project policy

2006-11-28 Thread Daniel Lakeland
On Sat, Nov 25, 2006 at 08:44:59AM +0100, Richard Levitte - VMS Whacker wrote:
 In message [EMAIL PROTECTED] on Thu, 23 Nov 2006 12:11:06 -0800, Daniel 
 Lakeland [EMAIL PROTECTED] said:
 
 dlakelan On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS 
 Whacker wrote:
 dlakelan 
 dlakelan   - We need to convert line endings to the local standard on 
 anything
 dlakelan that's assumed to be text on checkout.  This I regard as a 
 fact.
 dlakelan (see the problem that some Unixly programs have with embedded 
 \r)
 dlakelan 
 dlakelan Consider languages like Python that have the ability to
 dlakelan create multiline strings, now the \r or \n characters are
 dlakelan part of the string. Converting them changes the behavior and
 dlakelan meaning of the program. This is very tricky.
 
 Does it really?  So, if I write that little example in a python
 program in Windows, using notepad, I should expect my program to
 expect differently on Windows than if I wrote that in emacs on a Unix
 box and ran it on Unix?  If that is to be *expected*, then I'm
 immediately throwing away python for any future plans.

I'm not quite sure what you mean, but basically the string continues
over several lines and the newline characters become part of the
string. Therefore, for example, if your string contains some data that
is exactly what you're looking for in the output of another program,
you'll be surprised when monotone alters the line endings in your
string and your python program doesn't match the output of the other
program it interfaces with.

Yes, as someone said, this is dodgy code. But nevertheless I don't
know why monotone should be altering the content of any text that it
checks in. It WILL cause problems eventually. 


-- 
Daniel Lakeland
[EMAIL PROTECTED]
http://www.street-artists.org/~dlakelan


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-28 Thread Nicolas Ruiz
Sorry for prolonging this thread.

Line ending conversion affects my use case. I keep bash scripts in mtn
that I copy by hand to other computers, and I generally use sha1sum to
make sure that said computers have a certain version of the scripts
installed (comparing their SHA1 against the log). The fact that mtn
identifies file versions by their SHA1 is very elegant and probably the
feature that caught my eye when I was shopping for a SCM.

I think that having each version is identified by the SHA1 of what YOU
commit is inherently better than each version is identified by the
SHA1 of the UTF-8 (or whatever format chosen) of what you commit.

The fact that I have an external tool that agrees with monotone's idea
of a file's contents gives me confidence in monotone.

cheers
nicolás

Ulf Ochsenfahrt wrote:
 This line ending thing is getting far too much attention, IMHO. My last
 word on this issue is:
 
 - Whatever I check in, I want checked out
 
 - What I'd like to see is a setting where monotone checks on commit if
 the files obey a particular line ending convention/charset and gives a
 warning if they don't
 
 
 I don't want any automatic conversion of line endings or charsets. IMHO,
 charsets are much too fragile and dangerous to be handled by monotone.
 And line ending conversion cannot really be separated from charset
 handling in the face of non-8-bit encoded charsets.
 
 That said, I am not opposed to an opt-in mechanism for line
 ending/charset handling, as long as its not on by default.
 
 The CVS way to do it was really, really bad. It messed up my files
 several times, with duplicate line endings and with treating binary
 files as text.
 
 Now, there's also another thing, which is a better merge ui, which is
 much overdue now...
 
 Cheers,
 
 -- Ulf
 
 
 
 
 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel




___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-26 Thread Richard Levitte - VMS Whacker
So, what I hear you say is that we should remove all the conversion
stuff that exists today.  Is that correct?

Cheers,
Richard

In message [EMAIL PROTECTED] on Sat, 25 Nov 2006 13:52:48 +0100, Ulf 
Ochsenfahrt [EMAIL PROTECTED] said:

ulf Hi all,
ulf 
ulf This line ending thing is getting far too much attention, IMHO. My last 
ulf word on this issue is:
ulf 
ulf - Whatever I check in, I want checked out
ulf 
ulf - What I'd like to see is a setting where monotone checks on commit if 
ulf the files obey a particular line ending convention/charset and gives a 
ulf warning if they don't
ulf 
ulf 
ulf I don't want any automatic conversion of line endings or charsets. IMHO, 
ulf charsets are much too fragile and dangerous to be handled by monotone. 
ulf And line ending conversion cannot really be separated from charset 
ulf handling in the face of non-8-bit encoded charsets.
ulf 
ulf That said, I am not opposed to an opt-in mechanism for line 
ulf ending/charset handling, as long as its not on by default.
ulf 
ulf The CVS way to do it was really, really bad. It messed up my files 
ulf several times, with duplicate line endings and with treating binary 
ulf files as text.
ulf 
ulf Now, there's also another thing, which is a better merge ui, which is 
ulf much overdue now...

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up.
-- C.S. Lewis


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-26 Thread Brian May
 Richard == Richard Levitte - VMS Whacker [EMAIL PROTECTED] writes:

Richard So, what I hear you say is that we should remove all the conversion
Richard stuff that exists today.  Is that correct?

What conversion stuff exists today?

I thought there was none?
-- 
Brian May [EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-25 Thread Ulf Ochsenfahrt

Larry Hastings wrote:


Ulf Ochsenfahrt wrote:

Yes, but UTF-8 is a _multi-byte_ encoding.
If you see an LF byte, you don't know whether this is a single-byte LF 
or part of a multi-byte sequence.
Yes you do, because all multi-byte character sequences in UTF-8 have the 
high-bit set.  If you see 0x0A in a UTF-8 stream you can be certain it 
/is/ an LF and /not/ part of a multi-byte sequence.


I suspected that, which is why I put in the part about NON-8-bit 
encodings, which you conveniently cut out.


-- Ulf


smime.p7s
Description: S/MIME Cryptographic Signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-25 Thread Ulf Ochsenfahrt

Hi all,

This line ending thing is getting far too much attention, IMHO. My last 
word on this issue is:


- Whatever I check in, I want checked out

- What I'd like to see is a setting where monotone checks on commit if 
the files obey a particular line ending convention/charset and gives a 
warning if they don't



I don't want any automatic conversion of line endings or charsets. IMHO, 
charsets are much too fragile and dangerous to be handled by monotone. 
And line ending conversion cannot really be separated from charset 
handling in the face of non-8-bit encoded charsets.


That said, I am not opposed to an opt-in mechanism for line 
ending/charset handling, as long as its not on by default.


The CVS way to do it was really, really bad. It messed up my files 
several times, with duplicate line endings and with treating binary 
files as text.


Now, there's also another thing, which is a better merge ui, which is 
much overdue now...


Cheers,

-- Ulf


smime.p7s
Description: S/MIME Cryptographic Signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-25 Thread Rob Schoening

Here here.

RS


On 11/25/06, Ulf Ochsenfahrt [EMAIL PROTECTED] wrote:


Hi all,

This line ending thing is getting far too much attention, IMHO. My last
word on this issue is:

- Whatever I check in, I want checked out

- What I'd like to see is a setting where monotone checks on commit if
the files obey a particular line ending convention/charset and gives a
warning if they don't


I don't want any automatic conversion of line endings or charsets. IMHO,
charsets are much too fragile and dangerous to be handled by monotone.
And line ending conversion cannot really be separated from charset
handling in the face of non-8-bit encoded charsets.

That said, I am not opposed to an opt-in mechanism for line
ending/charset handling, as long as its not on by default.

The CVS way to do it was really, really bad. It messed up my files
several times, with duplicate line endings and with treating binary
files as text.

Now, there's also another thing, which is a better merge ui, which is
much overdue now...

Cheers,

-- Ulf


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel




___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-25 Thread Rob Schoening

or is it hear hear ?  ;-)

RS


On 11/25/06, Rob Schoening [EMAIL PROTECTED] wrote:


Here here.

RS


 On 11/25/06, Ulf Ochsenfahrt [EMAIL PROTECTED] wrote:

 Hi all,

 This line ending thing is getting far too much attention, IMHO. My last
 word on this issue is:

 - Whatever I check in, I want checked out

 - What I'd like to see is a setting where monotone checks on commit if
 the files obey a particular line ending convention/charset and gives a
 warning if they don't


 I don't want any automatic conversion of line endings or charsets. IMHO,
 charsets are much too fragile and dangerous to be handled by monotone.
 And line ending conversion cannot really be separated from charset
 handling in the face of non-8-bit encoded charsets.

 That said, I am not opposed to an opt-in mechanism for line
 ending/charset handling, as long as its not on by default.

 The CVS way to do it was really, really bad. It messed up my files
 several times, with duplicate line endings and with treating binary
 files as text.

 Now, there's also another thing, which is a better merge ui, which is
 much overdue now...

 Cheers,

 -- Ulf


 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel





___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-24 Thread Joel Crisp

Just a general comment on this thread. Please don't forget that text files
are not the only ones which require special handling. There can be other
file formats such as XML which need special merge handling.

Clearcase handles this with the file type manager, which allows you to
associate a type with each file then specify the behaviour for things like
diff, merge and format conversion.

Joel

On 11/22/06, Nuno Lucas [EMAIL PROTECTED] wrote:


On 11/22/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Does that mean that you have C code in ASCII with comments embedded in a
 completely different characte set?

What I called IBM-860 is just a variation of ASCII. It's the same as
having an UTF-8 C source file with comments in a foreigner language
(e.g., with accents). IBM-437 (or CP-437) is the US-English code page,
and IBM-850 (or CP-850) is the multi-lingual code-page (contains most
of the graphics symbols 437 has, but lacks things like uppercase
accented vowels).

 Just curious -- is IBM 860 some variety of EBCDIC?  And is the file
 record-structured so that all 256 character codes are available (in
 principle) for text other than newlines?  So that as far as character
 coding is concerner, end-of-line is handled by a form of out-of-oband
 signalling?

To clarify a bit, IBM-860 is the same as ASCII for the ASCII part
(0..127) and characters for the portuguese locale in the above 128
characters (mostly accented vowels).

If you have an old matrix printer around (or something like a receipt
printer), take a look on the apendix pages, where all this code pages
usually are.

  For example, I can have a directory with many different translations
  of a document (in text, off course), each one with it's own encoding.
  While I would be happy if checkout handles line endings automatically
  for me, I would  be very surprised if it decides to handle the text
  encoding.

 Do we have a situation in which each file has its own encoding?  Or one
 in which different parts of a file have different encodings?

I was talking of the case of different file encodings, but I forgot
the case of different encodings on the same file, which is not as rare
as you may think. in my case there are 3 languages the program
talks, and all strings are defined on a single C source file
(remember this was an old DOS application). They all use ISO-8859-1
(actually they were converted from IBM-860 too), but that is just
because the characters needed are all there (all western european
languages, namely portuguese, french, spanish and an incomplete -
unused - english one).


Regards,
~Nuno Lucas


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel

___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-24 Thread Brian May
 Daniel == Daniel Lakeland [EMAIL PROTECTED] writes:

Daniel Consider languages like Python that have the ability to
Daniel create multiline strings, now the \r or \n characters are
Daniel part of the string. Converting them changes the behavior
Daniel and meaning of the program. This is very tricky.

Any code that relies on this behaviour is very dodgy IMHO.
-- 
Brian May [EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-24 Thread Justin Patrin

On 11/23/06, Daniel Lakeland [EMAIL PROTECTED] wrote:

On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote:

  - We need to convert line endings to the local standard on anything
that's assumed to be text on checkout.  This I regard as a fact.
(see the problem that some Unixly programs have with embedded \r)

Consider languages like Python that have the ability to create
multiline strings, now the \r or \n characters are part of the
string. Converting them changes the behavior and meaning of the
program. This is very tricky.

Example:

mystring = This string
Has several
New line characters
embedded in it
suppose the contents were executable code
embedded in this string
can we safely convert the newlines?
No




PHP has the same thing, but if I require a certain type of newline I
always use escaped chars (\r \n) as you never know what editors/RCS
are going to do to your newlines.

--
Justin Patrin


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-24 Thread Larry Hastings


Ulf Ochsenfahrt wrote:

Yes, but UTF-8 is a _multi-byte_ encoding.
If you see an LF byte, you don't know whether this is a single-byte LF 
or part of a multi-byte sequence.
Yes you do, because all multi-byte character sequences in UTF-8 have the 
high-bit set.  If you see 0x0A in a UTF-8 stream you can be certain it 
/is/ an LF and /not/ part of a multi-byte sequence.


http://en.wikipedia.org/wiki/Utf-8#Description


Brian May wrote:

Daniel == Daniel Lakeland [EMAIL PROTECTED] writes:
Daniel Consider languages like Python that have the ability to
Daniel create multiline strings, now the \r or \n characters are
Daniel part of the string. Converting them changes the behavior
Daniel and meaning of the program. This is very tricky.

Any code that relies on this behaviour is very dodgy IMHO.
Well I'd certainly agree it isn't platform-independent code.  But where 
is it written that monotone should not support checking in dodgy code?



/larry/
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-24 Thread Brian May
 Larry == Larry Hastings [EMAIL PROTECTED] writes:

Larry Well I'd certainly agree it isn't platform-independent
Larry code.  But where is it written that monotone should not
Larry support checking in dodgy code?

Store the files as binary. Such users obviously don't need end-of-line
conversion anyway.
-- 
Brian May [EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-24 Thread Richard Levitte - VMS Whacker
In message [EMAIL PROTECTED] on Thu, 23 Nov 2006 12:11:06 -0800, Daniel 
Lakeland [EMAIL PROTECTED] said:

dlakelan On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS 
Whacker wrote:
dlakelan 
dlakelan   - We need to convert line endings to the local standard on anything
dlakelan that's assumed to be text on checkout.  This I regard as a fact.
dlakelan (see the problem that some Unixly programs have with embedded \r)
dlakelan 
dlakelan Consider languages like Python that have the ability to
dlakelan create multiline strings, now the \r or \n characters are
dlakelan part of the string. Converting them changes the behavior and
dlakelan meaning of the program. This is very tricky.

Does it really?  So, if I write that little example in a python
program in Windows, using notepad, I should expect my program to
expect differently on Windows than if I wrote that in emacs on a Unix
box and ran it on Unix?  If that is to be *expected*, then I'm
immediately throwing away python for any future plans.

Now, if I have some code elsewhere in the program that expects a
certain type of line ending, then I'm a programmer that only know that
particular platform, and I need to learn something about line ending
formats and true portability.

Consider doing the natural thing and using FTP to transfer the code,
in ASCII mode (well, it's source, so it's text, right?).  Then I can
watch my program go *bamf* until I fix it.

dlakelan Example:
dlakelan 
dlakelan mystring = This string
dlakelan Has several
dlakelan New line characters
dlakelan embedded in it
dlakelan suppose the contents were executable code
dlakelan embedded in this string
dlakelan can we safely convert the newlines?
dlakelan No
dlakelan 

Cheers,
Richard

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up.
-- C.S. Lewis


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Nathaniel Smith
On Tue, Nov 21, 2006 at 11:14:45PM -0700, Derek Scherger wrote:
 So, would this be better as a (shared and versioned) project policy 
 entry with line ending styles specified by file name patterns. It seems 
 like it would handle the case of added files, that match some policy 
 pattern, better. I'm not sure how policy *changes* would work though. 
 i.e. suppose the eol policy for .sh files is set to LF after a few .sh 
 files have already been added incorrectly. How do the pre-existing files 
 get fixed up?

Door C: .mtn-autoprops?

-- Nathaniel

-- 
- Don't let your informants burn anything.
- Don't grow old.
- Be good grad students.
  -- advice of Murray B. Emeneau on the occasion of his 100th birthday


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Richard Levitte - VMS Whacker
In message [EMAIL PROTECTED] on Tue, 21 Nov 2006 23:59:41 -0800, Justin 
Patrin [EMAIL PROTECTED] said:

papercrane I haven't read the line endings with 0.31 thread yet
papercrane but...ugh. Is it really necessary to mangle line endings
papercrane when checking out files? I mean reallyshouldn't people
papercrane just use a capable text editor if they're contributing to
papercrane a project?

If it was as easy as the editor.  Trouble is, different systems have
different standards, and a lot of programmers know only one of the
systems with no understanding of the rest of the world (this goes for
Windows, Unix and VMS programmers alike, and I think this discussion
shows it).  So far, I've seen editors make a mess (think notepad.exe),
at least one shell (/bin/sh on Solaris) barf all ovre the place when
it sniffs the presence of a CR, and at least one C compiler (don't
recall which, but it was fairly recent) do the same.

As soon as you're dealing with software that transfers files between
different platforms, this becomes the eternal problem to deal with.
FTP had to.  Editors are typically NOT the kind of software that
should need to deal with this kind of problem, because editors do NOT
typically transfer files between different platforms.  Same goes for C
compilers, shells and so on.  You can't blame them for being fed
something that completely unexpected for the system they live in.
Sticking our heads in the sand doesn't change this.

My point is, it's really up to monotone to do something that's at
least sensible in most of the cases.  Right now, as soon as you start
dealing with line endings (which is what you do as soon as you hack
the lua function get_linesep_conv()), you take a shot at screwing up,
royally.

There are a few proposals I actually liked, and most of all, the
fella' that suggested monotone could check that line endings are
consistent for anything it suspects being text.

Basically, it comes down to a few itams, some of them I regard as
fact, others I regard as questions:

 - We need to treat files as binary unless told otherwise.  This I
   regard as a fact.  (see the problem with screwed up files without
   the user knowing about it)

 - We need to mark text files as such.  This I regard as fact, and it
   seemt to me like this is almost concensus.

 - We need to convert line endings to the local standard on anything
   that's assumed to be text on checkout.  This I regard as a fact.
   (see the problem that some Unixly programs have with embedded \r)

 - We need to make a choice, either we treat all files as binary and
   only mark them as text and what line ending they seem to go by, or
   we need to convert to some internal line ending standard.  It seems
   to me this is still a question, although most seem to lean toward
   an internal line ending standard, which is what monotone does now.

 - IF we go for an internal line ending standard, we need to CHOOSE
   one and stick with it, not have the user choose one for us.  I
   don't currently recall if it is already this way today or if we're
   relying on the first element returned by get_linesep_conv().  If
   it's the latter, we need to stop that.  This I regard as fact.

The rest, such as merge problems to deal with, will come and will have
to be treated when they do.  But first, we need to make decisions and
stick by them.  The discussion on line endings has popped up a little
now and then, and been left off with a few question marks and nothing
else happening, just to come up again a few months later.  It's time
things get decided upon so we can actually get the work done, and I
don't believe in someone just doing and that be the winning thing,
because months later, there's gonna be a whiner who says we f*cked up
royally.  Let's get it right and reach concensus instead, well
grounded into are minds and our wills.

So, anything I forgot?

Cheers,
Richard

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up.
-- C.S. Lewis


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Richard Levitte - VMS Whacker
In message [EMAIL PROTECTED] on Wed, 22 Nov 2006 01:31:41 -0800, Nathaniel 
Smith [EMAIL PROTECTED] said:

njs Door C: .mtn-autoprops?

Sounds like a good idea to me!  (well, until there's something better,
of course)

Cheers,
Richard

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up.
-- C.S. Lewis


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread hendrik
On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote:
 In message [EMAIL PROTECTED] on Tue, 21 Nov 2006 23:59:41 -0800, Justin 
 Patrin [EMAIL PROTECTED] said:
 
 papercrane I haven't read the line endings with 0.31 thread yet
 papercrane but...ugh. Is it really necessary to mangle line endings
 papercrane when checking out files? I mean reallyshouldn't people
 papercrane just use a capable text editor if they're contributing to
 papercrane a project?
 
 If it was as easy as the editor.  Trouble is, different systems have
 different standards, and a lot of programmers know only one of the
 systems with no understanding of the rest of the world (this goes for
 Windows, Unix and VMS programmers alike, and I think this discussion
 shows it).  So far, I've seen editors make a mess (think notepad.exe),
 at least one shell (/bin/sh on Solaris) barf all ovre the place when
 it sniffs the presence of a CR, and at least one C compiler (don't
 recall which, but it was fairly recent) do the same.
 
 As soon as you're dealing with software that transfers files between
 different platforms, this becomes the eternal problem to deal with.
 FTP had to.  Editors are typically NOT the kind of software that
 should need to deal with this kind of problem, because editors do NOT
 typically transfer files between different platforms.  Same goes for C
 compilers, shells and so on.  You can't blame them for being fed
 something that completely unexpected for the system they live in.
 Sticking our heads in the sand doesn't change this.
 
 My point is, it's really up to monotone to do something that's at
 least sensible in most of the cases.  Right now, as soon as you start
 dealing with line endings (which is what you do as soon as you hack
 the lua function get_linesep_conv()), you take a shot at screwing up,
 royally.
 
 There are a few proposals I actually liked, and most of all, the
 fella' that suggested monotone could check that line endings are
 consistent for anything it suspects being text.
 
 Basically, it comes down to a few itams, some of them I regard as
 fact, others I regard as questions:
 
  - We need to treat files as binary unless told otherwise.  This I
regard as a fact.  (see the problem with screwed up files without
the user knowing about it)

Agree.  This is an essential safety constraint.

 
  - We need to mark text files as such.  This I regard as fact, and it
seemt to me like this is almost concensus.

Agree.

 
  - We need to convert line endings to the local standard on anything
that's assumed to be text on checkout.  This I regard as a fact.
(see the problem that some Unixly programs have with embedded \r)

This seems obvious, but I have some discomfort with the idea.  Perhaps 
because I'm thinking of the wider issues involved in character set 
incompatibility.  IN any case, conversion on checkout should be 
overridable in some way. 

  - We need to make a choice, either we treat all files as binary and
only mark them as text and what line ending they seem to go by, or
we need to convert to some internal line ending standard.  It seems
to me this is still a question, although most seem to lean toward
an internal line ending standard, which is what monotone does now.

  - IF we go for an internal line ending standard, we need to CHOOSE
one and stick with it, not have the user choose one for us.  I
don't currently recall if it is already this way today or if we're
relying on the first element returned by get_linesep_conv().  If
it's the latter, we need to stop that.  This I regard as fact.

If we use an internal line ending standard, we should consider the 
possibility of using the standard newline character NEL, Next Line, 
0x85, unicode U+0085.

 The rest, such as merge problems to deal with, will come and will have
 to be treated when they do.  But first, we need to make decisions and
 stick by them.  The discussion on line endings has popped up a little
 now and then, and been left off with a few question marks and nothing
 else happening, just to come up again a few months later.  It's time
 things get decided upon so we can actually get the work done, and I
 don't believe in someone just doing and that be the winning thing,
 because months later, there's gonna be a whiner who says we f*cked up
 royally.

whiner is almost an anagram of winner :-)

 Let's get it right and reach consensus instead, well
 grounded into are minds and our wills.

To get it really well-grounded, we might also consider it in the context 
of character set conversion.  Points that are easy to overlook with 
respect to line endings may be glaringly obvious in this larger context.  
Even if we don't solve the larger context, it may make decisions clear 
with the smaller one.

 
 So, anything I forgot?

Just how do we mark files as being text in the data base?  Will it 
conceptially be part of the checked-in revision, and editable and 
mergible like 

Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Richard Levitte - VMS Whacker
In message [EMAIL PROTECTED] on Wed, 22 Nov 2006 10:05:06 -0500, [EMAIL 
PROTECTED] said:

hendrik   - We need to treat files as binary unless told otherwise.
hendrik This I regard as a fact.  (see the problem with screwed
hendrik up files without the user knowing about it)
hendrik 
hendrik Agree.  This is an essential safety constraint.

Good.

hendrik   - We need to mark text files as such.  This I regard as
hendrik fact, and it seemt to me like this is almost concensus.
hendrik 
hendrik Agree.

Good.

hendrik   - We need to convert line endings to the local standard on
hendrik anything that's assumed to be text on checkout.  This I
hendrik regard as a fact.  (see the problem that some Unixly
hendrik programs have with embedded \r)
hendrik 
hendrik This seems obvious, but I have some discomfort with the idea.
hendrik Perhaps because I'm thinking of the wider issues involved in
hendrik character set incompatibility.  IN any case, conversion on
hendrik checkout should be overridable in some way.

Aye, I hear ya.  Character set incompatibility (and conversion) is a
bigger can of worms.  I believe we can handle that separately.  We do
need to handle that as well at some point.  But for now, I'd like to
keep it small and manageable and focus on line ends, if for nothing
else then *so we get something done*.

hendrik If we use an internal line ending standard, we should
hendrik consider the possibility of using the standard newline
hendrik character NEL, Next Line, 0x85, unicode U+0085.

Now, there's an idea...  Just be ready for getting pounced over that
one, that would mean some rather big changes, me thinks.

hendrik  Let's get it right and reach consensus instead, well
hendrik  grounded into are minds and our wills.
hendrik 
hendrik To get it really well-grounded, we might also consider it in
hendrik the context of character set conversion.  Points that are
hendrik easy to overlook with respect to line endings may be
hendrik glaringly obvious in this larger context.  Even if we don't
hendrik solve the larger context, it may make decisions clear with
hendrik the smaller one.

I'm all for hearing about things that will help resolve the smaller
issue first.

hendrik  So, anything I forgot?
hendrik 
hendrik Just how do we mark files as being text in the data base?
hendrik Will it conceptially be part of the checked-in revision, and
hendrik editable and mergible like anything else?

I was imagining an attribute.  They can be set, fetched and dropped.
Attributes are conceptually part of the checked-in revision.  I guess
the biggest thing to resolve is what happens if that attribute changes
for some reason, and that includes merging where the atribute value
doesn't match in both parents.

Come to think of he, how does merging of non-matching attributes work
today?  For example, when a file is marked as executable in one
revision but not in the other, and a merge is performed down the line?

hendrik Just how does the user mark files as being text?  A specific
hendrik parameter on initial checkin, to be changed later on checkin?
hendrik A default for new files based on the last few letters of the
hendrik name?  A sanity check whether the file is really of the type
hendrik claimed?

Answered in order:  1,2) I propose an attribute, 3) a default of some
sort helps as well, and doesn't stop the user from making changes (njs
proposed .mtn-autoprops, and we already have something that detects
binary...  sorry, manual_merge files), 4) well, as sanity check might
be a good thing, if we think it's needed.  We don't have a sanity
check of that sort today, though...

hendrik Can we uncompress compressed files so as top better
hendrik diff/merge the contents and recompress on checkout?  This
hendrik might be very helpful for openoffice files.

Uhmm, I seriously thing this is way out of monotone's scope.

hendrik How do we handle the transition between the current
hendrik conventions and the new ones?

We currently do not convert by default.  When the lua function
get_linesep_conv() is undefined, monotone works as if it returned
{'LF','LF'}, and when the two elements are exactly the same, monotone
does no conversion whatsoever.  None at all, as far as I remember (I
can't be arsed to look at the code for the moment, so find out on your
own or wait a day or two for me to confirm or take back that statement
:-)).

hendrik Are we currently storing files as unicode or UTF-8?  (I think
hendrik only admin information such as file names)  Should we store
hendrik text files as UTF-8?

Uh, I'm under the impression that we store as UTF-8.  Damn, it's
been too long since I looked at the code, my DRAM apparently needs a
refresh...

Cheers,
Richard

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

When I became a 

Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Nuno Lucas

On 11/22/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

If we use an internal line ending standard, we should consider the
possibility of using the standard newline character NEL, Next Line,
0x85, unicode U+0085.


You are forgetting I can (and actually I am) versioning C files with
text comments using some code page other than ASCII (in my case
IBM-860, because it's a port from a MS-DOS program, and the original
programmer was Portuguese).

So, I have lot's of comments with '\x85'. If your idea goes ahead,
suddenly the project will become corrupt, because C++ style comments
suddenly wrap to the next line.


Are we currently storing files as unicode or UTF-8?  (I think only admin
information such as file names)  Should we store text files as
UTF-8?


Don't mix character encoding problems with the end-of-line issue. They
are very different beasts.

For example, I can have a directory with many different translations
of a document (in text, off course), each one with it's own encoding.
While I would be happy if checkout handles line endings automatically
for me, I would  be very surprised if it decides to handle the text
encoding.

My current project uses ISO-8859-15 (because it's an embedded device),
but I develop in a UTF-8 environment (a standard desktop linux
distro), so all text on the source must be ISO-8859-15, not UTF-8.

In my opinion, should be up to the user to know how to handle the text
encoding, not monotone.

I mostly agree with the rest of your points, though.


Best regards,
~Nuno Lucas


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Thomas Moschny
On Wednesday 22 November 2006 22:33, Nuno Lucas wrote:
 Don't mix character encoding problems with the end-of-line issue. They
 are very different beasts.

But in order to know what you are doing when converting different types of eol 
into each other, you have to know what the encoding of the file is, not?

Or, at least, you have to know whether it is encoded in one of the many 
encodings that extend ascii in the one way or the other.

- Thomas



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread hendrik
On Wed, Nov 22, 2006 at 09:33:33PM +, Nuno Lucas wrote:
 On 11/22/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 If we use an internal line ending standard, we should consider the
 possibility of using the standard newline character NEL, Next Line,
 0x85, unicode U+0085.
 
 You are forgetting I can (and actually I am) versioning C files with
 text comments using some code page other than ASCII (in my case
 IBM-860, because it's a port from a MS-DOS program, and the original
 programmer was Portuguese).

Does that mean that you have C code in ASCII with comments embedded in a 
completely different characte set?

 
 So, I have lot's of comments with '\x85'. If your idea goes ahead,
 suddenly the project will become corrupt, because C++ style comments
 suddenly wrap to the next line.

Just curious -- is IBM 860 some variety of EBCDIC?  And is the file 
record-structured so that all 256 character codes are available (in 
principle) for text other than newlines?  So that as far as character 
coding is concerner, end-of-line is handled by a form of out-of-oband 
signalling?

 Are we currently storing files as unicode or UTF-8?  (I think only admin
 information such as file names)  Should we store text files as
 UTF-8?
 
 Don't mix character encoding problems with the end-of-line issue. They
 are very different beasts.

I think that end-if-line coding is one of the simplest character-coding 
issues.

 
 For example, I can have a directory with many different translations
 of a document (in text, off course), each one with it's own encoding.
 While I would be happy if checkout handles line endings automatically
 for me, I would  be very surprised if it decides to handle the text
 encoding.

Do we have a situation in which each file has its own encoding?  Or one 
in which different parts of a file have different encodings?

 
 My current project uses ISO-8859-15 (because it's an embedded device),
 but I develop in a UTF-8 environment (a standard desktop linux
 distro), so all text on the source must be ISO-8859-15, not UTF-8.
 
 In my opinion, should be up to the user to know how to handle the text
 encoding, not monotone.
 
 I mostly agree with the rest of your points, though.
 
 
 Best regards,
 ~Nuno Lucas
 
 
 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread hendrik
On Wed, Nov 22, 2006 at 08:05:31PM +0100, Richard Levitte - VMS Whacker wrote:
 
 hendrik Can we uncompress compressed files so as top better
 hendrik diff/merge the contents and recompress on checkout?  This
 hendrik might be very helpful for openoffice files.
 
 Uhmm, I seriously thing this is way out of monotone's scope.

This week, anyway ...

But it is a real problem.

Last time I looked at a OpenOffice file, it was a compressed zip archive 
containing several files
the actual text coded as XML
the style sheet for presenting said XML
a few other things, maybe the DTD, I forget.
But we won't be able to properly merge these files unless we do 
unarchive and rearchive them.

I suspect with the absurd bulkiness of XML notation, this won't be the 
last case we encounter where compression interferes with merging.

-- hendrik



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Nuno Lucas

On 11/22/06, Thomas Moschny [EMAIL PROTECTED] wrote:

On Wednesday 22 November 2006 22:33, Nuno Lucas wrote:
 Don't mix character encoding problems with the end-of-line issue. They
 are very different beasts.

But in order to know what you are doing when converting different types of eol
into each other, you have to know what the encoding of the file is, not?

Or, at least, you have to know whether it is encoded in one of the many
encodings that extend ascii in the one way or the other.


Line endings don't have a direct relation to character encoding. It's
true that in theory you would need to know the character encoding to
know what a line ending is (like the mentioned line ending Unicode
character), but in practice there are only 3 standard line endings
(LF, CR-LF and CR) and if some file uses any other you would need to
use a special program for it, so it's better to treat the file as
binary.

An ASCII text can use any of the 3 line-endings. Some with an UTF-8
text, ISO-8859-1, or any other. No way to know the line ending by the
character encoding.


Regards,
~Nuno Lucas


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-22 Thread Nuno Lucas

On 11/22/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Does that mean that you have C code in ASCII with comments embedded in a
completely different characte set?


What I called IBM-860 is just a variation of ASCII. It's the same as
having an UTF-8 C source file with comments in a foreigner language
(e.g., with accents). IBM-437 (or CP-437) is the US-English code page,
and IBM-850 (or CP-850) is the multi-lingual code-page (contains most
of the graphics symbols 437 has, but lacks things like uppercase
accented vowels).


Just curious -- is IBM 860 some variety of EBCDIC?  And is the file
record-structured so that all 256 character codes are available (in
principle) for text other than newlines?  So that as far as character
coding is concerner, end-of-line is handled by a form of out-of-oband
signalling?


To clarify a bit, IBM-860 is the same as ASCII for the ASCII part
(0..127) and characters for the portuguese locale in the above 128
characters (mostly accented vowels).

If you have an old matrix printer around (or something like a receipt
printer), take a look on the apendix pages, where all this code pages
usually are.


 For example, I can have a directory with many different translations
 of a document (in text, off course), each one with it's own encoding.
 While I would be happy if checkout handles line endings automatically
 for me, I would  be very surprised if it decides to handle the text
 encoding.

Do we have a situation in which each file has its own encoding?  Or one
in which different parts of a file have different encodings?


I was talking of the case of different file encodings, but I forgot
the case of different encodings on the same file, which is not as rare
as you may think. in my case there are 3 languages the program
talks, and all strings are defined on a single C source file
(remember this was an old DOS application). They all use ISO-8859-1
(actually they were converted from IBM-860 too), but that is just
because the characters needed are all there (all western european
languages, namely portuguese, french, spanish and an incomplete -
unused - english one).


Regards,
~Nuno Lucas


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings as project policy

2006-11-21 Thread Justin Patrin

On 11/21/06, Derek Scherger [EMAIL PROTECTED] wrote:

Having read the line endings with 0.31 thread and having used the
svn:eol-style property I have a vague feeling that line endings may be
better specified using some aspect of the (yet to be defined) project
policy stuff.

In an svn project that I work on, we generally set svn:eol-style to LF
for .sh files (otherwise #! /bin/bash ends up being #! /bin/bash\r
on windows and don't run on cygwin). The problem with this is that every
time someone adds a new shell script they have to remember to set the
eol-style, or some other unlucky person gets to find this bug and fix
it. This happens over and over again.

Similarly, for .c files it can be painful to watch people fighting over
line endings. One commit is CRLF, the next is LF, back and forth it
goes. This must greatly reduce repository compression and does render
diff(1) more or less useless, as often every line in the file has changed.

Subversion has something called autoprops which allows you to specify
which properties to set based on file name patterns, so that when new
files are added they can be assigned the correct eol-style property.
However, this is an unversioned client side setting that every committer
must have set correctly and must keep up to date. In practice this
doesn't scale to even moderately sized teams. In monotone this would be
like having the setting in a lua hook and expecting everyone on the team
to have the same hook.

So, would this be better as a (shared and versioned) project policy
entry with line ending styles specified by file name patterns. It seems
like it would handle the case of added files, that match some policy
pattern, better. I'm not sure how policy *changes* would work though.
i.e. suppose the eol policy for .sh files is set to LF after a few .sh
files have already been added incorrectly. How do the pre-existing files
get fixed up?



Ugh.

I haven't read the line endings with 0.31 thread yet but...ugh. Is it
really necessary to mangle line endings when checking out files? I
mean reallyshouldn't people just use a capable text editor if
they're contributing to a project?

--
Justin Patrin


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-07 Thread Nathaniel Smith
On Thu, Feb 02, 2006 at 08:10:36AM -0500, Yury Polyanskiy wrote:
 And of course it'd be MUCH better to have get_linesep_conv() NOT called
 for binary files (or at least for manual_merge files as a quick hack).

For a bit of history... when manual_merge was added, the argument was
to explicitly _avoid_ trying to come up with the Ultimate Design For
All Things Content Sensitive.  The argument was that manual merging
was on particular thing one might want (even on files that are
textual!  E.g., if you have a special-purpose merger for some XML
documents).  So manual_merge and line ending conversion should perhaps
be kept orthogonal.

-- Nathaniel

-- 
.i dei jitfa fanmo xatra


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-02 Thread Yury Polyanskiy
On Thu, 2006-02-02 at 07:42 +0100, Richard Levitte - VMS Whacker wrote:
 I must say that I feel nervous handing the decision to a lua hook.
 There's the strong possibility that a user somewhere will get
 creative and that chaos will follow.  If there's an attribute saying
 that a file shouldn't be transformed, the internals of monotone should
 be able to detect that and avoid doing anything with that file.

Agree-agree! Because everything about the project should be commitable
and should be stored in db. 

And of course it'd be MUCH better to have get_linesep_conv() NOT called
for binary files (or at least for manual_merge files as a quick hack).

Cheers,
-up



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-02 Thread Glen Ditchfield
On Wednesday 01 February 2006 21:41, Ethan Blanton wrote:
 With this condition, a correct definition of the end-of-line sequence
 for the current platform,
 
I don't think Monotone should count on files obeying the platform's 
end-of-line sequence.  I mostly work in Windows, but even so I have to work 
with LF-only files fairly often.  Today I had to debug an Adobe FDF file that 
had LF line ends, even though the web site that generated it and the web 
browser that saved it were both on Windows machines.  If I were to save that 
into Monotone, I think I would want it treated as text, not binary.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-01 Thread Ethan Blanton
Howard Spindel spake unto us the following wisdom:
 It seems to me that to be completely general monotone needs a 
 per-file certificate specifying the permissible 
 transformations.  Examples of the settings available in that certificate:
 
 1.  Never transform anything
 2.  Database file is LF line terminated, transform LFs to platform 
 specific in/out
 3.  Database file is CR/LF line terminated, transform CR/LFs to 
 platform specific in/out
 4.  Whatever additional combinations are possible (does monotone have 
 to worry about big-endian vs little-endian ever?)

I think rules like this are not necessarily an improvement.  It is
obvious that no one conversion rule solves everything -- it seems
clear to me that the only 100% Reversible and Safe method of storing
line endings would be to use, e.g., byte stuffing to ensure that there
are no bare internal-line-ending markers in any text files which were
not actually end-of-line markers in the platform on which the file was
created, which have been canonicalized.

With this condition, a correct definition of the end-of-line sequence
for the current platform, and a way to say this file has no
end-of-line markers (is binary), the problem is solved.  Without this
condition, there are a number of heuristics which can be used (and are
used in many systems) to approximate the correct behavior which work
in Most Cases.

Muddying the waters by arbitrarily complicating database
representation (e.g., allowing various end-of-line representations in
the database) does not seem profitable to me.

Ethan

-- 
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
-- Cesare Beccaria, On Crimes and Punishments, 1764


pgp2a6fOEAkeB.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-01 Thread Justin Patrin
On 2/1/06, Ethan Blanton [EMAIL PROTECTED] wrote:

 Muddying the waters by arbitrarily complicating database
 representation (e.g., allowing various end-of-line representations in
 the database) does not seem profitable to me.


I agree. Especially since he missed a third option, which is CR
terminated lines (mac).

--
Justin Patrin


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-01 Thread Yury Polyanskiy
Well I think per-file certificate is something that doesn't fit in
monotone's model (certificates are attached to revisions not files).

In fact I think current implementation with get_linesep_conv() includes
every other model IF only I could check for merge_manual attribute
from inside the hook. As nobody answers I assume monotone doesn't have
internal mapping filename-attributes which can be checked from hook.

BTW, I don't understand all the buzz about bad binary files handling.
Maybe monotone COULD improve heuristics about detecting binary files but
in any case user has complete control over the process by setting
manual_merge attribute.

Overall I see two problems with line endings by now:
(a) conversion should not touch CR's if I want to convert LF's only.
(b) it should be possible to check for manual_merge attribute from hook
(c) default get_linesep_conv() should return {LF, SYSTEM_SEP } for all
files which are not marked by manual_merge attr.

That I think would satisfy everyone and does not (except for b) imply a
great change in current code. 

BTW can anyone knowing rosters stuff say if better attributes handling
may now help with (b) in the above?

Yury.



On Wed, 2006-02-01 at 18:59 -0800, Howard Spindel wrote:
 It seems to me that to be completely general monotone needs a 
 per-file certificate specifying the permissible 
 transformations.  Examples of the settings available in that certificate:
 
 1.  Never transform anything
 2.  Database file is LF line terminated, transform LFs to platform 
 specific in/out
 3.  Database file is CR/LF line terminated, transform CR/LFs to 
 platform specific in/out
 4.  Whatever additional combinations are possible (does monotone have 
 to worry about big-endian vs little-endian ever?)
 
 Platform specific could be determined by system call, overrideable by hook.
 
 Monotone would request the correct setting on initial file 
 check-in.  For backwards compatibility, if the certificate does not 
 exist monotone does whatever it does now but requests the correct 
 setting on next file check-in.
 
 The problem I see in the discussions so far is not in knowing what to 
 do on the platform you're on, but in knowing what's stored in the 
 database.  Monotone could use heuristics to determine that, but if 
 the heuristics fail the user gets a big, maybe nasty surprise.
 
 In a project development environment where everyone uses the same OS, 
 never transform anything should work well for all files and perhaps 
 there should be a way to tell monotone that this case exists.
 
 Since I know nothing of monotone internals, I acknowledge that I 
 could be way off-base on something here.
 
 Howard
 
 
 
 
 
 
 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings

2006-02-01 Thread Richard Levitte - VMS Whacker
In message [EMAIL PROTECTED] on Wed, 01 Feb 2006 18:59:12 -0800, Howard 
Spindel [EMAIL PROTECTED] said:

howard 2.  Database file is LF line terminated, transform LFs to
howard platform specific in/out
howard 3.  Database file is CR/LF line terminated, transform CR/LFs
howard to platform specific in/out

I think we established a while ago that there should only be one
possible line ending in the database.  Consider what happens when two
databases with different line ending standards are synchronised.

Cheers,
Richard

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up.
-- C.S. Lewis


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] line endings and binary files revisited...

2005-05-21 Thread Nathaniel Smith
On Sun, May 22, 2005 at 01:56:35AM +0200, Richard Levitte - VMS Whacker wrote:
 I have the feeling that however good, there's always a possibility
 that a non-text files be interpreted as text files.  As long as you're
 playing entirely in Unix, it's not a problem, since the line ending is
 a \n, so there will be no conversion.  On Windows or on Mac, where the
 line endine is \r\n and \r respectively, the matter is different.
 
 I'm thinking that files should be stored entirely unchanged in the
 database, and *possibly* be interpreted on output.  It has the benefit
 of not being destructive, which I think would be a good thing...

The current policy is:
  -- by default, don't touch line endings at all.  In fact, don't
 touch data at all, we store uninterpreted bytestreams.
  -- if the user requests, they can have line endings converted one
 way going out of their working copy, and back again when going
 into their working copy.  I don't know if there's a way to make
 this apply only to specific files, though.

-- Nathaniel

-- 
Eternity is very long, especially towards the end.
  -- Woody Allen


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel