Re: [Monotone-devel] line endings as project policy
Sorry for prolonging this thread. Line ending conversion affects my "use case". I keep bash scripts in mtn that I copy by hand to other computers, and I generally use sha1sum to make sure that said computers have a certain version of the scripts installed (comparing their SHA1 against the log). The fact that mtn identifies file versions by their SHA1 is very elegant and probably the feature that caught my eye when I was shopping for a SCM. I think that having "each version is identified by the SHA1 of what YOU commit" is inherently better than "each version is identified by the SHA1 of the UTF-8 (or whatever format chosen) of what you commit". The fact that I have an external tool that agrees with monotone's idea of a file's contents gives me confidence in monotone. cheers nicolás Ulf Ochsenfahrt wrote: > This line ending thing is getting far too much attention, IMHO. My last > word on this issue is: > > - Whatever I check in, I want checked out > > - What I'd like to see is a setting where monotone checks on commit if > the files obey a particular line ending convention/charset and gives a > warning if they don't > > > I don't want any automatic conversion of line endings or charsets. IMHO, > charsets are much too fragile and dangerous to be handled by monotone. > And line ending conversion cannot really be separated from charset > handling in the face of non-8-bit encoded charsets. > > That said, I am not opposed to an opt-in mechanism for line > ending/charset handling, as long as its not on by default. > > The CVS way to do it was really, really bad. It messed up my files > several times, with duplicate line endings and with treating binary > files as text. > > Now, there's also another thing, which is a better merge ui, which is > much overdue now... > > Cheers, > > -- Ulf > > > > > ___ > Monotone-devel mailing list > Monotone-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/monotone-devel ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On Sat, Nov 25, 2006 at 08:44:59AM +0100, Richard Levitte - VMS Whacker wrote: > In message <[EMAIL PROTECTED]> on Thu, 23 Nov 2006 12:11:06 -0800, Daniel > Lakeland <[EMAIL PROTECTED]> said: > > dlakelan> On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS > Whacker wrote: > dlakelan> > dlakelan> > - We need to convert line endings to the local standard on > anything > dlakelan> >that's assumed to be text on checkout. This I regard as a > fact. > dlakelan> >(see the problem that some Unixly programs have with embedded > \r) > dlakelan> > dlakelan> Consider languages like Python that have the ability to > dlakelan> create multiline strings, now the \r or \n characters are > dlakelan> part of the string. Converting them changes the behavior and > dlakelan> meaning of the program. This is very tricky. > > Does it really? So, if I write that little example in a python > program in Windows, using notepad, I should expect my program to > expect differently on Windows than if I wrote that in emacs on a Unix > box and ran it on Unix? If that is to be *expected*, then I'm > immediately throwing away python for any future plans. I'm not quite sure what you mean, but basically the string continues over several lines and the newline characters become part of the string. Therefore, for example, if your string contains some data that is exactly what you're looking for in the output of another program, you'll be surprised when monotone alters the line endings in your string and your python program doesn't match the output of the other program it interfaces with. Yes, as someone said, this is dodgy code. But nevertheless I don't know why monotone should be altering the content of any text that it checks in. It WILL cause problems eventually. -- Daniel Lakeland [EMAIL PROTECTED] http://www.street-artists.org/~dlakelan ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
> "Richard" == Richard Levitte <- VMS Whacker <[EMAIL PROTECTED]>> writes: Richard> So, what I hear you say is that we should remove all the conversion Richard> stuff that exists today. Is that correct? What conversion stuff exists today? I thought there was none? -- Brian May <[EMAIL PROTECTED]> ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
So, what I hear you say is that we should remove all the conversion stuff that exists today. Is that correct? Cheers, Richard In message <[EMAIL PROTECTED]> on Sat, 25 Nov 2006 13:52:48 +0100, Ulf Ochsenfahrt <[EMAIL PROTECTED]> said: ulf> Hi all, ulf> ulf> This line ending thing is getting far too much attention, IMHO. My last ulf> word on this issue is: ulf> ulf> - Whatever I check in, I want checked out ulf> ulf> - What I'd like to see is a setting where monotone checks on commit if ulf> the files obey a particular line ending convention/charset and gives a ulf> warning if they don't ulf> ulf> ulf> I don't want any automatic conversion of line endings or charsets. IMHO, ulf> charsets are much too fragile and dangerous to be handled by monotone. ulf> And line ending conversion cannot really be separated from charset ulf> handling in the face of non-8-bit encoded charsets. ulf> ulf> That said, I am not opposed to an opt-in mechanism for line ulf> ending/charset handling, as long as its not on by default. ulf> ulf> The CVS way to do it was really, really bad. It messed up my files ulf> several times, with duplicate line endings and with treating binary ulf> files as text. ulf> ulf> Now, there's also another thing, which is a better merge ui, which is ulf> much overdue now... - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
or is it "hear hear" ? ;-) RS On 11/25/06, Rob Schoening <[EMAIL PROTECTED]> wrote: Here here. RS On 11/25/06, Ulf Ochsenfahrt <[EMAIL PROTECTED]> wrote: > Hi all, > > This line ending thing is getting far too much attention, IMHO. My last > word on this issue is: > > - Whatever I check in, I want checked out > > - What I'd like to see is a setting where monotone checks on commit if > the files obey a particular line ending convention/charset and gives a > warning if they don't > > > I don't want any automatic conversion of line endings or charsets. IMHO, > charsets are much too fragile and dangerous to be handled by monotone. > And line ending conversion cannot really be separated from charset > handling in the face of non-8-bit encoded charsets. > > That said, I am not opposed to an opt-in mechanism for line > ending/charset handling, as long as its not on by default. > > The CVS way to do it was really, really bad. It messed up my files > several times, with duplicate line endings and with treating binary > files as text. > > Now, there's also another thing, which is a better merge ui, which is > much overdue now... > > Cheers, > > -- Ulf > > > ___ > Monotone-devel mailing list > Monotone-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/monotone-devel > > > > ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
Here here. RS On 11/25/06, Ulf Ochsenfahrt <[EMAIL PROTECTED]> wrote: Hi all, This line ending thing is getting far too much attention, IMHO. My last word on this issue is: - Whatever I check in, I want checked out - What I'd like to see is a setting where monotone checks on commit if the files obey a particular line ending convention/charset and gives a warning if they don't I don't want any automatic conversion of line endings or charsets. IMHO, charsets are much too fragile and dangerous to be handled by monotone. And line ending conversion cannot really be separated from charset handling in the face of non-8-bit encoded charsets. That said, I am not opposed to an opt-in mechanism for line ending/charset handling, as long as its not on by default. The CVS way to do it was really, really bad. It messed up my files several times, with duplicate line endings and with treating binary files as text. Now, there's also another thing, which is a better merge ui, which is much overdue now... Cheers, -- Ulf ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
Hi all, This line ending thing is getting far too much attention, IMHO. My last word on this issue is: - Whatever I check in, I want checked out - What I'd like to see is a setting where monotone checks on commit if the files obey a particular line ending convention/charset and gives a warning if they don't I don't want any automatic conversion of line endings or charsets. IMHO, charsets are much too fragile and dangerous to be handled by monotone. And line ending conversion cannot really be separated from charset handling in the face of non-8-bit encoded charsets. That said, I am not opposed to an opt-in mechanism for line ending/charset handling, as long as its not on by default. The CVS way to do it was really, really bad. It messed up my files several times, with duplicate line endings and with treating binary files as text. Now, there's also another thing, which is a better merge ui, which is much overdue now... Cheers, -- Ulf smime.p7s Description: S/MIME Cryptographic Signature ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
Larry Hastings wrote: Ulf Ochsenfahrt wrote: Yes, but UTF-8 is a _multi-byte_ encoding. If you see an LF byte, you don't know whether this is a single-byte LF or part of a multi-byte sequence. Yes you do, because all multi-byte character sequences in UTF-8 have the high-bit set. If you see 0x0A in a UTF-8 stream you can be certain it /is/ an LF and /not/ part of a multi-byte sequence. I suspected that, which is why I put in the part about NON-8-bit encodings, which you conveniently cut out. -- Ulf smime.p7s Description: S/MIME Cryptographic Signature ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
In message <[EMAIL PROTECTED]> on Thu, 23 Nov 2006 12:11:06 -0800, Daniel Lakeland <[EMAIL PROTECTED]> said: dlakelan> On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote: dlakelan> dlakelan> > - We need to convert line endings to the local standard on anything dlakelan> >that's assumed to be text on checkout. This I regard as a fact. dlakelan> >(see the problem that some Unixly programs have with embedded \r) dlakelan> dlakelan> Consider languages like Python that have the ability to dlakelan> create multiline strings, now the \r or \n characters are dlakelan> part of the string. Converting them changes the behavior and dlakelan> meaning of the program. This is very tricky. Does it really? So, if I write that little example in a python program in Windows, using notepad, I should expect my program to expect differently on Windows than if I wrote that in emacs on a Unix box and ran it on Unix? If that is to be *expected*, then I'm immediately throwing away python for any future plans. Now, if I have some code elsewhere in the program that expects a certain type of line ending, then I'm a programmer that only know that particular platform, and I need to learn something about line ending formats and true portability. Consider doing the natural thing and using FTP to transfer the code, in ASCII mode (well, it's source, so it's text, right?). Then I can watch my program go *bamf* until I fix it. dlakelan> Example: dlakelan> dlakelan> mystring = """This string dlakelan> Has several dlakelan> New line characters dlakelan> embedded in it dlakelan> suppose the contents were executable code dlakelan> embedded in this string dlakelan> can we safely convert the newlines? dlakelan> No dlakelan> """ Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
> "Larry" == Larry Hastings <[EMAIL PROTECTED]> writes: Larry> Well I'd certainly agree it isn't platform-independent Larry> code. But where is it written that monotone should not Larry> support checking in "dodgy" code? Store the files as binary. Such users obviously don't need end-of-line conversion anyway. -- Brian May <[EMAIL PROTECTED]> ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
Ulf Ochsenfahrt wrote: Yes, but UTF-8 is a _multi-byte_ encoding. If you see an LF byte, you don't know whether this is a single-byte LF or part of a multi-byte sequence. Yes you do, because all multi-byte character sequences in UTF-8 have the high-bit set. If you see 0x0A in a UTF-8 stream you can be certain it /is/ an LF and /not/ part of a multi-byte sequence. http://en.wikipedia.org/wiki/Utf-8#Description Brian May wrote: "Daniel" == Daniel Lakeland <[EMAIL PROTECTED]> writes: Daniel> Consider languages like Python that have the ability to Daniel> create multiline strings, now the \r or \n characters are Daniel> part of the string. Converting them changes the behavior Daniel> and meaning of the program. This is very tricky. Any code that relies on this behaviour is very dodgy IMHO. Well I'd certainly agree it isn't platform-independent code. But where is it written that monotone should not support checking in "dodgy" code? /larry/ ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
Nuno Lucas wrote: Line endings don't have a direct relation to character encoding. It's true that in theory you would need to know the character encoding to know what a line ending is (like the mentioned line ending Unicode character), but in practice there are only 3 "standard" line endings (LF, CR-LF and CR) and if some file uses any other you would need to use a special program for it, so it's better to treat the file as binary. An ASCII text can use any of the 3 line-endings. Some with an UTF-8 text, ISO-8859-1, or any other. No way to know the line ending by the character encoding. Yes, but UTF-8 is a _multi-byte_ encoding. If you see an LF byte, you don't know whether this is a single-byte LF or part of a multi-byte sequence. (I'm not sure if this is a problem with UTF-8 in particular, but it certainly is with 16 or 32-bit encodings, such as UTF-16 and UTF-32.) -- Ulf smime.p7s Description: S/MIME Cryptographic Signature ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On 11/23/06, Daniel Lakeland <[EMAIL PROTECTED]> wrote: On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote: > - We need to convert line endings to the local standard on anything >that's assumed to be text on checkout. This I regard as a fact. >(see the problem that some Unixly programs have with embedded \r) Consider languages like Python that have the ability to create multiline strings, now the \r or \n characters are part of the string. Converting them changes the behavior and meaning of the program. This is very tricky. Example: mystring = """This string Has several New line characters embedded in it suppose the contents were executable code embedded in this string can we safely convert the newlines? No """ PHP has the same thing, but if I require a certain type of newline I always use escaped chars (\r \n) as you never know what editors/RCS are going to do to your newlines. -- Justin Patrin ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
> "Daniel" == Daniel Lakeland <[EMAIL PROTECTED]> writes: Daniel> Consider languages like Python that have the ability to Daniel> create multiline strings, now the \r or \n characters are Daniel> part of the string. Converting them changes the behavior Daniel> and meaning of the program. This is very tricky. Any code that relies on this behaviour is very dodgy IMHO. -- Brian May <[EMAIL PROTECTED]> ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote: > - We need to convert line endings to the local standard on anything >that's assumed to be text on checkout. This I regard as a fact. >(see the problem that some Unixly programs have with embedded \r) Consider languages like Python that have the ability to create multiline strings, now the \r or \n characters are part of the string. Converting them changes the behavior and meaning of the program. This is very tricky. Example: mystring = """This string Has several New line characters embedded in it suppose the contents were executable code embedded in this string can we safely convert the newlines? No """ -- Daniel Lakeland [EMAIL PROTECTED] http://www.street-artists.org/~dlakelan ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
Just a general comment on this thread. Please don't forget that text files are not the only ones which require special handling. There can be other file formats such as XML which need special merge handling. Clearcase handles this with the file type manager, which allows you to associate a type with each file then specify the behaviour for things like diff, merge and format conversion. Joel On 11/22/06, Nuno Lucas <[EMAIL PROTECTED]> wrote: On 11/22/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Does that mean that you have C code in ASCII with comments embedded in a > completely different characte set? What I called IBM-860 is just a variation of ASCII. It's the same as having an UTF-8 C source file with comments in a foreigner language (e.g., with accents). IBM-437 (or CP-437) is the US-English code page, and IBM-850 (or CP-850) is the multi-lingual code-page (contains most of the graphics symbols 437 has, but lacks things like uppercase accented vowels). > Just curious -- is IBM 860 some variety of EBCDIC? And is the file > record-structured so that all 256 character codes are available (in > principle) for text other than newlines? So that as far as character > coding is concerner, end-of-line is handled by a form of out-of-oband > signalling? To clarify a bit, IBM-860 is the same as ASCII for the ASCII part (0..127) and characters for the portuguese locale in the above 128 characters (mostly accented vowels). If you have an old matrix printer around (or something like a receipt printer), take a look on the apendix pages, where all this code pages usually are. > > For example, I can have a directory with many different translations > > of a document (in text, off course), each one with it's own encoding. > > While I would be happy if checkout handles line endings automatically > > for me, I would be very surprised if it decides to handle the text > > encoding. > > Do we have a situation in which each file has its own encoding? Or one > in which different parts of a file have different encodings? I was talking of the case of different file encodings, but I forgot the case of different encodings on the same file, which is not as rare as you may think. in my case there are 3 languages the program "talks", and all strings are defined on a single C source file (remember this was an old DOS application). They all use ISO-8859-1 (actually they were converted from IBM-860 too), but that is just because the characters needed are all there (all western european languages, namely portuguese, french, spanish and an incomplete - unused - english one). Regards, ~Nuno Lucas ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On 11/22/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Does that mean that you have C code in ASCII with comments embedded in a completely different characte set? What I called IBM-860 is just a variation of ASCII. It's the same as having an UTF-8 C source file with comments in a foreigner language (e.g., with accents). IBM-437 (or CP-437) is the US-English code page, and IBM-850 (or CP-850) is the multi-lingual code-page (contains most of the graphics symbols 437 has, but lacks things like uppercase accented vowels). Just curious -- is IBM 860 some variety of EBCDIC? And is the file record-structured so that all 256 character codes are available (in principle) for text other than newlines? So that as far as character coding is concerner, end-of-line is handled by a form of out-of-oband signalling? To clarify a bit, IBM-860 is the same as ASCII for the ASCII part (0..127) and characters for the portuguese locale in the above 128 characters (mostly accented vowels). If you have an old matrix printer around (or something like a receipt printer), take a look on the apendix pages, where all this code pages usually are. > For example, I can have a directory with many different translations > of a document (in text, off course), each one with it's own encoding. > While I would be happy if checkout handles line endings automatically > for me, I would be very surprised if it decides to handle the text > encoding. Do we have a situation in which each file has its own encoding? Or one in which different parts of a file have different encodings? I was talking of the case of different file encodings, but I forgot the case of different encodings on the same file, which is not as rare as you may think. in my case there are 3 languages the program "talks", and all strings are defined on a single C source file (remember this was an old DOS application). They all use ISO-8859-1 (actually they were converted from IBM-860 too), but that is just because the characters needed are all there (all western european languages, namely portuguese, french, spanish and an incomplete - unused - english one). Regards, ~Nuno Lucas ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On 11/22/06, Thomas Moschny <[EMAIL PROTECTED]> wrote: On Wednesday 22 November 2006 22:33, Nuno Lucas wrote: > Don't mix character encoding problems with the end-of-line issue. They > are very different beasts. But in order to know what you are doing when converting different types of eol into each other, you have to know what the encoding of the file is, not? Or, at least, you have to know whether it is encoded in one of the many encodings that extend ascii in the one way or the other. Line endings don't have a direct relation to character encoding. It's true that in theory you would need to know the character encoding to know what a line ending is (like the mentioned line ending Unicode character), but in practice there are only 3 "standard" line endings (LF, CR-LF and CR) and if some file uses any other you would need to use a special program for it, so it's better to treat the file as binary. An ASCII text can use any of the 3 line-endings. Some with an UTF-8 text, ISO-8859-1, or any other. No way to know the line ending by the character encoding. Regards, ~Nuno Lucas ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On Wed, Nov 22, 2006 at 08:05:31PM +0100, Richard Levitte - VMS Whacker wrote: > > hendrik> Can we uncompress compressed files so as top better > hendrik> diff/merge the contents and recompress on checkout? This > hendrik> might be very helpful for openoffice files. > > Uhmm, I seriously thing this is way out of monotone's scope. This week, anyway ... But it is a real problem. Last time I looked at a OpenOffice file, it was a compressed zip archive containing several files the actual text coded as XML the style sheet for presenting said XML a few other things, maybe the DTD, I forget. But we won't be able to properly merge these files unless we do unarchive and rearchive them. I suspect with the absurd bulkiness of XML notation, this won't be the last case we encounter where compression interferes with merging. -- hendrik ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
In message <[EMAIL PROTECTED]> on Wed, 22 Nov 2006 21:33:33 +, "Nuno Lucas" <[EMAIL PROTECTED]> said: ntlucas> In my opinion, should be up to the user to know how to handle ntlucas> the text encoding, not monotone. As long as all users in the same project agree... Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On Wed, Nov 22, 2006 at 09:33:33PM +, Nuno Lucas wrote: > On 11/22/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > >If we use an internal line ending standard, we should consider the > >possibility of using the standard newline character NEL, "Next Line", > >0x85, unicode U+0085. > > You are forgetting I can (and actually I am) versioning C files with > text comments using some code page other than ASCII (in my case > IBM-860, because it's a port from a MS-DOS program, and the original > programmer was Portuguese). Does that mean that you have C code in ASCII with comments embedded in a completely different characte set? > > So, I have lot's of comments with '\x85'. If your idea goes ahead, > suddenly the project will become corrupt, because C++ style comments > suddenly wrap to the next line. Just curious -- is IBM 860 some variety of EBCDIC? And is the file record-structured so that all 256 character codes are available (in principle) for text other than newlines? So that as far as character coding is concerner, end-of-line is handled by a form of out-of-oband signalling? > >Are we currently storing files as unicode or UTF-8? (I think only admin > >information such as file names) Should we store text files as > >UTF-8? > > Don't mix character encoding problems with the end-of-line issue. They > are very different beasts. I think that end-if-line coding is one of the simplest character-coding issues. > > For example, I can have a directory with many different translations > of a document (in text, off course), each one with it's own encoding. > While I would be happy if checkout handles line endings automatically > for me, I would be very surprised if it decides to handle the text > encoding. Do we have a situation in which each file has its own encoding? Or one in which different parts of a file have different encodings? > > My current project uses ISO-8859-15 (because it's an embedded device), > but I develop in a UTF-8 environment (a standard desktop linux > distro), so all text on the source must be ISO-8859-15, not UTF-8. > > In my opinion, should be up to the user to know how to handle the text > encoding, not monotone. > > I mostly agree with the rest of your points, though. > > > Best regards, > ~Nuno Lucas > > > ___ > Monotone-devel mailing list > Monotone-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/monotone-devel ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On Wednesday 22 November 2006 22:33, Nuno Lucas wrote: > Don't mix character encoding problems with the end-of-line issue. They > are very different beasts. But in order to know what you are doing when converting different types of eol into each other, you have to know what the encoding of the file is, not? Or, at least, you have to know whether it is encoded in one of the many encodings that extend ascii in the one way or the other. - Thomas ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On 11/22/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: If we use an internal line ending standard, we should consider the possibility of using the standard newline character NEL, "Next Line", 0x85, unicode U+0085. You are forgetting I can (and actually I am) versioning C files with text comments using some code page other than ASCII (in my case IBM-860, because it's a port from a MS-DOS program, and the original programmer was Portuguese). So, I have lot's of comments with '\x85'. If your idea goes ahead, suddenly the project will become corrupt, because C++ style comments suddenly wrap to the next line. Are we currently storing files as unicode or UTF-8? (I think only admin information such as file names) Should we store text files as UTF-8? Don't mix character encoding problems with the end-of-line issue. They are very different beasts. For example, I can have a directory with many different translations of a document (in text, off course), each one with it's own encoding. While I would be happy if checkout handles line endings automatically for me, I would be very surprised if it decides to handle the text encoding. My current project uses ISO-8859-15 (because it's an embedded device), but I develop in a UTF-8 environment (a standard desktop linux distro), so all text on the source must be ISO-8859-15, not UTF-8. In my opinion, should be up to the user to know how to handle the text encoding, not monotone. I mostly agree with the rest of your points, though. Best regards, ~Nuno Lucas ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
In message <[EMAIL PROTECTED]> on Wed, 22 Nov 2006 10:05:06 -0500, [EMAIL PROTECTED] said: hendrik> > - We need to treat files as binary unless told otherwise. hendrik> >This I regard as a fact. (see the problem with screwed hendrik> >up files without the user knowing about it) hendrik> hendrik> Agree. This is an essential safety constraint. Good. hendrik> > - We need to mark text files as such. This I regard as hendrik> >fact, and it seemt to me like this is almost concensus. hendrik> hendrik> Agree. Good. hendrik> > - We need to convert line endings to the local standard on hendrik> >anything that's assumed to be text on checkout. This I hendrik> >regard as a fact. (see the problem that some Unixly hendrik> >programs have with embedded \r) hendrik> hendrik> This seems obvious, but I have some discomfort with the idea. hendrik> Perhaps because I'm thinking of the wider issues involved in hendrik> character set incompatibility. IN any case, conversion on hendrik> checkout should be overridable in some way. Aye, I hear ya. Character set incompatibility (and conversion) is a bigger can of worms. I believe we can handle that separately. We do need to handle that as well at some point. But for now, I'd like to keep it small and manageable and focus on line ends, if for nothing else then *so we get something done*. hendrik> If we use an internal line ending standard, we should hendrik> consider the possibility of using the standard newline hendrik> character NEL, "Next Line", 0x85, unicode U+0085. Now, there's an idea... Just be ready for getting pounced over that one, that would mean some rather big changes, me thinks. hendrik> > Let's get it right and reach consensus instead, well hendrik> > grounded into are minds and our wills. hendrik> hendrik> To get it really well-grounded, we might also consider it in hendrik> the context of character set conversion. Points that are hendrik> easy to overlook with respect to line endings may be hendrik> glaringly obvious in this larger context. Even if we don't hendrik> solve the larger context, it may make decisions clear with hendrik> the smaller one. I'm all for hearing about things that will help resolve the smaller issue first. hendrik> > So, anything I forgot? hendrik> hendrik> Just how do we mark files as being text in the data base? hendrik> Will it conceptially be part of the checked-in revision, and hendrik> editable and mergible like anything else? I was imagining an attribute. They can be set, fetched and dropped. Attributes are conceptually part of the checked-in revision. I guess the biggest thing to resolve is what happens if that attribute changes for some reason, and that includes merging where the atribute value doesn't match in both parents. Come to think of he, how does merging of non-matching attributes work today? For example, when a file is marked as executable in one revision but not in the other, and a merge is performed down the line? hendrik> Just how does the user mark files as being text? A specific hendrik> parameter on initial checkin, to be changed later on checkin? hendrik> A default for new files based on the last few letters of the hendrik> name? A sanity check whether the file is really of the type hendrik> claimed? Answered in order: 1,2) I propose an attribute, 3) a default of some sort helps as well, and doesn't stop the user from making changes (njs proposed .mtn-autoprops, and we already have something that detects binary... sorry, manual_merge files), 4) well, as sanity check might be a good thing, if we think it's needed. We don't have a sanity check of that sort today, though... hendrik> Can we uncompress compressed files so as top better hendrik> diff/merge the contents and recompress on checkout? This hendrik> might be very helpful for openoffice files. Uhmm, I seriously thing this is way out of monotone's scope. hendrik> How do we handle the transition between the current hendrik> conventions and the new ones? We currently do not convert by default. When the lua function get_linesep_conv() is undefined, monotone works as if it returned {'LF','LF'}, and when the two elements are exactly the same, monotone does no conversion whatsoever. None at all, as far as I remember (I can't be arsed to look at the code for the moment, so find out on your own or wait a day or two for me to confirm or take back that statement :-)). hendrik> Are we currently storing files as unicode or UTF-8? (I think hendrik> only admin information such as file names) Should we store hendrik> text files as UTF-8? Uh, I'm under the impression that we store as UTF-8. Damn, it's been too long since I looked at the code, my DRAM apparently needs a refresh... Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED]
Re: [Monotone-devel] line endings as project policy
On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote: > In message <[EMAIL PROTECTED]> on Tue, 21 Nov 2006 23:59:41 -0800, "Justin > Patrin" <[EMAIL PROTECTED]> said: > > papercrane> I haven't read the line endings with 0.31 thread yet > papercrane> but...ugh. Is it really necessary to mangle line endings > papercrane> when checking out files? I mean reallyshouldn't people > papercrane> just use a capable text editor if they're contributing to > papercrane> a project? > > If it was as easy as the editor. Trouble is, different systems have > different standards, and a lot of programmers know only one of the > systems with no understanding of the rest of the world (this goes for > Windows, Unix and VMS programmers alike, and I think this discussion > shows it). So far, I've seen editors make a mess (think notepad.exe), > at least one shell (/bin/sh on Solaris) barf all ovre the place when > it sniffs the presence of a CR, and at least one C compiler (don't > recall which, but it was fairly recent) do the same. > > As soon as you're dealing with software that transfers files between > different platforms, this becomes the eternal problem to deal with. > FTP had to. Editors are typically NOT the kind of software that > should need to deal with this kind of problem, because editors do NOT > typically transfer files between different platforms. Same goes for C > compilers, shells and so on. You can't blame them for being fed > something that completely unexpected for the system they live in. > Sticking our heads in the sand doesn't change this. > > My point is, it's really up to monotone to do something that's at > least sensible in most of the cases. Right now, as soon as you start > dealing with line endings (which is what you do as soon as you hack > the lua function get_linesep_conv()), you take a shot at screwing up, > royally. > > There are a few proposals I actually liked, and most of all, the > fella' that suggested monotone could check that line endings are > consistent for anything it suspects being text. > > Basically, it comes down to a few itams, some of them I regard as > fact, others I regard as questions: > > - We need to treat files as binary unless told otherwise. This I >regard as a fact. (see the problem with screwed up files without >the user knowing about it) Agree. This is an essential safety constraint. > > - We need to mark text files as such. This I regard as fact, and it >seemt to me like this is almost concensus. Agree. > > - We need to convert line endings to the local standard on anything >that's assumed to be text on checkout. This I regard as a fact. >(see the problem that some Unixly programs have with embedded \r) This seems obvious, but I have some discomfort with the idea. Perhaps because I'm thinking of the wider issues involved in character set incompatibility. IN any case, conversion on checkout should be overridable in some way. > - We need to make a choice, either we treat all files as binary and >only mark them as text and what line ending they seem to go by, or >we need to convert to some internal line ending standard. It seems >to me this is still a question, although most seem to lean toward >an internal line ending standard, which is what monotone does now. > > - IF we go for an internal line ending standard, we need to CHOOSE >one and stick with it, not have the user choose one for us. I >don't currently recall if it is already this way today or if we're >relying on the first element returned by get_linesep_conv(). If >it's the latter, we need to stop that. This I regard as fact. If we use an internal line ending standard, we should consider the possibility of using the standard newline character NEL, "Next Line", 0x85, unicode U+0085. > The rest, such as merge problems to deal with, will come and will have > to be treated when they do. But first, we need to make decisions and > stick by them. The discussion on line endings has popped up a little > now and then, and been left off with a few question marks and nothing > else happening, just to come up again a few months later. It's time > things get decided upon so we can actually get the work done, and I > don't believe in someone just doing and that be the winning thing, > because months later, there's gonna be a whiner who says we f*cked up > royally. whiner is almost an anagram of winner :-) > Let's get it right and reach consensus instead, well > grounded into are minds and our wills. To get it really well-grounded, we might also consider it in the context of character set conversion. Points that are easy to overlook with respect to line endings may be glaringly obvious in this larger context. Even if we don't solve the larger context, it may make decisions clear with the smaller one. > > So, anything I forgot? Just how do we mark files as being text in the data base? Wil
Re: [Monotone-devel] line endings as project policy
In message <[EMAIL PROTECTED]> on Wed, 22 Nov 2006 01:31:41 -0800, Nathaniel Smith <[EMAIL PROTECTED]> said: njs> Door C: .mtn-autoprops? Sounds like a good idea to me! (well, until there's something better, of course) Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
In message <[EMAIL PROTECTED]> on Tue, 21 Nov 2006 23:59:41 -0800, "Justin Patrin" <[EMAIL PROTECTED]> said: papercrane> I haven't read the line endings with 0.31 thread yet papercrane> but...ugh. Is it really necessary to mangle line endings papercrane> when checking out files? I mean reallyshouldn't people papercrane> just use a capable text editor if they're contributing to papercrane> a project? If it was as easy as the editor. Trouble is, different systems have different standards, and a lot of programmers know only one of the systems with no understanding of the rest of the world (this goes for Windows, Unix and VMS programmers alike, and I think this discussion shows it). So far, I've seen editors make a mess (think notepad.exe), at least one shell (/bin/sh on Solaris) barf all ovre the place when it sniffs the presence of a CR, and at least one C compiler (don't recall which, but it was fairly recent) do the same. As soon as you're dealing with software that transfers files between different platforms, this becomes the eternal problem to deal with. FTP had to. Editors are typically NOT the kind of software that should need to deal with this kind of problem, because editors do NOT typically transfer files between different platforms. Same goes for C compilers, shells and so on. You can't blame them for being fed something that completely unexpected for the system they live in. Sticking our heads in the sand doesn't change this. My point is, it's really up to monotone to do something that's at least sensible in most of the cases. Right now, as soon as you start dealing with line endings (which is what you do as soon as you hack the lua function get_linesep_conv()), you take a shot at screwing up, royally. There are a few proposals I actually liked, and most of all, the fella' that suggested monotone could check that line endings are consistent for anything it suspects being text. Basically, it comes down to a few itams, some of them I regard as fact, others I regard as questions: - We need to treat files as binary unless told otherwise. This I regard as a fact. (see the problem with screwed up files without the user knowing about it) - We need to mark text files as such. This I regard as fact, and it seemt to me like this is almost concensus. - We need to convert line endings to the local standard on anything that's assumed to be text on checkout. This I regard as a fact. (see the problem that some Unixly programs have with embedded \r) - We need to make a choice, either we treat all files as binary and only mark them as text and what line ending they seem to go by, or we need to convert to some internal line ending standard. It seems to me this is still a question, although most seem to lean toward an internal line ending standard, which is what monotone does now. - IF we go for an internal line ending standard, we need to CHOOSE one and stick with it, not have the user choose one for us. I don't currently recall if it is already this way today or if we're relying on the first element returned by get_linesep_conv(). If it's the latter, we need to stop that. This I regard as fact. The rest, such as merge problems to deal with, will come and will have to be treated when they do. But first, we need to make decisions and stick by them. The discussion on line endings has popped up a little now and then, and been left off with a few question marks and nothing else happening, just to come up again a few months later. It's time things get decided upon so we can actually get the work done, and I don't believe in someone just doing and that be the winning thing, because months later, there's gonna be a whiner who says we f*cked up royally. Let's get it right and reach concensus instead, well grounded into are minds and our wills. So, anything I forgot? Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On Tue, Nov 21, 2006 at 11:14:45PM -0700, Derek Scherger wrote: > So, would this be better as a (shared and versioned) project policy > entry with line ending styles specified by file name patterns. It seems > like it would handle the case of added files, that match some policy > pattern, better. I'm not sure how policy *changes* would work though. > i.e. suppose the eol policy for .sh files is set to LF after a few .sh > files have already been added incorrectly. How do the pre-existing files > get fixed up? Door C: .mtn-autoprops? -- Nathaniel -- - Don't let your informants burn anything. - Don't grow old. - Be good grad students. -- advice of Murray B. Emeneau on the occasion of his 100th birthday ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings as project policy
On 11/21/06, Derek Scherger <[EMAIL PROTECTED]> wrote: Having read the "line endings with 0.31" thread and having used the svn:eol-style property I have a vague feeling that line endings may be better specified using some aspect of the (yet to be defined) project policy stuff. In an svn project that I work on, we generally set svn:eol-style to LF for .sh files (otherwise "#! /bin/bash" ends up being "#! /bin/bash\r" on windows and don't run on cygwin). The problem with this is that every time someone adds a new shell script they have to remember to set the eol-style, or some other unlucky person gets to find this bug and fix it. This happens over and over again. Similarly, for .c files it can be painful to watch people fighting over line endings. One commit is CRLF, the next is LF, back and forth it goes. This must greatly reduce repository compression and does render diff(1) more or less useless, as often every line in the file has changed. Subversion has something called "autoprops" which allows you to specify which properties to set based on file name patterns, so that when new files are added they can be assigned the correct eol-style property. However, this is an unversioned client side setting that every committer must have set correctly and must keep up to date. In practice this doesn't scale to even moderately sized teams. In monotone this would be like having the setting in a lua hook and expecting everyone on the team to have the same hook. So, would this be better as a (shared and versioned) project policy entry with line ending styles specified by file name patterns. It seems like it would handle the case of added files, that match some policy pattern, better. I'm not sure how policy *changes* would work though. i.e. suppose the eol policy for .sh files is set to LF after a few .sh files have already been added incorrectly. How do the pre-existing files get fixed up? Ugh. I haven't read the line endings with 0.31 thread yet but...ugh. Is it really necessary to mangle line endings when checking out files? I mean reallyshouldn't people just use a capable text editor if they're contributing to a project? -- Justin Patrin ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
On Thu, Feb 02, 2006 at 08:10:36AM -0500, Yury Polyanskiy wrote: > And of course it'd be MUCH better to have get_linesep_conv() NOT called > for binary files (or at least for manual_merge files as a quick hack). For a bit of history... when manual_merge was added, the argument was to explicitly _avoid_ trying to come up with the Ultimate Design For All Things Content Sensitive. The argument was that manual merging was on particular thing one might want (even on files that are textual! E.g., if you have a special-purpose merger for some XML documents). So manual_merge and line ending conversion should perhaps be kept orthogonal. -- Nathaniel -- .i dei jitfa fanmo xatra ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
On Wednesday 01 February 2006 21:41, Ethan Blanton wrote: > With this condition, a correct definition of the end-of-line sequence > for the current platform, I don't think Monotone should count on files obeying the platform's end-of-line sequence. I mostly work in Windows, but even so I have to work with LF-only files fairly often. Today I had to debug an Adobe FDF file that had LF line ends, even though the web site that generated it and the web browser that saved it were both on Windows machines. If I were to save that into Monotone, I think I would want it treated as text, not binary. ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
On Thu, 2006-02-02 at 07:42 +0100, Richard Levitte - VMS Whacker wrote: > I must say that I feel nervous handing the decision to a lua hook. > There's the strong possibility that a user somewhere will get > "creative" and that chaos will follow. If there's an attribute saying > that a file shouldn't be transformed, the internals of monotone should > be able to detect that and avoid doing anything with that file. Agree-agree! Because everything about the project should be commitable and should be stored in db. And of course it'd be MUCH better to have get_linesep_conv() NOT called for binary files (or at least for manual_merge files as a quick hack). Cheers, -up ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
In message <[EMAIL PROTECTED]> on Thu, 02 Feb 2006 00:11:51 -0500, Yury Polyanskiy <[EMAIL PROTECTED]> said: ypolyans> Well I think per-file certificate is something that doesn't ypolyans> fit in monotone's model (certificates are attached to ypolyans> revisions not files). I think they're called "attributes" :-). Howard did say he is new at this... ypolyans> In fact I think current implementation with ypolyans> get_linesep_conv() includes every other model. IF only I ypolyans> could check for "merge_manual" attribute from inside the ypolyans> hook. As nobody answers I assume monotone doesn't have ypolyans> internal mapping filename->attributes which can be checked ypolyans> from hook. I think your guess is correct, and that's a shame. ypolyans> BTW, I don't understand all the buzz about bad binary files ypolyans> handling. Maybe monotone COULD improve heuristics about ypolyans> detecting binary files but in any case user has complete ypolyans> control over the process by setting manual_merge attribute. As it is right now, it does the wrong thing. When you commit, read_localized_data() is used to read the contents of each file. That function checks the values returned from get_linesep_conv(), and if they differ, it will have the lines split and then joined using the first of the two values returned by get_linesep_conv(). As you just said yourself, there's no way to check if the "manual_merge" attribute is set for that file (as far as I understand). ypolyans> Overall I see two problems with line endings by now: ypolyans> (a) conversion should not touch CR's if I want to convert ypolyans> LF's only. I'm playing with that, even though I disagree with the principle. ypolyans> (b) it should be possible to check for manual_merge ypolyans> attribute from hook Agreed, big time. (I only disagree with the attribute name, but that's a minor issue) ypolyans> (c) default get_linesep_conv() should return {"LF", ypolyans> SYSTEM_SEP } for all files which are not marked by ypolyans> manual_merge attr. I disagree with the return value. No hook should be able to say what the internal line ending should be. In my opinion, it should only return SYSTEM_SEP. And for completeness, it should be able to return "" to say that this file shouldn't be touched. I must say that I feel nervous handing the decision to a lua hook. There's the strong possibility that a user somewhere will get "creative" and that chaos will follow. If there's an attribute saying that a file shouldn't be transformed, the internals of monotone should be able to detect that and avoid doing anything with that file. ypolyans> That I think would satisfy everyone and does not (except for ypolyans> b) imply a great change in current code. OBTW, you mentioned a while ago that files with no line ending at the end of the file would get one added because of the way split_into_lines() and join_lines() work. I've been playing with that, and it's a lot tougher to change than you might think. The diff code depends on this behavior, and possibly the merge code in diff-patch.cc as well. I'm currently playing around with the diff code, and frankly, I cringe at the thought of touching the merge code for this... Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
In message <[EMAIL PROTECTED]> on Wed, 01 Feb 2006 18:59:12 -0800, Howard Spindel <[EMAIL PROTECTED]> said: howard> 2. Database file is LF line terminated, transform LFs to howard> platform specific in/out howard> 3. Database file is CR/LF line terminated, transform CR/LFs howard> to platform specific in/out I think we established a while ago that there should only be one possible line ending in the database. Consider what happens when two databases with different line ending standards are synchronised. Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
Well I think per-file certificate is something that doesn't fit in monotone's model (certificates are attached to revisions not files). In fact I think current implementation with get_linesep_conv() includes every other model IF only I could check for "merge_manual" attribute from inside the hook. As nobody answers I assume monotone doesn't have internal mapping filename->attributes which can be checked from hook. BTW, I don't understand all the buzz about bad binary files handling. Maybe monotone COULD improve heuristics about detecting binary files but in any case user has complete control over the process by setting manual_merge attribute. Overall I see two problems with line endings by now: (a) conversion should not touch CR's if I want to convert LF's only. (b) it should be possible to check for manual_merge attribute from hook (c) default get_linesep_conv() should return {"LF", SYSTEM_SEP } for all files which are not marked by manual_merge attr. That I think would satisfy everyone and does not (except for b) imply a great change in current code. BTW can anyone knowing rosters stuff say if "better attributes handling" may now help with (b) in the above? Yury. On Wed, 2006-02-01 at 18:59 -0800, Howard Spindel wrote: > It seems to me that to be completely general monotone needs a > per-file certificate specifying the permissible > transformations. Examples of the settings available in that certificate: > > 1. Never transform anything > 2. Database file is LF line terminated, transform LFs to platform > specific in/out > 3. Database file is CR/LF line terminated, transform CR/LFs to > platform specific in/out > 4. Whatever additional combinations are possible (does monotone have > to worry about big-endian vs little-endian ever?) > > "Platform specific" could be determined by system call, overrideable by hook. > > Monotone would request the correct setting on initial file > check-in. For backwards compatibility, if the certificate does not > exist monotone does whatever it does now but requests the correct > setting on next file check-in. > > The problem I see in the discussions so far is not in knowing what to > do on the platform you're on, but in knowing what's stored in the > database. Monotone could use heuristics to determine that, but if > the heuristics fail the user gets a big, maybe nasty surprise. > > In a project development environment where everyone uses the same OS, > "never transform anything" should work well for all files and perhaps > there should be a way to tell monotone that this case exists. > > Since I know nothing of monotone internals, I acknowledge that I > could be way off-base on something here. > > Howard > > > > > > > ___ > Monotone-devel mailing list > Monotone-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/monotone-devel ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
On 2/1/06, Ethan Blanton <[EMAIL PROTECTED]> wrote: > > Muddying the waters by arbitrarily complicating database > representation (e.g., allowing various end-of-line representations in > the database) does not seem profitable to me. > I agree. Especially since he missed a third option, which is CR terminated lines (mac). -- Justin Patrin ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings
Howard Spindel spake unto us the following wisdom: > It seems to me that to be completely general monotone needs a > per-file certificate specifying the permissible > transformations. Examples of the settings available in that certificate: > > 1. Never transform anything > 2. Database file is LF line terminated, transform LFs to platform > specific in/out > 3. Database file is CR/LF line terminated, transform CR/LFs to > platform specific in/out > 4. Whatever additional combinations are possible (does monotone have > to worry about big-endian vs little-endian ever?) I think rules like this are not necessarily an improvement. It is obvious that no one conversion rule solves everything -- it seems clear to me that the only 100% Reversible and Safe method of storing line endings would be to use, e.g., byte stuffing to ensure that there are no bare internal-line-ending markers in any text files which were not actually end-of-line markers in the platform on which the file was created, which have been canonicalized. With this condition, a correct definition of the end-of-line sequence for the current platform, and a way to say "this file has no end-of-line markers (is binary)", the problem is solved. Without this condition, there are a number of heuristics which can be used (and are used in many systems) to approximate the correct behavior which work in Most Cases. Muddying the waters by arbitrarily complicating database representation (e.g., allowing various end-of-line representations in the database) does not seem profitable to me. Ethan -- The laws that forbid the carrying of arms are laws [that have no remedy for evils]. They disarm only those who are neither inclined nor determined to commit crimes. -- Cesare Beccaria, "On Crimes and Punishments", 1764 pgp2a6fOEAkeB.pgp Description: PGP signature ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings and binary files revisited...
That's handy to know -- I might need to use my repositories on all 3 of these platforms, soon. On Sat, 2005-05-21 at 18:19 -0700, Nathaniel Smith wrote: [...] > The current policy is: > -- by default, don't touch line endings at all. In fact, don't > touch data at all, we store uninterpreted bytestreams. > -- if the user requests, they can have line endings converted one > way going out of their working copy, and back again when going > into their working copy. I don't know if there's a way to make > this apply only to specific files, though. > > -- Nathaniel > ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings and binary files revisited...
In message <[EMAIL PROTECTED]> on Sat, 21 May 2005 18:19:36 -0700, Nathaniel Smith <[EMAIL PROTECTED]> said: njs> The current policy is: njs> -- by default, don't touch line endings at all. In fact, don't njs> touch data at all, we store uninterpreted bytestreams. Oh, I guess I had misunderstood it... njs> -- if the user requests, they can have line endings converted one njs> way going out of their working copy, and back again when going njs> into their working copy. I don't know if there's a way to make njs> this apply only to specific files, though. Thanks for clarifying. - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ "When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up." -- C.S. Lewis ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] line endings and binary files revisited...
On Sun, May 22, 2005 at 01:56:35AM +0200, Richard Levitte - VMS Whacker wrote: > I have the feeling that however good, there's always a possibility > that a non-text files be interpreted as text files. As long as you're > playing entirely in Unix, it's not a problem, since the line ending is > a \n, so there will be no conversion. On Windows or on Mac, where the > line endine is \r\n and \r respectively, the matter is different. > > I'm thinking that files should be stored entirely unchanged in the > database, and *possibly* be interpreted on output. It has the benefit > of not being destructive, which I think would be a good thing... The current policy is: -- by default, don't touch line endings at all. In fact, don't touch data at all, we store uninterpreted bytestreams. -- if the user requests, they can have line endings converted one way going out of their working copy, and back again when going into their working copy. I don't know if there's a way to make this apply only to specific files, though. -- Nathaniel -- Eternity is very long, especially towards the end. -- Woody Allen ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel