Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
[ On Thursday, July 1, 2004 at 14:33:11 (-0700), Paul Sander wrote: ] > Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) > > What are you talking about? I can think of only two ways that CVS > "uses" the deltas: Well, as usual you got off on the wrong track right from the start again. What part of "RCS Compatible" have you misunderstood this time? > CVS has a notoriously poor diff and merge capability. Well, that seems to depend entirely on your point of view and your perpensity to try to use the wrong (or at least the least suitable) tool for the job. I and no doubt millions of other people have been incredibly satisfied with the extremely wide applicability of the unix diff and merge algorithms. It's almost a miracle that they work so well for such a variety of different kinds of text files -- or maybe it's just an indication of how well designed they are for dealing with the vast majority of forms of representation of human-readable information. Of course you don't have to like them, but you do have to accept that they are integral to RCS and thus integral to CVS. Go away and go play with xdelta and friends if you want. -- Greg A. Woods +1 416 218-0098 VE3TCPRoboHack <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]> Secrets of the Weird <[EMAIL PROTECTED]> ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
>--- Forwarded mail from [EMAIL PROTECTED] >[ On Monday, June 28, 2004 at 14:58:03 (-0700), Mark D. Baushke wrote: ] >> Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) >> >> Yes, but diff is not diff3. diff is used for the >> delta format. diff3 is used by rcsmerge, not for >> fundamental version deltas. >I think you're confused -- the differencing algorithms used are >fudamentally intertwined (and fundamentally based on units of lines of >text). This true, insofar as to maintain the integrity of the RCS files and to reconstruct complete versions. >Pretending you can do merges using some other algorithm while still >trying to store your deltas in unix diff format is just leading everyone >down the garden path to a dark dank corner no-one really wants to be in. What do we care what format the versions are stored in, as long as we can recover the complete files and apply any tool we want to them? Although I can imagine such a thing, I don't know of any merge tool reads the ed-like scripts produced by the diff program and presents a user interface to apply or omit specific deltas to an input file. It's an interesting idea, and it might even be useful, but its utility is limited. On the other hand, reconstructing entire versions and applying content-specific tools is far more useful. For example, there is research on hierarchical differencing algorithms that compare tree-like structures like the ones produced by parsers of programming languages. I foresee that this will lead to a new wave of merge tools that provide a much higher level of utility than line-based tools like diff3. This kind of work just isn't possible with line-based deltas produced by the diff program. (It's also possible that they could lead to a new wave of archivers that provide RCS-like capability but use the hierarchical diffs in the deltatext records, which will be interesting. But nobody's suggesting a possible replacement of the RCS layer just yet.) >The uniform use of differencing algorithms and their corresponding merge >algorithms (which are of course just "editing" scripts), is what makes >it worthwhile to use something like RCS as the foundation for CVS in the >first place. It's what makes it possible for systems like RCS to exploit the similarity of sequential versions for efficient storage, to be sure. But applying a delta to reconstruct a version is very different from doing a content merge of two or three fully reconstructed files. >I.e. it is not sufficient to just use the RCS delta format as a means of >archive compression -- that format is integral to the whole idea of >detecting, reporting, and merging, changes in any RCS-compatible tool. Once again, no one is suggesting changing the way that RCS works. >> Are there really utilities out there that try to >> to read RCS formats directly and do not allow for >> rcsfile(5) syntax to be used? If so, could you >> name any of them? >Humans, for one. :-) >(I know some folks can do manual merges of SCCS files, and though the >same techniques won't work quite so well on RCS files because of the >reverse delta thing, there are still a great many other valid reasons to >read and even repair RCS files by hand.) >There are a number of commercial software pacakges which are "GNU RCS >compatible", apparently without using RCS source code, with the most >"popular" perhaps being CS-RCS (though I've not confirmed 100% that it >does not use RCS source code). SourceCodeManager is apparently another, >and P4D yet another. >Perforce also uses RCS compatible files as its archive format, but I'm >not sure if its core RCS handling was derived from RCS source code or not. >I think I've just scratched the surface too, if any of the rumours I've >heard are close to true. Well, if these tools are truly "RCS compatible" then they should be able to ignore the newphrases we've been talking about. And since there is no proposition to change the format of the deltatext phrases, or any of the other standard components of an RCS file, those tools should continue to work. BTW, I have also written a couple of tools that parse the RCS file syntax. They conform to the rcsfile format and should tolerate extensions made as newphrases as specified. I have also seen commercial tools derived from RCS (specifically, the MKS variety) that have made proprietary extensions and are no longer compatible with the Gnu standard. >--- End of forwarded message from [EMAIL PROTECTED] ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
>--- Forwarded mail from [EMAIL PROTECTED] >[ On Monday, June 28, 2004 at 19:02:19 (-0700), Paul Sander wrote: ] >> Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) >> >> I have never, ever advocated changing the format of an RCS file in a >> way that would break the ci, co, rcs, or rlog programs. And although >> I strongly advocate the replacement of user-exposed diff and merge >> tools, I have never, ever advocated the replacement of the diff tool >> that computes the deltas stored in an RCS file. >Indeed -- instead you would rather use different algorithms for storing >deltas and for using them. >That would be just plain stupid, if indeed not eventually dangerous to >the integrity of a repository. What are you talking about? I can think of only two ways that CVS "uses" the deltas: Reconstructing complete versions, and annotating version history. For the purposes of this thread, which started out with diffing and merging files, the tools require reconstructed versions. Of course, the algorithms that produce the deltas and reconstruct the original data must agree. But that's all below the RCS API and is completely invisible to the user. Once the user has two or three complete files, he can apply any diff or merge algorithm he wants to those files. Recall the following sequence of operations: co -pancestor file,v > a co -pcontributor file,v > c diff3 -E file a c Once again, the algorithms and data formats that maintain the integrity of the RCS files is hidden away and invisible to the user by way of the co and ci programs. The user can replace the invocation of diff3 with any tool that he chooses to perform the content merge. Once done, the user uses ci to produce a new delta in the RCS files, using the very algorithm that produces the correct data for subsequent invocations of co. There's absolutely no danger to the integrity of the RCS file, unless someone mucks with the innards of co or ci. And nobody is even hinting that making such changes is desirable, at least with respect to the deltatext phrases in the RCS files. (There have been several recommendations to exploit the areas of the rcsfile format that explicitly permit extensions, but extensions of this nature have absolutely no effect on RCS' ability to store and reconstruct versions, which I have demonstrated in a separate message.) >The tools we now have for calculating and handling deltas are all >designed to work _together_, not in isolation of each other, and that >uniformity is as valuable to CVS as it is to RCS alone, if not more so. What tools, specifically (and I mean, you need to name them and include pointers to them so that the rest of us can look), are you talking about? The RCS programs and CVS in its current implementation are the obvious ones, and my comments withstand scrutiny on those. What else are you referring to? >How about you go off and spend the next, say, two years or so >intensively using such a scheme as you propose on a massively huge >variety of projects. That should give you about 10% of the experience >the rest of the world has with using diff and diff3 and rcsmerge >uniformly for both purposes. >Then if you still think it's wise to use disparate techniques for >storing deltas and for using deltas then you can show your results and >raise your proposal here again. >In the mean time please keep in mind that there are not just a plethora >of tools for using diff-style deltas, but there's also an enormous >amount of human experience with them too. I look forward to seeing your list of references, so that we can debate the relative value of interpreting ed-like scripts for a least-common denominator level of functionality, versus parsing the entire content of a reconstructed file and applying domain-specific algorithms that understand the type of data stored there. >You (and a few others) seem to want to throw the baby out with the bath >water, and all just so that a few hair-brained and lame mis-uses of CVS >will work "better". In the mean time if you (and others) had learned to >use the best tool for the job in the first place then you'd never have >had to dream up such a half-baked idea. CVS has a notoriously poor diff and merge capability. Integrating the user-exposed features with better tools is a very good example of using the best tool for the job. And it's not a half-baked idea; the whole idea of plug-ins is well established in the industry, and its feasibility in CVS is proven. >--- End of forwarded message from [EMAIL PROTECTED] ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
[ On Monday, June 28, 2004 at 14:58:03 (-0700), Mark D. Baushke wrote: ] > Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) > > Yes, but diff is not diff3. diff is used for the > delta format. diff3 is used by rcsmerge, not for > fundamental version deltas. I think you're confused -- the differencing algorithms used are fudamentally intertwined (and fundamentally based on units of lines of text). Pretending you can do merges using some other algorithm while still trying to store your deltas in unix diff format is just leading everyone down the garden path to a dark dank corner no-one really wants to be in. The uniform use of differencing algorithms and their corresponding merge algorithms (which are of course just "editing" scripts), is what makes it worthwhile to use something like RCS as the foundation for CVS in the first place. I.e. it is not sufficient to just use the RCS delta format as a means of archive compression -- that format is integral to the whole idea of detecting, reporting, and merging, changes in any RCS-compatible tool. > Are there really utilities out there that try to > to read RCS formats directly and do not allow for > rcsfile(5) syntax to be used? If so, could you > name any of them? Humans, for one. :-) (I know some folks can do manual merges of SCCS files, and though the same techniques won't work quite so well on RCS files because of the reverse delta thing, there are still a great many other valid reasons to read and even repair RCS files by hand.) There are a number of commercial software pacakges which are "GNU RCS compatible", apparently without using RCS source code, with the most "popular" perhaps being CS-RCS (though I've not confirmed 100% that it does not use RCS source code). SourceCodeManager is apparently another, and P4D yet another. Perforce also uses RCS compatible files as its archive format, but I'm not sure if its core RCS handling was derived from RCS source code or not. I think I've just scratched the surface too, if any of the rumours I've heard are close to true. -- Greg A. Woods +1 416 218-0098 VE3TCPRoboHack <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]> Secrets of the Weird <[EMAIL PROTECTED]> ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
[ On Tuesday, June 29, 2004 at 02:18:26 (-0700), Paul Sander wrote: ] > Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) > > >I.e. How do you propose to make it possible for the standard RCS tools > >alone to re-create _every_ revision from all files created by this > >hacked system? > > Simple: The delta text would not change. See above. It would be extremely short-sighted, if not downright stupid, to not keep the delta format compatible with that used by the new delta tools. You seem to have no appreciation whatsoever for the depth and breath to which this format (and its easily computed variants) is used and understood. -- Greg A. Woods +1 416 218-0098 VE3TCPRoboHack <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]> Secrets of the Weird <[EMAIL PROTECTED]> ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
[ On Monday, June 28, 2004 at 19:02:19 (-0700), Paul Sander wrote: ] > Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) > > I have never, ever advocated changing the format of an RCS file in a > way that would break the ci, co, rcs, or rlog programs. And although > I strongly advocate the replacement of user-exposed diff and merge > tools, I have never, ever advocated the replacement of the diff tool > that computes the deltas stored in an RCS file. Indeed -- instead you would rather use different algorithms for storing deltas and for using them. That would be just plain stupid, if indeed not eventually dangerous to the integrity of a repository. The tools we now have for calculating and handling deltas are all designed to work _together_, not in isolation of each other, and that uniformity is as valuable to CVS as it is to RCS alone, if not more so. How about you go off and spend the next, say, two years or so intensively using such a scheme as you propose on a massively huge variety of projects. That should give you about 10% of the experience the rest of the world has with using diff and diff3 and rcsmerge uniformly for both purposes. Then if you still think it's wise to use disparate techniques for storing deltas and for using deltas then you can show your results and raise your proposal here again. In the mean time please keep in mind that there are not just a plethora of tools for using diff-style deltas, but there's also an enormous amount of human experience with them too. You (and a few others) seem to want to throw the baby out with the bath water, and all just so that a few hair-brained and lame mis-uses of CVS will work "better". In the mean time if you (and others) had learned to use the best tool for the job in the first place then you'd never have had to dream up such a half-baked idea. -- Greg A. Woods +1 416 218-0098 VE3TCPRoboHack <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]> Secrets of the Weird <[EMAIL PROTECTED]> ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
>--- Forwarded mail from [EMAIL PROTECTED] >Paul Sander <[EMAIL PROTECTED]> writes: >> >--- Forwarded mail from [EMAIL PROTECTED] >>=20 >> Rather than use a hint to expose an >> implementation detail, I suggest recording a >> data type instead. Maybe even a MIME type. Then >> provide a suitable mechanism to map data types >> to tools that are appropriate to the >> environment. >I have no fundamental objection to saving the MIME >type. I suggest that it may need to be inside of a >string to pass the syntax of rcsfile(5). I would >actually suggest that it might be useful to just >borrow both of the MIME media-type and charset >concepts. That might allow for a=20 > "media-type text/plain;" > "charset ks_c_5601-1987;" >on a given file... the defaults should probably >be "text/plain" and iso-8859-1 or utf-8 Do you propose that the media-type be valid on its own, for data types where charsets have no meaning? Or put another way, is the charset solely to provide additional processing hints to supplement the media-type, or is the charset also required? >> >Given that this would appear to be the desire of >> >at least a few folks out there who might want to >> >make CVS do a better job at merging structured >> >ASCII files such as XML or HTML format. And >> >further, that you seem to have objections to this >> >approach. And while I have known you to bring up >> >points I have overlooked in the past... >>=20 >> Not just structured ASCII files as you describe, >> but any file containing structured data for >> which a merge tool is available. >Ahh, but I am not really trying to suggest that >"binary files" are suitable in the general case >for CVS control. That is a separate argument. Fair enough, but the practice is more common than anyone wants to admit. The issue must be faced at some point. >That said, I suppose that a merge utility that >understands how to merge a file containing lines >in a non-ISO-LATIN character set might also fall >into the category of a diff3 replacement and that >such files might be considered 'binary' by some >programs. Indeed. >> >This time around I just do not see anything that >> >would preclude such an approach of using an >> >external diff3 hint 'replacement' program for >> >doing a 'cvs update -jtag1 -jtag2' operation. >>=20 >> >I will stipulate that such a program will likely >> >need to live on the server and furthermore that it >> >would not be interactive. In the absense of >> >finding such a program, CVS would likely resort to >> >using diff3 as a fallback, so its arguments would >> >likely need to match those of the diff3 program >> >itself... at least to the extent that cvs currently >> >uses various arguments to diff3. >>=20 >> I don't believe that such a program MUST live on >> the server. >The changes needed to allow the client-side to do >a merge are very large. I am not willing to >stipulate an implementation that would allow CVS >to deal with an interactive merge operation for a >random 'cvs update' command. The repository would >have a lock open for too long in that case. Yes, to avoid long-lived locks, the necessary files must be copied to the client before the merge begins. This would involve a significant change to the client, but I'm not convinced that it would be a significant change to the server. The server already has the ability to send whole revisions to the client, and it need not be involved with the merge once it starts. >> Merge tools, like editors, have a way of >> becoming religious icons, in situations where >> users have a choice. Under such circumstances, >> it becomes important to have client side >> mappings between data types and merge tools. >Your arguments almost help to make a case in >Greg's favor against allowing a diff3 replacement. Horrors! I sure hope not! :-) >The kind of flexibility you desire is not >something that I think makes sense to bolt into >the 'diff3' slot. Then bolt in a wrapper that reads the user's environment and invokes a suitable merge tool based on preferences that are found there. And provide a default, like diff3, if such information is missing. >What you propose would potentially best be handled >with an entirely new kind of update paradigm. >Possibly the use of a CVS/Base/file file and a >'patch' that would bring CVS/Base/file up to the >latest version would be 'better' in this case... Whatever's most efficient to get the other contributor and common ancestor to the client. Clean-up needs to be considered as well. >> Additionally, I don't believe that merge tools >> necessarily need to be fully automated. >Here we do not agree. Without such automation, >lock contention on directories could get very >intense. Again, running the merge after relevant data have been copied out and freeing the locks would remove this issue. Actually, the ancestor and contributor are checked-in versions, and they're known in advance either by version number or branch/timestamp. Correct me if I'm wrong her
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Paul Sander <[EMAIL PROTECTED]> writes: > >--- Forwarded mail from [EMAIL PROTECTED] > > Rather than use a hint to expose an > implementation detail, I suggest recording a > data type instead. Maybe even a MIME type. Then > provide a suitable mechanism to map data types > to tools that are appropriate to the > environment. I have no fundamental objection to saving the MIME type. I suggest that it may need to be inside of a string to pass the syntax of rcsfile(5). I would actually suggest that it might be useful to just borrow both of the MIME media-type and charset concepts. That might allow for a "media-type text/plain;" "charset ks_c_5601-1987;" on a given file... the defaults should probably be "text/plain" and iso-8859-1 or utf-8 > BTW, CVS no longer uses rcsmerge; it co's the > necessary versions and runs diff3 directly. So > in a CVS context, pushing this capability down > to RCS isn't really a requirement. However, I > recognize the usefulness of doing so, and would > not oppose such a feature. On the other hand, > doing so will likely be a duplication of effort > because CVS has client/server concerns that RCS > does not, and that may necessitate a different > implementation. Yes, I am aware that CVS no longer uses rcsmerge. However, Greg was suggesting that RCS compatibility would be broken by an extension such as the one outlined in the thought experiment I provided, so I felt it reasonable to mention how RCS itself used diff3 in the past. > >Given that this would appear to be the desire of > >at least a few folks out there who might want to > >make CVS do a better job at merging structured > >ASCII files such as XML or HTML format. And > >further, that you seem to have objections to this > >approach. And while I have known you to bring up > >points I have overlooked in the past... > > Not just structured ASCII files as you describe, > but any file containing structured data for > which a merge tool is available. Ahh, but I am not really trying to suggest that "binary files" are suitable in the general case for CVS control. That is a separate argument. That said, I suppose that a merge utility that understands how to merge a file containing lines in a non-ISO-LATIN character set might also fall into the category of a diff3 replacement and that such files might be considered 'binary' by some programs. > >This time around I just do not see anything that > >would preclude such an approach of using an > >external diff3 hint 'replacement' program for > >doing a 'cvs update -jtag1 -jtag2' operation. > > >I will stipulate that such a program will likely > >need to live on the server and furthermore that it > >would not be interactive. In the absense of > >finding such a program, CVS would likely resort to > >using diff3 as a fallback, so its arguments would > >likely need to match those of the diff3 program > >itself... at least to the extent that cvs currently > >uses various arguments to diff3. > > I don't believe that such a program MUST live on > the server. The changes needed to allow the client-side to do a merge are very large. I am not willing to stipulate an implementation that would allow CVS to deal with an interactive merge operation for a random 'cvs update' command. The repository would have a lock open for too long in that case. > Merge tools, like editors, have a way of > becoming religious icons, in situations where > users have a choice. Under such circumstances, > it becomes important to have client side > mappings between data types and merge tools. Your arguments almost help to make a case in Greg's favor against allowing a diff3 replacement. The kind of flexibility you desire is not something that I think makes sense to bolt into the 'diff3' slot. What you propose would potentially best be handled with an entirely new kind of update paradigm. Possibly the use of a CVS/Base/file file and a 'patch' that would bring CVS/Base/file up to the latest version would be 'better' in this case... > Additionally, I don't believe that merge tools > necessarily need to be fully automated. Here we do not agree. Without such automation, lock contention on directories could get very intense. > After the relevant versions have been downloaded > to the client (and the repository locks have > been cleared), the merge tools can run > interactively. However, I believe that CVS > current intersperses merges with downloads, and > that would need to change before interactive > merges can be supported. The current CVS operations all occur on the server side prior to downloading patches to the client. What you are suggesting is a fairly major overhaul to the cvs client/server protocol and as such there is probably a 'better' way to deal with this than a 'simple' alternative table of diff3-style programs to do alternative merger algorithms. > Also, CVS currently relies on diff3-style > mark-ups to warn the user when merge conflicts >
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
>--- Forwarded mail from [EMAIL PROTECTED] >[ On Monday, June 28, 2004 at 01:44:36 (-0700), Mark D. Baushke wrote: ] >> Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) >> >> The RCS format is very extensible and in fact the >> CVSNT folks have extended it already and I have had >> no problems using CVSNT repositories in conjunction >> with either CVS or RCS. >"very" is an over-statement of the first order! ;-) >Sure, it's an extensible format, but not in the way that's been >suggested. You can't get rid of the _exclusive_ use of diff et al >without entirely losing compatabilty with RCS. Nobody has suggested abandoning diff for computing RCS deltas. All discussion relating to replacement of diff and merge tools have revolved around the user interface. That's completely different. >> I do not see support for your assertion that >> compatibility is "far more" than just the >> adherence to the syntax defined in rcsfile(5). >Sadly rcsfile(5) only describes the meta-syntax, not the nuts&bolts of >how RCS files work and how they're actually used by the RCS package. >> So, I believe that adding a >> >> 'diff3hint someprogram;' >> >> line to the RCS file should not be a problem for >> "co" to still be able to checkout each and every >> version of the file. >"diff3hint" in the way you're hinting it might be used is insufficient. >RCS directly interprets the content of the delta text information, >e.g. the likes of: > @d5 1 > a5 1 > some new line of text > d256 1 > @ >See, for example, lib/rcsedit.c from the RCS source distribution. You are obviously missing something here. We're talking about adding a newphrase in the admin, delta, or deltatext productions. Using the deltatext production and your diff output as an example: 1.1 log @this is a log message @ diff3hint use-this-tool; text @d5 1 a5 1 some new line of text d256 1 @ This obviously extends the RCS file format in a way that does not break compatibility with the existing RCS software. Following is a complete RCS file that contains not one but three extensions, but they're done in a way that is supported by the RCS file format. And none of the RCS programmatic interfaces break. head1.4; access; symbols; locks; strict; comment @# @; admin-ext @this is an admin extension.@; 1.4 date2004.06.29.09.08.54;author paul;state Exp; branches; next1.3; 1.3 date2004.06.29.09.05.20;author paul;state Exp; branches; next1.2; delta-ext @this is a delta extension.@; 1.2 date2004.06.29.09.04.53;author paul;state Exp; branches; next1.1; 1.1 date2004.06.29.09.04.24;author paul;state Exp; branches; next; desc @Test file. @ 1.4 log @Added the beep! @ text @This is a test. This is only a test. If this had been an actual emergency, it would have been too late. BEP! @ 1.3 log @Done! @ deltatext-ext @this is a deltatext extension.@; text @d4 1 @ 1.2 log @First change. Needs more work. @ text @d3 1 @ 1.1 log @Initial revision @ text @d2 1 @ >Any modification of the diff algorithm would almost certainly require >changes to the syntax of this delta text. Actually, this isn't true. The diff program itself implements multiple algorithms. But that's neither here nor there because nobody is recommending that the format of the differences be changed. >As far as I can tell the extensibility of the RCS,v syntax does not go >so far as to provide for callouts to add-on programs and I'm arguing >that it's _far_ too late to try to modify this widely used standard file >format now. It's never too late to update a standard. In any case, RCS file extensibility has been in the standard for a very long time now. >So, how _exactly_ do you propose to convince the standard "co" program >(or the equivalent in any other RCS-compatible tool suite, including the >current CVS implementations) to actually make use of the new delta >text syntax that such a hack would create? >I.e. How do you propose to make it possible for the standard RCS tools >alone to re-create _every_ revision from all files created by this >hacked system? Simple: The delta text would not change. See above. >It's simply not possible. Like I said, only the bare surface of RCS >compatability is scratched by the meta-syntax described in rcsfile(5). Absolutely untrue, as demonstrated by the RCS file above. >The RCS file format is intricately intertwined with the unix diff >algorithm, which is itself tightly dependent on the "normal" use of >lines of text to represent elements of a the source files being managed This much is t
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
>--- Forwarded mail from [EMAIL PROTECTED] Mark, I agree with your response to Greg's claims about RCS compatibility, or the lack thereof. >In particular, I am not aware of any fundamental >problems rcs 5.7 will have if someone were to >introduce a new keyword which would name a program >other than diff3 to be used in rcsmerge >operations. At most, I would expect a warning >message via the warnignore() function which would >specify >co: file,v: warning: Unknown phrases like `diff3hint ...;' are present. >and even so, a 'co -q file,v' would not generate >such a message. >So, I believe that adding a >'diff3hint someprogram;' >line to the RCS file should not be a problem for >"co" to still be able to checkout each and every >version of the file. Rather than use a hint to expose an implementation detail, I suggest recording a data type instead. Maybe even a MIME type. Then provide a suitable mechanism to map data types to tools that are appropriate to the environment. BTW, CVS no longer uses rcsmerge; it co's the necessary versions and runs diff3 directly. So in a CVS context, pushing this capability down to RCS isn't really a requirement. However, I recognize the usefulness of doing so, and would not oppose such a feature. On the other hand, doing so will likely be a duplication of effort because CVS has client/server concerns that RCS does not, and that may necessitate a different implementation. >Given that this would appear to be the desire of >at least a few folks out there who might want to >make CVS do a better job at merging structured >ASCII files such as XML or HTML format. And >further, that you seem to have objections to this >approach. And while I have known you to bring up >points I have overlooked in the past... Not just structured ASCII files as you describe, but any file containing structured data for which a merge tool is available. >This time around I just do not see anything that >would preclude such an approach of using an >external diff3 hint 'replacement' program for >doing a 'cvs update -jtag1 -jtag2' operation. >I will stipulate that such a program will likely >need to live on the server and furthermore that it >would not be interactive. In the absense of >finding such a program, CVS would likely resort to >using diff3 as a fallback, so its arguments would >likely need to match those of the diff3 program >itself... at least to the extent that cvs currently >uses various arguments to diff3. I don't believe that such a program MUST live on the server. Merge tools, like editors, have a way of becoming religious icons, in situations where users have a choice. Under such circumstances, it becomes important to have client side mappings between data types and merge tools. Additionally, I don't believe that merge tools necessarily need to be fully automated. After the relevant versions have been downloaded to the client (and the repository locks have been cleared), the merge tools can run interactively. However, I believe that CVS current intersperses merges with downloads, and that would need to change before interactive merges can be supported. Also, CVS currently relies on diff3-style mark-ups to warn the user when merge conflicts remain present at commit time. Though strictly speaking such warnings are not necessary, they are incredibly useful. And they'll be lost unless merge conflicts are recorded another way. One way is to lists conflicts in a file stored in the CVS directory. At commit time, skip the scan for diff3 mark-ups and instead read the conflict list and compare mod times of the relevant files. If they have changed, assume the conflicts have been resolved. >Let me state the scope of the thought experiment: >Goal: Provide a means whereby a cvs administrator >may cause a program other than diff3 to be used >when doing merge operations as a part of a >three-way merge of files in a sandbox. This >program might be defined as a keyword used as the >value of a 'diff3hint' followed by an 'id' which >could be looked up in a table that cvs could keep >to determine which executable and any additional >arguments above the diff3 form arguments might be >required. Again, I think that recording a data type is a more straightforward (or at least more easily understood) implementation. >Assertion: The diff3 replacement must handle >all of the args that cvs normally passes to diff3. Yes. >Assertion: The diff3 replacement must not be >interactive in nature for client/server repository >uses. Well, okay for the first implementation. :-) >Assertion: The diff3 replacement must be able to >run just given the three versions of the file >without any other state. Yes, but it would be nice to be able to pass in the version numbers for column headings or the like, if the tool permits. >Assertion: That cvs continue to write new RCS files >in adherence to the syntax defined in rcsfile(5), but >allowing the introduction of one or more new phrases >and associated
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
>--- Forwarded mail from Greg Woods: >[ On Thursday, June 17, 2004 at 16:49:42 (-0700), Paul Sander wrote: ] >> Subject: Smoke, FUD (was Re: CVS corrupts binary files ...) >> >> If this is true, then we're in violent agreement. But to date, you have >> argued that making the necessary changes to CVS to give better support >> for data types not handled well specifically by the diff and diff3 programs >> would break compatibility with RCS, which is demonstrably false. >Have you not looked at the content of an RCS file lately Paul? >RCS compatability is far more than just the adherence to the syntax >defined in rcsfile(5). If the generic "co" program from the RCS package >cannot extract any and every revision of a file from a file claiming to >be an RCS file then that file is clearly not RCS compatible. I have never, ever advocated changing the format of an RCS file in a way that would break the ci, co, rcs, or rlog programs. And although I strongly advocate the replacement of user-exposed diff and merge tools, I have never, ever advocated the replacement of the diff tool that computes the deltas stored in an RCS file. (That is not to say that I have never suggested making incompatible changes, but in context such suggestions have always carried caveats and recognized the lack of desirability of losing a valuable feature.) I don't know where you seem to be getting the idea that I'm recommending doing a global search and replace of "diff" with some other tool. That is clearly not the case. The RCS file format must be retained, unless we as a group decide to abandon it after weighing the consequences. However, I do advocate extending the RCS file format in ways that the RCS API can accomodate. The rcsfile(5) manual specifically allows for extensions in the admin and delta sections of the file. For example, I do recommend using a newphrase in the admin section to identify the type of data stored in the file, but not until the rename problem is solved. >> How am I spreading Fear, Uncertainty, or Doubt? >Maybe hypocrisy would be a better description of your approach to CVS. I don't believe I'm misrepresenting any of my beliefs about CVS or SCM in general. I've tried very hard to explain them clearly, and I've tried especially hard to drill them into that rock that you carry on your shoulders, but I'm obviously using the wrong screwdriver. >--- End of forwarded message from [EMAIL PROTECTED] ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Greg A. Woods <[EMAIL PROTECTED]> writes: > [ On Monday, June 28, 2004 at 01:44:36 (-0700), Mark D. Baushke wrote: ] > > Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) > > > > The RCS format is very extensible and in fact the > > CVSNT folks have extended it already and I have had > > no problems using CVSNT repositories in conjunction > > with either CVS or RCS. > > "very" is an over-statement of the first order! ;-) Agreed. :-) > Sure, it's an extensible format, but not in the > way that's been suggested. You can't get rid of > the _exclusive_ use of diff et al without > entirely losing compatabilty with RCS. Yes, but diff is not diff3. diff is used for the delta format. diff3 is used by rcsmerge, not for fundamental version deltas. > > I do not see support for your assertion that > > compatibility is "far more" than just the > > adherence to the syntax defined in rcsfile(5). > > Sadly rcsfile(5) only describes the meta-syntax, > not the nuts&bolts of how RCS files work and how > they're actually used by the RCS package. True, but examiniation of the rcs sources (or cvs sources) can help you a lot. > > So, I believe that adding a > > > > 'diff3hint someprogram;' > > > > line to the RCS file should not be a problem > > for "co" to still be able to checkout each and > > every version of the file. > > "diff3hint" in the way you're hinting it might > be used is insufficient. Why? > RCS directly interprets the content of the delta > text information, e.g. the likes of: > > @d5 1 > a5 1 > some new line of text > d256 1 > @ > > See, for example, lib/rcsedit.c from the RCS > source distribution. Yes, and that is the concern of 'diff' NOT 'diff3'. My assumptions explicitly did NOT address any requirements other than that a 'diff3' replacement be used. Where did your assertion that this requires 'diff' to be changed arise? > Any modification of the diff algorithm would > almost certainly require changes to the syntax > of this delta text. I did not suggest modification of the diff format. I suggested modification of the diff3 program to be used. > As far as I can tell the extensibility of the > RCS,v syntax does not go so far as to provide > for callouts to add-on programs and I'm arguing > that it's _far_ too late to try to modify this > widely used standard file format now. With existing RCS, you may compile it to use DIFF3_BIN as any path you wish. There is nothing to guarentee that the diff3 does what the GNU diff3 program did... > So, how _exactly_ do you propose to convince the > standard "co" program (or the equivalent in any > other RCS-compatible tool suite, including the > current CVS implementations) to actually make > use of the new delta text syntax that such a > hack would create? I propose that "co" use "diff" just as it has always done. I am not proposing any change to the delta structure at all. The thought experiment is proposing a change in the function called to do three way diff and merge operations. > I.e. How do you propose to make it possible for > the standard RCS tools alone to re-create > _every_ revision from all files created by this > hacked system? What I suggested does not require this. > It's simply not possible. You say this, but are assuming facts that were not supported. Why does a change to 'diff3' for merge operations imply or require a change to 'diff' for everything else? > Like I said, only the bare surface of RCS > compatability is scratched by the meta-syntax > described in rcsfile(5). Why or how would a change in diff3 impact delta formats for RCS? The DIFF3 binary is used only in rcs-5.7/src/merger.c and plays no direct role in checkout or commit of RCS files. > The RCS file format is intricately intertwined > with the unix diff algorithm, Actually, I suspect this to be false. I believe the RCS delta section format is intertwined with the ed(1) command format. > which is itself tightly dependent on the > "normal" use of lines of text to represent > elements of a the source files being managed (at > least when it comes to automated merging for > concurrent editing). And all of that is not material to the current thought experiment. > Meanwhile there are other change delta file > formats and other version tracking tools that > use those other formats, and often there are > also tools that will convert RCS/CVS > repositories into those other formats. I.e. > there&
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
[ On Monday, June 28, 2004 at 01:44:36 (-0700), Mark D. Baushke wrote: ] > Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...) > > The RCS format is very extensible and in fact the > CVSNT folks have extended it already and I have had > no problems using CVSNT repositories in conjunction > with either CVS or RCS. "very" is an over-statement of the first order! ;-) Sure, it's an extensible format, but not in the way that's been suggested. You can't get rid of the _exclusive_ use of diff et al without entirely losing compatabilty with RCS. > I do not see support for your assertion that > compatibility is "far more" than just the > adherence to the syntax defined in rcsfile(5). Sadly rcsfile(5) only describes the meta-syntax, not the nuts&bolts of how RCS files work and how they're actually used by the RCS package. > So, I believe that adding a > > 'diff3hint someprogram;' > > line to the RCS file should not be a problem for > "co" to still be able to checkout each and every > version of the file. "diff3hint" in the way you're hinting it might be used is insufficient. RCS directly interprets the content of the delta text information, e.g. the likes of: @d5 1 a5 1 some new line of text d256 1 @ See, for example, lib/rcsedit.c from the RCS source distribution. Any modification of the diff algorithm would almost certainly require changes to the syntax of this delta text. As far as I can tell the extensibility of the RCS,v syntax does not go so far as to provide for callouts to add-on programs and I'm arguing that it's _far_ too late to try to modify this widely used standard file format now. So, how _exactly_ do you propose to convince the standard "co" program (or the equivalent in any other RCS-compatible tool suite, including the current CVS implementations) to actually make use of the new delta text syntax that such a hack would create? I.e. How do you propose to make it possible for the standard RCS tools alone to re-create _every_ revision from all files created by this hacked system? It's simply not possible. Like I said, only the bare surface of RCS compatability is scratched by the meta-syntax described in rcsfile(5). The RCS file format is intricately intertwined with the unix diff algorithm, which is itself tightly dependent on the "normal" use of lines of text to represent elements of a the source files being managed (at least when it comes to automated merging for concurrent editing). Meanwhile there are other change delta file formats and other version tracking tools that use those other formats, and often there are also tools that will convert RCS/CVS repositories into those other formats. I.e. there's no _real_ fundamental need to hack on RCS,v syntax. -- Greg A. Woods +1 416 218-0098 VE3TCPRoboHack <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]> Secrets of the Weird <[EMAIL PROTECTED]> ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Greg, Greg A. Woods <[EMAIL PROTECTED]> writes: > RCS compatability is far more than just the > adherence to the syntax defined in rcsfile(5). Is it? I think I must be missing something here. The RCS format is very extensible and in fact the CVSNT folks have extended it already and I have had no problems using CVSNT repositories in conjunction with either CVS or RCS. > If the generic "co" program from the RCS package > cannot extract any and every revision of a file > from a file claiming to be an RCS file then that > file is clearly not RCS compatible. Sure, I agree with your statement. If a generic "co" program is not able to extract any and every revision of a file from its file,v form, then it is not RCS compatible. I do not see support for your assertion that compatibility is "far more" than just the adherence to the syntax defined in rcsfile(5). In particular, I am not aware of any fundamental problems rcs 5.7 will have if someone were to introduce a new keyword which would name a program other than diff3 to be used in rcsmerge operations. At most, I would expect a warning message via the warnignore() function which would specify co: file,v: warning: Unknown phrases like `diff3hint ...;' are present. and even so, a 'co -q file,v' would not generate such a message. So, I believe that adding a 'diff3hint someprogram;' line to the RCS file should not be a problem for "co" to still be able to checkout each and every version of the file. Given that this would appear to be the desire of at least a few folks out there who might want to make CVS do a better job at merging structured ASCII files such as XML or HTML format. And further, that you seem to have objections to this approach. And while I have known you to bring up points I have overlooked in the past... This time around I just do not see anything that would preclude such an approach of using an external diff3 hint 'replacement' program for doing a 'cvs update -jtag1 -jtag2' operation. I will stipulate that such a program will likely need to live on the server and furthermore that it would not be interactive. In the absense of finding such a program, CVS would likely resort to using diff3 as a fallback, so its arguments would likely need to match those of the diff3 program itself... at least to the extent that cvs currently uses various arguments to diff3. So, as I trust that you have an example in mind that is a conflicting case, I must clearly be missing something here. I would take it is a favor if you could ellaborate in concrete terms why adding a new keyword and value to existing RCS format files to support an alternative to diff3 is not a viable path for a hook that users may wish to exercise. If there is some other communication error that has entered into the thread, I must have missed it. Feel free to point it out, but I would still be interested in your view of the following thought experiment. Let me state the scope of the thought experiment: Goal: Provide a means whereby a cvs administrator may cause a program other than diff3 to be used when doing merge operations as a part of a three-way merge of files in a sandbox. This program might be defined as a keyword used as the value of a 'diff3hint' followed by an 'id' which could be looked up in a table that cvs could keep to determine which executable and any additional arguments above the diff3 form arguments might be required. Assertion: The diff3 replacement must handle all of the args that cvs normally passes to diff3. Assertion: The diff3 replacement must not be interactive in nature for client/server repository uses. Assertion: The diff3 replacement must be able to run just given the three versions of the file without any other state. Assertion: That cvs continue to write new RCS files in adherence to the syntax defined in rcsfile(5), but allowing the introduction of one or more new phrases and associated id word values as allowed for by the RCS format syntax. It would be left to the extension designer to determine the method whereby such a new RCS phrase would be written into the CVS repository versions of the files. Thanks, -- Mark -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQFA39pz3x41pRYZE/gRAhjEAJ94c9uKJEwZww8lGAFGQJW68vvEswCfX3Ae HoyCY1oAu/1+v9jOMxBXflE= =u1L8 -END PGP SIGNATURE- ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
[ On Thursday, June 17, 2004 at 16:49:42 (-0700), Paul Sander wrote: ] > Subject: Smoke, FUD (was Re: CVS corrupts binary files ...) > > If this is true, then we're in violent agreement. But to date, you have > argued that making the necessary changes to CVS to give better support > for data types not handled well specifically by the diff and diff3 programs > would break compatibility with RCS, which is demonstrably false. Have you not looked at the content of an RCS file lately Paul? RCS compatability is far more than just the adherence to the syntax defined in rcsfile(5). If the generic "co" program from the RCS package cannot extract any and every revision of a file from a file claiming to be an RCS file then that file is clearly not RCS compatible. > How am I spreading Fear, Uncertainty, or Doubt? Maybe hypocrisy would be a better description of your approach to CVS. -- Greg A. Woods +1 416 218-0098 VE3TCPRoboHack <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]> Secrets of the Weird <[EMAIL PROTECTED]> ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs
Smoke, FUD (was Re: CVS corrupts binary files ...)
Whew, the smoke's getting thick in here! >From: [EMAIL PROTECTED] >[ On Thursday, June 17, 2004 at 13:06:44 (-0700), Paul Sander wrote: ] >> Subject: Re: CVS corrupts binary files ... >> >> Current releases of CVS do the latter. (Don't believe me? Look at >> the function named RCS_merge in the rcscmds.c source file.) It's a >> simple matter to replace the invocation of diff3 with a different tool. >Huh!?!?!? >Since when does the phrase "diff and diff3 algorithms" identify any >particular program that might implement those algorithms? Then you don't object to swapping the diff and diff3 programs out for others that might apply other 2-way and 3-way differencing algorithms that are more appropriate to the data at hand, for purposes other than maintainting the integrity of the RCS file format? If this is true, then we're in violent agreement. But to date, you have argued that making the necessary changes to CVS to give better support for data types not handled well specifically by the diff and diff3 programs would break compatibility with RCS, which is demonstrably false. The maintenance of version history is sufficiently insulated from the user interfaces of the content merge features that there is simply no credible argument on that basis. >Paul, _you_ are the one spreading FUD here. How am I spreading Fear, Uncertainty, or Doubt? I'm claiming that CVS is capable of doing more than it does, with only minor changes (i.e. none that have significant impact on its architecture). There's no FUD here, other than what's in your head. The world won't end if CVS changes its merge tool, Greg. Get over it. >In case you have forgotten I am intimately familiar with exactly how the >GNU diffutils code and the GNU patch code is integrated into the CVS >source. Not so intimate that you fully understand how the CVS design constrains the effects of certain kinds of changes, apparently. ___ Info-cvs mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/info-cvs