RE: Patch for making CommitID configurable

2005-04-28 Thread Torsten Martinsen
Mark D. Bausche wrote:

 There are a number of tools and utilities that use RCS which may have
 problems: 
 
   ViewCVS
   CVSWeb
   SmartCVS
   TortoiseCVS
 
 but I suspect that all of them have already run across the CVSNT
 commitid and are handling it correctly.

Indeed, TortoiseCVS is designed specifically to work with a CVSNT
client.
And the configuration mechanism for ViewCVS allows for (actually I
believe that it defaults to this, 
if it detects that CVSNT is installed) using the RCS wrappers that are
part of CVSNT.

-Torsten


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Mark D. Baushke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Peter,

Jim has already mentioned some things about why the commitid code may be
useful.

It may be worth noting that CVSNT has had this feature for a long time
and moving to adopt it satisfies a minor goal of trying to reduce the
separation and entropy between the major CVSNT fork of CVS and the CVS
that cvshome offers.

I honestly do not think this feature is the problem you seem to believe.

If you can provide more consumers of the ,v files that have problems
using the addition to the format, it would be good to have that list.

Thanks,
-- Mark
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCb2ec3x41pRYZE/gRAp6bAJwMvLXXweGhcUaZIYBVz6gP8Z4bxQCg1Zhq
2KS2OFODqpy57sEaGRUnkhE=
=wZpp
-END PGP SIGNATURE-


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


RE: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hello,

On 26 Apr 2005 at 16:02, Jim.Hyslop wrote:

 Peter Backes wrote:
  Just think about what *is* the big advantage of CVS besides working 
  on RCS files instead of a strange ever-changing file format?
 ever-changing? I think you're exaggerating here. When was the last
 time the RCS file format changed? 

I was not refering to RCS files concretely, but to a hypothetical 
file format which changes almost at every version of the software, 
something that might happen, perhaps in a less dramatic way, if more 
extensions like CommitID are going to be added in the future.  

 What's the point in having the rcsfile(5) specification have the
 newphrases spec, if you aren't going to use it?
 (http://www.daemon-systems.org/man/rcsfile.5.html)

I guess to keep upward compatibility with new versions of rcs in the 
past.  If it was there for arbitrary use, there would surely be some 
interface to specify them.

 Incrementally adding a new feature is a lot less of a change, and a lot less
 drastic, than switching to an entirely new system.

A program whose file formats keep changing incrementally (and thus 
all the time) causes a feeling of uncertainity.  If I do a drastic 
change on the other hand, I at least know what I get and I can try 
before.  If I am forced to update CVS because of a security problem 
and I notice suddenly it has some unexpected 'incrementally added 
feature', this is not least astonishing.

 The way you're talking, it sounds as if you are saying that, once a
 program is released, it should never change, and if you want new
 features you should write a whole new, different program to add those
 features. Is that really what you're proposing? 

It is something different if a feature is added within the existing 
specification for the interface and the file formats, or if the 
specification is being changed itself.

But yes, I am saying a program should stick to interface and file 
formats if at all possible.  Today's programs are unfortunately 
changing these much too often and are causing major headaches to a 
lot of users and very many hours wasted of lifetime.  

Just see TeX.  Without doubt, and you will surely agree, one of the 
best programs, perhaps the best program, ever written.  But a big 
part of its success is that features were added carefully and it has 
now come to a point where it is not going to be changed anymore 
except for very cruical bug fixes--it is a safe basis to do work 
with.

  Having said this, it is obvious that it should also be a question of 
  whether CommitID should be kept as a feature *at all*.
 No, it is not obvious at all. It is only obvious if one is intent on keeping
 the status quo.

I didn't say it should go definitively.  But it must be questioned 
and discussed!

  It is much better to use the loginfo feature [...]
 For what definition of better? Better for _you_, perhaps, but not for the
 dozens or hundreds of users (like me) who _want_ this feature.

It is perhaps not better for these hundreds of users who want such a 
feature quickly, without any further efforts other than updating cvs 
and no matter how implemented, I agree.

But it is better concerning time and network bandwidth wasted.  I do 
not want to do a scan trough all files in my project just to find out 
which files were changed in a commit.  And it is better in that it is 
entirely independent from a change CVS itself. 

 Using commitinfo requires 
 - each and every installation to make the same changes to their existing
 commitinfo scripts; this requires hundreds of hours of wasted, duplicated
 effort. 

I agree this is not an option, but I never imagined it should work 
like this.

 Sure, you could make a generic commitinfo script available - but if
 anyone already *has* a commitinfo script, then they won't be able to
 use the canned one. 

It's easy to write a script which saves the input and invokes all 
loginfo scripts one wants to execute on this input in the desired 
order.

 - tracking the commit ID in a separate database. 
 Separating the commit information (i.e. the ID and the log) is not a
 good idea. 

Only one file has to be read if one wants to retrieve the info, not 
all of them.  The log can be saved together with commitlog.  This 
seems like a good idea to me.  Why exactly do you think it is not a 
good idea?

 Assuming that each and every commit is tagged, perhaps. But that's not
 necessarily the case, and it's certainly not a practise I would encourage.

Independent from if this should be encouraged or not, it was the 
solution used by rcsfreeze.  Why exactly wouldn't you encourage it?

  Please don't ...  RCS is stable, and the files it writes have been 
  the same for years
 And your point would be... what, exactly? Does RCS not have any kind of a
 test suite to check for problems?

I was refering to 'stable' not concerning bugs, but concerning that 
the interfaces and file formats stay the same.  

 So, let me get this straight. Just because *you* don't see any 

Re: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hi,

On 27 Apr 2005 at 3:21, Mark D. Baushke wrote:

 It may be worth noting that CVSNT has had this feature for a long time
 and moving to adopt it satisfies a minor goal of trying to reduce the
 separation and entropy between the major CVSNT fork of CVS and the CVS
 that cvshome offers.

I agree that a commit ID might be a handy feature, but the way CVSNT 
did it was merely a quick and dirty hack IMO.  It relies on the 
concept of seconds since epoch, which is not portable.  Further the 
commit ID can only be assumed to be unique for a certain repository, 
so the whole thing cannot be used if somebody wants to build a 
distributed SCM on top of CVS.

I stick to my opinion that currently loginfo provides a much better 
way to achieve what Jim sees CommitID useful for.

 I honestly do not think this feature is the problem you seem to believe.

Even if it isn't, I don't see why it shouldn't be possible to apply 
my patch to let the user decide what he wants.  What is currently 
being done is that users are forced to have CommitIDs even if they 
don't want them (for whatever reason).  This cannot be right.

 If you can provide more consumers of the ,v files that have problems
 using the addition to the format, it would be good to have that list.

Besides rcs, I only remember cvsup as a program that might access the 
files in a CVS directory directly.  However, I don't know if it has 
any problems with the addition to the format, as I don't use it.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


RE: Patch for making CommitID configurable

2005-04-27 Thread Jim.Hyslop
Peter Backes wrote:
[a lot of stuff]

 Just see TeX.  Without doubt, and you will surely agree, one of the 
 best programs, perhaps the best program, ever written.
I've used it. Can't stand it. I don't have the time, nor the patience, to
learn all the arcane commands you need to know in order to make things
happen. And I shouldn't *have* to learn all that stuff. To me, the whole
point of having a computer is that it takes care of all that complexity, and
lets me focus on what's important - the document's contents and, to a minor
extent, its formatting.

Peter, it is clear that you and I have fundamentally different, and probably
irreconcilable, approaches and philosophies on software development. These
differences are extremely unlikely to be resolved in this forum.

-- 
Jim Hyslop
Senior Software Designer
Leitch Technology International Inc. ( http://www.leitch.com )
Columnist, C/C++ Users Journal ( http://www.cuj.com/experts )


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


RE: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hello,

On 27 Apr 2005 at 11:23, Jim.Hyslop wrote:

 Peter, it is clear that you and I have fundamentally different, and
 probably irreconcilable, approaches and philosophies on software
 development. These differences are extremely unlikely to be resolved in
 this forum. 

I know, and this is the reason why I made the patch so everyone can 
decide if he wants to use the current CommitID implementation or not. 
I have yet to see an argumentation which shows me why my patch is not 
appropriate.  Any arguments to far were for CommitID, not against an 
option by which it can be turned on or off in CVSROOT/config.

So let me repeat again, in case you got it wrong.  The patch I sent 
does not remove CommitID, it adds an option by which it can be 
disabled.

At least you seem to be luckier than me with your philosophy, as you 
must be very happy with all the software that is being written today 
just according to these.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Mark D. Baushke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Peter Backes [EMAIL PROTECTED] writes:

 On 27 Apr 2005 at 3:21, Mark D. Baushke wrote:
 
  It may be worth noting that CVSNT has had this feature for a long time
  and moving to adopt it satisfies a minor goal of trying to reduce the
  separation and entropy between the major CVSNT fork of CVS and the CVS
  that cvshome offers.
 
 I agree that a commit ID might be a handy feature, but the way CVSNT 
 did it was merely a quick and dirty hack IMO.  It relies on the 
 concept of seconds since epoch, which is not portable.  Further the 
 commit ID can only be assumed to be unique for a certain repository, 
 so the whole thing cannot be used if somebody wants to build a 
 distributed SCM on top of CVS.

The currently generated global_session_id is indeed ((2*sizeof(int)) + 12)
bytes in size and on most platforms that will be 16 bytes. However, the
CVSNT specification says that it may be

global_session_id = Xasprintf (%x%08lx%04x, (int)getpid(),
  (long)time (NULL), rand()0x);

Concerning the CVSNT sources...
One 'bug' is that src/RepositoryDatabase.cpp uses
'commitid CHAR(16) not null,' and the CVSNT sources use

  src/cvs.h:#define GLOBAL_SESSION_ID_LENGTH 64
  src/cvs.h:extern char global_session_id[GLOBAL_SESSION_ID_LENGTH];
  doc/cvs.dbk:
primaryCOMMITID, internal variable/primary
  /indextermUnique Session ID of cvsnt process. This is a random
string of printable characters that may be up to 256 characters
  src/main.cpp
static void make_session_id()
{

sprintf(global_session_id,%x%08lx%04x,(int)getpid(),(long)time(NULL),rand()0x);
}

which means that it is possible for CVSNT to handle 63 bytes in the
global_session_id plus the '\0' byte or for display purposes they
document a 256 character limit.

Of course, at present both CVS and CVSNT are only using [a-fA-F0-9] for
the characters in the string although CVS allows for [a-zA-Z0-9] in our
sanity.sh testing and CVS does not used a fixed buffer length so we
could easilty increase the size of the formulation if there is a need.

We are restricting the use of a '.' in the commitid, but we could
probably relax the encoding to allow a '%' sign and use a %2e escape
if you wanted to add in a FQDN for the hostname.

If there are any other suggestions for how the global_session_id should
be modified, I would like to see it discussed.

 I stick to my opinion that currently loginfo provides a much better 
 way to achieve what Jim sees CommitID useful for.

Are there any other folks that wish to chime in on this discussion?

  I honestly do not think this feature is the problem you seem to believe.
 
 Even if it isn't, I don't see why it shouldn't be possible to apply 
 my patch to let the user decide what he wants.  What is currently 
 being done is that users are forced to have CommitIDs even if they 
 don't want them (for whatever reason).  This cannot be right.

Hmmm... cvs 1.12.x is where we are doing new features that we consider
to be reasonable as the future direction of CVS. It is not yet the
stable version. If possible, when this version is blessed as STABLE to
replace the cvs 1.11.x series, we would rather have a standard version
that interoperates well with all other clients and servers.

If other people believe strongly that this feature needs to be a
compile-time or pre-repository config option, we can consider it and
bring it to a vote among the developers based on user input.

For example, I am waiting to hear what Greg A. Woods has to say on this
subject. He has been fairly vocal in the past about retaining the older
RCS ,v format without extension.

  If you can provide more consumers of the ,v files that have problems
  using the addition to the format, it would be good to have that list.
 
 Besides rcs, I only remember cvsup as a program that might access the 
 files in a CVS directory directly.  However, I don't know if it has 
 any problems with the addition to the format, as I don't use it.

Yes. I know about CVSup and CVSupd. I believe that it handles
newphrases in the tree section of the delta without any problems. See
RCSDelta.m3 IterateTextPhrases and IterateTreePhrases. It just walks
them in order and preserves them.

So, to be clear, the following are potential consumers of additions:

  RCS - the format allows for newphrases and RCS has a -q switch to
inhibit complaints about it not understanding how to use the
newphrase it does not recognize.

  CVS - cvs 1.12.x has introduced commitid into the delta structure

  CVSNT - The original home of the commitid allowed for a 64byte
  buffer on generation and documents in cvs.dbk that it may be
  up to 256 characters long.

  CVSup - seems to handle RCS extensions without caring what they do.

There are a number of tools and utilities that use RCS which may have
problems:

  ViewCVS
  CVSWeb
  SmartCVS
  TortoiseCVS

but I 

Re: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hello,

On 27 Apr 2005 at 11:08, Mark D. Baushke wrote:

 global_session_id = Xasprintf (%x%08lx%04x, (int)getpid(),
   (long)time (NULL), rand()0x);

(long)time(NULL), getpid(): not portable.
rand()0x: makes commit IDs probabilistic, not unique.

Why not simply keep a counter in a file which is being increased on 
each commit and used as the commit ID?  This avoids the probabilistic 
aspect and is entirely portable.  It was also the solution used for 
rcsfreeze.  The location could be a config file option.

 We are restricting the use of a '.' in the commitid, but we could
 probably relax the encoding to allow a '%' sign and use a %2e escape
 if you wanted to add in a FQDN for the hostname.

What is the reason for '.' being disallowed?

 CVS has added stuff to the RCS format in the past, even though those
 options are usually disabled: permissions and hardlinks.

I would love to see any new feature adding to the RCS file format 
being handled in exactly the same way.

 CVSNT has added commitid and mergepoint newphrases. It is entirely
 possible that CVS could add support for a mergepoint newphrase in some
 future release.

 CVSNT has also extended the -k flags to allow for UTF-8 characters.

I guess -k and mergepoint are only being written on user request.  
(for example if he sets the new keyword or if he requests a 
mergepoint to be written explicitely.)  This is entirely okay.  The 
difference is that commitid is being written on each commit, while it 
was not like that in the past, and currently in a way that does not 
allow the user to prevent that.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Derek Price
Mark D. Baushke wrote:

 Even if it isn't, I don't see why it shouldn't be possible to apply
 my patch to let the user decide what he wants.  What is currently
 being done is that users are forced to have CommitIDs even if they
 don't want them (for whatever reason).  This cannot be right.

 Hmmm... cvs 1.12.x is where we are doing new features that we consider
 to be reasonable as the future direction of CVS. It is not yet the
 stable version. If possible, when this version is blessed as STABLE to
 replace the cvs 1.11.x series, we would rather have a standard version
 that interoperates well with all other clients and servers.


I agree.  Feature was intended to give us more freedom to change the
file formats and whatnot without upsetting the luddites among us.  ;)
We're still going to fairly great lengths to avoid breaking backwards
compatibility with various applications, but I think a few warning
messages from RCS are an acceptable trade for some of these new features.

If you want stable file formats and APIs, Peter, you should be sticking
with the stable series (1.11.x) of releases.

 If other people believe strongly that this feature needs to be a
 compile-time or pre-repository config option, we can consider it and
 bring it to a vote among the developers based on user input.


Just a note, I already agreed to commit Peter's patch some days ago in
this thread, with the change that I was going to make the default
behavior leaving commitID enabled, with his config key able to turn it
off, pending lack of objections from the other developers and general
agreement from users.

Cheers,

Derek


signature.asc
Description: OpenPGP digital signature
___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs

Re: Patch for making CommitID configurable

2005-04-27 Thread Mark D. Baushke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Derek Price [EMAIL PROTECTED] writes:

 Just a note, I already agreed to commit Peter's patch some days ago in
 this thread, with the change that I was going to make the default
 behavior leaving commitID enabled, with his config key able to turn it
 off, pending lack of objections from the other developers and general
 agreement from users.

Hmmm... I don't think that would avoid having some commitid deltas in
the repository if the user wants to make use of the switch.

I was under the impression that Peter's patch adds a new UseCommitID
keyword into the CVSROOT/config file. This file has an initial creation
via 'cvs init' and will not have a commitid in the delta right now
(I have been meaning to ask if that is a bug or a feature).

If an administrator goes ahead and modifies a checked-out copy of
CVSROOT/config to add UseCommitID=no and commits it, then the delta for
that change will have a commitid in it.

If the intent of the administrator was to avoid having ANY deltas in the
repository with the commitid newphrase, then this one file will be an
exception.

So, if you wish to default to UseCommitID=yes for CVS, then you probably
also need to provide a 'cvs admin' switch that will remove commitid
phrases for given revisions of files.

Is avoiding commitid really worth all of this trouble? If so, then
allowing the administrator to rip out any uses of it after the fact also
seems needed.

Comments on the patch... if UseCommitID=no I would have expected that to
just deal with the generation of new commitid keywords, not the display
of log messages or versions that have it. So, I would have expected it
to control import.c and rcs.c output, but I would NOT have expected it
to be quiet if it finds a commitid field present in the delta. I would
also expect that the new .commitid tag processing would work if there
were delta records with a commitid in them regardless of the UseCommitID
value.

Summary: I can see the (marginal) utility of adding a way to avoid
creating new commitid tags in the RCS files of the CVS repository. I can
not see any benefit in supressing new CVS functionality for revisions of
files that use them.

Therefore, I object to Peter's patch as provided.

-- Mark
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCb+iZ3x41pRYZE/gRAiNRAKCEz0zm80/FNdGTx+LmgKSYUTTqzQCeOv9S
KteQb5obhzNsKqWzHjplWE4=
=zJKj
-END PGP SIGNATURE-


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Mark D. Baushke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Peter Backes [EMAIL PROTECTED] writes:

 On 27 Apr 2005 at 11:08, Mark D. Baushke wrote:
 
  global_session_id = Xasprintf (%x%08lx%04x, (int)getpid(),
(long)time (NULL), rand()0x);
 
 (long)time(NULL), getpid(): not portable.

That one requires supporting documentation.

Which platforms do not provide them? CVSNT and CVS both use them
extensively across all of our supported platforms.

I will grant that time() may eventually (in 2036) need to be revisited
when the UNIX epoch wraps. I would support 64bit time if there were an
easy way to specify it. On some sytems, truncating getpid() to an int
may be less useful if sizeof(pid_t) is larger than sizeof(int). If you
know of any such systems, we could consider going to a larger type for
the Xasprintf() call.

 rand()0x: makes commit IDs probabilistic, not unique.

True enough.

 
 Why not simply keep a counter in a file which is being increased on 
 each commit and used as the commit ID?  This avoids the probabilistic 
 aspect and is entirely portable.  It was also the solution used for 
 rcsfreeze.  The location could be a config file option.

This would create a hot-spot for contention of all cvs commits for the
repository in very large and very busy repositories this would be a
nightmare.

If you want a 'better' global_session_id, then perhaps doing a SHA256
hash of all of the files being committed in this session would be more
unique... of course, that is problematic for other reasons.

  We are restricting the use of a '.' in the commitid, but we could
  probably relax the encoding to allow a '%' sign and use a %2e escape
  if you wanted to add in a FQDN for the hostname.
 
 What is the reason for '.' being disallowed?

See the discussion on being able to use a commitid in a CVS tag in the
archives.

  CVS has added stuff to the RCS format in the past, even though those
  options are usually disabled: permissions and hardlinks.
 
 I would love to see any new feature adding to the RCS file format 
 being handled in exactly the same way.

The same way as your optional CVSROOT/config addition with it disabled
by default?

  CVSNT has added commitid and mergepoint newphrases. It is entirely
  possible that CVS could add support for a mergepoint newphrase in some
  future release.
 
  CVSNT has also extended the -k flags to allow for UTF-8 characters.
 
 I guess -k and mergepoint are only being written on user request.  
 (for example if he sets the new keyword or if he requests a 
 mergepoint to be written explicitely.)  This is entirely okay.  The 
 difference is that commitid is being written on each commit, while it 
 was not like that in the past, and currently in a way that does not 
 allow the user to prevent that.

It happens when users do a 'cvs update -j branch-tag' command. See
http://www.cvsnt.org/wiki/MergePoint for details. So, it is not really
very explicit on the part of the user in some sense.

Enjoy!
-- Mark
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCb+x53x41pRYZE/gRAsscAJ94JZEx8WeLzfhh5Gnib51xFHcqggCggnwh
+7sEjFcPwO5tigU8ASEjZWY=
=gnOD
-END PGP SIGNATURE-


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hello,

On 27 Apr 2005 at 12:31, Mark D. Baushke wrote:

 Summary: I can see the (marginal) utility of adding a way to avoid
 creating new commitid tags in the RCS files of the CVS repository. I can
 not see any benefit in supressing new CVS functionality for revisions of
 files that use them.

I agree with your arguments.  The cvs init thing is a nasty chicken-
and-egg problem.  A command line option is necessary for cvs init to 
specify the CommitID value to create the CVSROOT with and to write 
into CVSROOT/config.  I'd imagine something like 

cvs -d ... init -o CommitID=yes

You are also right in that if some revision has a commit ID, it 
should be displayed regardless of the CommitID option.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hello,

On 27 Apr 2005 at 12:48, Mark D. Baushke wrote:

  (long)time(NULL), getpid(): not portable.
 That one requires supporting documentation.
 
 Which platforms do not provide them? CVSNT and CVS both use them
 extensively across all of our supported platforms.

I can only say that Standard C doesn't specify any type for time_t, 
it is entirely opaque and can be implemented as a struct.  I think I 
have read some systems choose double.  getpid() is POSIX.

But it's not about CVS, it's about the file format.  A portable file 
format should not contain any information which cannot be 
created/processed in a portable way.  

  Why not simply keep a counter 
 This would create a hot-spot for contention of all cvs commits for the
 repository in very large and very busy repositories this would be a
 nightmare.

Locking one file as small as about 6 bytes, reading, writing and 
unlocking it?  I guess it wouldn't be a bottleneck, but it can only 
be told by measuring.

 If you want a 'better' global_session_id, then perhaps doing a SHA256
 hash of all of the files being committed in this session would be more
 unique... of course, that is problematic for other reasons.

Doesn't work.  If you undo a change and commit the result, you'd get 
duplicate commit IDs.  It would also be extremely slow.

Also, compare-by-hash is not an option for a general SCM, because if 
researchers working on hash collisions broke some hash function and 
have their related files under revision control, effects can be 
disasterous.  I already noticed such effects working with common file 
distribution tools (which use compare-by-hash) on files in my crypto 
folder, some of which are recently discovered hash collisions.

  I guess -k and mergepoint are only being written on user request.  
 It happens when users do a 'cvs update -j branch-tag' command. See
 http://www.cvsnt.org/wiki/MergePoint for details. So, it is not really
 very explicit on the part of the user in some sense.

Then if it should be implemented in CVS, IMO it should be done a way 
that makes it's creation more explicit.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-27 Thread Peter Backes
Hello,

On 27 Apr 2005 at 18:11, Derek Price wrote:

 This information doesn't need to be processed in any non-opaque way
 once created, but uniqueness is an argument.  Once created, it can be
 passed, basically, as a tag to CVS, at which point only uniqueness
 matters. not where the unique value came from. 

In real world conditions, uniqueness of the current solution is 
questionable, as often computer clocks are wrong and are adjusted on 
the fly by several minutes, for example with ntpdate.  This is not 
recommended, but it happens.  Also repositories are sometimes copied 
to a different machine with a differing clock.  If a repository is 
under heavy load (several commits per second) under situations where 
the clock might be changed into the past, PID and randomness are the 
only protection against collisions.

Why not measure speed of an rcsfreeze-like solution where commit IDs 
are guaranteed to be unique?  I'm interested if there's really a 
bottleneck.

 I really dislike the idea of making the user request it be enabled,
 unless there is a darn good reason.  I do not yet consider a few
 warning messages from RCS a darn good reason. 

I see.  If any suasion fails, the only solution is another fork of 
CVS, this seems inavoidable to me.  The time spent discussing is 
probably better spent programming.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-26 Thread Mark D. Baushke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Derek Price [EMAIL PROTECTED] writes:

 Peter Backes wrote:
 On 21 Apr 2005 at 10:02, Derek Price wrote:
 I really want to make this the default behavior, even after an
 upgrade.  Would this do the trick for you?
 
 What do you mean exactly?  new-use_commit_id = true instead of
 false, i.e. have commit IDs written even for old repositories after
 an update?  I'd not recommend this.  It violates the principle of
 least astonishment (like the current solution).  However, at least it
 was better than the current situation.
 
 I'm not sure I believe that.  I think most people would prefer to not
 have to jump through any hoops to see new features enabled, especially
 when dealing with feature releases, and I don't think all that many
 people run RCS inside their CVS repositories anymore.  Any dissenting
 opinions other than Peter's out there?

I am given to understand that WebCVS and ViewCVS can run 'rcs' commands
like rlog as a way to glean information out of the local repository
without running 'cvs' commands on the ,v files.

 Why is RCS whining anyhow?  The spec leaves room for newphrases.  Is it
 requesting an upgrade?

If you pass the -q switch to the RCS 5.7 commands, they will not whine
about unknown phrases. The warning message spews from the
rcs-5.7/src/rcslex.c::warnignore() function which is called from
rcs-5.7/src/rcslex.c::getphrases() when NextString is not one of the
known keywords.

A solution would be to ensure that the '-q' switch is being passed to
any rcs command that might run into a newphrase like 'commitid'

% rlog foo,v
rlog: foo,v: warning: Unknown phrases like `commitid ...;' are present.
...
% rcsdiff -r1.1 -r1.2 foo,v
rcsdiff: foo,v: warning: Unknown phrases like `newphrase ...;' are present.
===
RCS file: foo,v
retrieving revision 1.1
retrieving revision 1.2
diff -r1.1 -r1.2
26a27
 #
%

See also the threads here:

  http://www.cvsnt.org/pipermail/cvsnt/2003-February/005206.html
  http://mailman.lyra.org/pipermail/viewcvs/2003q1/001713.html
  http://www.akhphd.au.dk/~bertho/cvsgraph/viewcvs.cgi/cvsgraph/rcsl.l?rev=1.6

One solution might be to modify lib/vclib/bincvs/__init__.py to
use commands like 'co', 'rlog' with different command-line args than

fp = self.rcs_popen('co', (rev_flag, full_name), 'rb')
fp = self.rcs_popen('rlog', args, 'rt', 0)

or the others that exist.

Another alternative might be to provide a patch to the RCS maintainer
to support any of the newphrase formats that CVS has introduced...

Enjoy!
-- Mark
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCbezU3x41pRYZE/gRAq2KAKCg9gtNeRyUVhgA1aSRMlpBL/m1QQCfSHPH
o1NmVR1kyrB/uucNfTBFeTA=
=pUMj
-END PGP SIGNATURE-


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-26 Thread Peter Backes
On 26 Apr 2005 at 0:25, Mark D. Baushke wrote:

 Derek Price [EMAIL PROTECTED] writes:
  I'm not sure I believe that.  I think most people would prefer to not
  have to jump through any hoops to see new features enabled, especially
  when dealing with feature releases, and I don't think all that many
  people run RCS inside their CVS repositories anymore.  

But if in doubt, just in these sitations the principle of least 
astonishement gives us the answer!

Just think about what *is* the big advantage of CVS besides working 
on RCS files instead of a strange ever-changing file format?  I don't 
see any.  If you do not need this core feature, there are much better 
alternatives to pick from, like Subversion, and they provide much 
more than just writing a commit ID without ever using it!

Having said this, it is obvious that it should also be a question of 
whether CommitID should be kept as a feature *at all*.  It is much 
better to use the loginfo feature with the already present commitlog 
and then pick the changeset from this one (and that is already 
possible entirely without any change to cvs itself!) instead of 
relying on writing some hard-to-retrieve info into rcs files.  With a 
little bit of thinking, you can even find a way to accept the 
commitlog remotely entirely using existing cvs features!  So please 
think a little bit before using CommitID blindly.

When does a commit ID make sense?  Only in a distributed environment, 
where a revision must be identified uniquely independent of it's 
number.  Is the current CommitID usable for this? No, because a 
collision is not entirely unlikely (commit needs to happen at the 
same time with the same PID and the same random number, but I only 
see this to be safe enough to ensure unique IDs for the same 
machine.)  Some well-known commercial SCM for example includes the 
host name in the ID.  Besides that, CVS is currently not distributed 
at all.  And, if I have a certain CommitID and a file that didn't 
change in this commit, it is impossible to say which revision of the 
file was actually current at the time of the commit.

And isn't the commit time already unique per commit? (If the commit 
takes longer than a second?  Can there be more than one commit per 
second?) I don't know, but you should check this--it might be just 
the feature that you now try to implement a second time.

How about tags?  Don't they also provide just the solution to the 
problems commit ID was added for?

But I don't really know the intentions of the commit ID.  What it 
it's purpose?  Maybe I understand it better then.

 I am given to understand that WebCVS and ViewCVS can run 'rcs' commands
 like rlog as a way to glean information out of the local repository
 without running 'cvs' commands on the ,v files.

I didn't know about these problems, but it just shows the principle 
of least astonishement is not misplaced here.  If I am in a hurry, 
and need to update cvs because of the security bug and suddenly my 
webcvs stops working, this is not how it is supposed to be, is it?

 A solution would be to ensure that the '-q' switch is being passed to
 any rcs command that might run into a newphrase like 'commitid'

That's not a solution; it actually makes things worse!  Assumed there 
is a *real* problem rcs was about to tell me (like a corrpupted ,v 
file for whatever reason), I'd possibly not notice because of the '-
q'.  This is like adding casts in C programs just to shut off the 
compiler.

 Another alternative might be to provide a patch to the RCS maintainer
 to support any of the newphrase formats that CVS has introduced...

Please don't ...  RCS is stable, and the files it writes have been 
the same for years and it operates on *one* file at a time (it can 
actually process several, but only in a way that is more or less 
equivalent to a for loop, and this is certainly not to suggest any 
relationship at all between the files being processed.)  I don't see 
any feature which was not a step backwards that should be added to 
rcs.  So please don't change rcs anymore except for bug fixes!  

CVS is to be built upon the things given by RCS, not the other way 
around.  Subversion is the better solution to all problems that might 
make a CommitID reasonable.

BTW, what was actually the reason for merging all the tools (rcs, 
diff3, etc.) into cvs?  I think I have read there was something with 
branches that could not be done by invoking rcs, can you please tell 
me what this was exactly?

Independent from this, I think my patch is a good idea.  It leaves 
the choice to the user.

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-26 Thread Mark D. Baushke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Derek Price [EMAIL PROTECTED] writes:

 Mark D. Baushke wrote:
  I am given to understand that WebCVS and ViewCVS can run 'rcs' commands
  like rlog as a way to glean information out of the local repository
  without running 'cvs' commands on the ,v files.
 
 Okay, but the threads you cite seem to indicate that they have dealt
 with the problems by passing -q into RCS or tweaking their parsers.  I
 still don't see any reason not to enable commitds by default on feature.

Okay. I suppose it may be desirable to make note of the possible
problems folks might see if they are using older RCS tools in our CVS
documentation, but this thread may be useful enough for that purpose.

-- Mark
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCbpW73x41pRYZE/gRAv85AKCxmFJru81qiaSDC29xpQV766+AyACZAcpB
6e3Bpt0FAjp/z0nKbnL/+/4=
=p81z
-END PGP SIGNATURE-


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


RE: Patch for making CommitID configurable

2005-04-26 Thread Jim.Hyslop
Peter Backes wrote:
 Just think about what *is* the big advantage of CVS besides working 
 on RCS files instead of a strange ever-changing file format?
ever-changing? I think you're exaggerating here. When was the last time
the RCS file format changed?

What's the point in having the rcsfile(5) specification have the
newphrases spec, if you aren't going to use it?
(http://www.daemon-systems.org/man/rcsfile.5.html)

 I don't 
 see any.  If you do not need this core feature, there are much better 
 alternatives to pick from, like Subversion, and they provide much 
 more than just writing a commit ID without ever using it!
Incrementally adding a new feature is a lot less of a change, and a lot less
drastic, than switching to an entirely new system.

The way you're talking, it sounds as if you are saying that, once a program
is released, it should never change, and if you want new features you should
write a whole new, different program to add those features. Is that really
what you're proposing?

 Having said this, it is obvious that it should also be a question of 
 whether CommitID should be kept as a feature *at all*.
No, it is not obvious at all. It is only obvious if one is intent on keeping
the status quo.

 It is much 
 better to use the loginfo feature with the already present commitlog 
 and then pick the changeset from this one (and that is already 
 possible entirely without any change to cvs itself!) instead of 
 relying on writing some hard-to-retrieve info into rcs files.
For what definition of better? Better for _you_, perhaps, but not for the
dozens or hundreds of users (like me) who _want_ this feature.

Using commitinfo requires 
- each and every installation to make the same changes to their existing
commitinfo scripts; this requires hundreds of hours of wasted, duplicated
effort. Sure, you could make a generic commitinfo script available - but if
anyone already *has* a commitinfo script, then they won't be able to use the
canned one.

- tracking the commit ID in a separate database. Separating the commit
information (i.e. the ID and the log) is not a good idea.

 When does a commit ID make sense?  Only in a distributed environment, 
 where a revision must be identified uniquely independent of it's 
 number.
Huh? Not! The commit ID allows me to determine which files were committed in
a single operation. It has no bearing on distributed environments. I'm
looking at a file, and the log says Fixed such-and-such a bug. I want to
know what other files were committed at the same time. Currently, I have to
rely on narrowing it down by timestamp, using a difficult-to-remember, not
very obvious technique for entering timestamps. Or, I could just type in the
commitID and there it is.

 How about tags?  Don't they also provide just the solution to the 
 problems commit ID was added for?
Assuming that each and every commit is tagged, perhaps. But that's not
necessarily the case, and it's certainly not a practise I would encourage.

  Another alternative might be to provide a patch to the RCS 
 maintainer
  to support any of the newphrase formats that CVS has introduced...
 
 Please don't ...  RCS is stable, and the files it writes have been 
 the same for years
And your point would be... what, exactly? Does RCS not have any kind of a
test suite to check for problems?

 So please don't change rcs anymore except for bug fixes!  
So, let me get this straight. Just because *you* don't see any value in a
new feature, you want to prevent everyone else from using that feature?
(that was a rhetorical question, by the way - I'm sure that is not what you
intended).

Can you provide some objective reasons for not changing RCS? So far, all
you've given us is an impassioned plea for keeping the status quo, without
really giving any substantial reasons.

 CVS is to be built upon the things given by RCS, not the other way 
 around.
CVS *was* originally built on RCS. Why should it remain that way forever?

-- 
Jim Hyslop
Senior Software Designer
Leitch Technology International Inc. ( http://www.leitch.com )
Columnist, C/C++ Users Journal ( http://www.cuj.com/experts )


___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs


Re: Patch for making CommitID configurable

2005-04-25 Thread Peter Backes
Hi,

On 21 Apr 2005 at 10:02, Derek Price wrote:

 I really want to make this the default behavior, even after an
 upgrade.  Would this do the trick for you?

What do you mean exactly?  new-use_commit_id = true instead of 
false, i.e. have commit IDs written even for old repositories after 
an update?  I'd not recommend this.  It violates the principle of 
least astonishment (like the current solution).  However, at least it 
was better than the current situation.  

BTW, I forgot to mention the patch also reverts the change in format 
of cvs log (it erroneously appended ';' sometimes) back to how it was 
before.  This was introduced silently in 1.12.12 and was not noticed 
because sanity.sh was changed accordingly.  

Also ChangeLog entry is probably missing some files I changed.  I was 
in a hurry, sorry ...  

There's a suspected bug in cvs which fails reading (resp initializing 
if config file not found) 'config' for :fork: repositories, see the 
XXX comment.  Should be triggered if preserve permissions is turned 
on, in init-1 in remotecheck, but I didn't verify that. 

-- 
Peter 'Rattacresh' Backes, [EMAIL PROTECTED]



___
Bug-cvs mailing list
Bug-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-cvs