Re: Does CVS work with unicode files?

2000-07-11 Thread John Macdonald

Pavel Roskin wrote :
|| Hello!
|| 
|| On Tue, 11 Jul 2000, Guus Leeuw wrote:
|| 
||From: ccyf [mailto:[EMAIL PROTECTED]]
||If files contain non-ascii characters, does cvs commands like diff 
||still work?
||  
||  Nope. There are two possible ways of dealing with these:
||  1. Just check them in as ASCII, and possibly get them corrupted...
|| as CVS doesn't understand UNICODE
||  2. cvs add -kb unicode-file them, so that CVS thinks they're binary
|| and leaves them alone (doesn't try to do *anything* with
|| the contents of the file.
|| 
|| I am by no means an expert in Unicode, but shouldn't UTF be of some help
|| here? I believe that the line endings in UTF are normal UNIX line endings.
|| UTF contains no characters that could confuse CVS.
|| 
|| It is important that CVS understands files as sets of lines, so that it
|| can make diffs and merge sources.

Unicode values are large integers, but they can be represented in a
number of different ways.  The most useful way for this purpose is
UTF8, which uses one or more 8-bit bytes for each Unicode character.
A Unicode character whose value is in the range 0 to 127 is stored
unchanged as a single byte and has the same meaning as the ASCII
character of the same value.  Larger Unicode values as stored in
multiple bytes - and each of those bytes is in the range 128 to 255,
so they will not be confused with normal ASCII values in the 0 to 127
range.

As long as CVS doesn't take any special notice of any values above
127, and as long as the Unicode is stored in UTF8, there should be no
problem.

BUT: I haven't tried this myself.  I'm speaking from theory, nor
direct experience.  CAVEAT EMPTOR

-- 
Sleep should not be used as a substitute| John Macdonald
for high levels of caffeine  -- FPhlyer |   [EMAIL PROTECTED]




Re: .trunk patch refinement

2000-06-20 Thread John Macdonald

Russ Allbery wrote :
|| David Thornley [EMAIL PROTECTED] writes:
|| 
||  Either way, any technique that assumes that all main trunk development
||  is on rev numbers 1.* is useless to me, and probably to quite a few
||  people.
|| 
|| And it's quite possible to get into that state without any misuse of CVS
|| at any point.  It's worth remembering that a lot of us are using CVS
|| repositories formed from imported RCS files, and using different rev
|| numbers with RCS is extremely common.

The sccs2rcs script in contrib retains such version numbers when it
converts - many of my cvs managed files have version numbers greater
than 1 because of this.

With sccs, if you use -r9 (or -r99 or -r, whatever number of
nines is needed to guarantee that it is bigger than the version
number of any file being managed) it is the same as if you had used
the current version for each file.  I recall hearing that this
doesn't work for rcs, but fixing whatever problem there is with this
would make -r9 a workable approach.  But making .trunk work is a
better approach.

-- 
Sleep should not be used as a substitute| John Macdonald
for high levels of caffeine  -- FPhlyer |   [EMAIL PROTECTED]




Re: question (preference?) about xmalloc

2000-05-04 Thread John Macdonald

Paul Sander wrote :
|| 
|| An "isValid" field isn't valid if the structure isn't initialized properly.
|| When this is a concern, the value of the pointer itself becomes significant
|| (like it is with the NULL pointer, but when its value must also be
|| distinguishable from NULL).  In such cases, a pointer to block of zero bytes
|| makes sense to some people.  (In some such cases, it's just as valid to assign
|| to the pointer the address of a statically allocated structure.  But sometimes
|| it's not known in advance how many such pointers are needed, so the structures
|| must be dynamically allocated.)
|| 
|| As for how to tell it apart from a "good" allocation, the pointer is usually
|| stored in some well-known place at the time the allocation is done, usually
|| when a program or structure is being initialized.  An arbitrary pointer's
|| value is compared with that stored in the well-known place when the special
|| value is of interest.

But if no memory is actually allocated, how do you ensure that the
address of the zero-length chunk is different?  Since it didn't use
any memory, the same memory address is still free and is likely to be
returned as the result of the next allocation request too.  Rather
than trust the alloc routine to carry out an impossible task for
which it was not desgined, you are better off the allocate a real
memory area for unique flag values to ensure they are unique.

-- 
Sleep should not be used as a substitute| John Macdonald
for high levels of caffeine  -- FPhlyer |   [EMAIL PROTECTED]




Re: Spaces in filenames?

2000-03-23 Thread John Macdonald

Tony Hoyle wrote :
|| I thought of changing loginfo parameters so they are comma separated... then you 
|have the same problem, but with commas instead
|| of spaces (although commas are far less common in filenames than spaces).

A trailing ",v" on filenames is ubiquitous in CVS repositories.  That
might not actually cause problems since CVS commands always deal with
the original name instead of the RCS name, but it is at least one
place where commas are far more common than spaces in filenames.

-- 
Anyone who can't laugh at himself is not    | John Macdonald
taking life seriously enough -- Larry Wall  |   [EMAIL PROTECTED]




Re: cvs add *

2000-02-17 Thread John Macdonald

Alex Chaffee wrote :
|| I'm afraid to bring up a new thread involving "cvs add." Please don't
|| anyone start saying mean things about my family, OK?
|| 
|| I just had the experience of bringing a large set of Java source files
|| under CVS revision control.  I couldn't use "cvs import" because they
|| had a lot of overlapping subdirectories (Java package hierarchies).  I
|| ended up doing "cvs add *" a lot.  This led to two big problems:
|| 
|| 1. When one of the * arguments was a directory, and that directory
|| already had a "CVS" directory, the add was *aborted*, and the rest of
|| the files on the command line failed to be added.  All other cvs
|| commands that take multiple arguments treat each argument
|| independently; if there's a failure it just spits out a warning and
|| then proceeds to try the rest of the arguments.  The aborting behavior
|| of "add" led to inconsistencies I had to resolve manually -- I had to
|| say "cvs add `find * -type f -maxdepth 0`" or other such nonsense.
|| 
|| Are there any special reasons why "cvs add" aborts for extant
|| directories, but not for extant files?

"cvs add" is not unique here.  "cvs commit", for example, is an all
or nothing command where being unable to process one argument aborts
the whole command.  For commit, this is clearly the right choice.
For the current implementation of add it is also the right choice
(because add currently is modifying the repository), but as part of
excising the repository changing that could certainly be
reconsidered.

To make add work individually on a list is actually not as hard as
the workaround you discuss with find above:

for i in *
do
cvs add $i
done

which does a separate "cvs add" for each item, so no failure prevents
the other additions form succeeding.

|| 2. While it seems to deal OK with the basic case of "cvs add CVS", if
|| you have a "find" script running that tries the equivalent of "cvs add
|| CVS/Repository" or whatever, you get a bogus entry in Entries:
|| "D/CVS///". Then on "cvs update" it tries to create this directory,
|| leading too "foo/CVS/CVS/..." and other such nonsense, and many
|| spurious error messages and weirdness.

This sounds like a bug.  "cvs add CVS" should reject tha add; and
then "cd CVS; cvs add Repository" should fail because there is no CVS
sub-directory.  Doing "cvs add CVS/Repository" is currently not
permitted.  How did you make it work?

|| Are there plans for cvs to do more strict argument checking to make
|| sure you're not mistakenly messing with the magic CVS directory?

At first glance, the existing checks ought to be enough, but
obviously you've found a way around them.

Any change that will permit "cvs add" to take an argument containing
a slash will have to be careful to preserve the existing check, as
will "cvs add foo" if it is going to add . and .. and ../.. etc as
managed directories (so it not only has to search up to find them but
also has to ensure that none of the names coming back down is 'CVS').

Hmm, here's an off the top of my head way of adding * while
precluding CVS sub-directories while living with the current "no
slash" restricted form of "cvs add":

find . -print \
|   grep -v '/CVS$' \
|   grep -v '/CVS/' \
|   while read target
do
dir=`dirname $target`
targ=`basename $target`
(
cd $dir
cvs add $targ
)
done

It lists everything below the current dir, removes from the list
anything named CVS or within a directory named CVS.  For each
remaining item it uses a sub-shell to cd into the enclosing directory
and add the item.  The enclosing directory might be '.', but there
always is one explicitly there in $target for dirname to find.  The
sub-shell is used so that the cd does not affect the processing of
subsequent items.

-- 
Anyone who can't laugh at himself is not| John Macdonald
taking life seriously enough -- Larry Wall  |   [EMAIL PROTECTED]



Re: iVS (Was: CVS File Locking)

2000-02-16 Thread John Macdonald

Paul Sander wrote :
|| --- Forwarded mail from [EMAIL PROTECTED]
|| 
|| Very yes. Though unix has extensionless files, the web and MIME are defacto
|| using suffixes for file type id.
|| 
|| Well, they do and they don't.  MIME provides a way of supplying the type of
|| some content along with the data itself.  That mechanism in itself does not
|| rely on file extensions.  However, certain software (such as email clients
|| and web servers) use lookup tables to map file extensions to MIME types on
|| those occasions where they must somehow conjure up a type without asking a
|| user for it.  But once a file is encoded with MIME, its original extension
|| becomes meaningless because its type is carried along explicitly.

I think that "mime type" is mostly a side-issue.  It gives a nice set
of names that an admin might want to use, but doesn't help much
more.  New files will have to be classified and the classificatio
that has been determined will have to be stored in a control file
(not embedded in a wrapper around the actual data the way mime is put
on mail).   The classification would be by the user providing an
explicit type or by an add hook examining the file (both the data
content and the filename/extension to the extent that the platform
provides those) and trying to classify it automatically.  The
classification hook should have a way of giving up, so that the
fallback position of asking the user is used.

-- 
Anyone who can't laugh at himself is not    | John Macdonald
taking life seriously enough -- Larry Wall  |   [EMAIL PROTECTED]



Re: CVS File Locking

2000-02-15 Thread John Macdonald

Greg A. Woods wrote :
|| 
|| [ On Tuesday, February 15, 2000 at 10:11:04 (-0500), John Macdonald wrote: ]
||  Subject: Re: CVS File Locking
|| 
||  Things that are infrequent are the ones that most need automated
||  checking to ensure they are done correctly.  Computers are good at
||  automating checks.  Insisting on human manual management is a good
||  way to ensure that it is done poorly and inconsistantly.
|| 
|| There's no problem in this case.  CVS conflict detection is highly
|| automated and will catch all the goofs.

Detecting that a goof has happened, while useful, is far less
valuable than preventing it in the first place.  Instead of
committing a conflict that indicates wasted effort, the developer
gets told about it.  This is still too late to prevent the wasted
effort.  Depending upon how much effort has been wasted, there could
be a rather strong objection to your characterization of "no
problem".  Sometimes it may be negligable problem, it is never no
problem.

-- 
Anyone who can't laugh at himself is not    | John Macdonald
taking life seriously enough -- Larry Wall  |   [EMAIL PROTECTED]



Re: CVS File Locking

2000-02-15 Thread John Macdonald

Greg A. Woods wrote :
|| [ On Monday, February 14, 2000 at 10:58:20 (-0500), John Macdonald wrote: ]
||  Subject: Re: CVS File Locking
|| 
||  File, not modules.  The (god forbid) MS Word documentation that has
||  to acompany each module, or the jpeg icon for the module, etc.  The
||  few binary files that are a part of the module.  Making them a
||  separate module ensures that they are poorly maintained.
|| 
|| If you're only talking about a few files then there's no real need for
|| hard locking at all.  Conflict detection of non-mergable files can be
|| done easily enough if you want to support that feature and for a few
|| files the logical approach is to simply assign responsibility for
|| changing those files to one or two specific developers who will learn to
|| avoid conflicts without locks, and outside of CVS.
|| 
|| CVS is not a substitute for management.  :-)

Things that are infrequent are the ones that most need automated
checking to ensure they are done correctly.  Computers are good at
automating checks.  Insisting on human manual management is a good
way to ensure that it is done poorly and inconsistantly.

||  But you refuce to permit 99% concurrent along with 1% locked, forcing
||  100% locked instead.  That disenfranchizes the converts, and prevents
||  any further preaching to the skeptics.
|| 
|| Bull.  Requiring locking for 1% of the files is total nonsense.
|| Co-ordinating changes to that small number of files, even in a
|| relatively large project, is very very very easily managed outside of
|| CVS.  I know from first-hand experience this is true.

I know from first hand experience that this is where problems occur.

||  Or is "go to another tool" really saying "give up on concurrent even
||  though it is vastly better for almost everything you do".
|| 
|| Not exactly.  It's more like "I give up on trying to sell you the
|| benefits of concurrent development.  You're obviously not going to be
|| happy here right now but if you change your mind then please come back
|| then."

If you are unwilling to do things wrong for the last 1%, I won't let
you do them right for the first 99%.

-- 
Anyone who can't laugh at himself is not| John Macdonald
taking life seriously enough -- Larry Wall  |   [EMAIL PROTECTED]