Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-13 Thread Michael Haggerty
On 01/13/2013 11:20 PM, Junio C Hamano wrote:
> After a quick survey of various distros, I think it is very unlikely
> that we will see "distros move on to newer cvsps, leaving cvsimport
> broken" situation. If anything, it is more like "distros decide to
> ignore the new cvsps, until it is made to work with cvsimport" [*1*].

A better predictor of the distros' decisions is probably which other
packages depend on cvsps.  As one data point: on Debian squeeze and on
Ubuntu precise, only two packages depend on cvsps (git-cvs and
bzr-cvsps-import) and one suggests it (chora2, "a code repository
viewing component for horde framework").  So also by this standard they
are unlikely to feel a lot of pressure to update quickly to cvsps3.

> I think it is probably sensible to [...]
> 
> Agreed?

Yes, I agree that what you propose is a good strategy.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-13 Thread Junio C Hamano
Junio C Hamano  writes:

> Jonathan Nieder  writes:
>
>> Michael Haggerty wrote:
>>
>>> Regarding your claim that "within a few months the Perl git-cvsimport is
>>> going to cease even pretending to work": It might be that the old
>>> git-cvsimport will stop working *for people who upgrade to cvsps 3.x*.
>>> But it is not realistic to expect people to synchronize their git and
>>> cvsps version upgrades.  It is even quite possible that this or that
>>> Linux distribution will package incompatible versions of the two packages.
>>
>> Moreover, I feel an obligation to point the following out:
>>
>> In a hypothetical world where cvsps 3.x simply breaks "git cvsimport"
>> it is likely that some distributions would just stick to the existing
>> cvsps and not upgrade to 3.x.  Maybe that's a wrong choice, but that's
>> a choice some would make.  An even more likely outcome in that
>> hypothetical world is that they would ship it renamed to something
>> like "cvsps3" alongside the existing cvsps.  Or they could rename the
>> old version to "cvsps2".  If we were the last holdout, we could even
>> bundle it as compat/cvsps.
>>
>> So please do not act as though the cvsps upgrade is a crisis that we
>> need to break ourselves for at threat of no longer working at all.
>> The threat doesn't hold water.
>>
>> Luckily you have already written patches to make "git cvsimport" work
>> with cvsps 3.x, and through your work you are making a better
>> argument: "The new cvsimport + cvsps will work better, at least for
>> some users, than the old tool."
>>
>> Just don't pretend you have the power to force a change for a less
>> sensible reason than that!
>
> After a quick survey of various distros, I think it is very unlikely
> that we will see "distros move on to newer cvsps, leaving cvsimport
> broken" situation. If anything, it is more like "distros decide to
> ignore the new cvsps, until it is made to work with cvsimport" [*1*].
>
> I think it is probably sensible to rename the current cvsimport to
> cvsimport-2, write a small wrapper git-cvsimport.sh which is
> something like this:
>
> - >8 -
> #!/bin/sh
>
> if test -z "$GIT_CVSPS_VERSION"
> then
>   case "$(cvsps -h 2>&1 | grep 'cvsps version')" in
> 2.*)
>   GIT_CVSPS_VERSION=2
> ;;
>   3.*)
>   GIT_CVSPS_VERSION=3
> ;;
>   esac
> fi
>
> if test -z "$GIT_CVSPS_VERSION" 
> then
>   echo >&2 "No supported cvsps available"
>   exit 1
> fi
>
> exec git cvsimport-$GIT_CVSPS_VERSION "$@"
> - 8< -
>
> and put Eric's one as git-cvsimport-3 (after ripping out the code to
> fallback to the old cvsimport).  The longer term trend will be to
> move away from cvsimport-2, as it is unlikely cvsps-2.x will gain
> improvements, if any; keeping fallback code outside cvsimport-3 will
> be a better first step in the healthier long term code evolution.
>
> We will keep the current t96xx series of tests, and have them export
> GIT_CVSPS_VERSION=2 at the beginning, protect them with test prereq
> that requires presence of cvsps 2.x; this will still make sure that
> the current cvsimport users will not see any regressions.
>
> Eric's one should be polished enough to produce good results on the

s/should be polished enough/should be in a polished enough state/
that is.  Also "if not right now" may better convey what I meant if
written "if not already".

> simpler sample CVS histories t96xx deal with soonish if not right
> now, so we can use a method similar to how we shared tests between
> blame and annotate while both were _different_ implementations to
> make sure the newer blame did not inroduce regression by running the
> same set of tests.  Where the result _ought_ to differ, we should
> also add tests that work only with the new cvsimport, of course.
>
> I could help getting the ball rolling on this, if everybody agrees
> that the above is a sensible direction to go, given the real world
> constraints of distro inertia.
>
> Agreed?
>
>
> [References]
>
> *1* Fedora, Debian and Ubuntu do not even have cvsps 3.x in their
> bleeding edges, OpenBSD and NetBSD do not seem to have it either,
> and Gentoo and ArchLinux have the cvsps 3.x blocked due to
> incompatiblity.
>
> http://pkgs.fedoraproject.org/cgit/cvsps.git/
> http://packages.debian.org/search?keywords=cvsps
> http://packages.ubuntu.com/search?keywords=cvsps
>
> http://packages.gentoo.org/package/dev-vcs/cvsps
> https://bugs.gentoo.org/show_bug.cgi?id=450424
>
> https://bugs.archlinux.org/task/33363?project=1&cat%5B0%5D=2&string=cvsps
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-13 Thread Junio C Hamano
Jonathan Nieder  writes:

> Michael Haggerty wrote:
>
>> Regarding your claim that "within a few months the Perl git-cvsimport is
>> going to cease even pretending to work": It might be that the old
>> git-cvsimport will stop working *for people who upgrade to cvsps 3.x*.
>> But it is not realistic to expect people to synchronize their git and
>> cvsps version upgrades.  It is even quite possible that this or that
>> Linux distribution will package incompatible versions of the two packages.
>
> Moreover, I feel an obligation to point the following out:
>
> In a hypothetical world where cvsps 3.x simply breaks "git cvsimport"
> it is likely that some distributions would just stick to the existing
> cvsps and not upgrade to 3.x.  Maybe that's a wrong choice, but that's
> a choice some would make.  An even more likely outcome in that
> hypothetical world is that they would ship it renamed to something
> like "cvsps3" alongside the existing cvsps.  Or they could rename the
> old version to "cvsps2".  If we were the last holdout, we could even
> bundle it as compat/cvsps.
>
> So please do not act as though the cvsps upgrade is a crisis that we
> need to break ourselves for at threat of no longer working at all.
> The threat doesn't hold water.
>
> Luckily you have already written patches to make "git cvsimport" work
> with cvsps 3.x, and through your work you are making a better
> argument: "The new cvsimport + cvsps will work better, at least for
> some users, than the old tool."
>
> Just don't pretend you have the power to force a change for a less
> sensible reason than that!

After a quick survey of various distros, I think it is very unlikely
that we will see "distros move on to newer cvsps, leaving cvsimport
broken" situation. If anything, it is more like "distros decide to
ignore the new cvsps, until it is made to work with cvsimport" [*1*].

I think it is probably sensible to rename the current cvsimport to
cvsimport-2, write a small wrapper git-cvsimport.sh which is
something like this:

- >8 -
#!/bin/sh

if test -z "$GIT_CVSPS_VERSION"
then
case "$(cvsps -h 2>&1 | grep 'cvsps version')" in
2.*)
GIT_CVSPS_VERSION=2
;;
3.*)
GIT_CVSPS_VERSION=3
;;
esac
fi

if test -z "$GIT_CVSPS_VERSION" 
then
echo >&2 "No supported cvsps available"
exit 1
fi

exec git cvsimport-$GIT_CVSPS_VERSION "$@"
- 8< -

and put Eric's one as git-cvsimport-3 (after ripping out the code to
fallback to the old cvsimport).  The longer term trend will be to
move away from cvsimport-2, as it is unlikely cvsps-2.x will gain
improvements, if any; keeping fallback code outside cvsimport-3 will
be a better first step in the healthier long term code evolution.

We will keep the current t96xx series of tests, and have them export
GIT_CVSPS_VERSION=2 at the beginning, protect them with test prereq
that requires presence of cvsps 2.x; this will still make sure that
the current cvsimport users will not see any regressions.

Eric's one should be polished enough to produce good results on the
simpler sample CVS histories t96xx deal with soonish if not right
now, so we can use a method similar to how we shared tests between
blame and annotate while both were _different_ implementations to
make sure the newer blame did not inroduce regression by running the
same set of tests.  Where the result _ought_ to differ, we should
also add tests that work only with the new cvsimport, of course.

I could help getting the ball rolling on this, if everybody agrees
that the above is a sensible direction to go, given the real world
constraints of distro inertia.

Agreed?


[References]

*1* Fedora, Debian and Ubuntu do not even have cvsps 3.x in their
bleeding edges, OpenBSD and NetBSD do not seem to have it either,
and Gentoo and ArchLinux have the cvsps 3.x blocked due to
incompatiblity.

http://pkgs.fedoraproject.org/cgit/cvsps.git/
http://packages.debian.org/search?keywords=cvsps
http://packages.ubuntu.com/search?keywords=cvsps

http://packages.gentoo.org/package/dev-vcs/cvsps
https://bugs.gentoo.org/show_bug.cgi?id=450424

https://bugs.archlinux.org/task/33363?project=1&cat%5B0%5D=2&string=cvsps
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-12 Thread Jonathan Nieder
Michael Haggerty wrote:

> Regarding your claim that "within a few months the Perl git-cvsimport is
> going to cease even pretending to work": It might be that the old
> git-cvsimport will stop working *for people who upgrade to cvsps 3.x*.
> But it is not realistic to expect people to synchronize their git and
> cvsps version upgrades.  It is even quite possible that this or that
> Linux distribution will package incompatible versions of the two packages.

Moreover, I feel an obligation to point the following out:

In a hypothetical world where cvsps 3.x simply breaks "git cvsimport"
it is likely that some distributions would just stick to the existing
cvsps and not upgrade to 3.x.  Maybe that's a wrong choice, but that's
a choice some would make.  An even more likely outcome in that
hypothetical world is that they would ship it renamed to something
like "cvsps3" alongside the existing cvsps.  Or they could rename the
old version to "cvsps2".  If we were the last holdout, we could even
bundle it as compat/cvsps.

So please do not act as though the cvsps upgrade is a crisis that we
need to break ourselves for at threat of no longer working at all.
The threat doesn't hold water.

Luckily you have already written patches to make "git cvsimport" work
with cvsps 3.x, and through your work you are making a better
argument: "The new cvsimport + cvsps will work better, at least for
some users, than the old tool."

Just don't pretend you have the power to force a change for a less
sensible reason than that!

Hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-12 Thread Jonathan Nieder
Hi Eric,

Eric S. Raymond wrote:

> But in practice the git crew was going to lose that
> capability anyway simply because the new wrapper will support three
> engines rather than just one.  It's not practical for the git tests to
> handle that many variant external dependencies.

See the git-blame/git-annotate tests for an example of how the
testsuite handles "variations on a theme".

It works fine.

Hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-12 Thread Eric S. Raymond
Michael Haggerty :
> Otherwise, how do we know that cvsps currently works with git-cvsimport?
> (OK, you claim that it does, but in the next breath you admit that
> there is a new failure in "one pathological tagging case".)  How can we
> understand its strengths/weaknesses?  How can we gain confidence that it
> works on different platforms?  How will we find out if a future versions
> of cvsps stops working (e.g., because of a breakage or a
> non-backwards-compatible change)?

You can't.  But in practice the git crew was going to lose that
capability anyway simply because the new wrapper will support three
engines rather than just one.  It's not practical for the git tests to
handle that many variant external dependencies.

However, there is a solution.

The solution is for git to test that the wrapper is *generating the
expected commands*.  So what the git tree ends up with is conditional
assurance; the wrapper will do the right thing if the engine it calls
is working correctly.  I think that's really all the git-tree tests
can hope for.

Michael, the engines are my problem and yours - it's *our*
responsibility to develop a (hopefully shared) test suite to verify
that they convert repos correctly.  I'm working my end as fast as I can;
I hope to have the test suite factored out of cvsps and ready to check 
multiple engines by around Wednesday.  I still need to convert t9604,
too.

I have parsecvs working since yesterday, so we really are up to three
engines.

I have two minor features I need to merge into parsecvs before 
I can start on splitting out the test suite.
-- 
http://www.catb.org/~esr/";>Eric S. Raymond
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-12 Thread Eric S. Raymond
Junio C Hamano :
> And here is what I got:

Hm. In my version of these tests, I only have one regression from the
old combo (in the pathological tags test, t9602).  You're seeing more
breakage than that, obviously.

> A funny thing was that without cvsps-3.7 on $PATH (which means I am
> getting distro packaged cvsps 2.1), I got identical errors.

That suggests that something in your test setup has gone bad and is
introducing spurious errors. Which would be consistent with the above.

> Looking
> at the log message, it seems that you meant to remove t960[123], so
> perhaps the patch simply forgot to remove 9601 and 9602?

Yes.
 
> As neither test runs "git cvsimport" with -o/-m/-M options, ideally
> we should be able to pass them with and without having cvsps-3.x.
> Not passing them without cvsps-3.x would mean that the fallback mode
> of rewritten cvsimport is not working as expected. Not passing them
> with cvsps-3.x may mean the tests were expecting a wrong conversion
> result, or they uncover bugs in the replacement cvsimport.

That's possible, but seems unlikely.  Because the new cvsimport is
such a thin wrapper around the conversion engine, bugs in it should
lead to obvious crashes or failure to run the engine rather than the 
sort of conversion error the t960* tests are designed to check.  Really
all it does is assemble options to pass to the conversion engines.

My test strategy is aimed at the engine, not the wrapper. I took the
repos from t960*  and wrote a small Python framework to check the same 
assertions as the git-tree tests do, but using the engine.  For example,
here's how my t9602 looks:

import os, cvspstest

cc = cvspstest.ConvertComparison("t9602", "module")
cc.cmp_branch_tree("test of branch", "master", True)
cc.cmp_branch_tree("test of branch", "vendorbranch", True)
cc.cmp_branch_tree("test of branch", "B_FROM_INITIALS", False)
cc.cmp_branch_tree("test of branch", "B_FROM_INITIALS_BUT_ONE", False)
cc.cmp_branch_tree("test of branch", "B_MIXED", False)
cc.cmp_branch_tree("test of branch", "B_SPLIT", True)
cc.cmp_branch_tree("test of tag", "vendortag", False)
# This is the only test new cvsps fails that old git-cvsimport passed.
cc.cmp_branch_tree("test of tag", "T_ALL_INITIAL_FILES", True)
cc.cmp_branch_tree("test of tag", "T_ALL_INITIAL_FILES_BUT_ONE", False)
cc.cmp_branch_tree("test of tag", "T_MIXED", False)
cc.cleanup()
 
> t9600 fails with "-a is no longer supported", even without having
> cvsps-3.x on the $PATH (i.e. attempting to use the fallback).  I
> wonder if this is an option the updated cvsimport would want to
> simply ignore?

Probably.  But I don't think you should keep these tests in the git tree.
That wasn't a great idea even when you were supporting just one engine;
with two (and soon three) it's going to be just silly.  Let sanity-checking
the engines be *my* problem, since I have to do it anyway.

(I'm working towards the generalized test suite as fast as I can.  First
results probably in four days or so.)

> It is a way to tell the old cvsps/cvsimport to disable its
> heuristics to ignore commits made within the last 10 minutes (this
> is done in the hope of waiting for the per-file nature of CVS
> commits to stabilize, IIUC); the user tells the command that he
> knows that the CVS repository is now quiescent and it is safe to
> import the whole thing.

Yes, that's just what -a is supposed to do.  But is should be
irrelevant for testing - in the test framework CVS is running locally, 
so there's no network lag.

> So... does this mean that we now set the minimum required version of
> Python to 2.7?  I dunno.

That would be bad, IMO.  I'll put backporting to 2.6 high on my to-do list.
-- 
http://www.catb.org/~esr/";>Eric S. Raymond
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-12 Thread Michael Haggerty
On 01/11/2013 04:32 AM, Junio C Hamano wrote:
> From: "Eric S. Raymond" 
> 
> The combination of git-cvsimport and cvsps had serious problems.

Agreed.

> [...]
> This patch also removes Michael Haggerty's git-cvsimport tests
> (t960[123]) from the git tree.  These are actually conversion-engine
> tests and have been merged into a larger cvsps test suite, which I
> intend to spin out into a general CVS-lifting test that can also be
> applied to utilities such as cvs2git and parsecvs.  The t9604 test
> will move in a future patch, when I likewise have it integrated
> into the general test suite.
> 
> The following known bug has not been fixed: "If any files were ever
> "cvs import"ed more than once (e.g., import of more than one vendor
> release) the HEAD contains the wrong content." However, cvsps now
> emits a warning in this case. There is also one pathological tagging
> case that was successful in the former t9602 test that now fails
> (with a warning).
> 
> I plan to address these problems. This patch at least gets the
> cvsps-3.x/git-cvsimport combination to a state that is not too
> broken to ship - that is, in all failure cases known to me it
> now emits useful warnings rather than silently botching the
> import.

I don't understand the logic of removing the cvsimport tests, at least
not at this time.  It is true that the tests mostly ensure that the
conversion engine is working correctly, especially with your new version
of cvsps.  But I think the git project, by implicitly endorsing the use
of cvsps, has some responsibility to verify that the combination cvsps +
git-cvsimport continues to work and to document any known breakages via
its test suite.

Otherwise, how do we know that cvsps currently works with git-cvsimport?
 (OK, you claim that it does, but in the next breath you admit that
there is a new failure in "one pathological tagging case".)  How can we
understand its strengths/weaknesses?  How can we gain confidence that it
works on different platforms?  How will we find out if a future versions
of cvsps stops working (e.g., because of a breakage or a
non-backwards-compatible change)?

Normally one would expect an improvement like this to be combined with
patches that turn test expected failures into expected successes, not to
rip out the very tests that establish the correctness of the change that
is being proposed!


Let me describe what I would consider to be the optimum state of the
test suite.  Maybe your idea of "optimum" differs from mine, or maybe
the optimum is unrealistic due to lack of resources or for some other
reason.  But if so, let's explicitly spell out why we are deviating from
whatever optimum we define.

* The old tests should be retained (and possibly new tests added to show
off your improvements).

* There should be a way for users to choose which cvsps executable to
use when running test suite.  (In the future, the selection might be
expanded to cover altogether different conversion engines.)

* The tests should determine which version of cvsps has been selected
(e.g., by running "cvsps --version").

* The individual tests should be marked expected success/expected
failure based on the selected version of cvsps; in other words, some
tests might be marked "expected failure" if cvsps 2.x is being used but
"expected success" if cvsps 3.x is being used.


Regarding your claim that "within a few months the Perl git-cvsimport is
going to cease even pretending to work": It might be that the old
git-cvsimport will stop working *for people who upgrade to cvsps 3.x*.
But it is not realistic to expect people to synchronize their git and
cvsps version upgrades.  It is even quite possible that this or that
Linux distribution will package incompatible versions of the two packages.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-11 Thread Junio C Hamano
I cloned git://gitorious.org/cvsps/cvsps.git and installed cvsps-3.7
at c2ce6cc (More fun with test loads, sigh.  Timezones suck.,
2013-01-09) earlier on my $PATH, and tried to run t96xx series with
this patch applied on top of Git 1.8.1.

The first thing I noticed was that all the tests were skipped.
A patch to t/lib-cvs.sh might be sufficient, 

- >8 -
diff --git a/t/lib-cvs.sh b/t/lib-cvs.sh
index 44263ad..423953f 100644
--- a/t/lib-cvs.sh
+++ b/t/lib-cvs.sh
@@ -17,6 +17,8 @@ cvsps_version=`cvsps -h 2>&1 | sed -ne 's/cvsps version //p'`
 case "$cvsps_version" in
 2.1 | 2.2*)
;;
+3.*)
+   ;;
 '')
skip_all='skipping cvsimport tests, cvsps not found'
test_done
- 8< -

but I didn't check more than "now it seems not to skip them".
And here is what I got:

- >8 -
Test Summary Report
---
t9600-cvsimport.sh  (Wstat: 256 Tests: 15 Failed: 9)
  Failed tests:  4-6, 8-9, 11-13, 15
  Non-zero exit status: 1
t9601-cvsimport-vendor-branch.sh (Wstat: 256 Tests: 9 Failed: 8)
  Failed tests:  1-4, 6-9
  Non-zero exit status: 1
t9602-cvsimport-branches-tags.sh (Wstat: 256 Tests: 11 Failed: 5)
  Failed tests:  1-3, 7, 9
  Non-zero exit status: 1
t9604-cvsimport-timestamps.sh   (Wstat: 256 Tests: 2 Failed: 2)
  Failed tests:  1-2
  Non-zero exit status: 1
Files=5, Tests=38,  5 wallclock secs ( 0.05 usr  0.01 sys +  0.49
  cusr  0.16 csys =  0.71 CPU)
Result: FAIL
- 8< -

A funny thing was that without cvsps-3.7 on $PATH (which means I am
getting distro packaged cvsps 2.1), I got identical errors.  Looking
at the log message, it seems that you meant to remove t960[123], so
perhaps the patch simply forgot to remove 9601 and 9602?

As neither test runs "git cvsimport" with -o/-m/-M options, ideally
we should be able to pass them with and without having cvsps-3.x.
Not passing them without cvsps-3.x would mean that the fallback mode
of rewritten cvsimport is not working as expected. Not passing them
with cvsps-3.x may mean the tests were expecting a wrong conversion
result, or they uncover bugs in the replacement cvsimport.

t9600 fails with "-a is no longer supported", even without having
cvsps-3.x on the $PATH (i.e. attempting to use the fallback).  I
wonder if this is an option the updated cvsimport would want to
simply ignore?

It is a way to tell the old cvsps/cvsimport to disable its
heuristics to ignore commits made within the last 10 minutes (this
is done in the hope of waiting for the per-file nature of CVS
commits to stabilize, IIUC); the user tells the command that he
knows that the CVS repository is now quiescent and it is safe to
import the whole thing.

If the updated cvsps can identify the changeset more reliably and it
no longer needs "-a" option, it may be more helpful to the users to
migrate their script if it allowed, warned and then ignored the
option.  It certainly would help sharing of this test script between
runs that use the old and new cvsps as backends.

t9601 (after resurrecting the t/t9601/cvsroot directory) fails in an
interesting way.

- >8 -
$ sh t9601-cvsimport-vendor-branch.sh -i -v
Initialized empty Git repository in /git/git.build/t/trash 
directory.t9601-cvsimport-vendor-branch/.git/
expecting success:

git cvsimport -C module-git module

Traceback (most recent call last):
  File "/git/git.build/git-cvsimport", line 262, in 
subprocess.check_output("cvsps -V 2>/dev/null", shell=True)
AttributeError: 'module' object has no attribute 'check_output'
not ok - 1 import a module with a vendor branch
- 8< -

Apparently, the copy of "subprocess.py" I have does not give us the
check_output thing:

- >8 -
$ python
Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> dir(subprocess)
['CalledProcessError', 'MAXFD', 'PIPE', 'Popen', 'STDOUT', '__all__', 
'__builtins__', '__doc__', '__file__', '__name__', '__package__', '_active', 
'_cleanup', '_demo_posix', '_demo_windows', '_eintr_retry_call', 'call', 
'check_call', 'errno', 'fcntl', 'gc', 'list2cmdline', 'mswindows', 'os', 
'pickle', 'select', 'signal', 'sys', 'traceback', 'types']
- 8< -

The story is the same for t9602 and t9603 (again after resurrecting
the necessary files).

http://docs.python.org/2/library/subprocess.html tells me that
check_output has first become available in 2.7.

So... does this mean that we now set the minimum required version of
Python to 2.7?  I dunno.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http:/

Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-11 Thread Eric S. Raymond
Junio C Hamano :
> Yeah, it is OK to _discourage_ its use, but to me it looks like that
> the above is a fairly subjective policy decision, not something I
> should let you impose on the users of the old cvsimport, which you
> do not seem to even treat as your users.

Er.  You still don't seem to grasp the fundamentals of this
situation. I'm not imposing any damn thing on the users.  What's
imposing is the fact that cvsps-2.x and the Perl cvsimport are both
individually and collectively *broken right now*, and within a few
months the Perl git-cvsimport is going to cease even pretending to
work.  I'm trying to *fix that problem* as best I can, fixing it
required two radical rewrites, and criticizing me for not emulating
every last detail and misfeature immediately is every bit as pointless
and annoying as arguing about the fabric on the deck chairs while the
ship is sinking.

To put it bluntly, you should be grateful to be getting back any
functionality at all - because the alternative is that the Perl
git-cvsimport will hang out in your tree as a dead piece of cruft.
Your choice is between making it easy for me replace it with minimum
disruption now and hoping for someone else to replace it months from
now after you've had a bunch of unhappy users bitching at you.

So let me be more direct.  I think the -M and -m options are
sufficiently bad ideas that I am *not willing* to put in the quite
large amount of effort that would be required to implement them in cvsps
or parsecvs.  That would be a bad use of my time.

This is not the case with -o; that might be a good idea if I
understood it. This is also not like the 2.x fallback; I thought that
was a bad idea (because it would be better for users that the
combination break in an obvious way than continue breaking in a silent
one), but it was a small enough effort that I was willing to do it
anyway to keep the git maintainer happy. The effort to fix the quoting
bugs is even easier for me to justify; they are actual bugs.

Those are my engineering judgments; go ahead and call them
"subjective" if you like, but neither the facts nor my judgment will
change on that account.

> The "major" in my sentence was from your description (the bugs you
> fixed), and not about the new ones you still have in this draft.  I
> did not mean to say that you are trading fixes to "major" bugs with
> different "major" bugs.

OK, thank you.  In the future I will try to bear in mind that English
is not your primary language when I evaluate statements that seems a bit
offensive.

So what's your next bid? Note that you can't increase my friction and
hassle costs much more before I give up and let you deal with the
consequences without me. I want to do the right thing, but I have
more other projects clamoring for my attention than you could easily
guess.  I need to get git-cvsimport *finished* and off my hands -
I may already have given it more time than I really should have.

So give me your minimum list of deliverables before you'll merge,
please, and then stick to it.  I assume fixes for the quoting bugs
will be on that list.
-- 
http://www.catb.org/~esr/";>Eric S. Raymond
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-11 Thread Junio C Hamano
"Eric S. Raymond"  writes:

> Junio C Hamano :
> ...
> The other is a design-level problem - these options were a bad idea to
> begin with.  In earlier list mail I said
>
> An example of the batchiness mistake close to home is the -m and -M
> options in the old version of cvsimport.  It takes human judgment
> looking at the whole commit DAG in gitspace to decide what merge
> points would best express the (as you say, sometimes ambiguous) CVS
> history - what's needed is a scalpel and sutures in a surgeon's hand,
> not a regexp hammer.
>
> One specific problem with the regexp hammer is false-positive matches
> leading to unintended merges.

Yeah, it is OK to _discourage_ its use, but to me it looks like that
the above is a fairly subjective policy decision, not something I
should let you impose on the users of the old cvsimport, which you
do not seem to even treat as your users.

>> Having the code to die when it sees options the rewritten version
>> does not yet support before it calls the fallback makes the fallback
>> much less effective, no?
>
> Only to the extent that -o/-m/-M are really important, which I doubt.
> But that might be fixable, and I'll put it on the to-do list.
>
>> Not very impressed (yet).  The advertised "fix major bugs" sounds
>> more like "trade major bugs with different ones with a couple of
>> feature removals" at this point.
>
> If you think that, you have failed to understand just how broken and
> dangerous the old combination is.  None of the details you've called
> out are "major" by any stretch of the imagination compared to it
> silently botching the translation of repositories.

The "major" in my sentence was from your description (the bugs you
fixed), and not about the new ones you still have in this draft.  I
did not mean to say that you are trading fixes to "major" bugs with
different "major" bugs.

Insecure quoting of parameters is much easier to fix; it does need
to be addressed, though.

It is just that looking at the state of the patch as submitted left
me with a feeling that this topic needs a lot more time to mature
than I previously was led to believe by your earlier messages, which
made me someaht sad.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-11 Thread Junio C Hamano
"Eric S. Raymond"  writes:

> Junio C Hamano :
>> I think the prevalent style in this script is to write "print"
>> without parentheses:
>> 
>>  print STDERR "msg\n";
>
> That can be easily fixed.
>
>> This looks lazy and unsafe quoting.  Is there anything that makes
>> sure repository path does not contain a single quote?
>
> No. But...wait, checking...the Perl code didn't have the analogous
> check, so there's no increased vulnerability here.  I'll put it on the
> to-do list for after I ship parsecvs.

I checked before I sent that review, and as far as I could tell, it
was fairly consistently avoiding the lazy and insecure forms, e.g.

system("com mand " . $param);
open($fh, "com mand " . $param . " |"); while (<$fh>)   { ... }

but used the more sequre list form, e.g.

system(qw(com mand), $param);
open($fh, "-|", qw(com mand), $param); while (<$fh>){ ... }

But of course there may be some places that were careless that I
didn't spot (and previous reviewers of the current cvsimport
didn't).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-11 Thread Eric S. Raymond
Junio C Hamano :
> I think the prevalent style in this script is to write "print"
> without parentheses:
> 
>   print STDERR "msg\n";

That can be easily fixed.

> This looks lazy and unsafe quoting.  Is there anything that makes
> sure repository path does not contain a single quote?

No. But...wait, checking...the Perl code didn't have the analogous
check, so there's no increased vulnerability here.  I'll put it on the
to-do list for after I ship parsecvs.

> > +def command(self):
> > +"Emit the command implied by all previous options."
> > +return "(cvs2git --username=git-cvsimport --quiet --quiet 
> > --blobfile={0} --dumpfile={1} {2} {3} && cat {0} {1} && rm {0} 
> > {1})".format(tempfile.mkstemp()[1], tempfile.mkstemp()[1], self.opts, 
> > self.modulepath)
> 
> Could we do something better with this overlong source line?

Yes.  The correct fix is to simplify cvs2git's rather baroque command-line 
interface.  Michael Haggerty has accepted that patch.  Soon that line will
look like this:

 return "cvs2git --quiet --quiet {0} {1}".format(self.opts, self.modulepath)

Older versions of cvs2git will terminate cleanly with an error message 
when called this way.

> > +elif opt == '-o':
> > +sys.stderr.write("git cvsimport: -o is no longer supported.\n")
> > +sys.exit(1)
> 
> Isn't this a regression?

It would be if the -o behavior were consistent and decently
documented.  When I tested this option with the Perl version I got no
result.  Possibly my usage was incorrect; if anyone can be found who
actually understands how it's supposed to work in detail and will tell
me, I can probably support it.

> > +elif opt in ('-m', '-M'):
> > +sys.stderr.write("git cvsimport: -m and -M are no longer 
> > supported: use reposurgeon instead.\n")
> > +sys.exit(1)
> 
> I wonder if it is better to ignore these options with a warning but
> still let the command continue; cvsps-3.x was supposed to get merges
> right without the help of these ad-hoc options, no?

Sorry, I don't know where you got that idea. I don't think the general merge
detection that would need is possible even in principle.

> Otherwise it looks like a regression to me.

There are two reasons -m and -M aren't supported.

One is implementation-level.  The wrapper script no longer deals with
individual files or changesets or branches; it relies on the
conversion engine to do all that.  (As it should - the old design was
a mess with lots of stuff being done at the wrong level.)  But the
conversion engines don't implement -m or -M, and aren't ever going to
(see next paragraph).

The other is a design-level problem - these options were a bad idea to
begin with.  In earlier list mail I said

An example of the batchiness mistake close to home is the -m and -M
options in the old version of cvsimport.  It takes human judgment
looking at the whole commit DAG in gitspace to decide what merge
points would best express the (as you say, sometimes ambiguous) CVS
history - what's needed is a scalpel and sutures in a surgeon's hand,
not a regexp hammer.

One specific problem with the regexp hammer is false-positive matches
leading to unintended merges.

That's why I won't implement these in cvsps or parsecvs. Instead I've
just added DAG visualization via graphviz in reposurgeon, so a human
can quickly see candidate merges in the visualization and do them by
hand.  This is better and safer.

> Having the code to die when it sees options the rewritten version
> does not yet support before it calls the fallback makes the fallback
> much less effective, no?

Only to the extent that -o/-m/-M are really important, which I doubt.
But that might be fixable, and I'll put it on the to-do list.

> Not very impressed (yet).  The advertised "fix major bugs" sounds
> more like "trade major bugs with different ones with a couple of
> feature removals" at this point.

If you think that, you have failed to understand just how broken and
dangerous the old combination is.  None of the details you've called
out are "major" by any stretch of the imagination compared to it
silently botching the translation of repositories.

Also bear in mind that leaving the old Perl code in place is not going
to be a viable option for more than a few months out.  As cvsps-3.x
propagates to the distros what you have is going to stop even its
current pretense of working.

Finally...my own purposes are fulfilled by having CVS exporters that can
do a decent job of front-ending for reposurgeon. Rewriting git's
wrapper script was extra work that I took on only because I wanted to
be friendly to the git project, *but*... 

...there is a limit to the amount of what I consider pointless
hoop-jumping that friendliness will buy you, and the 2.x fallback eas
already pushing that limit.  Tread a little more gently, Junio; I've
put in a lot of hard, boring work on git-cvsimport over the last two
weeks when I w

Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs

2013-01-11 Thread Junio C Hamano
> From: "Eric S. Raymond" 
> ...
> diff --git a/git-cvsimport.perl b/git-cvsimport-fallback.perl
> similarity index 98%
> rename from git-cvsimport.perl
> rename to git-cvsimport-fallback.perl
> index 0a31ebd..4bc0717 100755
> --- a/git-cvsimport.perl
> +++ b/git-cvsimport-fallback.perl
> @@ -1,4 +1,8 @@
>  #!/usr/bin/perl
> +# This code became obsolete in January 2013, and is retained only as a
> +# fallback from git-cvsimport.py for users who have only cvsps-2.x.
> +# It (and the code in cvsimport.py that calls it) should be removed
> +# once the 3.x version has had a reasonable time to propagate.
>  
>  # This tool is copyright (c) 2005, Matthias Urlichs.
>  # It is released under the Gnu Public License, version 2.
> @@ -27,6 +31,10 @@
>  use POSIX qw(strftime tzset dup2 ENOENT);
>  use IPC::Open2;
>  
> +print(STDERR "You do not appear to have cvsps 3.x.\n");
> +print(STDERR "Falling back to unmaintained Perl cvsimport for cvsps 2.x.\n");
> +print(STDERR "Upgrade your cvsps for best results.\n");

I think the prevalent style in this script is to write "print"
without parentheses:

print STDERR "msg\n";

> diff --git a/git-cvsimport.py b/git-cvsimport.py
> new file mode 100755
> index 000..129471e
> --- /dev/null
> +++ b/git-cvsimport.py
> @@ -0,0 +1,354 @@
> +#!/usr/bin/env python
> +#
> +# Import CVS history into git
> +#
> +# Intended to be a near-workalike of Matthias Urlichs's Perl implementation.
> ...
> +class cvsps:
> +"Method class for cvsps back end."
> +def __init__(self):
> +self.opts = ""
> +self.revmap = None
> +def set_repo(self, val):
> +"Set the repository root option."
> +if not val.startswith(":"):
> +if not val.startswith(os.sep):
> +val = os.path.abspath(val)
> +val = ":local:" + val
> +self.opts += " --root '%s'" % val

This looks lazy and unsafe quoting.  Is there anything that makes
sure repository path does not contain a single quote?

> +def set_authormap(self, val):
> +"Set the author-map file."
> +self.opts += " -A '%s'" % val
> +def set_fuzz(self, val):
> +"Set the commit-similarity window."
> +self.opts += " -z %s" % val
> +def set_nokeywords(self):
> +"Suppress CVS keyword expansion."
> +self.opts += " -k"
> +def add_opts(self, val):
> +"Add options to the engine command line."
> +self.opts += " " + val

... especially for callers of this method.

The same comment applies to many uses of "val" in the method
implementations of this class and the cvs2git class.

> +def command(self):
> +"Emit the command implied by all previous options."
> +return "(cvs2git --username=git-cvsimport --quiet --quiet 
> --blobfile={0} --dumpfile={1} {2} {3} && cat {0} {1} && rm {0} 
> {1})".format(tempfile.mkstemp()[1], tempfile.mkstemp()[1], self.opts, 
> self.modulepath)

Could we do something better with this overlong source line?

> +if __name__ == '__main__':
> ...
> +for (opt, val) in options:
> +if opt == '-v':
> +verbose += 1
> +elif opt == '-b':
> +bare = True
> +elif opt == '-e':
> +for cls in (cvsps, cvs2git):
> +if cls.__name__ == val:
> +backend = cls()
> +break
> +else:
> +sys.stderr.write("git cvsimport: unknown engine %s.\n" % val)
> +sys.exit(1)
> +elif opt == '-d':
> +backend.set_repo(val)
> +elif opt == '-C':
> +outdir = val
> +elif opt == '-r':
> +remotize = True
> +elif opt == '-o':
> +sys.stderr.write("git cvsimport: -o is no longer supported.\n")
> +sys.exit(1)

Isn't this a regression?

> +elif opt == '-i':
> +import_only = True
> +elif opt == '-k':
> +backend.set_nokeywords()
> +elif opt == '-u':
> +underscore_to_dot = True
> +elif opt == '-s':
> +slashsubst = val
> +elif opt == '-p':
> +backend.add_opts(val.replace(",", " "))
> +elif opt == '-z':
> ...
> +elif opt == '-P':
> +backend = filesource(val)
> +sys.exit(1)

???

> +elif opt in ('-m', '-M'):
> +sys.stderr.write("git cvsimport: -m and -M are no longer 
> supported: use reposurgeon instead.\n")
> +sys.exit(1)

I wonder if it is better to ignore these options with a warning but
still let the command continue; cvsps-3.x was supposed to get merges
right without the help of these ad-hoc options, no?

Otherwise it looks like a regression to me.

> +elif opt == '-S':
> +backend.set_exclusion(val)
> +elif opt == '-a':
> +sys.stderr.write("git cvsimport: -a is no longer supported.\n")
> +sys.exit(1)
> +elif opt == '-L':
> +sys.st