Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
On 01/13/2013 11:20 PM, Junio C Hamano wrote: > After a quick survey of various distros, I think it is very unlikely > that we will see "distros move on to newer cvsps, leaving cvsimport > broken" situation. If anything, it is more like "distros decide to > ignore the new cvsps, until it is made to work with cvsimport" [*1*]. A better predictor of the distros' decisions is probably which other packages depend on cvsps. As one data point: on Debian squeeze and on Ubuntu precise, only two packages depend on cvsps (git-cvs and bzr-cvsps-import) and one suggests it (chora2, "a code repository viewing component for horde framework"). So also by this standard they are unlikely to feel a lot of pressure to update quickly to cvsps3. > I think it is probably sensible to [...] > > Agreed? Yes, I agree that what you propose is a good strategy. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Junio C Hamano writes: > Jonathan Nieder writes: > >> Michael Haggerty wrote: >> >>> Regarding your claim that "within a few months the Perl git-cvsimport is >>> going to cease even pretending to work": It might be that the old >>> git-cvsimport will stop working *for people who upgrade to cvsps 3.x*. >>> But it is not realistic to expect people to synchronize their git and >>> cvsps version upgrades. It is even quite possible that this or that >>> Linux distribution will package incompatible versions of the two packages. >> >> Moreover, I feel an obligation to point the following out: >> >> In a hypothetical world where cvsps 3.x simply breaks "git cvsimport" >> it is likely that some distributions would just stick to the existing >> cvsps and not upgrade to 3.x. Maybe that's a wrong choice, but that's >> a choice some would make. An even more likely outcome in that >> hypothetical world is that they would ship it renamed to something >> like "cvsps3" alongside the existing cvsps. Or they could rename the >> old version to "cvsps2". If we were the last holdout, we could even >> bundle it as compat/cvsps. >> >> So please do not act as though the cvsps upgrade is a crisis that we >> need to break ourselves for at threat of no longer working at all. >> The threat doesn't hold water. >> >> Luckily you have already written patches to make "git cvsimport" work >> with cvsps 3.x, and through your work you are making a better >> argument: "The new cvsimport + cvsps will work better, at least for >> some users, than the old tool." >> >> Just don't pretend you have the power to force a change for a less >> sensible reason than that! > > After a quick survey of various distros, I think it is very unlikely > that we will see "distros move on to newer cvsps, leaving cvsimport > broken" situation. If anything, it is more like "distros decide to > ignore the new cvsps, until it is made to work with cvsimport" [*1*]. > > I think it is probably sensible to rename the current cvsimport to > cvsimport-2, write a small wrapper git-cvsimport.sh which is > something like this: > > - >8 - > #!/bin/sh > > if test -z "$GIT_CVSPS_VERSION" > then > case "$(cvsps -h 2>&1 | grep 'cvsps version')" in > 2.*) > GIT_CVSPS_VERSION=2 > ;; > 3.*) > GIT_CVSPS_VERSION=3 > ;; > esac > fi > > if test -z "$GIT_CVSPS_VERSION" > then > echo >&2 "No supported cvsps available" > exit 1 > fi > > exec git cvsimport-$GIT_CVSPS_VERSION "$@" > - 8< - > > and put Eric's one as git-cvsimport-3 (after ripping out the code to > fallback to the old cvsimport). The longer term trend will be to > move away from cvsimport-2, as it is unlikely cvsps-2.x will gain > improvements, if any; keeping fallback code outside cvsimport-3 will > be a better first step in the healthier long term code evolution. > > We will keep the current t96xx series of tests, and have them export > GIT_CVSPS_VERSION=2 at the beginning, protect them with test prereq > that requires presence of cvsps 2.x; this will still make sure that > the current cvsimport users will not see any regressions. > > Eric's one should be polished enough to produce good results on the s/should be polished enough/should be in a polished enough state/ that is. Also "if not right now" may better convey what I meant if written "if not already". > simpler sample CVS histories t96xx deal with soonish if not right > now, so we can use a method similar to how we shared tests between > blame and annotate while both were _different_ implementations to > make sure the newer blame did not inroduce regression by running the > same set of tests. Where the result _ought_ to differ, we should > also add tests that work only with the new cvsimport, of course. > > I could help getting the ball rolling on this, if everybody agrees > that the above is a sensible direction to go, given the real world > constraints of distro inertia. > > Agreed? > > > [References] > > *1* Fedora, Debian and Ubuntu do not even have cvsps 3.x in their > bleeding edges, OpenBSD and NetBSD do not seem to have it either, > and Gentoo and ArchLinux have the cvsps 3.x blocked due to > incompatiblity. > > http://pkgs.fedoraproject.org/cgit/cvsps.git/ > http://packages.debian.org/search?keywords=cvsps > http://packages.ubuntu.com/search?keywords=cvsps > > http://packages.gentoo.org/package/dev-vcs/cvsps > https://bugs.gentoo.org/show_bug.cgi?id=450424 > > https://bugs.archlinux.org/task/33363?project=1&cat%5B0%5D=2&string=cvsps -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Jonathan Nieder writes: > Michael Haggerty wrote: > >> Regarding your claim that "within a few months the Perl git-cvsimport is >> going to cease even pretending to work": It might be that the old >> git-cvsimport will stop working *for people who upgrade to cvsps 3.x*. >> But it is not realistic to expect people to synchronize their git and >> cvsps version upgrades. It is even quite possible that this or that >> Linux distribution will package incompatible versions of the two packages. > > Moreover, I feel an obligation to point the following out: > > In a hypothetical world where cvsps 3.x simply breaks "git cvsimport" > it is likely that some distributions would just stick to the existing > cvsps and not upgrade to 3.x. Maybe that's a wrong choice, but that's > a choice some would make. An even more likely outcome in that > hypothetical world is that they would ship it renamed to something > like "cvsps3" alongside the existing cvsps. Or they could rename the > old version to "cvsps2". If we were the last holdout, we could even > bundle it as compat/cvsps. > > So please do not act as though the cvsps upgrade is a crisis that we > need to break ourselves for at threat of no longer working at all. > The threat doesn't hold water. > > Luckily you have already written patches to make "git cvsimport" work > with cvsps 3.x, and through your work you are making a better > argument: "The new cvsimport + cvsps will work better, at least for > some users, than the old tool." > > Just don't pretend you have the power to force a change for a less > sensible reason than that! After a quick survey of various distros, I think it is very unlikely that we will see "distros move on to newer cvsps, leaving cvsimport broken" situation. If anything, it is more like "distros decide to ignore the new cvsps, until it is made to work with cvsimport" [*1*]. I think it is probably sensible to rename the current cvsimport to cvsimport-2, write a small wrapper git-cvsimport.sh which is something like this: - >8 - #!/bin/sh if test -z "$GIT_CVSPS_VERSION" then case "$(cvsps -h 2>&1 | grep 'cvsps version')" in 2.*) GIT_CVSPS_VERSION=2 ;; 3.*) GIT_CVSPS_VERSION=3 ;; esac fi if test -z "$GIT_CVSPS_VERSION" then echo >&2 "No supported cvsps available" exit 1 fi exec git cvsimport-$GIT_CVSPS_VERSION "$@" - 8< - and put Eric's one as git-cvsimport-3 (after ripping out the code to fallback to the old cvsimport). The longer term trend will be to move away from cvsimport-2, as it is unlikely cvsps-2.x will gain improvements, if any; keeping fallback code outside cvsimport-3 will be a better first step in the healthier long term code evolution. We will keep the current t96xx series of tests, and have them export GIT_CVSPS_VERSION=2 at the beginning, protect them with test prereq that requires presence of cvsps 2.x; this will still make sure that the current cvsimport users will not see any regressions. Eric's one should be polished enough to produce good results on the simpler sample CVS histories t96xx deal with soonish if not right now, so we can use a method similar to how we shared tests between blame and annotate while both were _different_ implementations to make sure the newer blame did not inroduce regression by running the same set of tests. Where the result _ought_ to differ, we should also add tests that work only with the new cvsimport, of course. I could help getting the ball rolling on this, if everybody agrees that the above is a sensible direction to go, given the real world constraints of distro inertia. Agreed? [References] *1* Fedora, Debian and Ubuntu do not even have cvsps 3.x in their bleeding edges, OpenBSD and NetBSD do not seem to have it either, and Gentoo and ArchLinux have the cvsps 3.x blocked due to incompatiblity. http://pkgs.fedoraproject.org/cgit/cvsps.git/ http://packages.debian.org/search?keywords=cvsps http://packages.ubuntu.com/search?keywords=cvsps http://packages.gentoo.org/package/dev-vcs/cvsps https://bugs.gentoo.org/show_bug.cgi?id=450424 https://bugs.archlinux.org/task/33363?project=1&cat%5B0%5D=2&string=cvsps -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Michael Haggerty wrote: > Regarding your claim that "within a few months the Perl git-cvsimport is > going to cease even pretending to work": It might be that the old > git-cvsimport will stop working *for people who upgrade to cvsps 3.x*. > But it is not realistic to expect people to synchronize their git and > cvsps version upgrades. It is even quite possible that this or that > Linux distribution will package incompatible versions of the two packages. Moreover, I feel an obligation to point the following out: In a hypothetical world where cvsps 3.x simply breaks "git cvsimport" it is likely that some distributions would just stick to the existing cvsps and not upgrade to 3.x. Maybe that's a wrong choice, but that's a choice some would make. An even more likely outcome in that hypothetical world is that they would ship it renamed to something like "cvsps3" alongside the existing cvsps. Or they could rename the old version to "cvsps2". If we were the last holdout, we could even bundle it as compat/cvsps. So please do not act as though the cvsps upgrade is a crisis that we need to break ourselves for at threat of no longer working at all. The threat doesn't hold water. Luckily you have already written patches to make "git cvsimport" work with cvsps 3.x, and through your work you are making a better argument: "The new cvsimport + cvsps will work better, at least for some users, than the old tool." Just don't pretend you have the power to force a change for a less sensible reason than that! Hope that helps, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Hi Eric, Eric S. Raymond wrote: > But in practice the git crew was going to lose that > capability anyway simply because the new wrapper will support three > engines rather than just one. It's not practical for the git tests to > handle that many variant external dependencies. See the git-blame/git-annotate tests for an example of how the testsuite handles "variations on a theme". It works fine. Hope that helps, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Michael Haggerty : > Otherwise, how do we know that cvsps currently works with git-cvsimport? > (OK, you claim that it does, but in the next breath you admit that > there is a new failure in "one pathological tagging case".) How can we > understand its strengths/weaknesses? How can we gain confidence that it > works on different platforms? How will we find out if a future versions > of cvsps stops working (e.g., because of a breakage or a > non-backwards-compatible change)? You can't. But in practice the git crew was going to lose that capability anyway simply because the new wrapper will support three engines rather than just one. It's not practical for the git tests to handle that many variant external dependencies. However, there is a solution. The solution is for git to test that the wrapper is *generating the expected commands*. So what the git tree ends up with is conditional assurance; the wrapper will do the right thing if the engine it calls is working correctly. I think that's really all the git-tree tests can hope for. Michael, the engines are my problem and yours - it's *our* responsibility to develop a (hopefully shared) test suite to verify that they convert repos correctly. I'm working my end as fast as I can; I hope to have the test suite factored out of cvsps and ready to check multiple engines by around Wednesday. I still need to convert t9604, too. I have parsecvs working since yesterday, so we really are up to three engines. I have two minor features I need to merge into parsecvs before I can start on splitting out the test suite. -- http://www.catb.org/~esr/";>Eric S. Raymond -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Junio C Hamano : > And here is what I got: Hm. In my version of these tests, I only have one regression from the old combo (in the pathological tags test, t9602). You're seeing more breakage than that, obviously. > A funny thing was that without cvsps-3.7 on $PATH (which means I am > getting distro packaged cvsps 2.1), I got identical errors. That suggests that something in your test setup has gone bad and is introducing spurious errors. Which would be consistent with the above. > Looking > at the log message, it seems that you meant to remove t960[123], so > perhaps the patch simply forgot to remove 9601 and 9602? Yes. > As neither test runs "git cvsimport" with -o/-m/-M options, ideally > we should be able to pass them with and without having cvsps-3.x. > Not passing them without cvsps-3.x would mean that the fallback mode > of rewritten cvsimport is not working as expected. Not passing them > with cvsps-3.x may mean the tests were expecting a wrong conversion > result, or they uncover bugs in the replacement cvsimport. That's possible, but seems unlikely. Because the new cvsimport is such a thin wrapper around the conversion engine, bugs in it should lead to obvious crashes or failure to run the engine rather than the sort of conversion error the t960* tests are designed to check. Really all it does is assemble options to pass to the conversion engines. My test strategy is aimed at the engine, not the wrapper. I took the repos from t960* and wrote a small Python framework to check the same assertions as the git-tree tests do, but using the engine. For example, here's how my t9602 looks: import os, cvspstest cc = cvspstest.ConvertComparison("t9602", "module") cc.cmp_branch_tree("test of branch", "master", True) cc.cmp_branch_tree("test of branch", "vendorbranch", True) cc.cmp_branch_tree("test of branch", "B_FROM_INITIALS", False) cc.cmp_branch_tree("test of branch", "B_FROM_INITIALS_BUT_ONE", False) cc.cmp_branch_tree("test of branch", "B_MIXED", False) cc.cmp_branch_tree("test of branch", "B_SPLIT", True) cc.cmp_branch_tree("test of tag", "vendortag", False) # This is the only test new cvsps fails that old git-cvsimport passed. cc.cmp_branch_tree("test of tag", "T_ALL_INITIAL_FILES", True) cc.cmp_branch_tree("test of tag", "T_ALL_INITIAL_FILES_BUT_ONE", False) cc.cmp_branch_tree("test of tag", "T_MIXED", False) cc.cleanup() > t9600 fails with "-a is no longer supported", even without having > cvsps-3.x on the $PATH (i.e. attempting to use the fallback). I > wonder if this is an option the updated cvsimport would want to > simply ignore? Probably. But I don't think you should keep these tests in the git tree. That wasn't a great idea even when you were supporting just one engine; with two (and soon three) it's going to be just silly. Let sanity-checking the engines be *my* problem, since I have to do it anyway. (I'm working towards the generalized test suite as fast as I can. First results probably in four days or so.) > It is a way to tell the old cvsps/cvsimport to disable its > heuristics to ignore commits made within the last 10 minutes (this > is done in the hope of waiting for the per-file nature of CVS > commits to stabilize, IIUC); the user tells the command that he > knows that the CVS repository is now quiescent and it is safe to > import the whole thing. Yes, that's just what -a is supposed to do. But is should be irrelevant for testing - in the test framework CVS is running locally, so there's no network lag. > So... does this mean that we now set the minimum required version of > Python to 2.7? I dunno. That would be bad, IMO. I'll put backporting to 2.6 high on my to-do list. -- http://www.catb.org/~esr/";>Eric S. Raymond -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
On 01/11/2013 04:32 AM, Junio C Hamano wrote: > From: "Eric S. Raymond" > > The combination of git-cvsimport and cvsps had serious problems. Agreed. > [...] > This patch also removes Michael Haggerty's git-cvsimport tests > (t960[123]) from the git tree. These are actually conversion-engine > tests and have been merged into a larger cvsps test suite, which I > intend to spin out into a general CVS-lifting test that can also be > applied to utilities such as cvs2git and parsecvs. The t9604 test > will move in a future patch, when I likewise have it integrated > into the general test suite. > > The following known bug has not been fixed: "If any files were ever > "cvs import"ed more than once (e.g., import of more than one vendor > release) the HEAD contains the wrong content." However, cvsps now > emits a warning in this case. There is also one pathological tagging > case that was successful in the former t9602 test that now fails > (with a warning). > > I plan to address these problems. This patch at least gets the > cvsps-3.x/git-cvsimport combination to a state that is not too > broken to ship - that is, in all failure cases known to me it > now emits useful warnings rather than silently botching the > import. I don't understand the logic of removing the cvsimport tests, at least not at this time. It is true that the tests mostly ensure that the conversion engine is working correctly, especially with your new version of cvsps. But I think the git project, by implicitly endorsing the use of cvsps, has some responsibility to verify that the combination cvsps + git-cvsimport continues to work and to document any known breakages via its test suite. Otherwise, how do we know that cvsps currently works with git-cvsimport? (OK, you claim that it does, but in the next breath you admit that there is a new failure in "one pathological tagging case".) How can we understand its strengths/weaknesses? How can we gain confidence that it works on different platforms? How will we find out if a future versions of cvsps stops working (e.g., because of a breakage or a non-backwards-compatible change)? Normally one would expect an improvement like this to be combined with patches that turn test expected failures into expected successes, not to rip out the very tests that establish the correctness of the change that is being proposed! Let me describe what I would consider to be the optimum state of the test suite. Maybe your idea of "optimum" differs from mine, or maybe the optimum is unrealistic due to lack of resources or for some other reason. But if so, let's explicitly spell out why we are deviating from whatever optimum we define. * The old tests should be retained (and possibly new tests added to show off your improvements). * There should be a way for users to choose which cvsps executable to use when running test suite. (In the future, the selection might be expanded to cover altogether different conversion engines.) * The tests should determine which version of cvsps has been selected (e.g., by running "cvsps --version"). * The individual tests should be marked expected success/expected failure based on the selected version of cvsps; in other words, some tests might be marked "expected failure" if cvsps 2.x is being used but "expected success" if cvsps 3.x is being used. Regarding your claim that "within a few months the Perl git-cvsimport is going to cease even pretending to work": It might be that the old git-cvsimport will stop working *for people who upgrade to cvsps 3.x*. But it is not realistic to expect people to synchronize their git and cvsps version upgrades. It is even quite possible that this or that Linux distribution will package incompatible versions of the two packages. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
I cloned git://gitorious.org/cvsps/cvsps.git and installed cvsps-3.7 at c2ce6cc (More fun with test loads, sigh. Timezones suck., 2013-01-09) earlier on my $PATH, and tried to run t96xx series with this patch applied on top of Git 1.8.1. The first thing I noticed was that all the tests were skipped. A patch to t/lib-cvs.sh might be sufficient, - >8 - diff --git a/t/lib-cvs.sh b/t/lib-cvs.sh index 44263ad..423953f 100644 --- a/t/lib-cvs.sh +++ b/t/lib-cvs.sh @@ -17,6 +17,8 @@ cvsps_version=`cvsps -h 2>&1 | sed -ne 's/cvsps version //p'` case "$cvsps_version" in 2.1 | 2.2*) ;; +3.*) + ;; '') skip_all='skipping cvsimport tests, cvsps not found' test_done - 8< - but I didn't check more than "now it seems not to skip them". And here is what I got: - >8 - Test Summary Report --- t9600-cvsimport.sh (Wstat: 256 Tests: 15 Failed: 9) Failed tests: 4-6, 8-9, 11-13, 15 Non-zero exit status: 1 t9601-cvsimport-vendor-branch.sh (Wstat: 256 Tests: 9 Failed: 8) Failed tests: 1-4, 6-9 Non-zero exit status: 1 t9602-cvsimport-branches-tags.sh (Wstat: 256 Tests: 11 Failed: 5) Failed tests: 1-3, 7, 9 Non-zero exit status: 1 t9604-cvsimport-timestamps.sh (Wstat: 256 Tests: 2 Failed: 2) Failed tests: 1-2 Non-zero exit status: 1 Files=5, Tests=38, 5 wallclock secs ( 0.05 usr 0.01 sys + 0.49 cusr 0.16 csys = 0.71 CPU) Result: FAIL - 8< - A funny thing was that without cvsps-3.7 on $PATH (which means I am getting distro packaged cvsps 2.1), I got identical errors. Looking at the log message, it seems that you meant to remove t960[123], so perhaps the patch simply forgot to remove 9601 and 9602? As neither test runs "git cvsimport" with -o/-m/-M options, ideally we should be able to pass them with and without having cvsps-3.x. Not passing them without cvsps-3.x would mean that the fallback mode of rewritten cvsimport is not working as expected. Not passing them with cvsps-3.x may mean the tests were expecting a wrong conversion result, or they uncover bugs in the replacement cvsimport. t9600 fails with "-a is no longer supported", even without having cvsps-3.x on the $PATH (i.e. attempting to use the fallback). I wonder if this is an option the updated cvsimport would want to simply ignore? It is a way to tell the old cvsps/cvsimport to disable its heuristics to ignore commits made within the last 10 minutes (this is done in the hope of waiting for the per-file nature of CVS commits to stabilize, IIUC); the user tells the command that he knows that the CVS repository is now quiescent and it is safe to import the whole thing. If the updated cvsps can identify the changeset more reliably and it no longer needs "-a" option, it may be more helpful to the users to migrate their script if it allowed, warned and then ignored the option. It certainly would help sharing of this test script between runs that use the old and new cvsps as backends. t9601 (after resurrecting the t/t9601/cvsroot directory) fails in an interesting way. - >8 - $ sh t9601-cvsimport-vendor-branch.sh -i -v Initialized empty Git repository in /git/git.build/t/trash directory.t9601-cvsimport-vendor-branch/.git/ expecting success: git cvsimport -C module-git module Traceback (most recent call last): File "/git/git.build/git-cvsimport", line 262, in subprocess.check_output("cvsps -V 2>/dev/null", shell=True) AttributeError: 'module' object has no attribute 'check_output' not ok - 1 import a module with a vendor branch - 8< - Apparently, the copy of "subprocess.py" I have does not give us the check_output thing: - >8 - $ python Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import subprocess >>> dir(subprocess) ['CalledProcessError', 'MAXFD', 'PIPE', 'Popen', 'STDOUT', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '_active', '_cleanup', '_demo_posix', '_demo_windows', '_eintr_retry_call', 'call', 'check_call', 'errno', 'fcntl', 'gc', 'list2cmdline', 'mswindows', 'os', 'pickle', 'select', 'signal', 'sys', 'traceback', 'types'] - 8< - The story is the same for t9602 and t9603 (again after resurrecting the necessary files). http://docs.python.org/2/library/subprocess.html tells me that check_output has first become available in 2.7. So... does this mean that we now set the minimum required version of Python to 2.7? I dunno. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http:/
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Junio C Hamano : > Yeah, it is OK to _discourage_ its use, but to me it looks like that > the above is a fairly subjective policy decision, not something I > should let you impose on the users of the old cvsimport, which you > do not seem to even treat as your users. Er. You still don't seem to grasp the fundamentals of this situation. I'm not imposing any damn thing on the users. What's imposing is the fact that cvsps-2.x and the Perl cvsimport are both individually and collectively *broken right now*, and within a few months the Perl git-cvsimport is going to cease even pretending to work. I'm trying to *fix that problem* as best I can, fixing it required two radical rewrites, and criticizing me for not emulating every last detail and misfeature immediately is every bit as pointless and annoying as arguing about the fabric on the deck chairs while the ship is sinking. To put it bluntly, you should be grateful to be getting back any functionality at all - because the alternative is that the Perl git-cvsimport will hang out in your tree as a dead piece of cruft. Your choice is between making it easy for me replace it with minimum disruption now and hoping for someone else to replace it months from now after you've had a bunch of unhappy users bitching at you. So let me be more direct. I think the -M and -m options are sufficiently bad ideas that I am *not willing* to put in the quite large amount of effort that would be required to implement them in cvsps or parsecvs. That would be a bad use of my time. This is not the case with -o; that might be a good idea if I understood it. This is also not like the 2.x fallback; I thought that was a bad idea (because it would be better for users that the combination break in an obvious way than continue breaking in a silent one), but it was a small enough effort that I was willing to do it anyway to keep the git maintainer happy. The effort to fix the quoting bugs is even easier for me to justify; they are actual bugs. Those are my engineering judgments; go ahead and call them "subjective" if you like, but neither the facts nor my judgment will change on that account. > The "major" in my sentence was from your description (the bugs you > fixed), and not about the new ones you still have in this draft. I > did not mean to say that you are trading fixes to "major" bugs with > different "major" bugs. OK, thank you. In the future I will try to bear in mind that English is not your primary language when I evaluate statements that seems a bit offensive. So what's your next bid? Note that you can't increase my friction and hassle costs much more before I give up and let you deal with the consequences without me. I want to do the right thing, but I have more other projects clamoring for my attention than you could easily guess. I need to get git-cvsimport *finished* and off my hands - I may already have given it more time than I really should have. So give me your minimum list of deliverables before you'll merge, please, and then stick to it. I assume fixes for the quoting bugs will be on that list. -- http://www.catb.org/~esr/";>Eric S. Raymond -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
"Eric S. Raymond" writes: > Junio C Hamano : > ... > The other is a design-level problem - these options were a bad idea to > begin with. In earlier list mail I said > > An example of the batchiness mistake close to home is the -m and -M > options in the old version of cvsimport. It takes human judgment > looking at the whole commit DAG in gitspace to decide what merge > points would best express the (as you say, sometimes ambiguous) CVS > history - what's needed is a scalpel and sutures in a surgeon's hand, > not a regexp hammer. > > One specific problem with the regexp hammer is false-positive matches > leading to unintended merges. Yeah, it is OK to _discourage_ its use, but to me it looks like that the above is a fairly subjective policy decision, not something I should let you impose on the users of the old cvsimport, which you do not seem to even treat as your users. >> Having the code to die when it sees options the rewritten version >> does not yet support before it calls the fallback makes the fallback >> much less effective, no? > > Only to the extent that -o/-m/-M are really important, which I doubt. > But that might be fixable, and I'll put it on the to-do list. > >> Not very impressed (yet). The advertised "fix major bugs" sounds >> more like "trade major bugs with different ones with a couple of >> feature removals" at this point. > > If you think that, you have failed to understand just how broken and > dangerous the old combination is. None of the details you've called > out are "major" by any stretch of the imagination compared to it > silently botching the translation of repositories. The "major" in my sentence was from your description (the bugs you fixed), and not about the new ones you still have in this draft. I did not mean to say that you are trading fixes to "major" bugs with different "major" bugs. Insecure quoting of parameters is much easier to fix; it does need to be addressed, though. It is just that looking at the state of the patch as submitted left me with a feeling that this topic needs a lot more time to mature than I previously was led to believe by your earlier messages, which made me someaht sad. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
"Eric S. Raymond" writes: > Junio C Hamano : >> I think the prevalent style in this script is to write "print" >> without parentheses: >> >> print STDERR "msg\n"; > > That can be easily fixed. > >> This looks lazy and unsafe quoting. Is there anything that makes >> sure repository path does not contain a single quote? > > No. But...wait, checking...the Perl code didn't have the analogous > check, so there's no increased vulnerability here. I'll put it on the > to-do list for after I ship parsecvs. I checked before I sent that review, and as far as I could tell, it was fairly consistently avoiding the lazy and insecure forms, e.g. system("com mand " . $param); open($fh, "com mand " . $param . " |"); while (<$fh>) { ... } but used the more sequre list form, e.g. system(qw(com mand), $param); open($fh, "-|", qw(com mand), $param); while (<$fh>){ ... } But of course there may be some places that were careless that I didn't spot (and previous reviewers of the current cvsimport didn't). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
Junio C Hamano : > I think the prevalent style in this script is to write "print" > without parentheses: > > print STDERR "msg\n"; That can be easily fixed. > This looks lazy and unsafe quoting. Is there anything that makes > sure repository path does not contain a single quote? No. But...wait, checking...the Perl code didn't have the analogous check, so there's no increased vulnerability here. I'll put it on the to-do list for after I ship parsecvs. > > +def command(self): > > +"Emit the command implied by all previous options." > > +return "(cvs2git --username=git-cvsimport --quiet --quiet > > --blobfile={0} --dumpfile={1} {2} {3} && cat {0} {1} && rm {0} > > {1})".format(tempfile.mkstemp()[1], tempfile.mkstemp()[1], self.opts, > > self.modulepath) > > Could we do something better with this overlong source line? Yes. The correct fix is to simplify cvs2git's rather baroque command-line interface. Michael Haggerty has accepted that patch. Soon that line will look like this: return "cvs2git --quiet --quiet {0} {1}".format(self.opts, self.modulepath) Older versions of cvs2git will terminate cleanly with an error message when called this way. > > +elif opt == '-o': > > +sys.stderr.write("git cvsimport: -o is no longer supported.\n") > > +sys.exit(1) > > Isn't this a regression? It would be if the -o behavior were consistent and decently documented. When I tested this option with the Perl version I got no result. Possibly my usage was incorrect; if anyone can be found who actually understands how it's supposed to work in detail and will tell me, I can probably support it. > > +elif opt in ('-m', '-M'): > > +sys.stderr.write("git cvsimport: -m and -M are no longer > > supported: use reposurgeon instead.\n") > > +sys.exit(1) > > I wonder if it is better to ignore these options with a warning but > still let the command continue; cvsps-3.x was supposed to get merges > right without the help of these ad-hoc options, no? Sorry, I don't know where you got that idea. I don't think the general merge detection that would need is possible even in principle. > Otherwise it looks like a regression to me. There are two reasons -m and -M aren't supported. One is implementation-level. The wrapper script no longer deals with individual files or changesets or branches; it relies on the conversion engine to do all that. (As it should - the old design was a mess with lots of stuff being done at the wrong level.) But the conversion engines don't implement -m or -M, and aren't ever going to (see next paragraph). The other is a design-level problem - these options were a bad idea to begin with. In earlier list mail I said An example of the batchiness mistake close to home is the -m and -M options in the old version of cvsimport. It takes human judgment looking at the whole commit DAG in gitspace to decide what merge points would best express the (as you say, sometimes ambiguous) CVS history - what's needed is a scalpel and sutures in a surgeon's hand, not a regexp hammer. One specific problem with the regexp hammer is false-positive matches leading to unintended merges. That's why I won't implement these in cvsps or parsecvs. Instead I've just added DAG visualization via graphviz in reposurgeon, so a human can quickly see candidate merges in the visualization and do them by hand. This is better and safer. > Having the code to die when it sees options the rewritten version > does not yet support before it calls the fallback makes the fallback > much less effective, no? Only to the extent that -o/-m/-M are really important, which I doubt. But that might be fixable, and I'll put it on the to-do list. > Not very impressed (yet). The advertised "fix major bugs" sounds > more like "trade major bugs with different ones with a couple of > feature removals" at this point. If you think that, you have failed to understand just how broken and dangerous the old combination is. None of the details you've called out are "major" by any stretch of the imagination compared to it silently botching the translation of repositories. Also bear in mind that leaving the old Perl code in place is not going to be a viable option for more than a few months out. As cvsps-3.x propagates to the distros what you have is going to stop even its current pretense of working. Finally...my own purposes are fulfilled by having CVS exporters that can do a decent job of front-ending for reposurgeon. Rewriting git's wrapper script was extra work that I took on only because I wanted to be friendly to the git project, *but*... ...there is a limit to the amount of what I consider pointless hoop-jumping that friendliness will buy you, and the 2.x fallback eas already pushing that limit. Tread a little more gently, Junio; I've put in a lot of hard, boring work on git-cvsimport over the last two weeks when I w
Re: [PATCH] cvsimport: rewrite to use cvsps 3.x to fix major bugs
> From: "Eric S. Raymond" > ... > diff --git a/git-cvsimport.perl b/git-cvsimport-fallback.perl > similarity index 98% > rename from git-cvsimport.perl > rename to git-cvsimport-fallback.perl > index 0a31ebd..4bc0717 100755 > --- a/git-cvsimport.perl > +++ b/git-cvsimport-fallback.perl > @@ -1,4 +1,8 @@ > #!/usr/bin/perl > +# This code became obsolete in January 2013, and is retained only as a > +# fallback from git-cvsimport.py for users who have only cvsps-2.x. > +# It (and the code in cvsimport.py that calls it) should be removed > +# once the 3.x version has had a reasonable time to propagate. > > # This tool is copyright (c) 2005, Matthias Urlichs. > # It is released under the Gnu Public License, version 2. > @@ -27,6 +31,10 @@ > use POSIX qw(strftime tzset dup2 ENOENT); > use IPC::Open2; > > +print(STDERR "You do not appear to have cvsps 3.x.\n"); > +print(STDERR "Falling back to unmaintained Perl cvsimport for cvsps 2.x.\n"); > +print(STDERR "Upgrade your cvsps for best results.\n"); I think the prevalent style in this script is to write "print" without parentheses: print STDERR "msg\n"; > diff --git a/git-cvsimport.py b/git-cvsimport.py > new file mode 100755 > index 000..129471e > --- /dev/null > +++ b/git-cvsimport.py > @@ -0,0 +1,354 @@ > +#!/usr/bin/env python > +# > +# Import CVS history into git > +# > +# Intended to be a near-workalike of Matthias Urlichs's Perl implementation. > ... > +class cvsps: > +"Method class for cvsps back end." > +def __init__(self): > +self.opts = "" > +self.revmap = None > +def set_repo(self, val): > +"Set the repository root option." > +if not val.startswith(":"): > +if not val.startswith(os.sep): > +val = os.path.abspath(val) > +val = ":local:" + val > +self.opts += " --root '%s'" % val This looks lazy and unsafe quoting. Is there anything that makes sure repository path does not contain a single quote? > +def set_authormap(self, val): > +"Set the author-map file." > +self.opts += " -A '%s'" % val > +def set_fuzz(self, val): > +"Set the commit-similarity window." > +self.opts += " -z %s" % val > +def set_nokeywords(self): > +"Suppress CVS keyword expansion." > +self.opts += " -k" > +def add_opts(self, val): > +"Add options to the engine command line." > +self.opts += " " + val ... especially for callers of this method. The same comment applies to many uses of "val" in the method implementations of this class and the cvs2git class. > +def command(self): > +"Emit the command implied by all previous options." > +return "(cvs2git --username=git-cvsimport --quiet --quiet > --blobfile={0} --dumpfile={1} {2} {3} && cat {0} {1} && rm {0} > {1})".format(tempfile.mkstemp()[1], tempfile.mkstemp()[1], self.opts, > self.modulepath) Could we do something better with this overlong source line? > +if __name__ == '__main__': > ... > +for (opt, val) in options: > +if opt == '-v': > +verbose += 1 > +elif opt == '-b': > +bare = True > +elif opt == '-e': > +for cls in (cvsps, cvs2git): > +if cls.__name__ == val: > +backend = cls() > +break > +else: > +sys.stderr.write("git cvsimport: unknown engine %s.\n" % val) > +sys.exit(1) > +elif opt == '-d': > +backend.set_repo(val) > +elif opt == '-C': > +outdir = val > +elif opt == '-r': > +remotize = True > +elif opt == '-o': > +sys.stderr.write("git cvsimport: -o is no longer supported.\n") > +sys.exit(1) Isn't this a regression? > +elif opt == '-i': > +import_only = True > +elif opt == '-k': > +backend.set_nokeywords() > +elif opt == '-u': > +underscore_to_dot = True > +elif opt == '-s': > +slashsubst = val > +elif opt == '-p': > +backend.add_opts(val.replace(",", " ")) > +elif opt == '-z': > ... > +elif opt == '-P': > +backend = filesource(val) > +sys.exit(1) ??? > +elif opt in ('-m', '-M'): > +sys.stderr.write("git cvsimport: -m and -M are no longer > supported: use reposurgeon instead.\n") > +sys.exit(1) I wonder if it is better to ignore these options with a warning but still let the command continue; cvsps-3.x was supposed to get merges right without the help of these ad-hoc options, no? Otherwise it looks like a regression to me. > +elif opt == '-S': > +backend.set_exclusion(val) > +elif opt == '-a': > +sys.stderr.write("git cvsimport: -a is no longer supported.\n") > +sys.exit(1) > +elif opt == '-L': > +sys.st