On Thu, Nov 13, 2014 at 09:46:13AM +0100, Raphael Hertzog wrote: > [ I skip the more detailed discussions on naming conventions to > concentrate on your higher level questions for now ]
Agreed, if we solve the tricky problems, that part is mostly just yak shaving (and if we can't, it's probably mostly irrelevant ...) > On Thu, 13 Nov 2014, Ron wrote: > > Sure, I understood those were your goals. > > > > What I haven't seen, and what I'm asking for, is an actual detailed > > rationale describing the actual detailed problem(s) that you think > > these goals will be a remedy for. > > Problem 1: the derivatives > -------------------------- > > So I am a Kali Linux contributor. We use git repos to maintain all our > packages and we use git-buildpackage. I guess the first question there is what were the arguments put forward for deciding to 'standardise' on gbp? If there wasn't one, maybe that's an argument you should have (and if there was, maybe it's one to revisit :) If you have a clearer idea now of the problems you are facing, it might make properly evaluating things that avoid those problems easier. > Most of the Kali contributors > are not long-term Debian contributors, I write documentation so that > they can contribute to Kali (while basing our work on Debian). > > To make this manageable I opted to always use a workflow based on > "git-import-orig". Even when Debian has its own git repo, we start > from the released source packages because that's the only level of > uniformity that we can rely on. And it's a pity because if we could > build on Debian's git repo, it also means that our work would be easier > to merge for the Debian maintainer. That's not a totally terrible way to kick off a new repo when there isn't an existing one. I wrote git-debimport to build a history from just existing Debian source packages when that's all you have, and debsnap (in devscripts) originally got written for use with it to collect the whole set from snapshot when you didn't already have them. And you can fairly easily splice a repo created with it to a real upstream one to continue maintenance in a more sensible way from there. But it is a kind of terrible way to continue maintaining them if there is a repo. Unfortunately I think that if you make "uniformity" the overarching consideration here, you've basically doomed yourself to "failure" from the outset. Even if everyone did stick to the conventions already discussed (which in reality, they won't) there's still far too many degrees of (quite necessary) freedom to really approach a "just follow these three easy steps" kind of uniformity. And even if you got close to achieving that for the "debian branches", the (again necessary) variability between upstream repos is going to be even greater. I think at some point you're going to have to rely on real human intelligence to be able to look at a repo and form their own understanding of its structure. Most really aren't all that complicated (however much they vary between each other), and in the worst case you can always actually ask the 'upstream' maintainer to explain anything that is unclear. That's going to get annoying really fast if someone new asks me really basic questions something like that every week. But if you do this right it's also something that in the worst case someone should only need to ever ask once, because they can document what they learned for the next person if there are no 'dedicated' maintainers for individual packages, and this is a needed thing. I don't think I've ever had to ask an upstream "how does your repo work" though. So you possibly could also handle this internally with a few knowledgeable mentors too. > They could add the Kali repo as a > remote (without fearing any conflict in terms of tags, branch names) > and just merge or cherry-pick as appropriate. All that said, this part however is not a problem at all. If you've branched your repo off the 'upstream' one (so you share its history) then there's never going to be a conflict between branch names. You can have a Kali repo, cloned from a Debian repo, cloned from an upstream repo, and *all* of those repos could have their own separate and distinct branch called "master", and still there would be no conflict. git is already going to namespace them so when you add the remotes the branch refs will be (for remotes named upstream, debian, kali): remotes/upstream/master remotes/debian/master remotes/kali/master What you name those branches if you check them out locally is totally up to you. The local names need to be unique, but you can: $ git checkout -t debian/master -b debian-master $ git checkout -t kali/master -b master And that will work just fine. Tag names aren't quite so forgiving. But realistically, even if you simply name your tags v$version in all of upstream, debian, and kali, then if you actually have conflicting names you were already in deep trouble anyway, because now you have a kali package with *exactly* the same version as a debian package. If it really is the 'real' debian package, you have no problem and don't need a kali tag for it anyway. If it's a kali special, then the package already ought to have a x.y.x-1kali2 type version anyway. So this part already naturally avoids conflicts too. > And we could also build on work in progress that has not yet been > released as a source package. Right now, the only packages where we > build on top of the Debian git repositories are some native packages > (like debian-installer). I can't really tell you how kali ought to work, but if you can solve it for this case, you should be able to do the same for non-native packages too. And I would recommend that you try to, even if you can't get the "uniformity" that you'd thing of as ideal. What is the exact difference that makes native packages work for this but others be harder? In the case of gitpkg, this is really just the difference between passing one ref and two to export it. > I can't afford to document all the possible ways Debian is maintaining > their package but if I can write a documentation that covers the common > case and if I can tell them "when you see those branches, you can follow > the instructions below", then we have made some real progress. I really do think that the names of the branches are actually going to be the least of your worries here, unfortunately. Even with a naming scheme that's widely adopted, things just aren't going to be that sort of uniform outside of (a fairly large number of) fairly small subsets. You're going to need a better solution than this unfortunately, if you want it to actually work. > Problem 2: interoperability between the tools > --------------------------------------------- > > I am part of the Python Modules team who wants to switch to git but not > all contributors are using the same git helper tools and yet we would like > to all work together on the same repositories without forcing everybody > to use the same helper tool (habits are hard to change). Really the only way this can possibly work is if all the tools that you consider viable candidates to use *don't* require some special structure of their own in the repo to work (or knowing the right magic incantation specific to each repo for them to work correctly). Otherwise, the tools that do are simply never going to work with a repo that doesn't have that. There really isn't any way around that, except to replace those tools with better ones ... I completely agree with the habits thing. I blame svn for all sorts of terrible habits that seem to persist perniciously :) At some point though, you'll probably have to decide if the habits that seem acceptable in private are habits that really are acceptable in a group situation. It might pay to remember that a willingness to fix bad habits *is* what separates the best and most productive developers from the pack ... > We can't just let each maintainer use the default layout suggested > by his preferred helper tool and the defaul tagging scheme, we have to > define some common layout for the whole team. Then it matters less if > people are using git-buildpackage or git-dpm or gitpkg. Again, the tagging and "layout" are going to be the least of your problems here. Even if you get consistency there, most of these tools still aren't going to be interchangeable if they make assumptions about repo content that git itself doesn't enforce. > It might be awkward at times but at least there is some consistency among > the team, and the few problems that will arise will be occasions to > improve the tools. > > But we have to define the common layout to use and this discussion should > hopefully solve this too. I think the differences you're going to really get stuck on are far more fundamental than that. I think until you sort those out (or rule out use of the tools for which this is not reconcilable) it's premature to worry about layout conventions. If you solve that, any convention "will do". If you don't, no convention is going to be able to help. It's not enough to just say "if all the layouts looked more like what gbp uses, everything will interop". That's not the thing that makes it different to them. > Problem 3: making it easier for new contributors > ------------------------------------------------- > > While I can appreciate the versatility of gitpkg, new contributors > are looking for guidance and clear instructions. It's difficult to give > those when we have zero common ground on how we manage our git > repositories within our project. You should probably go have a play with gitpkg :) I have a hunch you're going to be shocked at how trivial the instructions for using it will be. And how much of what you say you want is going to Just Work out of the box with it. We talked Jacob through this on IRC a few weeks back. He came from having read some hideously complicated tutorial for using gbp, and was bashing his head against the wall trying to figure out how to do the simple task of being able to manage his debian patches against his own upstream repo. The biggest problem he had once we started that, was *unlearning* that this was a hideously complicated process. It took longer to explain "no, no, you don't need to do all that stuff anymore" than it did to explain what he did need to do. He simply couldn't believe at first that it was all he needed to do. I'm happy to walk anyone else through that if it's not immediately intuitive to them and they want to try it. Primacy in learning is a powerful thing. Training someone to copy a bad habit is a lot easier than teaching them to break out of it again. If you're worried about new contributors, this really is something you should seriously explore in more depth. > While I don't see us converging on any single helper tool right now, One thing that might help existing users here, in the group case you described above, is it's probably quite possible to build some front ends to gitpkg that mimic the other tools, for people who simply find that finger memory is hard to break. Unlike several of them, gitpkg didn't try to be a "framework" for package management. It simply focussed on the job of exporting a source package, in any valid source package format, from an existing git repo. It provides hooks to let you tweak that, and to perform automated tasks after export (like shipping the package off to be built), but it explicitly doesn't provide high level functions for repo management tasks *prior* to export. Mostly because it's trivial to write those as separate scripts which call gitpkg once it's time to do the export. So you might find that it's actually much easier to implement any functionality from the more troublesome tools that you do find valuable on top of gitpkg than it is to actually fix the more fundamental problems those tools themselves ... > it's important to start taking steps that brings them closer so that > we can give more useful explanations to newcomers. > and so that they > can get started > > > Likewise, it's not clear to me that tools other than gitpkg are > > actually interchangeable, because they weren't designed to be from > > the outset and rely on magic being committed into the repo to work. > > > > I don't really see how some naming conventions can fix that either. > > Naming conventions won't fix that but it's still a pre-requisite > to be able to fix the tools that (unlike gitpkg) voluntarily set (by > default) more constraints on the expected layout of the repository. Like I said above, I don't think that's going to get you past their more fundamental differences. > > Maybe if you start by detailing the problems, we will be able to > > see some better solutions that actually achieve your real goals > > and result in real improvements to the tools that created them. > > Let's see! Well, at the very least I can promise you that if we find real limiting problems in gitpkg that will be something I'll be keen to see remedied! Whether the other tools can be fixed ... yeah. After all, there was a reason gitpkg was born as a separate thing in the first place :) > > > Fine if the other tools do not need anything like that. But who knows, > > > maybe you will want to enhance git-debcherry to not only update > > > debian/patches/ but also store the corresponding git branch for long-term > > > storage. In which case, you will already have a recommended tag name > > > for this purpose :-) > > > > Why would you want to do that? > > To share them with upstream in a form ready for merge (or more practical > for review/analysis, etc.). Ok. Time for me to share a case study :) I've never really bought the argument that "quilt patches in a debian package will result in them being more efficiently upstreamed", and I've never really seen any evidence to contradict that -- but we don't need to rely on speculation, I have a real example to share! I fairly recently adopted tftp-hpa after it was orphaned. The previous git repo was deleted when it was orphaned so I never got a copy of that and I'm not sure exactly what form it was in, but I believe it was using tarball imports using gbp, and exporting format 3 packages with a quilt series for patches to upstream. There were upstream patches languishing in there dating back ~5 years. I quickly enough reconstructed a base history with a git-debimport of snapshot.d.o (as described above), spliced that to the real upstream git repo, pulled their latest changes, ditched the debian/patches and turned them all into real commits, exported and uploaded a new package, and pushed my repo to alioth. ... less than 8 hours later, *all* of those patches had been cherry picked and merged into the mainline upstream repo. This is why I think having them as real commits, and working in the way that real upstreams really work with git, is infinitely more valuable than having a patch series in either the package or as duplicated diff-of-a-diff commits to the repo. This is far from the only upstream that I've had rapidly accept patches in this way. Now that gitpkg can automatically find and export them as a debian/patches series for people who want that too, that can be done at basically "no cost", so there really is no sensible reason to be handling those separately anymore, even with the assistance of a "helper tool". Just use git as git was designed and if you still want them in the package, let the package export automation do the rest. There's no better way to share with an upstream using git than using git in the way that upstream already understands. (but yes, I know, bad habits die slowly :) > > A lot of the problems you seem to be worried about here are things that > > gitpkg designed around ever having from the outset and simply doesn't > > have. I think if we can raise awareness about those things and fix them > > in the tools that have them, that would be an awesome thing. I'm less > > excited at the idea of codifying those limitations as if they were an > > inevitably necessary thing, as a way to avoid fixing problems in the > > tools that might have them though. That would just paint us all into > > a corner that will be even harder to get out of again later. > > I understand, though we can certainly set a default naming convention > without codifying it as limitations to be imposed. My goal is not to > restrict the workflows, my goal is to standardize the bacic (common) > concepts and associated branch/tag names and build on that to improve all > our tools. Oh, the other one I meant to mention about this the other day and then forgot, is you also can't rely on "retagging" an upstream branch to not break things. An increasing number of upstreams are doing automated version stamping based on the presence of tags in the repo, and if you lay a new tag on top of theirs, some proportion of those things will surely break too. The core problem with making conventions like that though, is that as soon as anything relies on assumptions that aren't something which git itself prohibits, it's simply a matter of time before that thing will explode in your face, on some repo, somewhere. So if you aren't going to enforce them (which realistically is quite impossible to do anyway), ultimately the tools need to support exceptions to them. And as soon as the tools do that, and do that well, the need for the conventions (beyond the ones that already exist outside of Debian as common best practices) quickly evaporates. There really isn't anything special about Debian branches so far as git is concerned. They're really no different from any other feature branch in any other repo. It's the trying to make them special, like they had to be for CVS and SVN, that has been the cause of so much confusion and overcomplication in this space. That's the trap that a lot of tools have fallen into. Maybe some of them can still dig themselves out of it, maybe some of them we'll find are fundamentally stuck fast in it. But I do believe that gitpkg is a working proof that this isn't a trap that can't be avoided. I'd certainly be interested in any feedback you have on what gitpkg *doesn't* do for you that you find to be an essential feature of gbp. I'd be quite surprised if any of those things weren't quite easy to fix with some fairly simple scripting. Cheers, Ron -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141113140412.gc10...@hex.shelbyville.oz