Unless there's something new in the packager I've not seen yet, using d.o pulls in production bypasses the packager. That is, you're then missing:

- The full version information in the info file, which is used by update manager.
- The License.TXT file that every module is supposed to have.

Is that no longer the case? I'm pretty sure both of those still only happen with a tarball, so if you want those (and I do) then you need to use a tarball.

Also, if you want to manage both core and contrib modules that way it means you are now using git submodules, which it's generally agreed suck AFAIK, or complex sub-tree merging that is out of reach of 99% of developers. Hell, I've done it and I don't want to do it. :-)

Shallow clones are fine for removing disk size, certainly. But there's workflow considerations there that I don't believe Git solves (at least not yet). If I'm not doing site-specific or company-specific branches of core or modules (that is, hacking core or hacking modules, which is a no-no in 95% of cases), then the extra patch-level control that the more complex all-Git approach would allow is useless because I'm not even using it.

I'm not saying there are no use cases for an all-git-all-the-time site building process, just that it has implications that you're glossing over in return for a benefit that the majority of use cases don't even need.

--Larry Garfield

On 3/1/11 12:38 PM, Sam Boyer wrote:


On 3/1/11 8:13 AM, [email protected] wrote:
I think the question is more about non-custom dev history; there's
little need for a client site to have the complete development history
of Drupal 4.3 in its repo, for instance.

So you do a shallow clone that skips irrelevant branches and only grabs
recent history on the ones you want, that's fine.


Lately, what I've been doing/advocating is using Drush and real releases
to download stuff from Drupal.org (core, contrib modules, etc.) and then
checking the whole site into Git.  If I update a module, I use Drush for
that and then update the code in my Git repo. Then deploy to production
using *my* git repo (which has my full dev history but not every commit
in every one of my projects ever) and tags.

...which is *exactly* what I'm saying is pointless. Why stick a stupider
intermediary - tarballs - into a system that's already highly capable of
doing patch&  vendor management? The only thing you've accomplished is
diluting the capabilities of your version control system to manage
upstream changes.


That keeps me on real releases, avoids unnecessary repository bloat, but
still gives me a full history of all work on that project specifically.

"Unnecessary repository bloat?" Two great words there, let's address
each one:

"Unnecessary": well, the full branch history is a requirement if you
want to use git's smart merging algorithms. So the only way it's
"unnecessary" is if you prefer manually hauling chunks out of
patch-generated .rej and .orig files.

"Bloat": Really, step back and think about this. Are you solving a real,
compelling problem faced by most modern servers? How much does it matter
that your Drupal tree is, say, 70MB instead of 700MB? It really doesn't.
Not even on shared hosting. And, let's not forget - judicious use of
shallow clones&  compression whittles that number way, WAY down. IMO,
ripping out the vendor history is something a lot of us got in the habit
of doing because we were used to having CVS vendor data that earned us
nothing but headaches, and it was an easy "optimization" that made our
Drupal trees feel more svelte.

Well, now it does get you something. It gets you a _ton_. Now, all you
need for company-specific or site-specific customizations that can
easily coexist with rich vendor data is some branch naming conventions
and practice with reading git logs. Yeah, that takes some learning too,
but it's worth it.

cheers
s


--Larry Garfield

On 3/1/11 1:56 AM, Sam Boyer wrote:
I tend to advocate full clone. You're talking about a task that version
control is designed for. Now that we've made the switch, IMO native
code:Git::bytecode:another VCS, or worse, patch stacks, etc. I don't
know what drush did before to "make this easy" - maybe pop off patch
stacks, update the module, then re-apply the patches? Fact is, though,
nothing Drush could have done under CVS can compare to patching with
native Git commits: your patches can speak the same language as upstream
changes, and you have all of Git's merge&   rebase behavior at your
fingertips to reconcile them.

There are some occasional exceptions to this, but I really do think it's
a bit daft not to keep the full history. Keeping that history means
peace of mind that your patches (now commits) can be intelligently
merged with all changes ever made to that module for all time, across
new versions, across Drupal major versions...blah blah blah. Trading a
few hundred MB of disk space for that is MORE than worth it.

cheers
s

On 2/28/11 10:56 AM, Marco Carbone wrote:
Since a Git clone downloads the entire Drupal repository, the Drupal
codebase is no longer so lightweight (~50MB) if you are using Git,
especially as if you clone contrib module repositories as well.

With CVS, our usual practice with clients was to checkout core and
contrib using CVS, so that we can easily monitor any patches that have
been applied, so that they wouldn't be lost when updating to newer
releases.  (Drush makes this particularly easy.) This is doable with Git
as well, but now there seems to be the added cost of having to download
the full repository. This is great when doing core/contrib development,
but not really necessary for client work. This is unavoidable as far as
I can tell, but I don't think I'm satisfied with the "just use a tarball
and don't hack core/contrib" solution, especially when patches come into
play.

Is there something I'm missing/not understanding here, or does one just
have to accept the price of a bigger codebase when using Git to manage
core/contrib code? Or is managing core/contrib code this way passe now
that updates can be done through the UI?

-marco////




Reply via email to