Re: Speeding up dpkg, a proposal

2011-03-03 Thread Marius Vollmer
ext Chow Loong Jin hyper...@ubuntu.com writes:

 Could we somehow avoid using sync()? sync() syncs all mounted filesystems, 
 which
 isn't exactly very friendly when you have a few slow-syncing filesystems like
 btrfs (or even NFS) mounted.

Hmm, right.  We could keep a list of all files that need fsyncing, and
then fsync them all just before writing the checkpoint.

Half of that is already done (for the content of the packages), we would
need to add it for the files in /var/lib/dpkg/, or we could just fsync
the whole directory.

But then again, I would argue that the sync() is actually necessary
always, for correct semantics: You also want to sync everything that the
postinst script has done before recording that a package is fully
installed.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87y64w8zy1@big.research.nokia.com



Re: Speeding up dpkg, a proposal

2011-03-03 Thread Marius Vollmer
ext Raphael Hertzog hert...@debian.org writes:

 On Wed, 02 Mar 2011, Marius Vollmer wrote:
 - Instead, we move all packages that are to be unpacked into
   half-installed / reinstreq before touching the first one, and put a
   big sync() right before carefully writing /var/lib/dpkg/status.

 The big sync() doesn't work. It means dpkg never finishes its work on
 systems with lots of unrelated I/O.

Ok, understood.  It's now clear to me that the big sync should be
replaced with deferred fsyncs.  (I would defer the fsync of the content
of all packages until modstatdb_checkpoint, not just until
tar_deferred_extract.)

With that change, do you think the approach is sound?

 We've seen reports of poor performance with btrfs (and that's what you use
 for Meego IIRC) so you might want to investigate why btrfs is coping so
 badly with a few fsync() just on the status files.

This is about Harmattan, which uses ext4.

To understand our troubles, you need to know that we have around 2500
packages with just a single file in it.  For those packages, dpkg spends
the largest part of its time in writing the nine journal entries to
/var/lib/dpkg/updates.

We will reduce the number of our packages, so this issue might solve
itself that way, but I had good success in reducing the per-package
overhead of dpkg, and if it is correct and works for us, why not use the
'reckless' option as well?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87mxlc8vem@big.research.nokia.com



Speeding up dpkg, a proposal

2011-03-02 Thread Marius Vollmer
Hi,

I have recently been looking into where dpkg spends most of its time
when installing very many small packages, and came up with the following
idea to speed it up.

- Most of the time is spent writing files very carefully, a lot of them
  in /var/lib/dpkg/updates.

- We can avoid this by writing the files less carefully (without fsync)
  and even skipping the journal entries in /var/lib/dpkg/updates
  completely.

- Instead, we move all packages that are to be unpacked into
  half-installed / reinstreq before touching the first one, and put a
  big sync() right before carefully writing /var/lib/dpkg/status.

[ There are more details to this than this, please check the code before
  trying to find the holes in this short version of the idea.
]

This should be just as safe as writing very many small journal entries,
but if dpkg does get interrupted harshly, it leaves its database behind
in a correct but quite outdated and not so friendly state.  Many
packages that have not been touched will have to be reinstalled because
dpkg can't be sure that they have in fact not been touched.

This should only happen when the system goes down abruptly without any
chance for dpkg to write a checkpoint and without unmounting the
filesystem cleanly.  In any other case, such as a maintainer script
failing or the user interrupting dpkg with C-c, dpkg will write a
accurate checkpoint as the last thing it does.


I have experimental code for this here, based on dpkg 1.15.8.8:


http://meego.gitorious.org/~mvo/meego-platform-security/reckless-dpkg/commits/mvo/reckless

It shows a speed up between factor six and two in our environment (ext4
on a slowish flash drive) .  I am not sure whether messing with the
fundamentals of dpkg is worth a factor of two in performance, but I
still think the idea is sound and worth sharing here, if only to be shot
down.

So, opinions?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8762s1a4u4@big.research.nokia.com



Re: Speeding up dpkg, a proposal

2011-03-02 Thread Marius Vollmer
ext Chow Loong Jin hyper...@ubuntu.com writes:

 I remember seeing there being some list of files to be fsynced in one of the
 older dpkgs. It's probably that which led to the ext4 slowdown [...]

Hmm, performance is the ultimate reason for doing all this, but right
now, I am mostly interested in whether my changes are correct.  I know
that they improve performance, but I am not totally convinced that they
are actually correct in the way that they change the status of packages,
etc.

I am only proposing to add this as an option to dpkg, not to make it the
default.

We might enable it in Harmattan, if I have the balls and it does in fact
speed things up enough, but nothing of that is certain right now.  We
might get the improvement we need just from reducing our number of
packages to something reasonable.

 But then again, I would argue that the sync() is actually necessary
 always, for correct semantics: You also want to sync everything that the
 postinst script has done before recording that a package is fully
 installed.

 Yes, you're right. I completely forgot about that. I don't think most postinst
 scripts sync when done. I suppose the best that can be done is to batch the
 stuff as best as can be done to reduce the number of sync()s needed.

On the other hand, it _is_ the job of the maintainerscripts to sync if
that s necessary for correctness, and maybe we don't want to take that
reponsibility away from them.

And in the big picture, all we need is some guarantee that renames are
comitted in order, and after the content of the file that is being
renamed.  I have the impression that all reasonable filesystems give
that guarantee now, no?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87r5ao8xqi@big.research.nokia.com



Accepted magit 0.7-1 (source all)

2009-05-16 Thread Marius Vollmer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Format: 1.8
Date: Sun, 26 Apr 2009 19:13:16 +0300
Source: magit
Binary: magit
Architecture: source all
Version: 0.7-1
Distribution: unstable
Urgency: low
Maintainer: Marius Vollmer marius.voll...@gmail.com
Changed-By: Marius Vollmer marius.voll...@gmail.com
Description: 
 magit  - A Emacs interface for Git
Closes: 518818
Changes: 
 magit (0.7-1) unstable; urgency=low
 .
   * first official debian upload (closes: #518818)
 .
   [ RISKO Gergely ]
   * pointer added to GFDL-1.2 in debian/copyright
   * changed debian/compat to 7
   * standards-version to 3.8.1
 .
   [ Marius Vollmer ]
   * Follow Debian Emacsen policy.
 - Use scripts in /usr/lib/emacsen-common/packages to
   byte-compile during installation.
   * List all authors in debian/copyright.
   * Lintian fixes.
 .
   New upstream release.
 .
   * Tagging, on 't' and 'T'.
 .
   * Stashing, on 'z' and 'Z'.
 .
   * Wazzup, on 'w'.  Wazzup gives you an overview over how other
 branches relate to the current one.
 .
   * There is more control over pushing.  'P' now takes a prefix argument
 and pushing a branch without a default remote will ask for one.
 .
   * Logs have changed a bit: 'l' shows the traditional brief log, and
 'L' shows a more verbose log.  Use the prefix arg to specify the
 range of the log.
 .
   * M-x magit-status doesn't prompt anymore for a directory when invoked
 from within a Git repository.  Use C-u to force a prompt.
 .
   * When you have nothing staged, 'c' will now explicitly ask whether to
 commit everything instead of just going ahead and do it.  This can
 be customized.
 .
   * The digit keys '1', '2', '3', and '4' now show sections on the
 respective level and hide everything below.  With Meta, they work on
 all sections; without, they work only on sections that are a parent
 or child of the current section.
 .
   * Typing '+' and '-' will change the size of hunks, via the -U
 option to git diff.  '0' resets hunks to their default size.
 .
   * Typing 'k' on the Untracked files section title will offer to
 delete all untracked files.
 .
   * Magit understands a bit of git-svn: the status buffer shows unpushed
 and unpulled commits, 'N r' runs git svn rebase, and 'N c' runs git
 svn dcommit.
 .
   * Magit now also works when the direcory is accessed via tramp.
 .
   * M-x magit-status can also create new repositories when given a
 directory that is not a Git repository.
 .
   * Magit works better with oldish Gits that don't understand --graph,
 for example.
 .
   * The name of the Git program and common options for it can be
 customized.
Checksums-Sha1: 
 960f763ed470dbd5feec9005617ee7c1644e836c 945 magit_0.7-1.dsc
 cfc618f505b161742f50eb21b3bcfb64032e3cc0 188026 magit_0.7.orig.tar.gz
 0453b2caef85995c7e71bc32f7792749fae78386 2906 magit_0.7-1.diff.gz
 e757f41bb1b89e8ec59887e06727908a4917cf03 36352 magit_0.7-1_all.deb
Checksums-Sha256: 
 e45e7d900b6246b3ff342f838ba3a6470346716dd008cc9516c4b036cec893b0 945 
magit_0.7-1.dsc
 3fcaaca73c9a60a6b5320233555efb05c1fffef819eb70dce4ccdf40e40e3e63 188026 
magit_0.7.orig.tar.gz
 02c70b627fc582770c98ec0440e20a439fc20348d17292daab5a239ebce15159 2906 
magit_0.7-1.diff.gz
 f069209e9e5618857d560bf4aa986da6ce3fdd6f32729d0e8df932e495935a06 36352 
magit_0.7-1_all.deb
Files: 
 fa76c400e143b820d5c15e58329efd11 945 devel optional magit_0.7-1.dsc
 1ea442bd6f83f7ac82967059754c8c87 188026 devel optional magit_0.7.orig.tar.gz
 cf46933efb81dfdca95f8cc47bde2a35 2906 devel optional magit_0.7-1.diff.gz
 840f9214b4a3c4f8d5990c50609dbeb7 36352 devel optional magit_0.7-1_all.deb

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkn0iowACgkQO0PrGO4KNccsoACfX75W5AnimJMTSC3R38RQf9pG
88IAnA7S4UDIownkjfaJ0Cbgk2WFY3EN
=yQGs
-END PGP SIGNATURE-


Accepted:
magit_0.7-1.diff.gz
  to pool/main/m/magit/magit_0.7-1.diff.gz
magit_0.7-1.dsc
  to pool/main/m/magit/magit_0.7-1.dsc
magit_0.7-1_all.deb
  to pool/main/m/magit/magit_0.7-1_all.deb
magit_0.7.orig.tar.gz
  to pool/main/m/magit/magit_0.7.orig.tar.gz


-- 
To UNSUBSCRIBE, email to debian-devel-changes-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Source-Depends? Autoreconf?

2009-05-02 Thread Marius Vollmer
Hi,

here is some ghost spooking around in my head, and maybe you can help me
put it to rest.

The GNU build system makes a distinction between maintainers of a
source package and the people who eventually install it.  Essentially,
GNU is producing a source distribution that is aiming for very high
portability and ease of installation from source.

Because of that, the source packages produced by GNU projects are 'fat':
they contain the well-known biggish autotools build machinery, and they
even had a formal 'maintainer mode' concept.  That mode is falling out
of favor (because it is not doing its job in the best possible way), but
the release tarballs are still carefully constructed to need much fewer
external software than the maintainers need when creating that release
tarball.

Debian is much less concered about this distinction: for example, there
is no dist or source target in debian/rules, and no Source-Depends
or Release-Depends field that would state the required packages for
running it.

Making a Debian source package always felt cumbersome: dpkg-source just
tars up everything in front of it (I exaggerate, of course), and at
least I am not used to having to clean my source tree before making a
distribution tarball out of it.

Of course, Debian doesn't really need to be as careful about making
release tarballs as the GNU project: a Debian source package only needs
to work in Debian, and there is no need to include pre-built
documentation in the source, for example, because the tools to build it
are always there.

Still, I feel that this area of what should and shouldn't be in a source
package is a bit under-addressed in Debian.  (That is very likely only my
own ignorance.  I learned the little I know about packaging in Maemo;
apologies might be in order for me speaking up on this list... :)

On one end of the spectrum, a Debian source package could try to have as
little dependencies during build-time as possible.  This would be nice
for distributions downstream of Debian that would have less trouble
importing Debian source packages.  In this case, I think we would need
something like Source-Depends and making a source package would consist
of running the equivalent of make maintainer-clean  autoreconf.

(If a downstream distribution wants to make changes to the source
packages that it imports from Debian, it can create the changed source
packages in Debian itself.  Ideally, there shouldn't ever be a need to
create a source package in such a--by choice--anemic downstream
distribution.)

On the other end of the spectrum, a source package would only contain
genuine sources; it wouldn't contain anything that can be regenerated
from other files in the source package.  For the typical package made
from a GNU upstream, this would mean always running autoreconf before
configure, for example.

I have the feeling that most Debian packages are somewhere in the
middle, without having any real opinion about where to go: they contain
lots of generated files, but there is no formal, uniform means to
regenerate them.  Also, no second thought is given to adding monster
toolchains like gtk-doc-tools to Build-Depends.


I think the first end of the spectrum is more flexible, and adding the
necessary bits to Debian (such as a source target in debian/rules)
would allow Debian to become a lot stronger as a repository of source
packages.  This added strength would mostly help the people lifting code
from Debian, of course, but I think Debian itself would also benefit
from it.

For example, Debian's 'lesser' architectures might have an easier time
with reduced build-dependencies, and it might even be possible to
formally allow 'cross' architectures into Debian: those architectures
would not be as general as the big ones and all packages for them would
be cross-compiled, perhaps.  (Reducing build-dependencies of source
packages goes a long way towards cross-compilability.)

Also, being more careful about source packages can help with keeping
things in VCS.  The new formalism for creating source packages can make
it easier to create them automatically from a 'bare' CVS branch that
doesn't contain any generated files.

That is actually what got me started thinking about this in the first
place: I want to keep my sources (both upstream and packaged) in a VCS
without any generated files; I want to automatically create source
packages of a bare branch in a rich environment; and I want to
automatically compile these source packages in a poor environment (that
doesn't have all the tools I use during development).

So, what do you think?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Should we still purge GConf schemas from the old directory?

2009-04-27 Thread Marius Vollmer
Josselin Mouette j...@debian.org writes:

 I’m considering asking for the removal of this snippet, since it is only
 useful for those having upgraded a pre-woody system all along. While I’m
 one of those doing that, I’m not sure there are as many people like
 that, and I guess they could live with some file left over if they have
 already been left over for so many years.

If all files in /etc/gconf/schemas are totally useless (i.e., nobody
reads them), why not put a big cleanup hack in, say, gconf itself that
just removes that directory entirely.  (In addition to removing the
snipped from dh_gconf.)

(But maybe I have misunderstood the situation, of course.)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Why do we have to support tmpfs for /var/run (policy changes in 3.8.1)

2009-04-04 Thread Marius Vollmer
Paul Wise p...@debian.org writes:

 On Sat, Apr 4, 2009 at 7:37 AM, Michael Biebl bi...@debian.org wrote:

 Afaik, Ubuntu is the only Linux distro which supports and uses tmpfs by 
 default.

 The OpenEmbedded distros do this too, I've especially seen that the
 ones associated with OpenMoko do that.

Maemo does it, too.  (Actually in links /var/run to /tmp/.run)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: grouping of alternative depends

2009-03-29 Thread Marius Vollmer
Rafael Laboissiere raf...@debian.org writes:

 Yes, the formulae are logically equivalent to each other. However, for
 some packages I would like to have (A B) | (C D), meaning ((A and B) or
 (C and D)).  This does not seem to be doable with the current Depends
 syntax.

You just need to do the expansion one more time:

(A, B) | (C, D)
 - A | (C D), B | (C D)
 - A | C, A | D, B | C, B | D

Every expression involving AND / OR can be written in a 'two-level' form

(A or B or ...) and (X or Y or ..) and ...

Thus, the Depends syntax is completely general, but not very convenient
for complicated expressions.  My feeling is that it is not worth
changing the syntax.  Too many tools depend on it, and complicated
dependencies should be avoided anyway.

Depending on what you want to do, you might create new packages and use
virtual packages to simplify your dependency expressions.  I.e., to
express (A, B) | (C, D), you would create two new packages X and Y where
X depends on A and B, and Y depends on C and D.  Both could provide the
virtual package V, and the original expression is then just V.

This could give better modularity and better documentation: You make it
explicit that your package depends on 'someone' providing interface V,
and it is up to those interface providers to define what they need.
This is probably better than hard coding it all into a single
complicated dependency expression.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#518818: ITP: magit -- A Emacs interface for Git

2009-03-08 Thread Marius Vollmer
Package: wnpp
Severity: wishlist
Owner: Marius Vollmer marius.voll...@gmail.com


* Package name: magit
  Version : 0.7
  Upstream Author : Marius Vollmer marius.voll...@gmail.com
* URL : http://zagadka.vm.bytemark.co.uk/magit/
* License : GPLv3, FDL
  Programming Lang: Elisp
  Description : A Emacs interface for Git

With Magit, you can inspect and modify your Git repositories with
Emacs.  You can review and commit the changes you have made to the
tracked files, for example, and you can browse the history of past
changes.  There is support for cherry picking, reverting, merging,
rebasing, and other common Git operations.

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)



-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: slib and Debian ?

1997-12-27 Thread Marius Vollmer
Gregor Hoffleit [EMAIL PROTECTED] writes:

 Now I have no idea about guile and scm and slib: Should I file a bug
 against guile that it should apply Marius' recent patch to
 ice-9/slib.scm 

I have already done that, no need to worry.

and furthermore that it should use slib when
 installed ? Or is this a problem with the slib_2c0-2 package that it
 should announce itself to guile somehow ?

Hmm, I think the symlink that Karl mentioned is the right solution.  I
don't know if it could/should be handled by the Debian installation
scripts.


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word unsubscribe to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .