Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Jonathan Nieder
Guillem Jover wrote:

> 

Ah, sorry for the noise.

I can see why one would be annoyed with Ted's proposed solutions
(e.g., don't crash), yes.

Jonathan


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101107055447.ga32...@burratino



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Guillem Jover
On Sat, 2010-11-06 at 16:33:14 -0500, Jonathan Nieder wrote:
> Guillem Jover wrote:
> > The worst that could happen on other file systems w/o the sync()/fsync()
> > before rename()s for extracted files was that the dpkg database might
> > get slightly out of sync relative to what was installed on disk, but
> > that's at most confusing, nothing compared to getting zero-length files
> > all over the place.
> [...]
> > The zero-length problem should affect only new recently installed
> > systems with ext4 anyway.
> 
> To throw another variable in the works: doesn't ext4 fall in the same
> category as "other file systems" today?  See v2.6.30-rc1~416^2~15
> (ext4: Automatically allocate delay allocated blocks on rename,
> 2009-02-23).

Given  was filed
against 2.6.32-21-generic, no, I don't think so. Although it could have
been improved somehow with later versions, but then the bug report has
not been updated. Eric Sandeen's comment is relevant in that he states
applications cannot rely on not using fsync() for ext4.

regards,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101107051204.ga28...@gaara.hadrons.org



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Jonathan Nieder
Guillem Jover wrote:

> The worst that could happen on other file systems w/o the sync()/fsync()
> before rename()s for extracted files was that the dpkg database might
> get slightly out of sync relative to what was installed on disk, but
> that's at most confusing, nothing compared to getting zero-length files
> all over the place.
[...]
> The zero-length problem should affect only new recently installed
> systems with ext4 anyway.

To throw another variable in the works: doesn't ext4 fall in the same
category as "other file systems" today?  See v2.6.30-rc1~416^2~15
(ext4: Automatically allocate delay allocated blocks on rename,
2009-02-23).


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101106213314.ga32...@burratino



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Guillem Jover
Hi!

On Sat, 2010-11-06 at 10:20:29 +0100, Raphael Hertzog wrote:
> On Sat, 06 Nov 2010, Guillem Jover wrote:
> > Finally, while accepting patch 2) alone might make sense, accepting 1)
> > alone would not.
> 
> BTW, I think we should go with patch 2 alone currently (i.e. just add
> --force-unsafe-io).
> 
> Continuing to use sync() instead of fsync() is the best compromise we have
> currently to have reasonable performance on Linux with all filesystems.

I disagree, as that does not seem to be the case currently anyway.

> I don't think that introducing a big performance degradation on ext4 users
> is a good idea at this point of the release cycle even if they have a
> work-around with --force-unsafe-io.

Installing/upgrading software is not something you usually do that often
on a stable system. And I'm planning to apply the switch to fsync()
for 1.16.0 so that will almost immediately affect testing/sid once the
freeze is over anyway.

> I would like however to see --force-unsafe-io so that we can have the best
> performance when we want it (in d-i, in temporary build chroots, etc.).

Well, I think it's completely unfair, that due to "design flaws" (some
might want to call that conformance to the spec) in a file system,
everyone has to suffer for it. We have to remember all this is mostly
for ext4 benefit after all. I'm always eager to do the right thing, in
this case using fsync() if it fixes real problems, but when the correct
and simple solution [0] which is also what was initially proposed and
expected (just look at man mount, auto_da_alloc, among others) by those
same who designed such system is impracticable on *that* file system,
then that's just not right.

 [0] Not to take into account the delayed rename() gymnastics we have
 had to do in dpkg, and possible future additional complexity,
 aio/fsync() in threads or subprocesses/etc?.

The worst that could happen on other file systems w/o the sync()/fsync()
before rename()s for extracted files was that the dpkg database might
get slightly out of sync relative to what was installed on disk, but
that's at most confusing, nothing compared to getting zero-length files
all over the place.

Switching to sync() was a mistake, a workaround that seemingly improved
things, but a workaround non the less with pernicious side effects.
Obviously --force-unsafe-io is also a workaround, one I'd rather not
have, but at least it's portable and has some utility outside ext4.

The zero-length problem should affect only new recently installed
systems with ext4 anyway. Those users might be ok with dpkg doing safe
I/O, but they will not be ok with say user data which can get lost as
easily, something I think is worse than system files. For that matter
using mostly any software (except few, probably database related) will
be dangerous, you just have to check around for the few projects using
fsync() at all! Not even rpm does for the extracted files.

Using nodelalloc as Sven pointed out, is IMO the only sane option for
those users if they value their data (also data=ordered). And something
that should probably be mentioned in the release notes anyway regardless
of any dpkg change. Or just recommend not using ext4 at all?

This and my other related mails might come across probably as quite
loaded, but the situation regarding ext4 does not leave much options
available, so I've been feeling between a rock and a hard place.

Anyway, I think I'm starting to repeat myself, and that I've exposed
my case, I'll leave the release team to deliberate, and in case they
decline I'll probably prepare dpkg backports to be placed somewhere.

thanks,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101106211855.gb19...@gaara.hadrons.org



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Guillem Jover
On Sat, 2010-11-06 at 10:41:46 +0100, Sven Joachim wrote:
> On 2010-11-06 08:46 +0100, Guillem Jover wrote:
> > On Fri, 2010-11-05 at 23:18:47 +0100, Philipp Kern wrote:
> > > On Fri, Oct 22, 2010 at 11:35:54AM +0200, Guillem Jover wrote:
> > > What are the implications for ext4 and btrfs?
> >
> > They are going to be slower with this change.
> 
> Are you sure this is the case for btrfs?  Mike Hommey has described the
> sync() implementation as really horrible[0] (unfortunately the webserver on
> glandium.org is currently down, but the Google cache[1] has a copy).

Buh, yeah, sorry, I meant that to apply only to ext4.

I didn't remember Mike's blog post about btrfs, but Modestas pointed out
in [2] it's twice as worse using sync().

[2] 

> For ext4, mounting with the nodelalloc option helps a lot, although this
> option allegedly slows down ext4 in the general case.

Ah right, that was pointed out in a thread recently. This should also
fix any zero-length problems on ext4 too AFAIUI.

> 0. http://glandium.org/blog/?p=1169
> 1. 
> http://webcache.googleusercontent.com/search?q=cache:zFsVyqYwxAAJ:glandium.org/blog/%3Fp%3D1169+mike+hommey+dpkg+sync&cd=1&ct=clnk

thanks,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101106191317.ga19...@gaara.hadrons.org



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Sven Joachim
On 2010-11-06 08:46 +0100, Guillem Jover wrote:

> On Fri, 2010-11-05 at 23:18:47 +0100, Philipp Kern wrote:
>> On Fri, Oct 22, 2010 at 11:35:54AM +0200, Guillem Jover wrote:
>
>> What are the implications for ext4 and btrfs?
>
> They are going to be slower with this change.

Are you sure this is the case for btrfs?  Mike Hommey has described the
sync() implementation as really horrible[0] (unfortunately the webserver on
glandium.org is currently down, but the Google cache[1] has a copy).

For ext4, mounting with the nodelalloc option helps a lot, although this
option allegedly slows down ext4 in the general case.

Regards,
Sven


0. http://glandium.org/blog/?p=1169
1. 
http://webcache.googleusercontent.com/search?q=cache:zFsVyqYwxAAJ:glandium.org/blog/%3Fp%3D1169+mike+hommey+dpkg+sync&cd=1&ct=clnk


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87r5eywzz9@turtle.gmx.de



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Raphael Hertzog
Hi,

On Sat, 06 Nov 2010, Guillem Jover wrote:
> Finally, while accepting patch 2) alone might make sense, accepting 1)
> alone would not.

BTW, I think we should go with patch 2 alone currently (i.e. just add
--force-unsafe-io).

Continuing to use sync() instead of fsync() is the best compromise we have
currently to have reasonable performance on Linux with all filesystems.

I don't think that introducing a big performance degradation on ext4 users
is a good idea at this point of the release cycle even if they have a
work-around with --force-unsafe-io.

I would like however to see --force-unsafe-io so that we can have the best
performance when we want it (in d-i, in temporary build chroots, etc.).

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer ◈ [Flattr=20693]

Follow my Debian News ▶ http://RaphaelHertzog.com (English)
  ▶ http://RaphaelHertzog.fr (Français)


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101106092029.ge1...@rivendell.home.ouaza.com



Re: problems with autoreconf

2010-11-06 Thread Guillem Jover
On Fri, 2010-10-29 at 07:20:42 +0200, Guillem Jover wrote:
> On Wed, 2010-10-27 at 20:43:17 -0500, Michael Schmidt wrote:
> > then i tried running the configure script and it broke like this
> 
> > checking for bcopy... yes
> > checking for memcpy... yes
> > checking for setsid... yes
> > checking for getdtablesize... yes
> > checking for posix_fadvise... yes
> > ./configure: line 13869: syntax error near unexpected token `fi'
> > ./configure: line 13869: `fi'
> 
> Most probably that's due to some m4 missquoting.

It was due to an empty 'else fi'.

> > I tried investigating why this happens, but I can't find the reason.
> > I checked the configure script at that line, and from what I can
> > tell this problem is that the if statement never was opened, but
> > there is a closing else en fi.Maybe you guys can try and reproduce
> > this error, just so that I know that it definitely is a problem on
> > my system.

> What version of automake are you using? Can you post your configure
> script somewhere? Or send it to me privately if that's not possible
> (as it's quite huge).

He is using automake 1.10.1, and autoconf 2.61 from lenny, and he sent
me the failing configure off-list.

I tried to reproduce it with automake 1.10.3 but it works with that.
I'm guessing though it's a problem with autoconf. Current autoconf
fills the empty 'else fi' body with a colon (null command).

> Most of the time the if is there, it's just concatenated with some
> other text.

I queued the attached patch which should fix this bug (you'll need to
apply the dpkg-m4-quotes.patch first), please let me know if there's
something else. They will be in my next push, today or tomorrow.

Sorry for now having come back to you earlier.

thanks,
guillem
commit 28bd5ea3ff451181657e32211300a57398e14c23
Author: Guillem Jover 
Date:   Sat Oct 30 09:00:24 2010 +0200

build: Add missing m4 quotes to sed regex

This was making the regex non-functional, as the square brackets
were being removed when generating the output file, thus making the
--disable-compiler-optimisations and --disable-linker-optimisations
non-functional.

diff --git a/m4/dpkg-compiler.m4 b/m4/dpkg-compiler.m4
index 5a0f791..e885af4 100644
--- a/m4/dpkg-compiler.m4
+++ b/m4/dpkg-compiler.m4
@@ -36,7 +36,7 @@ AC_DEFUN([DPKG_COMPILER_OPTIMISATIONS],
 	AS_HELP_STRING([--disable-compiler-optimisations],
 		   [Disable compiler optimisations]),
 [if test "x$enable_compiler_optimisations" = "xno"; then
-	[CFLAGS=$(echo "$CFLAGS" | sed -e "s/ -O[1-9]*\b/ -O0/g")]
+	[CFLAGS=$(echo "$CFLAGS" | sed -e "s/ -O[[1-9]]*\b/ -O0/g")]
 fi])dnl
 ])
 
diff --git a/m4/dpkg-linker.m4 b/m4/dpkg-linker.m4
index 9bd98f0..bff4d22 100644
--- a/m4/dpkg-linker.m4
+++ b/m4/dpkg-linker.m4
@@ -8,7 +8,7 @@ AC_DEFUN([DPKG_LINKER_OPTIMISATIONS],
 	AS_HELP_STRING([--disable-linker-optimisations],
 		   [Disable linker optimisations]),
 [if test "x$enable_linker_optimisations" = "xno"; then
-	[LDFLAGS=$(echo "$LDFLAGS" | sed -e "s/ -Wl,-O[0-9]*\b//g")]
+	[LDFLAGS=$(echo "$LDFLAGS" | sed -e "s/ -Wl,-O[[0-9]]*\b//g")]
 else
 	[LDFLAGS="$LDFLAGS -Wl,-O1"]
 fi], [LDFLAGS="$LDFLAGS -Wl,-O1"])dnl
commit 2f6287298b00a194539d93452d8c192de0d8d30c
Author: Guillem Jover 
Date:   Sun Oct 31 03:27:29 2010 +0100

build: Unify and fix AC_ARG_ENABLE usage

The current code was executing code in the action arguments, instead
of just setting boolean flags and processing them afterwards. This
poses several problems, it implies jugling code around in case the the
default changes, it might also duplicate code, and it might leave the
ACTION-IF-NOT-GIVEN argument empty which could turn into an empty
“then fi” shell block which is a syntax error on POSIX shell. Leaving
the ACTION-IF-GIVEN argument empty is fine as it's always used by
autoconf to set $enableval to the specific enable variable, and setting
that variable from $enableval is redundant and might be wrong depending
on the order they are set, which could empty it.

Reported-by: Michael Schmidt 
Signed-off-by: Guillem Jover 

diff --git a/m4/dpkg-compiler.m4 b/m4/dpkg-compiler.m4
index e885af4..3a38601 100644
--- a/m4/dpkg-compiler.m4
+++ b/m4/dpkg-compiler.m4
@@ -8,7 +8,7 @@ AC_DEFUN([DPKG_COMPILER_WARNINGS],
 [AC_ARG_ENABLE(compiler-warnings,
 	AS_HELP_STRING([--disable-compiler-warnings],
 	   [Disable additional compiler warnings]),
-	[enable_compiler_warnings=$enableval],
+	[],
 	[enable_compiler_warnings=yes])
 
 WFLAGS="-Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers \
@@ -35,9 +35,12 @@ AC_DEFUN([DPKG_COMPILER_OPTIMISATIONS],
 [AC_ARG_ENABLE(compiler-optimisations,
 	AS_HELP_STRING([--disable-compiler-optimisations],
 		   [Disable compiler optimisations]),
-[if test "x$enable_compiler_optimisations" = "xno"; then
-	[CFLAGS=$(echo "$CFLAGS" | sed -e "s/ -O[[1-9]]*\b/ -O0/g")]
-fi])dnl
+	[],
+	[enable_compiler_optimisations=yes])
+
+  AS_IF([test "x$

Re: Transferring conffiles between packages (Re: Bug#564254: conflicting /etc/bash_completion)

2010-11-06 Thread Guillem Jover
Hi!

On Tue, 2010-11-02 at 15:14:36 -0500, Jonathan Nieder wrote:
> On 2010-01-08, Kurt Roeckx wrote:
> > I got this on the buildd:
> > Unpacking bash-completion (from .../bash-completion_1%3a1.1-3_all.deb) ...
> > dpkg: error processing 
> > /home/buildd/build/chroot-unstable/var/cache/apt/archives/bash-completion_1%3a1.1-3_all.deb
> > (--unpack): trying to overwrite `/etc/bash_completion', which is also in 
> > package bash
> > 
> > On the system:
> > excelsior:~# ls -l /etc/bash_completion
> > -rw-r--r-- 1 root root 215907 Jul  5  2006 /etc/bash_completion
> > excelsior:~# dpkg --search /etc/bash_completion
> > bash: /etc/bash_completion
> > 
> > This is with bash 4.0-7.
> 
> The message is in tarobject().  I think dpkg 1.13.14~19 (Improve
> processing of disappearing conffiles, 2006-02-10) was supposed to deal
> with this case:
> 
>If the file to be unpacked is (1) a conffile in the new package and
>(2) a regular file rather than a symlink or directory, and some
>installed conffile with the same inode is obsolete, then let the
>installation continue.

Right. I fixed few bugs from that patch, but not related to this:

  4021e3db0f30bf4a19abb2a54fe5758654baa4e3
  368b3934bbf1d106e8448b8587657292c24da777

> Checking on snapshot.debian.org, I see that /etc/bash_completion was
> indeed a conffile in bash-completion 1:1.1-3.

> Kurt Roeckx wrote:
> 
> > At some point in time the chroot had the version from oldstable
> > or older, just like all my chroots and main systems.  And I have
> > upgraded from that version.  I never installed bash-completion.
> > But now some pacakge build-depends on that for some strange reason,
> > and I get that error.

> Any idea what could have gone wrong?

My guess would be that bash got upgraded to the package w/o the
obsolete conffile before the fixed dpkg on those systems.

A proof of that I guess, might be checking if the file has the obsolete
flag, if it does not then it was installed by a buggy dpkg.

  $ dpkg-query -W -f '${Conffiles}\n' bash | grep bash_completion

And an unversioned Replaces in bash-completion would be the correct way
to handle that. Fixing that in bash would imply removing the file on
upgrade, removing it on remove or purge would not really happen (bash
is Essential). And as such it would need to check if bash-completion is
installed, to not remove a file it does not own, or check if it has the
conffile listed in the Conffile field w/o the obsolete flag? etc, which
seems overcomplex and just wrong, compared with just a Replaces field.

regards,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101106083941.ga12...@gaara.hadrons.org



Re: Pre-approval request for dpkg sync() changes for squeeze

2010-11-06 Thread Guillem Jover
On Fri, 2010-11-05 at 23:18:47 +0100, Philipp Kern wrote:
> On Fri, Oct 22, 2010 at 11:35:54AM +0200, Guillem Jover wrote:
> > Those are related to the fsync()/sync() changes in dpkg from some time
> > ago, the patches would:
> > 
> >   1) Switch back from sync() to fsync() before rename() (while keeping
> >  the sync() code around for the benefit of other distributions
> >  that might not want to switch just yet). So to avoid unrelated
> >  I/O when there's background work being done for example. This
> >  hack also only works on Linux where sync() is synchronous, so
> >  it would unify that code path for all dpkg supported platforms.
> > 
> >  Bug: #588339
> > 
> >   
> 
> I do have to say that I'm not overly happy about this change this late in the
> freeze.

My idea was to get those on the planned 1.15.9, but after the sudden
freeze and the initial reactions to the first unblock request I didn't
feel like asking for any more changes. Anyway, the blame would be
obviously on me in case these do not get accepted due to the delay.

> I suppose this resembles what we currently have in Lenny?

No, lenny's dpkg does not do sync() nor fsync() before renames on
extracted files from archives.

dpkg 1.15.6 started doing fsync() for extracted files.
dpkg 1.15.7.2 switched from fsync() to sync() on Linux.

> What are the implications for ext4 and btrfs?

They are going to be slower with this change.

> Why was it changed in the first place?

I wrote about this in . Extremely short
summary: because fsync() was considered too slow on ext4 (#578635).

> >   2) Add a new --force-unsafe-io to disable those fsync() for people
> >  using certain file systems where the only option is to choose
> >  between reliability or acceptable performance. Or in cases where
> >  the data is cached and it might not be important to lose it.
> > 
> >  Bug: #584254
> > 
> >   
> 
> So even when you don't do sync() anymore you still want to disable those
> fsyncs()?

Given the performance degradation the first patch will introduce on at
least ext4, users might want to disable fsync()s. Arguably they seem
to want to do that now even with sync(), as people are already using
hacks like eatmydata to disable them, the problem is that this will
disable *all* syncs, including the ones for the database among others.
If the file system gets corrupted you might still have a chance of
recovering, if the databases do, you most probably need either backups
or a reinstall.

> Granted, it's a self-contained new feature.  What's the use case
> for squeeze?  d-i I heared, but I'm not sure that they'd use it in this
> cycle.  And tmpfs users, or who?

Other users are buildds using fresh unpacked snapshots for each build
session. ext4 users who would prefer faster dpkg runs rather than data
safety.


The problem with sync() is that even if it gives better performance
than fsync()s on ext4, it's just the wrong thing to use, sync()
affects all mount points, so if you happen to have for example a server
with lots of writes on say a /srv mount point, and the admin decides to
do a large install/upgrade, it will force all that data down to all
mount points, for each archive unpacked.

The target for this is ext4 users mainly, the syncs were included mostly
to protect them from data loss. They are also the ones suffering the
I/O performance degradation. I don't think it's fair to make users not
suffering such problems suffer unrelated issues like the ones imposed by
sync(). As such I think it should be them who have to decide between
performance or data loss.


Also I found it interesting that the ext4 developers were complaining
about userspace developers not doing the right thing regarding atomic
renames (which would include fsync()) during the huge zero-length
discussions, to then state that fsync() is obviously too slow and heavy
handed and thus should better not be used.

  

Ultimately I just think this is a problem with ext4. Regardless of that,
I'll try to come up with a better, portable alternative to improve the
performance for 1.16.x if possible.


Finally, while accepting patch 2) alone might make sense, accepting 1)
alone would not.

thanks,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101106074601.ga32...@gaara.hadrons.org