Re: [paludis-user] Rebuilding everything with cave --resume-file is awfully slow

2011-01-10 Thread Rodolphe Rocca

On 01/01/2011 01:50 AM, Ciaran McCreesh wrote:

On Fri, 31 Dec 2010 12:51:57 +0100
Rodolphe Rocca  wrote:

Off course every user would appreciate things to go a little bit
faster if possible here :-)

The next major release includes a workaround for some libstdc++
stupidity that tends to make it try to write() one byte at a time.
That'll make a fair difference there...


paludis-0.58 is much faster indeed :-)

Thank you for this work !
___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] Rebuilding everything with cave --resume-file is awfully slow

2010-12-31 Thread Rodolphe Rocca

On 12/31/2010 08:59 AM, Ciaran McCreesh wrote:

On Wed, 29 Dec 2010 12:41:20 +0100
Rodolphe Rocca  wrote:

Each time an archive integrity is checked, the resume file is written.

No, it's written after a successful fetch.


Is everything written to the resume file really meaningful
resume information ?

Yes.


The paludis client is many times faster here.

The Paludis client doesn't store continue-on-failure information in
there, which means it gets things wrong later on.


Thank you for your answer Ciaran. Correctness is better than 
quick-and-dirtiness, I agree.


Off course every user would appreciate things to go a little bit faster 
if possible here :-)



Moreover If I CTRL-C, the resume file gets corrupted.
Isn't there a signal handler catching this signal and waiting for the
resume-file being completely written before exiting gracefully ?

No.


Is it a design choice or a question of priority ?


NB: The reason why I want to do a CTRL-C here is that the
installation is stuck at etqw-data waiting for my action to input a
data dvd, but it fails to find it. So I want to modify the resume
file by hand to remove etqw-data. Do you see any better way to handle
this kind of situation ?

You can't edit resume files.


Any advice on how to handle this situation ?
Because at this time I don't know how to rebuild my system with cave 
without uninstalling a few packages like etqw-data...


--
Rodolphe Rocca
___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


[paludis-user] Rebuilding everything with cave --resume-file is awfully slow

2010-12-29 Thread Rodolphe Rocca

Hi,

rebuilding my all system with the following command takes ages :

$ cave resolve --resume-file /var/tmp/cave-resume-XX 
installed-packages -e -x --continue-on-failure always


I have 1250 installed packages.
Each time an archive integrity is checked, the resume file is written.


[...]

Checking 'acct-6.5.4.tar.gz'... ok

Done fetch for sys-process/acct-6.5.4-r2:0::gentoo


Writing resume information to /var/tmp/cave-resume-1...

[...]

My resume file is about 50 MB ; writing the resume file takes 30 seconds 
each time.

(1250 * 30) / 3600 = 10 hours just to check file integrity !
Is it really needed to rewrite the resume file after each integrity check ?
Is everything written to the resume file really meaningful resume 
information ?


The paludis client is many times faster here.

Moreover If I CTRL-C, the resume file gets corrupted.
Isn't there a signal handler catching this signal and waiting for the 
resume-file being completely written before exiting gracefully ?


NB: The reason why I want to do a CTRL-C here is that the installation 
is stuck at etqw-data waiting for my action to input a data dvd, but it 
fails to find it. So I want to modify the resume file by hand to remove 
etqw-data. Do you see any better way to handle this kind of situation ?


--
Rodolphe Rocca

___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] cave : libX11-1.4.0 issue

2010-12-17 Thread Rodolphe Rocca

On 12/12/2010 03:10 PM, Ciaran McCreesh wrote:

On Sun, 12 Dec 2010 11:43:33 +0100
Rodolphe Rocca  wrote:

Looking at the vlc ebuild I can see :

  opengl? ( virtual/opengl ||
(=x11-libs/libX11-1.3.99.901 ) )

Which looks conceptually good since libX11-1.4 lost its xcb use flag,
but... I'm not an ebuild syntax expert.

There is a similar line in the pulseaudio ebuild.

Is it an ebuild bug or a cave one ?

It's an ebuild bug. || ( a b ) means "prefer a, but b is fine too".


Mmhh that's weird.

Pulseaudio has been fixed in the tree. I've fixed vlc in a local repository.

I've managed to reinstall both individually without worry :

$ cave resolve pulseaudio vlc -x1

This though did not trigger libX11-1.4.0 bump.

Checking the deps :

$ paludis -q pulseaudio --show-deps | grep -A1 -B1 X11
|| (
>=x11-libs/libX11-1.4.0
=x11-libs/libX11-1.4.0
=x11-libs/libX11-1.3.99.901
=x11-libs/libX11-1.3.99.901
Did not meet =x11-libs/libX11-1.4.0, never using existing, 
installing to / from target
Did not meet >=x11-libs/libX11-1.3.99.901, use existing if 
possible, installing to / from media-libs/mesa

  * x11-libs/libX11-1.3.6:0::gentoo
Did not meet =x11-libs/libX11-1.4.0, never using existing, 
installing to / from target
Did not meet >=x11-libs/libX11-1.3.99.901, use existing if 
possible, installing to / from media-libs/mesa

  * x11-libs/libX11-1.4.0:0::gentoo
Did not meet possible, installing to / from media-video/vlc

  * x11-libs/libX11-:0::x11
Masked by repository
Repository masked 
/var/paludis/repositories/x11/profiles/package.mask

Don't let people install these accidentally
Masked by user
Did not meet =x11-libs/libX11-1.4.0, never using existing, 
installing to / from target
Did not meet possible, installing to / from media-video/vlc


___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] cave : libX11-1.4.0 issue

2010-12-12 Thread Rodolphe Rocca

On 12/12/2010 03:10 PM, Ciaran McCreesh wrote:

On Sun, 12 Dec 2010 11:43:33 +0100
Rodolphe Rocca  wrote:

Looking at the vlc ebuild I can see :

  opengl? ( virtual/opengl ||
(=x11-libs/libX11-1.3.99.901 ) )

Which looks conceptually good since libX11-1.4 lost its xcb use flag,
but... I'm not an ebuild syntax expert.

There is a similar line in the pulseaudio ebuild.

Is it an ebuild bug or a cave one ?

It's an ebuild bug. || ( a b ) means "prefer a, but b is fine too".


What would be the correct syntax then ?

I tried :

opengl? ( virtual/opengl ||
(=x11-libs/libX11-1.3.99.901 ) )


But it looks like a real syntax error, not a sematic mistake.

Thank you for your help so that I can complete :

http://bugs.gentoo.org/show_bug.cgi?id=348518

Oh, and feel free to add your word to this issue since you have much more ammo 
than I do on the subject.

___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] cave : libX11-1.4.0 issue

2010-12-12 Thread Rodolphe Rocca

On 12/12/2010 03:10 PM, Ciaran McCreesh wrote:

On Sun, 12 Dec 2010 11:43:33 +0100
Rodolphe Rocca  wrote:

Looking at the vlc ebuild I can see :

  opengl? ( virtual/opengl ||
(=x11-libs/libX11-1.3.99.901 ) )

Which looks conceptually good since libX11-1.4 lost its xcb use flag,
but... I'm not an ebuild syntax expert.

There is a similar line in the pulseaudio ebuild.

Is it an ebuild bug or a cave one ?

It's an ebuild bug. || ( a b ) means "prefer a, but b is fine too".


Before filling an issue in Gentoo's bugzilla, I'ld like to understand :

If b is fine, why does cave reject it ?

  * x11-libs/libX11-1.4.0:0::gentoo
Did not meet possible, installing to / from media-sound/pulseaudio
Did not meet possible, installing to / from media-video/vlc




___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


[paludis-user] cave : libX11-1.4.0 issue

2010-12-12 Thread Rodolphe Rocca

Hi,

trying to update my gentoo to libX11-1.4 / mesa-7.9, I'm facing this 
issue (paludis-0.56) :


!   x11-libs/libX11
Reasons: target (installed-packages::installed), 
app-emulation/emul-linux-x86-xlibs, dev-dotnet/libgdiplus, 126 more

Unsuitable candidates:
  * x11-libs/libX11-1.3.4:0::gentoo
Did not meet >=x11-libs/libX11-1.3.99.901, use existing if 
possible, installing to / from media-libs/mesa

  * x11-libs/libX11-1.3.6:0::gentoo
Did not meet >=x11-libs/libX11-1.3.99.901, use existing if 
possible, installing to / from media-libs/mesa

  * x11-libs/libX11-1.4.0:0::gentoo
Masked by user
Did not meet possible, installing to / from media-sound/pulseaudio
Did not meet possible, installing to / from media-video/vlc

  * x11-libs/libX11-:0::x11
Masked by repository
Repository masked 
/var/paludis/repositories/x11/profiles/package.mask

Don't let people install these accidentally
Masked by user
Did not meet possible, installing to / from media-sound/pulseaudio
Did not meet possible, installing to / from media-video/vlc



Looking at the vlc ebuild I can see :

opengl? ( virtual/opengl || ( >=x11-libs/libX11-1.3.99.901 ) )


Which looks conceptually good since libX11-1.4 lost its xcb use flag, 
but... I'm not an ebuild syntax expert.


There is a similar line in the pulseaudio ebuild.

Is it an ebuild bug or a cave one ?

--
Rodolphe





___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] Paludis, ext4, delayed allocation, zero file length

2009-10-05 Thread Rodolphe Rocca

Richard Freeman wrote:

Rodolphe Rocca wrote:

Ciaran McCreesh wrote:

On Sun, 04 Oct 2009 15:42:27 +0200
Rodolphe Rocca  wrote:

Maybe a forgotten fsync after writing the SLOT file ?

No, it's unrelated. We don't fsync anything, at all.


I'm not not a filesystem expert but isn't it the problem ?


First - apologies for replying to an otherwise-closed thread.  Glad to 
see that Ciaran isn't planning on making any changes.


Since this didn't come up I figured I'd add it - many people consider 
the argument advanced in that blog you referenced bogus.  That 
includes Linus Torvalds.  I believe the most recent kernels include an 
ext4 patch that forces ordered writes by default as was the case with 
ext3 as a result.


Applications generally shouldn't use fsync at all, except in very 
specific circumstances.  If the application implements its own 
transactions (particularly across a network where one server might go 
down and another might not), then it is appropriate to use fsync to 
ensure transactions are synchronized.  That comes with a significant 
penalty to filesystem performance.


As an example of why fsync shouldn't be used, consider mythtv.  By 
default it fsyncs video recordings every couple of seconds.  On a 
loaded system with RAID that means that the disks are constantly 
seeking.  On my backend that resulted in video loss due to buffer 
overruns.  When I edited out the fsyncs from the source, the problem 
went away, as the underlying device drivers could pool data in a sane 
way and not rewrite the RAID5 stripes every time 1/10th of a stripe 
changed.


Filesystems should safely store files and avoid zeroing out modified 
files any time they are written to normally.  It shouldn't be up to 
application developers to figure out the implementation-level details 
of the filesystems they are running on.


Sure, it is best to not have a system crash at all.  However, if it is 
going to fail there are better ways of doing it than what the ext4 
team came up with.  That is why Linus overrode them in the kernel.
Thanks for your answer Richard. I'm inclined to have the same opinion 
about a filesystem zeroing files upon a system crash. But this is an 
extreme and complex situation and there is still place for debate.


Outside of my system crash context, it came to my mind today that the 
fact the VDB repository is not updated (almost) atomically could not be 
only an issue in the context of a system crash.


Correct me if I'm wrong, but if _paludis_ crashes in the middle of a VDB 
update, /var/db/pkg will be left in an inconsistent state.


Fortunately, being a paludis user from a long time, I happily admit that 
a paludis crash is very rare. Actually the few paludis crashes that I 
had in the past all came from a broken environment :-) But in the end, 
it just happens.


Isn't it an issue ?

___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] Paludis, ext4, delayed allocation, zero file length

2009-10-04 Thread Rodolphe Rocca
Ciaran McCreesh wrote:
> On Sun, 04 Oct 2009 21:53:11 +0200
> Rodolphe Rocca  wrote:
>>> or if you don't cleanly power off your computer after
>>> unmounting a filesystem, things will break.
>> but not this one.
>>
>> AFAIK when a fs is unmounted, a sync happens on the virtual device and
>> the unmount operation is blocking until all data has been flushed to
>> the disk (write cache). After what it's up to the hard drive firmware
>> to flush the write cache if it is enabled. So normally when unmount
>> returns, data is at least in the write cache of the drive. A power off
>> during the internal write cache flushing process will trigger data
>> loss. Not sure if the reboot will too.
> 
> Unmounting a filesystem doesn't force the disk to really really write
> its contents properly. In the general case, there's absolutely nothing
> a computer can do to force data to be really really written, whatever
> that even means.

That's actually what I said...

>> Concerning the merged files I agree that not much more can be done to
>> make paludis more resilient to a system crash.
>>
>> But what about /var/cache/db ?
> 
> Assuming you mean /var/db/pkg...

Right sorry.

> 
>> What I'm thinking about is letting paludis work as much as possible
>> in a temporary vdb directory and rename this directory in the safest
>> possible way once everything is done.
> 
> No point.
> 
>> Next time paludis runs, it could be able to detect inconsistencies and
>> automatically fix them. Something like :
> 
> A full VDB scan is slow, and might require permissions Paludis
> doesn't currently have. That's really not something we want to do.

Ok. The scan could be handled by an explicit option (--check-vdb).

>> Would it be insane ?
> 
> Yes, it would.
> 
> First, there is absolutely nothing whatsoever that userland software
> can do to deal with the user randomly powering off their computer.
> 
> Second, this is a huge amount of effort to avoid something that is
> entirely caused by a particularly unpleasant case of user error.

Not a user error, a system or hardware error.

Anyway I surrender, thanks a lot for your answers, I understand your
point of view.

___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] Paludis, ext4, delayed allocation, zero file length

2009-10-04 Thread Rodolphe Rocca
Ciaran McCreesh wrote:
> On Sun, 04 Oct 2009 18:25:14 +0200
> Rodolphe Rocca  wrote:
>> I understand that in case of a machine crashing, a package gets
>> partially installed, as long as the package files content is either
>> the old one or the new one. This can potentially cause trouble, but
>> has a higher resilience than ending with some empty files in VDB
>> causing paludis to be unusable until a manual intervention
>> in /var/cache/db.
> 
> No, it can also result in partially written new files being installed.
> 
>> I tried to turn the auto_da_alloc ext4 mount option on as it is
>> supposed to fix the "zero file length" issue for some file replacing
>> patterns. A few minutes later I got a new crash and empty files
>> again. So I guess the corruption is unrelated or paludis uses a
>> rename pattern which is not detected by ext4.
> 
> The problem is not a rename pattern, and it has nothing to do with
> allocation. It's quite simple: if you don't cleanly unmount a
> filesystem, 

OK I understand that...

> or if you don't cleanly power off your computer after
> unmounting a filesystem, things will break.

but not this one.

AFAIK when a fs is unmounted, a sync happens on the virtual device and
the unmount operation is blocking until all data has been flushed to the
disk (write cache). After what it's up to the hard drive firmware to
flush the write cache if it is enabled. So normally when unmount
returns, data is at least in the write cache of the drive. A power off
during the internal write cache flushing process will trigger data loss.
Not sure if the reboot will too.

> Where Paludis does renames for merging, it does so to prevent a
> partially written executable from existing. If anyone tried to launch
> an executable when it was partially written, weird things would happen;
> using a rename removes that case, although there is a small amount of
> time between when the old executable is removed and the new one is
> renamed into place. Handling unclean unmounts is not a consideration.

Concerning the merged files I agree that not much more can be done to
make paludis more resilient to a system crash.

But what about /var/cache/db ?

What I'm thinking about is letting paludis work as much as possible in a
temporary vdb directory and rename this directory in the safest possible
way once everything is done.

For some "paludis -i pkg" command :

1. d1 = "/var/cache/db/cat/pkg"
2. d2 = "/var/cache/db/cat/.pkgtmp"
3. cp -R $d1/. $d2
4. compile, merge files etc.
5. update DB entries in $d2
6. fsync files in $d2
7. remove $d1
8. rename $d2 $d1

I even imagine a mechanism that would make recoverable a crash happening
between 7 and 8.

Something like :

7. move $d1 $d11(=/var/cache/db/cat/.pkg.original)
8. rename $d2 $d1
9. remove $d11

Next time paludis runs, it could be able to detect inconsistencies and
automatically fix them. Something like :

if $d11 exists:
if ! $d1 exists:
# crash between 7 and 8
mv $d11 $d1
else
# crash between 8 and 9
remove $d11
fi

I'm getting a bit tired so I probably miss some cases but you see the
picture.

Would it be insane ?

>> Now I'm at the point where I disabled ext4 delayed allocation
>> (nodelalloc mount option). Let's see what happens.
> 
> Things will still break if you randomly power off your computer. The
> only difference is that the breakage may display itself slightly
> differently. You can still end up with partially written or empty
> files; you may just not notice them as frequently.

Agreed.
___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] Paludis, ext4, delayed allocation, zero file length

2009-10-04 Thread Rodolphe Rocca
Ciaran McCreesh wrote:
> On Sun, 04 Oct 2009 17:25:48 +0200
> Rodolphe Rocca  wrote:
>>> No, it's unrelated. We don't fsync anything, at all.
>> I'm not not a filesystem expert but isn't it the problem ?
> 
> No. There's absolutely no need to call fsync in normal cases. The only
> time you call fsync is if you need to get some kind of synchronisation
> between two different processes.
> 
> Your problem is that you killed your system in the middle of doing
> something. That will cause breakage regardless of what an application
> does.

True. But there are different kinds of breakage.

I understand that in case of a machine crashing, a package gets
partially installed, as long as the package files content is either the
old one or the new one. This can potentially cause trouble, but has a
higher resilience than ending with some empty files in VDB causing
paludis to be unusable until a manual intervention in /var/cache/db.

Notice I'm not saying it's necessarily paludis' fault. Just opening a
discussion to hear about your thoughts and share my experience with
other potential users ;-)

For the record, a few days ago my /var was under reiserfs. I had several
machine crashes of the same kind while running paludis (and other
programs), but never ended with such empty files, and it never really
broke paludis in such an ugly way that a manual intervention was
required other than reinstalling the package or removing a temporary
directory.

I tried to turn the auto_da_alloc ext4 mount option on as it is supposed
to fix the "zero file length" issue for some file replacing patterns. A
few minutes later I got a new crash and empty files again. So I guess
the corruption is unrelated or paludis uses a rename pattern which is
not detected by ext4.

Now I'm at the point where I disabled ext4 delayed allocation
(nodelalloc mount option). Let's see what happens.
___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


Re: [paludis-user] Paludis, ext4, delayed allocation, zero file length

2009-10-04 Thread Rodolphe Rocca
Ciaran McCreesh wrote:
> On Sun, 04 Oct 2009 15:42:27 +0200
> Rodolphe Rocca  wrote:
>> After a reboot and a regular fsck, I found out that the
>> /var/db/pkg///SLOT was empty (size 0) for a few
>> packages that paludis was installing during the upgrade phase.
>>
>> This issue looks very much like the famous "ext4, delayed allocation
>> and zero file length" issue.
>> I've read many things about the subject, most of them written by the
>> author of ext4, Theodore T'so.
>>
>> What I've read from Theodore and the fact that the corruption always
>> happen on the same file (SLOT) makes me think that there may be an
>> issue in paludis.
>>
>> Maybe a forgotten fsync after writing the SLOT file ?
> 
> No, it's unrelated. We don't fsync anything, at all.

I'm not not a filesystem expert but isn't it the problem ?

Am I misinterpreting this blog discussion (especially comment #7) ?

http://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/
___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


[paludis-user] Paludis, ext4, delayed allocation, zero file length

2009-10-04 Thread Rodolphe Rocca
Hi guys,

I have a gentoo system with kernel gentoo-sources-2.6.31-r1.

My /var partition is formatted as ext4 and mounted with default options
(so delayed allocation is enabled).

While upgrading some packages my system crashed. The crash in itself is
not related to paludis or ext4.

After a reboot and a regular fsck, I found out that the
/var/db/pkg///SLOT was empty (size 0) for a few
packages that paludis was installing during the upgrade phase.

This issue looks very much like the famous "ext4, delayed allocation and
zero file length" issue.

I've read many things about the subject, most of them written by the
author of ext4, Theodore T'so.

What I've read from Theodore and the fact that the corruption always
happen on the same file (SLOT) makes me think that there may be an issue
in paludis.

Maybe a forgotten fsync after writing the SLOT file ?

Thanks !

Rodolphe Rocca

___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user


[paludis-user] Forcing reinstall of all packages in a given repository

2008-07-22 Thread Rodolphe Rocca
Hi,

first and foremost, thank you for this great piece of software called
paludis.
I'm using it almost since it's been published and it just works :-)

Now I  have a question that may quickly turn into an improvement wish.

I've just installed the enlightenment overlay which provides many cvs
ebuilds.
How could I do to force paludis to reinstall all the already installed
packages that belong to the enlightenment overlay ?

I would have thought of something like :

paludis -i world --dl-reinstall always --repository enlightment

but it seems the --repository option is only used when issuing a
--list-* command.

Rodolphe

___
paludis-user mailing list
paludis-user@lists.pioto.org
http://lists.pioto.org/mailman/listinfo/paludis-user