Re: general protection fault in css_release_work_fn()

2021-04-19 Thread Christian Hesse
Hillf Danton  on Mo, 2021/04/12 16:05:
> Looks like double free or use after free based on 0xdead.
> If possible, would you try the mainline with KASAN enabled, given the fear
> that few guys can find time for 5.10 this week?

Currently running 5.11.13 with KASAN enabled for about a week. Either this
has been fixed lately or I am suffering a race that does not happen with
KASAN enabled.
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpcLkslVjPtB.pgp
Description: OpenPGP digital signature


Re: general protection fault in css_release_work_fn()

2021-04-12 Thread Christian Hesse
Christian Hesse  on Mo, 2021/03/15 14:10:
> Hello everybody,
> 
> Christian Hesse  on Tue, 2021/03/02 09:34:
> > I see this on a git server with lots of ssh logins. It happens every few
> > hours to days. No idea how to reproduce, guess it's a race condition?
> > 
> > general protection fault, probably for non-canonical address
> > 0xdead0122:  [#1] SMP NOPTI CPU: 3 PID: 2213757 Comm:
> > kworker/3:2 Not tainted 5.10.18-1-lts #1  
> 
> I've seen more crashes with 5.10.23-1-lts and 5.11.6-arch1-1. Looks like
> 5.11.2-arch1-1 is stable for now, but I did not test everything in between.

Had several more crashes, latest with 5.10.29-1-lts. So... Anybody has an
idea what commit could have caused this?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgplh_M3Bdih6.pgp
Description: OpenPGP digital signature


crash in process_one_work()

2021-03-27 Thread Christian Hesse
Hello everybody,

I just had a crash in process_one_work() with linux 5.10.26... Sadly the log
did not contain anything, so the attached screenshot is all I have.
This is a database server running mariadb.
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpPGnikQ8WuP.pgp
Description: OpenPGP digital signature


Re: general protection fault in css_release_work_fn()

2021-03-15 Thread Christian Hesse
Hello everybody,

Christian Hesse  on Tue, 2021/03/02 09:34:
> I see this on a git server with lots of ssh logins. It happens every few
> hours to days. No idea how to reproduce, guess it's a race condition?
> 
> general protection fault, probably for non-canonical address
> 0xdead0122:  [#1] SMP NOPTI CPU: 3 PID: 2213757 Comm:
> kworker/3:2 Not tainted 5.10.18-1-lts #1

I've seen more crashes with 5.10.23-1-lts and 5.11.6-arch1-1. Looks like
5.11.2-arch1-1 is stable for now, but I did not test everything in between.

Anybody has an idea what commit could have caused this?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgp21oS2nEav6.pgp
Description: OpenPGP digital signature


general protection fault in css_release_work_fn()

2021-03-02 Thread Christian Hesse
Hello everybody,

I see this on a git server with lots of ssh logins. It happens every few
hours to days. No idea how to reproduce, guess it's a race condition?

general protection fault, probably for non-canonical address 
0xdead0122:  [#1] SMP NOPTI
CPU: 3 PID: 2213757 Comm: kworker/3:2 Not tainted 5.10.18-1-lts #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference 
Platform, BIOS 6.00 05/28/2020
Workqueue: cgroup_destroy css_release_work_fn
RIP: 0010:css_release_work_fn+0x3c/0x200
Code: 54 55 53 48 89 fb 48 8b 6f a0 4c 8b 67 98 48 c7 c7 80 d3 ed b4 e8 d4 83 
91 00 48 8b 43 c0 48 8b 53 b8 83 4b ec 04 48 89 42 08 <48> 89 10 4c 89 6b c0 48 
85 ed 0f 84 ab 00 00 00 48 8b 53 d8 48 8d
RSP: 0018:b75f4098fe78 EFLAGS: 00010206
RAX: dead0122 RBX: 9b112c157068 RCX: 9b117ddab5a0
RDX: 9b110e5c2020 RSI: 807f RDI: b4edd380
RBP: b4ff6120 R08: 9b11012e7eb0 R09: 9b11234f5c74
R10: 0018 R11: 0018 R12: 9b110c01a000
R13: dead0122 R14: 9b117ddb2600 R15: 
FS:  () GS:9b117dd8() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f0ff5098620 CR3: 0e74 CR4: 003506e0
Call Trace:
 process_one_work+0x1df/0x370
 worker_thread+0x50/0x400
 ? process_one_work+0x370/0x370
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30
Modules linked in: vsock_loopback vmw_vsock_virtio_transport_common nf_tables 
vmw_vsock_vmci_transport vsock libcrc32c nfnetlink vmwgfx amd_energy joydev 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mousedev aesni_intel 
crypto_simd vmw_balloon ttm cryptd glue_helper rapl drm_kms_helper psmouse 
pcspkr vmxnet3 cec syscopyarea sysfillrect sysimgblt intel_agp vmw_vmci 
fb_sys_fops i2c_piix4 intel_gtt mac_hid drm sg fuse agpgart ip_tables x_tables 
ext4 crc32c_generic crc16 mbcache jbd2 dm_mod sr_mod cdrom ata_generic 
pata_acpi crc32c_intel serio_raw vmw_pvscsi ata_piix
---[ end trace e3405678b69341c6 ]---
RIP: 0010:css_release_work_fn+0x3c/0x200
Code: 54 55 53 48 89 fb 48 8b 6f a0 4c 8b 67 98 48 c7 c7 80 d3 ed b4 e8 d4 83 
91 00 48 8b 43 c0 48 8b 53 b8 83 4b ec 04 48 89 42 08 <48> 89 10 4c 89 6b c0 48 
85 ed 0f 84 ab 00 00 00 48 8b 53 d8 48 8d
RSP: 0018:b75f4098fe78 EFLAGS: 00010206
RAX: dead0122 RBX: 9b112c157068 RCX: 9b117ddab5a0
RDX: 9b110e5c2020 RSI: 807f RDI: b4edd380
RBP: b4ff6120 R08: 9b11012e7eb0 R09: 9b11234f5c74
R10: 0018 R11: 0018 R12: 9b110c01a000
R13: dead0122 R14: 9b117ddb2600 R15: 
FS:  () GS:9b117dd8() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f0ff5098620 CR3: 0e74 CR4: 003506e0

-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpfly3qoFYRP.pgp
Description: OpenPGP digital signature


Re: Linux 5.9.8

2020-11-11 Thread Christian Hesse
Greg Kroah-Hartman  on Tue, 2020/11/10 21:56:
> I'm announcing the release of the 5.9.8 kernel.

This is not yet linked on kernel.org - same goes for lts version 5.4.77.
I guess this is not by intention, no?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpExxcivbE1K.pgp
Description: OpenPGP digital signature


Re: [Regression 5.9][Bisected 1df2bdba528b] Wifi GTK rekeying fails: Sending of EAPol packages broken

2020-10-19 Thread Christian Hesse
Mathy Vanhoef  on Sat, 2020/10/17 23:08:
> I've managed to reproduce the issue, or at least a related issue. Can
> you try the draft patch below and see if that fixes it?

This patch fixes the regression for me. Thanks a lot!
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpJvHzeL4FtP.pgp
Description: OpenPGP digital signature


Re: Linux 4.20.8

2019-02-14 Thread Christian Hesse
Greg KH  on Tue, 2019/02/12 21:18:
> I'm announcing the release of the 4.20.8 kernel.

Versions 4.20.7 and 4.20.8 are missing signatures in git notes. Can you
please add (or push) them? Thanks!
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpOD_vubbrcc.pgp
Description: OpenPGP digital signature


Re: [PATCH 4.4 71/96] e1000e: Separate signaling for link check/link up

2017-12-08 Thread Christian Hesse
Benjamin Poirier  on Fri, 2017/12/08 17:34:
> On 2017/12/07 20:02, Ben Hutchings wrote:
> > On Tue, 2017-11-28 at 11:23 +0100, Greg Kroah-Hartman wrote:  
> > > 4.4-stable review patch.  If anyone has any objections, please let me
> > > know.
> > > 
> > > --
> > > 
> > > From: Benjamin Poirier 
> > > 
> > > commit 19110cfbb34d4af0cdfe14cd243f3b09dc95b013 upstream.  
> > [...]  
> > > --- a/drivers/net/ethernet/intel/e1000e/mac.c
> > > +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> > > @@ -410,6 +410,9 @@ void e1000e_clear_hw_cntrs_base(struct e
> > >   *  Checks to see of the link status of the hardware has changed.  If a
> > >   *  change in link status has been detected, then we read the PHY
> > > registers
> > >   *  to get the current speed/duplex if link exists.
> > > + *
> > > + *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1
> > > (link
> > > + *  up).
> > >   **/
> > >  s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
> > >  {  
> > [...]  
> > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> > > @@ -5017,7 +5017,7 @@ static bool e1000e_has_link(struct e1000  
> > > >     case e1000_media_type_copper:
> > > >     if (hw->mac.get_link_status) {
> > > >     ret_val = hw->mac.ops.check_for_link(hw);
> > > > -   link_active = !hw->mac.get_link_status;
> > > > +   link_active = ret_val > 0;
> > > >     } else {
> > > >     link_active = true;
> > > >     }  
> > 
> > As this change in e1000e_has_link() is conditional only on the media
> > type, doesn't e1000_check_for_copper_link_ich8lan() also need to be
> > changed to return 1 for link up?  
> 
> You're right. I looked at it again, in the commit log I wrote that
> "hw->mac.ops.check_for_link(hw) === e1000e_check_for_copper_link" which
> is true for the race condition reported (because that's the function in
> use on adapters that have msix vectors mac.type == e1000_82574) but not
> generally true. The other check_for_link callback needs to be adjusted
> likewise.
> 
> However, I happen to have a I218-LM (e1000_pch_lpt) so I tested 4.14.3
> and this error only delays link up, it doesn't prevent it.
> e1000_check_for_copper_link_ich8lan() sets mac->get_link_status = false;
> and on the next watchdog execution, we fall in the second branch of the
> following e1000e_has_link code:
> 
>   case e1000_media_type_copper:
>   if (hw->mac.get_link_status) {
>   ret_val = hw->mac.ops.check_for_link(hw);
>   link_active = ret_val > 0;
>   } else {
>   link_active = true;
> 
> OTOH, there are multiple reports in
> https://bugzilla.kernel.org/show_bug.cgi?id=198047
> that reverting 830466993daf ("e1000e: Separate signaling for link
> check/link up") fixes the issue so there's something I'm missing.
> 
> Gabriel and Christian, can you test the following patch?

With this patch applied my connection is up and running again. Thanks!
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpj4X4vO9djf.pgp
Description: OpenPGP digital signature


Re: [PATCH 4.4 71/96] e1000e: Separate signaling for link check/link up

2017-12-08 Thread Christian Hesse
Benjamin Poirier  on Fri, 2017/12/08 17:34:
> On 2017/12/07 20:02, Ben Hutchings wrote:
> > On Tue, 2017-11-28 at 11:23 +0100, Greg Kroah-Hartman wrote:  
> > > 4.4-stable review patch.  If anyone has any objections, please let me
> > > know.
> > > 
> > > --
> > > 
> > > From: Benjamin Poirier 
> > > 
> > > commit 19110cfbb34d4af0cdfe14cd243f3b09dc95b013 upstream.  
> > [...]  
> > > --- a/drivers/net/ethernet/intel/e1000e/mac.c
> > > +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> > > @@ -410,6 +410,9 @@ void e1000e_clear_hw_cntrs_base(struct e
> > >   *  Checks to see of the link status of the hardware has changed.  If a
> > >   *  change in link status has been detected, then we read the PHY
> > > registers
> > >   *  to get the current speed/duplex if link exists.
> > > + *
> > > + *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1
> > > (link
> > > + *  up).
> > >   **/
> > >  s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
> > >  {  
> > [...]  
> > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> > > @@ -5017,7 +5017,7 @@ static bool e1000e_has_link(struct e1000  
> > > >     case e1000_media_type_copper:
> > > >     if (hw->mac.get_link_status) {
> > > >     ret_val = hw->mac.ops.check_for_link(hw);
> > > > -   link_active = !hw->mac.get_link_status;
> > > > +   link_active = ret_val > 0;
> > > >     } else {
> > > >     link_active = true;
> > > >     }  
> > 
> > As this change in e1000e_has_link() is conditional only on the media
> > type, doesn't e1000_check_for_copper_link_ich8lan() also need to be
> > changed to return 1 for link up?  
> 
> You're right. I looked at it again, in the commit log I wrote that
> "hw->mac.ops.check_for_link(hw) === e1000e_check_for_copper_link" which
> is true for the race condition reported (because that's the function in
> use on adapters that have msix vectors mac.type == e1000_82574) but not
> generally true. The other check_for_link callback needs to be adjusted
> likewise.
> 
> However, I happen to have a I218-LM (e1000_pch_lpt) so I tested 4.14.3
> and this error only delays link up, it doesn't prevent it.
> e1000_check_for_copper_link_ich8lan() sets mac->get_link_status = false;
> and on the next watchdog execution, we fall in the second branch of the
> following e1000e_has_link code:
> 
>   case e1000_media_type_copper:
>   if (hw->mac.get_link_status) {
>   ret_val = hw->mac.ops.check_for_link(hw);
>   link_active = ret_val > 0;
>   } else {
>   link_active = true;
> 
> OTOH, there are multiple reports in
> https://bugzilla.kernel.org/show_bug.cgi?id=198047
> that reverting 830466993daf ("e1000e: Separate signaling for link
> check/link up") fixes the issue so there's something I'm missing.
> 
> Gabriel and Christian, can you test the following patch?

With this patch applied my connection is up and running again. Thanks!
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpj4X4vO9djf.pgp
Description: OpenPGP digital signature


Re: Linux 4.0.6 breaks my disk/lvm/filesystem setup

2015-06-25 Thread Christian Hesse
David Herrmann  on Thu, 2015/06/25 14:01:
> Hi
> 
> On Thu, Jun 25, 2015 at 9:05 AM, Christian Hesse  wrote:
> > Hello everybody,
> >
> > I kind of nailed the issue. Adding CC to Greg for kdbus and to Herbert for
> > the bad commit. Details are below.
> >
> > Christian Hesse  on Tue, 2015/06/23 10:14:
> >> with Linux 4.0.5 everything was perfectly fine, Linux 4.0.6 breaks the
> >> setup on one of my systems: Only three of my logical volumes are
> >> available, systemd reports:
> >>
> >> lvm2-pvscan@254:3.service: State 'stop-sigterm' timed out. Killing.
> >>
> >> Followed by a lot of failed dependencies. The setup looks like this:
> >>
> >> NAMEMAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
> >> sda   8:00 953,9G  0 disk
> >> |-sda18:10 1M  0 part
> >> |-sda28:20   256M  0 part  /boot/efi
> >> |-sda38:30   7,8G  0 part
> >> | |-vg-iso  254:00 4G  0 lvm   /srv/iso
> >> | |-vg-persist  254:10 2G  0 lvm   /srv/iso/persist
> >> | `-vg-boot 254:20   128M  0 lvm   /boot
> >> |-sda48:40 913,9G  0 part
> >> | `-cvg 254:30 913,9G  0 crypt
> >> |   |-cvg-swap  254:40 4G  0 lvm   [SWAP]
> >> |   |-cvg-root  254:5040G  0 lvm   /
> >> |   |-cvg-log   254:60 1G  0 lvm   /var/log
> >> |   |-cvg-home  254:70   500G  0 lvm   /home
> >> |   |-cvg-vbox_win7 254:8032G  0 lvm
> >> |   |-cvg-vbox_win8 254:9032G  0 lvm
> >> |   |-cvg-git   254:10   012G  0 lvm   /srv/git
> >> |   `-cvg-chroots   254:11   016G  0 lvm   /var/lib/archbuild
> >> `-sda58:5032G  0 part
> >>
> >> Another system is just fine, the only difference is a logical volume with
> >> btrfs (cvg-chroots). Possibly the btrfs fixes are involved?
> >
> > I am running Linux 4.0.x with kdbus from Greg's char-misc tree kdbus
> > branch [0] merged, last commit is b69af624a0 ("kdbus: optimize if
> > statements in kdbus_conn_disconnect()"). Everything works fine with Linux
> > 4.0.5 but breaks with 4.0.6 on one of my systems.
> >
> > I bisected the problem and found this to be the bad commit [1]:
> >
> >   From cf8befcc1a5538b035d478424efcc2d50e66928e Mon Sep 17 00:00:00 2001
> >   From: Herbert Xu 
> >   Date: Sat, 16 May 2015 21:16:28 +0800
> >   Subject: netlink: Disable insertions/removals during rehash
> >
> >   [ Upstream commit: Not applicable ]
> >
> >   The current rhashtable rehash code is buggy and can't deal with
> >   parallel insertions/removals without corrupting the hash table.
> >
> >   This patch disables it by partially reverting
> >   c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink: eliminate
> >   nl_sk_hash_lock").
> >
> > I can fix my system by booting with kdbus=0 to disable kdbus or by
> > reverting this single commit. Looks like anything deadlocks... Any idea?
> 
> Greg's kdbus tree does not work on 4.0. How exactly did you do the
> back-merge? You need to revert these patches at least to make it work
> on 4.0 (in this order):
> kdbus: no need to ref current->mm
> kdbus: use rcu to access exe file in metadata
> kdbus: pool: use __vfs_read()
> Furthermore, we don't support kdbus on 4.0. So if this does not happen
> on 4.1, I'd recommend staying with 4.1. It'd still be interesting to
> see whether the netlink-locking back-port is indeed broken.

Probably I borked it with my back-merge then...
Everything else worked perfectly fine, though.

I am on 4.1.0 now, which did not have any issues so far.

> Regardless: It is highly unlikely that the netlink commit and kdbus
> are in any way related. Either kdbus triggers some uncommon user-space
> path, or you have a borked kdbus-merge.

I don't know... Possibly the latter. Let's ignore this as it is
unsupported. ;)

Thanks for your support!
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpggEYDGbw2G.pgp
Description: OpenPGP digital signature


Re: Linux 4.0.6 breaks my disk/lvm/filesystem setup

2015-06-25 Thread Christian Hesse
Hello everybody,

I kind of nailed the issue. Adding CC to Greg for kdbus and to Herbert for
the bad commit. Details are below.

Christian Hesse  on Tue, 2015/06/23 10:14:
> with Linux 4.0.5 everything was perfectly fine, Linux 4.0.6 breaks the setup
> on one of my systems: Only three of my logical volumes are available,
> systemd reports:
> 
> lvm2-pvscan@254:3.service: State 'stop-sigterm' timed out. Killing.
> 
> Followed by a lot of failed dependencies. The setup looks like this:
> 
> NAMEMAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
> sda   8:00 953,9G  0 disk  
> |-sda18:10 1M  0 part  
> |-sda28:20   256M  0 part  /boot/efi
> |-sda38:30   7,8G  0 part  
> | |-vg-iso  254:00 4G  0 lvm   /srv/iso
> | |-vg-persist  254:10 2G  0 lvm   /srv/iso/persist
> | `-vg-boot 254:20   128M  0 lvm   /boot
> |-sda48:40 913,9G  0 part  
> | `-cvg 254:30 913,9G  0 crypt 
> |   |-cvg-swap  254:40 4G  0 lvm   [SWAP]
> |   |-cvg-root  254:5040G  0 lvm   /
> |   |-cvg-log   254:60 1G  0 lvm   /var/log
> |   |-cvg-home  254:70   500G  0 lvm   /home
> |   |-cvg-vbox_win7 254:8032G  0 lvm   
> |   |-cvg-vbox_win8 254:9032G  0 lvm   
> |   |-cvg-git   254:10   012G  0 lvm   /srv/git
> |   `-cvg-chroots   254:11   016G  0 lvm   /var/lib/archbuild
> `-sda58:5032G  0 part
> 
> Another system is just fine, the only difference is a logical volume with
> btrfs (cvg-chroots). Possibly the btrfs fixes are involved?

I am running Linux 4.0.x with kdbus from Greg's char-misc tree kdbus branch
[0] merged, last commit is b69af624a0 ("kdbus: optimize if statements in
kdbus_conn_disconnect()"). Everything works fine with Linux 4.0.5 but breaks
with 4.0.6 on one of my systems.

I bisected the problem and found this to be the bad commit [1]:

  From cf8befcc1a5538b035d478424efcc2d50e66928e Mon Sep 17 00:00:00 2001
  From: Herbert Xu 
  Date: Sat, 16 May 2015 21:16:28 +0800
  Subject: netlink: Disable insertions/removals during rehash

  [ Upstream commit: Not applicable ]

  The current rhashtable rehash code is buggy and can't deal with
  parallel insertions/removals without corrupting the hash table.

  This patch disables it by partially reverting
  c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink: eliminate
  nl_sk_hash_lock").

I can fix my system by booting with kdbus=0 to disable kdbus or by reverting
this single commit. Looks like anything deadlocks... Any idea?

[0] https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/?h=kdbus
[1]
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=cf8befcc
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpytV81Oxajb.pgp
Description: OpenPGP digital signature


Re: Linux 4.0.6 breaks my disk/lvm/filesystem setup

2015-06-25 Thread Christian Hesse
Hello everybody,

I kind of nailed the issue. Adding CC to Greg for kdbus and to Herbert for
the bad commit. Details are below.

Christian Hesse l...@eworm.de on Tue, 2015/06/23 10:14:
 with Linux 4.0.5 everything was perfectly fine, Linux 4.0.6 breaks the setup
 on one of my systems: Only three of my logical volumes are available,
 systemd reports:
 
 lvm2-pvscan@254:3.service: State 'stop-sigterm' timed out. Killing.
 
 Followed by a lot of failed dependencies. The setup looks like this:
 
 NAMEMAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
 sda   8:00 953,9G  0 disk  
 |-sda18:10 1M  0 part  
 |-sda28:20   256M  0 part  /boot/efi
 |-sda38:30   7,8G  0 part  
 | |-vg-iso  254:00 4G  0 lvm   /srv/iso
 | |-vg-persist  254:10 2G  0 lvm   /srv/iso/persist
 | `-vg-boot 254:20   128M  0 lvm   /boot
 |-sda48:40 913,9G  0 part  
 | `-cvg 254:30 913,9G  0 crypt 
 |   |-cvg-swap  254:40 4G  0 lvm   [SWAP]
 |   |-cvg-root  254:5040G  0 lvm   /
 |   |-cvg-log   254:60 1G  0 lvm   /var/log
 |   |-cvg-home  254:70   500G  0 lvm   /home
 |   |-cvg-vbox_win7 254:8032G  0 lvm   
 |   |-cvg-vbox_win8 254:9032G  0 lvm   
 |   |-cvg-git   254:10   012G  0 lvm   /srv/git
 |   `-cvg-chroots   254:11   016G  0 lvm   /var/lib/archbuild
 `-sda58:5032G  0 part
 
 Another system is just fine, the only difference is a logical volume with
 btrfs (cvg-chroots). Possibly the btrfs fixes are involved?

I am running Linux 4.0.x with kdbus from Greg's char-misc tree kdbus branch
[0] merged, last commit is b69af624a0 (kdbus: optimize if statements in
kdbus_conn_disconnect()). Everything works fine with Linux 4.0.5 but breaks
with 4.0.6 on one of my systems.

I bisected the problem and found this to be the bad commit [1]:

  From cf8befcc1a5538b035d478424efcc2d50e66928e Mon Sep 17 00:00:00 2001
  From: Herbert Xu herb...@gondor.apana.org.au
  Date: Sat, 16 May 2015 21:16:28 +0800
  Subject: netlink: Disable insertions/removals during rehash

  [ Upstream commit: Not applicable ]

  The current rhashtable rehash code is buggy and can't deal with
  parallel insertions/removals without corrupting the hash table.

  This patch disables it by partially reverting
  c5adde9468b0714a051eac7f9666f23eb10b61f7 (netlink: eliminate
  nl_sk_hash_lock).

I can fix my system by booting with kdbus=0 to disable kdbus or by reverting
this single commit. Looks like anything deadlocks... Any idea?

[0] https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/?h=kdbus
[1]
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=cf8befcc
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpytV81Oxajb.pgp
Description: OpenPGP digital signature


Re: Linux 4.0.6 breaks my disk/lvm/filesystem setup

2015-06-25 Thread Christian Hesse
David Herrmann dh.herrm...@gmail.com on Thu, 2015/06/25 14:01:
 Hi
 
 On Thu, Jun 25, 2015 at 9:05 AM, Christian Hesse l...@eworm.de wrote:
  Hello everybody,
 
  I kind of nailed the issue. Adding CC to Greg for kdbus and to Herbert for
  the bad commit. Details are below.
 
  Christian Hesse l...@eworm.de on Tue, 2015/06/23 10:14:
  with Linux 4.0.5 everything was perfectly fine, Linux 4.0.6 breaks the
  setup on one of my systems: Only three of my logical volumes are
  available, systemd reports:
 
  lvm2-pvscan@254:3.service: State 'stop-sigterm' timed out. Killing.
 
  Followed by a lot of failed dependencies. The setup looks like this:
 
  NAMEMAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
  sda   8:00 953,9G  0 disk
  |-sda18:10 1M  0 part
  |-sda28:20   256M  0 part  /boot/efi
  |-sda38:30   7,8G  0 part
  | |-vg-iso  254:00 4G  0 lvm   /srv/iso
  | |-vg-persist  254:10 2G  0 lvm   /srv/iso/persist
  | `-vg-boot 254:20   128M  0 lvm   /boot
  |-sda48:40 913,9G  0 part
  | `-cvg 254:30 913,9G  0 crypt
  |   |-cvg-swap  254:40 4G  0 lvm   [SWAP]
  |   |-cvg-root  254:5040G  0 lvm   /
  |   |-cvg-log   254:60 1G  0 lvm   /var/log
  |   |-cvg-home  254:70   500G  0 lvm   /home
  |   |-cvg-vbox_win7 254:8032G  0 lvm
  |   |-cvg-vbox_win8 254:9032G  0 lvm
  |   |-cvg-git   254:10   012G  0 lvm   /srv/git
  |   `-cvg-chroots   254:11   016G  0 lvm   /var/lib/archbuild
  `-sda58:5032G  0 part
 
  Another system is just fine, the only difference is a logical volume with
  btrfs (cvg-chroots). Possibly the btrfs fixes are involved?
 
  I am running Linux 4.0.x with kdbus from Greg's char-misc tree kdbus
  branch [0] merged, last commit is b69af624a0 (kdbus: optimize if
  statements in kdbus_conn_disconnect()). Everything works fine with Linux
  4.0.5 but breaks with 4.0.6 on one of my systems.
 
  I bisected the problem and found this to be the bad commit [1]:
 
From cf8befcc1a5538b035d478424efcc2d50e66928e Mon Sep 17 00:00:00 2001
From: Herbert Xu herb...@gondor.apana.org.au
Date: Sat, 16 May 2015 21:16:28 +0800
Subject: netlink: Disable insertions/removals during rehash
 
[ Upstream commit: Not applicable ]
 
The current rhashtable rehash code is buggy and can't deal with
parallel insertions/removals without corrupting the hash table.
 
This patch disables it by partially reverting
c5adde9468b0714a051eac7f9666f23eb10b61f7 (netlink: eliminate
nl_sk_hash_lock).
 
  I can fix my system by booting with kdbus=0 to disable kdbus or by
  reverting this single commit. Looks like anything deadlocks... Any idea?
 
 Greg's kdbus tree does not work on 4.0. How exactly did you do the
 back-merge? You need to revert these patches at least to make it work
 on 4.0 (in this order):
 kdbus: no need to ref current-mm
 kdbus: use rcu to access exe file in metadata
 kdbus: pool: use __vfs_read()
 Furthermore, we don't support kdbus on 4.0. So if this does not happen
 on 4.1, I'd recommend staying with 4.1. It'd still be interesting to
 see whether the netlink-locking back-port is indeed broken.

Probably I borked it with my back-merge then...
Everything else worked perfectly fine, though.

I am on 4.1.0 now, which did not have any issues so far.

 Regardless: It is highly unlikely that the netlink commit and kdbus
 are in any way related. Either kdbus triggers some uncommon user-space
 path, or you have a borked kdbus-merge.

I don't know... Possibly the latter. Let's ignore this as it is
unsupported. ;)

Thanks for your support!
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpggEYDGbw2G.pgp
Description: OpenPGP digital signature


Linux 4.0.6 breaks my disk/lvm/filesystem setup

2015-06-23 Thread Christian Hesse
Hello everybody,

with Linux 4.0.5 everything was perfectly fine, Linux 4.0.6 breaks the setup
on one of my systems: Only three of my logical volumes are available, systemd
reports:

lvm2-pvscan@254:3.service: State 'stop-sigterm' timed out. Killing.

Followed by a lot of failed dependencies. The setup looks like this:

NAMEMAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda   8:00 953,9G  0 disk  
|-sda18:10 1M  0 part  
|-sda28:20   256M  0 part  /boot/efi
|-sda38:30   7,8G  0 part  
| |-vg-iso  254:00 4G  0 lvm   /srv/iso
| |-vg-persist  254:10 2G  0 lvm   /srv/iso/persist
| `-vg-boot 254:20   128M  0 lvm   /boot
|-sda48:40 913,9G  0 part  
| `-cvg 254:30 913,9G  0 crypt 
|   |-cvg-swap  254:40 4G  0 lvm   [SWAP]
|   |-cvg-root  254:5040G  0 lvm   /
|   |-cvg-log   254:60 1G  0 lvm   /var/log
|   |-cvg-home  254:70   500G  0 lvm   /home
|   |-cvg-vbox_win7 254:8032G  0 lvm   
|   |-cvg-vbox_win8 254:9032G  0 lvm   
|   |-cvg-git   254:10   012G  0 lvm   /srv/git
|   `-cvg-chroots   254:11   016G  0 lvm   /var/lib/archbuild
`-sda58:5032G  0 part

Another system is just fine, the only difference is a logical volume with
btrfs (cvg-chroots). Possibly the btrfs fixes are involved?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpqKZwfozaE6.pgp
Description: OpenPGP digital signature


Linux 4.0.6 breaks my disk/lvm/filesystem setup

2015-06-23 Thread Christian Hesse
Hello everybody,

with Linux 4.0.5 everything was perfectly fine, Linux 4.0.6 breaks the setup
on one of my systems: Only three of my logical volumes are available, systemd
reports:

lvm2-pvscan@254:3.service: State 'stop-sigterm' timed out. Killing.

Followed by a lot of failed dependencies. The setup looks like this:

NAMEMAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda   8:00 953,9G  0 disk  
|-sda18:10 1M  0 part  
|-sda28:20   256M  0 part  /boot/efi
|-sda38:30   7,8G  0 part  
| |-vg-iso  254:00 4G  0 lvm   /srv/iso
| |-vg-persist  254:10 2G  0 lvm   /srv/iso/persist
| `-vg-boot 254:20   128M  0 lvm   /boot
|-sda48:40 913,9G  0 part  
| `-cvg 254:30 913,9G  0 crypt 
|   |-cvg-swap  254:40 4G  0 lvm   [SWAP]
|   |-cvg-root  254:5040G  0 lvm   /
|   |-cvg-log   254:60 1G  0 lvm   /var/log
|   |-cvg-home  254:70   500G  0 lvm   /home
|   |-cvg-vbox_win7 254:8032G  0 lvm   
|   |-cvg-vbox_win8 254:9032G  0 lvm   
|   |-cvg-git   254:10   012G  0 lvm   /srv/git
|   `-cvg-chroots   254:11   016G  0 lvm   /var/lib/archbuild
`-sda58:5032G  0 part

Another system is just fine, the only difference is a logical volume with
btrfs (cvg-chroots). Possibly the btrfs fixes are involved?
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpqKZwfozaE6.pgp
Description: OpenPGP digital signature


Re: task kworker / kloopd blocked for more than 120 seconds

2015-05-29 Thread Christian Hesse
Ming Lei  on Thu, 2015/05/28 20:20:
On Thu, May 28, 2015 at 5:37 PM, Christian Hesse  wrote:
> > Assembling the root filesystem stack always succeeds, but sometimes boot
> > hangs with a lot of:  
> 
> Could you try the following two patches to see if they can fix your issue?
> 
> http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/commit/?h=for-next=f4aa4c7bbac6c4afdd4adccf90898c1a3685396d
> 
> http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/commit/?h=for-next=4d4e41aef9429872ea3b105e83426941f7185ab6

Sounds reasonable. System works perfectly stable and reliable since booted
with a patched kernel. Thanks a lot for the hint!

Looks like the changes did not make it into Linus' tree yet. So it will take
some time for the patches to show up in a stable release, no?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpOg2VbKezFt.pgp
Description: OpenPGP digital signature


Re: task kworker / kloopd blocked for more than 120 seconds

2015-05-29 Thread Christian Hesse
Ming Lei tom.leim...@gmail.com on Thu, 2015/05/28 20:20:
On Thu, May 28, 2015 at 5:37 PM, Christian Hesse l...@eworm.de wrote:
  Assembling the root filesystem stack always succeeds, but sometimes boot
  hangs with a lot of:  
 
 Could you try the following two patches to see if they can fix your issue?
 
 http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/commit/?h=for-nextid=f4aa4c7bbac6c4afdd4adccf90898c1a3685396d
 
 http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/commit/?h=for-nextid=4d4e41aef9429872ea3b105e83426941f7185ab6

Sounds reasonable. System works perfectly stable and reliable since booted
with a patched kernel. Thanks a lot for the hint!

Looks like the changes did not make it into Linus' tree yet. So it will take
some time for the patches to show up in a stable release, no?
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpOg2VbKezFt.pgp
Description: OpenPGP digital signature


task kworker / kloopd blocked for more than 120 seconds

2015-05-28 Thread Christian Hesse
Hello everybody,

I am experiencing issues on a live system based on Arch Linux [0][1] with
Linux kernel 4.0.x when booted in RAM mode (parameter 'copytoram') [2].
Probably the problem exists some time longer, I am not sure when this first
happened and what versions are effected.

The boot sequence looks like this:

-> boot kernel
-> start initramfs
-> get the squashfs image file (there is no difference with loop mount,
   nbd or http)
-> root filesystem is set up with loop devices and device mapper
-> boot continues

The root filesystem is accessed as follows:

-> ext4 filesystem
  -> device mapper snapshot target
-> /dev/loop1: r/o ext4 image
  -> mounted squashfs
-> /dev/loop0: squashfs image
-> /dev/loop2: r/w COW image (sparse file)


Assembling the root filesystem stack always succeeds, but sometimes boot hangs
with a lot of:

INFO: task kworker/u#:#:### blocked for more than 120 seconds

and a single:

INFO: task kloopd/### blocked for more than 120 seconds

Any idea what goes wrong? Looks like a race condition or deadlock.

[0] https://wiki.archlinux.org/index.php/Archiso
[1] https://projects.archlinux.org/archiso.git/
[2] https://projects.archlinux.org/archiso.git/tree/docs/README.bootparams
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpvol8zlseVa.pgp
Description: OpenPGP digital signature


task kworker / kloopd blocked for more than 120 seconds

2015-05-28 Thread Christian Hesse
Hello everybody,

I am experiencing issues on a live system based on Arch Linux [0][1] with
Linux kernel 4.0.x when booted in RAM mode (parameter 'copytoram') [2].
Probably the problem exists some time longer, I am not sure when this first
happened and what versions are effected.

The boot sequence looks like this:

- boot kernel
- start initramfs
- get the squashfs image file (there is no difference with loop mount,
   nbd or http)
- root filesystem is set up with loop devices and device mapper
- boot continues

The root filesystem is accessed as follows:

- ext4 filesystem
  - device mapper snapshot target
- /dev/loop1: r/o ext4 image
  - mounted squashfs
- /dev/loop0: squashfs image
- /dev/loop2: r/w COW image (sparse file)


Assembling the root filesystem stack always succeeds, but sometimes boot hangs
with a lot of:

INFO: task kworker/u#:#:### blocked for more than 120 seconds

and a single:

INFO: task kloopd/### blocked for more than 120 seconds

Any idea what goes wrong? Looks like a race condition or deadlock.

[0] https://wiki.archlinux.org/index.php/Archiso
[1] https://projects.archlinux.org/archiso.git/
[2] https://projects.archlinux.org/archiso.git/tree/docs/README.bootparams
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpvol8zlseVa.pgp
Description: OpenPGP digital signature


Re: linux 3.19 iSCSI issue

2015-03-20 Thread Christian Hesse
Christian Hesse  on Thu, 2015/02/19 11:47:
> Hello everybody,
> 
> beginning with linux 3.19 (Arch Linux x86_64 package version 3.19-1) I see
> an iSCSI issue. This works perfectly with linux 3.18.6 and before. The logs
> tell the story:
>
> [snip log]

This is a real regression and still happens with 3.19.x stable (3.19.2) and
4.0 git (4.0rc4.r199.gb314aca). Anybody with a hint?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpaO59fq1x6e.pgp
Description: OpenPGP digital signature


Re: linux 3.19 iSCSI issue

2015-03-20 Thread Christian Hesse
Christian Hesse l...@eworm.de on Thu, 2015/02/19 11:47:
 Hello everybody,
 
 beginning with linux 3.19 (Arch Linux x86_64 package version 3.19-1) I see
 an iSCSI issue. This works perfectly with linux 3.18.6 and before. The logs
 tell the story:

 [snip log]

This is a real regression and still happens with 3.19.x stable (3.19.2) and
4.0 git (4.0rc4.r199.gb314aca). Anybody with a hint?
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpaO59fq1x6e.pgp
Description: OpenPGP digital signature


linux 3.19 iSCSI issue

2015-02-19 Thread Christian Hesse
Hello everybody,

beginning with linux 3.19 (Arch Linux x86_64 package version 3.19-1) I see an
iSCSI issue. This works perfectly with linux 3.18.6 and before. The logs tell
the story:

Feb 19 11:26:49 thebe kernel: scsi host6: iSCSI Initiator over TCP/IP
Feb 19 11:26:49 thebe kernel: scsi 6:0:0:0: Direct-Access QNAP iSCSI 
Storage4.0  PQ: 0 ANSI: 5
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] 1073741824 512-byte logical 
blocks: (549 GB/512 GiB)
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Write Protect is off
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Mode Sense: 2f 00 00 00
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Write cache: disabled, read 
cache: enabled, doesn't support DPO or FUA
Feb 19 11:26:49 thebe kernel:  sdb: unknown partition table
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Attached SCSI disk
Feb 19 11:26:49 thebe iscsid[10804]: Connection1:0 to [target: 
iqn.2004-04.com.qnap:ts-859:iscsi.xx.xx, portal: 
xx.xx.xxx.xx,3260] through [iface: default] is operational now
Feb 19 11:26:57 thebe kernel:  sdb: unknown partition table
Feb 19 11:28:20 thebe kernel: EXT4-fs (dm-8): mounting with "discard" option, 
but the device does not support discard
Feb 19 11:28:20 thebe kernel: EXT4-fs (dm-8): mounted filesystem with ordered 
data mode. Opts: (null)
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] UNKNOWN Result: hostbyte=0x00 
driverbyte=0x08
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] Sense Key : 0x5 [current]
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] ASC=0x24 ASCQ=0x0
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] CDB: 
Feb 19 11:28:24 thebe kernel: cdb[0]=0x2a: 2a 00 34 5b 07 ff 00 2f 88 00
Feb 19 11:28:24 thebe kernel: blk_update_request: critical target error, dev 
sdb, sector 878381055
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749056)
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749056
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749057
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749058
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749059
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749060
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749061
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749062
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749063
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749064
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749065
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749312)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749568)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749824)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108750080)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108750336)
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] UNKNOWN Result: hostbyte=0x00 
driverbyte=0x08
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] Sense Key : 0x5 [current]
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] ASC=0x24 ASCQ=0x0
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] CDB: 
Feb 19 11:29:10 thebe kernel: cdb[0]=0x2a: 2a 00 20 44 89 17 00 20 50 00
Feb 19 11:29:10 thebe kernel: blk_update_request: critical target error, dev 
sdb, sector 541362455
Feb 19 11:29:10 thebe kernel: Buffer I/O error on dev dm-8, logical block 
66621731, lost sync page write
Feb 19 11:29:10 thebe kernel: Aborting journal on device dm-8-8.
Feb 19 11:29:10 thebe kernel: EXT4-fs error (device dm-8): 
ext4_journal_check_start:56: Detected aborted journal
Feb 19 11:29:10 thebe kernel: EXT4-fs (dm-8): Remounting filesystem read-only
Feb 19 11:29:20 thebe kernel: EXT4-fs error (device dm-8): ext4_put_super:780: 
Couldn't clean up the journal

-- 
main(a){char*c=/*Best regards,   */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


pgpqUUUdhtEUO.pgp
Description: OpenPGP digital signature


linux 3.19 iSCSI issue

2015-02-19 Thread Christian Hesse
Hello everybody,

beginning with linux 3.19 (Arch Linux x86_64 package version 3.19-1) I see an
iSCSI issue. This works perfectly with linux 3.18.6 and before. The logs tell
the story:

Feb 19 11:26:49 thebe kernel: scsi host6: iSCSI Initiator over TCP/IP
Feb 19 11:26:49 thebe kernel: scsi 6:0:0:0: Direct-Access QNAP iSCSI 
Storage4.0  PQ: 0 ANSI: 5
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] 1073741824 512-byte logical 
blocks: (549 GB/512 GiB)
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Write Protect is off
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Mode Sense: 2f 00 00 00
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Write cache: disabled, read 
cache: enabled, doesn't support DPO or FUA
Feb 19 11:26:49 thebe kernel:  sdb: unknown partition table
Feb 19 11:26:49 thebe kernel: sd 6:0:0:0: [sdb] Attached SCSI disk
Feb 19 11:26:49 thebe iscsid[10804]: Connection1:0 to [target: 
iqn.2004-04.com.qnap:ts-859:iscsi.xx.xx, portal: 
xx.xx.xxx.xx,3260] through [iface: default] is operational now
Feb 19 11:26:57 thebe kernel:  sdb: unknown partition table
Feb 19 11:28:20 thebe kernel: EXT4-fs (dm-8): mounting with discard option, 
but the device does not support discard
Feb 19 11:28:20 thebe kernel: EXT4-fs (dm-8): mounted filesystem with ordered 
data mode. Opts: (null)
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] UNKNOWN Result: hostbyte=0x00 
driverbyte=0x08
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] Sense Key : 0x5 [current]
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] ASC=0x24 ASCQ=0x0
Feb 19 11:28:24 thebe kernel: sd 6:0:0:0: [sdb] CDB: 
Feb 19 11:28:24 thebe kernel: cdb[0]=0x2a: 2a 00 34 5b 07 ff 00 2f 88 00
Feb 19 11:28:24 thebe kernel: blk_update_request: critical target error, dev 
sdb, sector 878381055
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749056)
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749056
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749057
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749058
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749059
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749060
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749061
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749062
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749063
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749064
Feb 19 11:28:24 thebe kernel: Buffer I/O error on device dm-8, logical block 
108749065
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749312)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749568)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108749824)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108750080)
Feb 19 11:28:24 thebe kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: 
I/O error -121 writing to inode 33196503 (offset 8388608 size 7278592 starting 
block 108750336)
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] UNKNOWN Result: hostbyte=0x00 
driverbyte=0x08
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] Sense Key : 0x5 [current]
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] ASC=0x24 ASCQ=0x0
Feb 19 11:29:10 thebe kernel: sd 6:0:0:0: [sdb] CDB: 
Feb 19 11:29:10 thebe kernel: cdb[0]=0x2a: 2a 00 20 44 89 17 00 20 50 00
Feb 19 11:29:10 thebe kernel: blk_update_request: critical target error, dev 
sdb, sector 541362455
Feb 19 11:29:10 thebe kernel: Buffer I/O error on dev dm-8, logical block 
66621731, lost sync page write
Feb 19 11:29:10 thebe kernel: Aborting journal on device dm-8-8.
Feb 19 11:29:10 thebe kernel: EXT4-fs error (device dm-8): 
ext4_journal_check_start:56: Detected aborted journal
Feb 19 11:29:10 thebe kernel: EXT4-fs (dm-8): Remounting filesystem read-only
Feb 19 11:29:20 thebe kernel: EXT4-fs error (device dm-8): ext4_put_super:780: 
Couldn't clean up the journal

-- 
main(a){char*c=/*Best regards,   */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


pgpqUUUdhtEUO.pgp
Description: OpenPGP digital signature


[PATCH 1/1] Documentation: image containing ucode needs to be uncompressed

2014-10-29 Thread Christian Hesse
While the regular initramfs is allowed to be compressed, the image
containing microcode is not.

Signed-off-by: Christian Hesse 
---
 Documentation/x86/early-microcode.txt | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/x86/early-microcode.txt 
b/Documentation/x86/early-microcode.txt
index d62bea6..da6ded3 100644
--- a/Documentation/x86/early-microcode.txt
+++ b/Documentation/x86/early-microcode.txt
@@ -8,9 +8,10 @@ can fix CPU issues before they are observed during kernel boot 
time.
 Microcode is stored in an initrd file. The microcode is read from the initrd
 file and loaded to CPUs during boot time.
 
-The format of the combined initrd image is microcode in cpio format followed by
-the initrd image (maybe compressed). Kernel parses the combined initrd image
-during boot time. The microcode file in cpio name space is:
+The format of the combined initrd image is microcode in uncompressed
+cpio format followed by the initrd image (maybe compressed). Kernel
+parses the combined initrd image during boot time. The microcode file in
+cpio name space is:
 on Intel: kernel/x86/microcode/GenuineIntel.bin
 on AMD  : kernel/x86/microcode/AuthenticAMD.bin
 
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] Documentation: image containing ucode needs to be uncompressed

2014-10-29 Thread Christian Hesse
While the regular initramfs is allowed to be compressed, the image
containing microcode is not.

Signed-off-by: Christian Hesse m...@eworm.de
---
 Documentation/x86/early-microcode.txt | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/x86/early-microcode.txt 
b/Documentation/x86/early-microcode.txt
index d62bea6..da6ded3 100644
--- a/Documentation/x86/early-microcode.txt
+++ b/Documentation/x86/early-microcode.txt
@@ -8,9 +8,10 @@ can fix CPU issues before they are observed during kernel boot 
time.
 Microcode is stored in an initrd file. The microcode is read from the initrd
 file and loaded to CPUs during boot time.
 
-The format of the combined initrd image is microcode in cpio format followed by
-the initrd image (maybe compressed). Kernel parses the combined initrd image
-during boot time. The microcode file in cpio name space is:
+The format of the combined initrd image is microcode in uncompressed
+cpio format followed by the initrd image (maybe compressed). Kernel
+parses the combined initrd image during boot time. The microcode file in
+cpio name space is:
 on Intel: kernel/x86/microcode/GenuineIntel.bin
 on AMD  : kernel/x86/microcode/AuthenticAMD.bin
 
-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] iproute2 3.10

2013-07-16 Thread Christian Hesse
Stephen Hemminger  on Tue, 2013/07/16 10:17:
> This is update for iproute2 tools for 3.10 kernel.
> [...]
>
> Iproute2 package is available at:
>   http://kernel.org/pub/linux/utils/net/iproute2/iproute2-3.10.0.tar.gz
> 
> You can download the source from:
>   git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

Is there any reason the last two releases have not been tagged in git?
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}


signature.asc
Description: PGP signature


Re: [ANNOUNCE] iproute2 3.10

2013-07-16 Thread Christian Hesse
Stephen Hemminger step...@networkplumber.org on Tue, 2013/07/16 10:17:
 This is update for iproute2 tools for 3.10 kernel.
 [...]

 Iproute2 package is available at:
   http://kernel.org/pub/linux/utils/net/iproute2/iproute2-3.10.0.tar.gz
 
 You can download the source from:
   git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

Is there any reason the last two releases have not been tagged in git?
-- 
main(a){char*c=/*Schoene Gruesse */B?IJj;MEH
CX:;,b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c  ./sig*/b/42*2-3)*42);}


signature.asc
Description: PGP signature


Re: [Btrfs-devel] [ANNOUNCE] Btrfs v0.10 (online growing/shrinking, ext3 conversion, and more)

2008-01-16 Thread Christian Hesse
On Tuesday 15 January 2008, Chris Mason wrote:
> Hello everyone,
>
> Btrfs v0.10 is now available for download from:

It does not even compile for me, tested with 2.6.24-rc{7,8}. I will look at 
that later.

fs/built-in.o: In function `btrfs_xattr_set_acl':
acl.c:(.text+0x68f33): undefined reference to `posix_acl_from_xattr'
acl.c:(.text+0x68f47): undefined reference to `posix_acl_valid'
make: *** [.tmp_vmlinux1] Error 1

Is this release supposed to fix the suspend problem?
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Btrfs-devel] [ANNOUNCE] Btrfs v0.10 (online growing/shrinking, ext3 conversion, and more)

2008-01-16 Thread Christian Hesse
On Tuesday 15 January 2008, Chris Mason wrote:
 Hello everyone,

 Btrfs v0.10 is now available for download from:

It does not even compile for me, tested with 2.6.24-rc{7,8}. I will look at 
that later.

fs/built-in.o: In function `btrfs_xattr_set_acl':
acl.c:(.text+0x68f33): undefined reference to `posix_acl_from_xattr'
acl.c:(.text+0x68f47): undefined reference to `posix_acl_valid'
make: *** [.tmp_vmlinux1] Error 1

Is this release supposed to fix the suspend problem?
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Reboot problem

2008-01-02 Thread Christian Hesse
On Wednesday 02 January 2008, Rafael J. Wysocki wrote:
> On Wednesday, 2 of January 2008, Christian Hesse wrote:
> > On Wednesday 02 January 2008, Nigel Cunningham wrote:
> > > Hi Christian.
> > >
> > > Christian Hesse wrote:
> > > > On Tuesday 01 January 2008, Nigel Cunningham wrote:
> > > >> Third, regarding the patch itself, I'm taking my time in working
> > > >> towards the 3.0 release. We don't have any major bugs with 3.0-rc3
> > > >> reported [...].
> > > >
> > > > Well, I think I still have a bug, though it is possibly a mainline
> > > > problem and it's not a showstopper. After a suspend/resume cycle the
> > > > reboot does not work. The system hangs with "Rebooting system" (or
> > > > similar). After that you have to hard reset the system, which is not
> > > > really a problem as filesystems have been unmounted before. Reboot
> > > > without a suspend cycle before and halt with and without suspend
> > > > cycle work without problems.
> > >
> > > Just to clarify, do you mean rebooting after writing an image, or
> > > shutting down and rebooting? It could be that there's some change to
> > > the semantics in 2.6.24 that I haven't noticed yet.
> >
> > I speak about shutting down and rebooting. I have not used reboot after
> > writing an image for a long time now. Will test what happens in this
> > case.

Reboot after writing image does not work, too. The system hangs with "Ready to 
reboot".

> > I had the issue before 2.6.24(-rc) already, thought I don't know whether
> > there were times it worked. I use it way too seldom.
>
> Well, this is similar to http://bugzilla.kernel.org/show_bug.cgi?id=6655 ,
> which definitely is a mainline problem (still pending).

Yes, that sound like my problem. I will test the patches and keep the bug 
report in focus.
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Reboot problem

2008-01-02 Thread Christian Hesse
On Wednesday 02 January 2008, Rafael J. Wysocki wrote:
 On Wednesday, 2 of January 2008, Christian Hesse wrote:
  On Wednesday 02 January 2008, Nigel Cunningham wrote:
   Hi Christian.
  
   Christian Hesse wrote:
On Tuesday 01 January 2008, Nigel Cunningham wrote:
Third, regarding the patch itself, I'm taking my time in working
towards the 3.0 release. We don't have any major bugs with 3.0-rc3
reported [...].
   
Well, I think I still have a bug, though it is possibly a mainline
problem and it's not a showstopper. After a suspend/resume cycle the
reboot does not work. The system hangs with Rebooting system (or
similar). After that you have to hard reset the system, which is not
really a problem as filesystems have been unmounted before. Reboot
without a suspend cycle before and halt with and without suspend
cycle work without problems.
  
   Just to clarify, do you mean rebooting after writing an image, or
   shutting down and rebooting? It could be that there's some change to
   the semantics in 2.6.24 that I haven't noticed yet.
 
  I speak about shutting down and rebooting. I have not used reboot after
  writing an image for a long time now. Will test what happens in this
  case.

Reboot after writing image does not work, too. The system hangs with Ready to 
reboot.

  I had the issue before 2.6.24(-rc) already, thought I don't know whether
  there were times it worked. I use it way too seldom.

 Well, this is similar to http://bugzilla.kernel.org/show_bug.cgi?id=6655 ,
 which definitely is a mainline problem (still pending).

Yes, that sound like my problem. I will test the patches and keep the bug 
report in focus.
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Reboot problem

2008-01-01 Thread Christian Hesse
On Wednesday 02 January 2008, Nigel Cunningham wrote:
> Hi Christian.
>
> Christian Hesse wrote:
> > On Tuesday 01 January 2008, Nigel Cunningham wrote:
> >> Third, regarding the patch itself, I'm taking my time in working towards
> >> the 3.0 release. We don't have any major bugs with 3.0-rc3 reported
> >> [...].
> >
> > Well, I think I still have a bug, though it is possibly a mainline
> > problem and it's not a showstopper. After a suspend/resume cycle the
> > reboot does not work. The system hangs with "Rebooting system" (or
> > similar). After that you have to hard reset the system, which is not
> > really a problem as filesystems have been unmounted before. Reboot
> > without a suspend cycle before and halt with and without suspend cycle
> > work without problems.
>
> Just to clarify, do you mean rebooting after writing an image, or
> shutting down and rebooting? It could be that there's some change to the
> semantics in 2.6.24 that I haven't noticed yet.

I speak about shutting down and rebooting. I have not used reboot after 
writing an image for a long time now. Will test what happens in this case.

I had the issue before 2.6.24(-rc) already, thought I don't know whether there 
were times it worked. I use it way too seldom.
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Reboot problem (was: Re: [Suspend2-devel] What's in store for 2008 for TuxOnIce?)

2008-01-01 Thread Christian Hesse
On Tuesday 01 January 2008, Nigel Cunningham wrote:
> Third, regarding the patch itself, I'm taking my time in working towards
> the 3.0 release. We don't have any major bugs with 3.0-rc3 reported [...].

Well, I think I still have a bug, though it is possibly a mainline problem and 
it's not a showstopper. After a suspend/resume cycle the reboot does not 
work. The system hangs with "Rebooting system" (or similar). After that you 
have to hard reset the system, which is not really a problem as filesystems 
have been unmounted before. Reboot without a suspend cycle before and halt 
with and without suspend cycle work without problems.
I'm using toi 3.0-rc3 with kernel 2.6.24-rc6 and beside the problem described 
above I'm really happy with toi.

Happy new your to everybody!
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Reboot problem (was: Re: [Suspend2-devel] What's in store for 2008 for TuxOnIce?)

2008-01-01 Thread Christian Hesse
On Tuesday 01 January 2008, Nigel Cunningham wrote:
 Third, regarding the patch itself, I'm taking my time in working towards
 the 3.0 release. We don't have any major bugs with 3.0-rc3 reported [...].

Well, I think I still have a bug, though it is possibly a mainline problem and 
it's not a showstopper. After a suspend/resume cycle the reboot does not 
work. The system hangs with Rebooting system (or similar). After that you 
have to hard reset the system, which is not really a problem as filesystems 
have been unmounted before. Reboot without a suspend cycle before and halt 
with and without suspend cycle work without problems.
I'm using toi 3.0-rc3 with kernel 2.6.24-rc6 and beside the problem described 
above I'm really happy with toi.

Happy new your to everybody!
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Reboot problem

2008-01-01 Thread Christian Hesse
On Wednesday 02 January 2008, Nigel Cunningham wrote:
 Hi Christian.

 Christian Hesse wrote:
  On Tuesday 01 January 2008, Nigel Cunningham wrote:
  Third, regarding the patch itself, I'm taking my time in working towards
  the 3.0 release. We don't have any major bugs with 3.0-rc3 reported
  [...].
 
  Well, I think I still have a bug, though it is possibly a mainline
  problem and it's not a showstopper. After a suspend/resume cycle the
  reboot does not work. The system hangs with Rebooting system (or
  similar). After that you have to hard reset the system, which is not
  really a problem as filesystems have been unmounted before. Reboot
  without a suspend cycle before and halt with and without suspend cycle
  work without problems.

 Just to clarify, do you mean rebooting after writing an image, or
 shutting down and rebooting? It could be that there's some change to the
 semantics in 2.6.24 that I haven't noticed yet.

I speak about shutting down and rebooting. I have not used reboot after 
writing an image for a long time now. Will test what happens in this case.

I had the issue before 2.6.24(-rc) already, thought I don't know whether there 
were times it worked. I use it way too seldom.
-- 
Regards,
Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Christian Hesse
On Saturday 25 August 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
> > > sources (http://forums.gentoo.org/viewtopic-t-577970.html)
> >
> > Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.
>
> please try the patch below - does it fix the problem?

Works for me. Thanks a lot!
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Christian Hesse
On Saturday 25 August 2007, Fabio Comolli wrote:
> On 8/25/07, David Rodriguez <[EMAIL PROTECTED]> wrote:
> > I'm using 2.6.22.5 with cfs v20.3 and  suspend2  2.2.10.2.
> > With that combination, suspend is not working anymore (with cfs v19
> > was working).
> > Stops on suspend in "Suspending tasks"
> > Looking at cfs patch, I managed to change the  migration_thread,
> > adding again the  try_to_freeze() removed in last patch and now the
> > suspend finished, but resume not work. Of course I don't know why that
> > was removed, and rewriting it is not a solution, but I want to report
> > it.
>
> Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
> sources (http://forums.gentoo.org/viewtopic-t-577970.html)

Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Christian Hesse
On Saturday 25 August 2007, Fabio Comolli wrote:
 On 8/25/07, David Rodriguez [EMAIL PROTECTED] wrote:
  I'm using 2.6.22.5 with cfs v20.3 and  suspend2  2.2.10.2.
  With that combination, suspend is not working anymore (with cfs v19
  was working).
  Stops on suspend in Suspending tasks
  Looking at cfs patch, I managed to change the  migration_thread,
  adding again the  try_to_freeze() removed in last patch and now the
  suspend finished, but resume not work. Of course I don't know why that
  was removed, and rewriting it is not a solution, but I want to report
  it.

 Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
 sources (http://forums.gentoo.org/viewtopic-t-577970.html)

Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Christian Hesse
On Saturday 25 August 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
   Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
   sources (http://forums.gentoo.org/viewtopic-t-577970.html)
 
  Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.

 please try the patch below - does it fix the problem?

Works for me. Thanks a lot!
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


dm(-crypt) and /dev/disk/by-label/

2007-05-14 Thread Christian Hesse
Hello everybody,

If this is the wrong place to ask, please tell me where to ask instead.

I have ext3 filesystems with labels on devicemapper crypted devices. These do 
not show up in /dev/disk/by-label/, in contrast to filesystems of my "real" 
partitions. Is this the expected behaviour or what could go wrong?
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


dm(-crypt) and /dev/disk/by-label/

2007-05-14 Thread Christian Hesse
Hello everybody,

If this is the wrong place to ask, please tell me where to ask instead.

I have ext3 filesystems with labels on devicemapper crypted devices. These do 
not show up in /dev/disk/by-label/, in contrast to filesystems of my real 
partitions. Is this the expected behaviour or what could go wrong?
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: build system: no module target ending with slash?

2007-05-04 Thread Christian Hesse
On Thursday 03 May 2007, Sam Ravnborg wrote:
> On Thu, May 03, 2007 at 09:17:15AM +0200, Christian Hesse wrote:
> > On Thursday 03 May 2007, Sam Ravnborg wrote:
> > > On Thu, May 03, 2007 at 06:25:11AM +0200, Sam Ravnborg wrote:
> > > > On Thu, May 03, 2007 at 12:43:43AM +0200, Christian Hesse wrote:
> > > > > Hi James, hi everybody,
> > > > >
> > > > > playing with iwlwifi I try to patch it into the kernel and to build
> > > > > it from there. But I have a problem with the build system.
> > > > >
> > > > > The file drivers/net/wireless/mac80211/Makefile contains one single
> > > > > line:
> > > > >
> > > > > obj-$(CONFIG_IWLWIFI)   += iwlwifi/
> > > > >
> > > > > When CONFIG_IWLWIFI=m in scripts/Makefile.lib line 29 the target is
> > > > > filtered as it ends with a slash. That results in
> > > > > drivers/net/wireless/mac80211/built-in.o not being built and the
> > > > > build process breaks with an error. What is the correct way to
> > > > > handle this? Why are targets ending with a slash filtered?
> > > >
> > > > Looks buggy. I will take a look tonight.
> > >
> > > After some coffee...
> > >
> > > Line 29 in Kbuild.include find all modules and a directory is not a
> > > module. In line 26 in same file the directory iwlwifi is included in
> > > the list of directories to visit.
> > > So there is something else going on.
> >
> > In scripts/Kbuild.include line 26 is empty and line 29 is a comment... Do
> > I look at the wrong place?
>
> I looked at lxr.linux.no - so probarly an outdated version.
>
> > I still believe in my version: built-in.o is built if any of $(obj-y)
> > $(obj-m) $(obj-n) $(obj-) $(lib-target) contains anything in
> > scripts/Makefile.build line 77. As scripts/Makefile.lib line 29 filters
> > the only target the object file is not built.
>
> I have applied your patch and tried it out.
> The reason for the problem is the placeholder directory mac80211.
> kbuild will not waste time building built-in.o for a directory where
> it is not necessary. So for mac80211 no built-in.o is created since there
> is no need. The only reference is to a module.

Agreed that it is not really needed. But if you don't build it you should not 
try to link it later...

> The quick-and-dirty workaround is to add a single
> obj-n := xx
> in mac80211/Makefile and kbuild is happy again.
>
> I could teach kbuild to create built-in.o also in the case
> where we refer to a subdirectory only. But then we would end up with a
> built-in.o in all directories where we have a kbuild MAkefile (almost) and
> that is not desireable.

I would prefer to teach it not to link object files that are not built.

> So I recommend the proposed workaround for now with a proper comment.

Ok, thanks for your help.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: build system: no module target ending with slash?

2007-05-04 Thread Christian Hesse
On Thursday 03 May 2007, Sam Ravnborg wrote:
 On Thu, May 03, 2007 at 09:17:15AM +0200, Christian Hesse wrote:
  On Thursday 03 May 2007, Sam Ravnborg wrote:
   On Thu, May 03, 2007 at 06:25:11AM +0200, Sam Ravnborg wrote:
On Thu, May 03, 2007 at 12:43:43AM +0200, Christian Hesse wrote:
 Hi James, hi everybody,

 playing with iwlwifi I try to patch it into the kernel and to build
 it from there. But I have a problem with the build system.

 The file drivers/net/wireless/mac80211/Makefile contains one single
 line:

 obj-$(CONFIG_IWLWIFI)   += iwlwifi/

 When CONFIG_IWLWIFI=m in scripts/Makefile.lib line 29 the target is
 filtered as it ends with a slash. That results in
 drivers/net/wireless/mac80211/built-in.o not being built and the
 build process breaks with an error. What is the correct way to
 handle this? Why are targets ending with a slash filtered?
   
Looks buggy. I will take a look tonight.
  
   After some coffee...
  
   Line 29 in Kbuild.include find all modules and a directory is not a
   module. In line 26 in same file the directory iwlwifi is included in
   the list of directories to visit.
   So there is something else going on.
 
  In scripts/Kbuild.include line 26 is empty and line 29 is a comment... Do
  I look at the wrong place?

 I looked at lxr.linux.no - so probarly an outdated version.

  I still believe in my version: built-in.o is built if any of $(obj-y)
  $(obj-m) $(obj-n) $(obj-) $(lib-target) contains anything in
  scripts/Makefile.build line 77. As scripts/Makefile.lib line 29 filters
  the only target the object file is not built.

 I have applied your patch and tried it out.
 The reason for the problem is the placeholder directory mac80211.
 kbuild will not waste time building built-in.o for a directory where
 it is not necessary. So for mac80211 no built-in.o is created since there
 is no need. The only reference is to a module.

Agreed that it is not really needed. But if you don't build it you should not 
try to link it later...

 The quick-and-dirty workaround is to add a single
 obj-n := xx
 in mac80211/Makefile and kbuild is happy again.

 I could teach kbuild to create built-in.o also in the case
 where we refer to a subdirectory only. But then we would end up with a
 built-in.o in all directories where we have a kbuild MAkefile (almost) and
 that is not desireable.

I would prefer to teach it not to link object files that are not built.

 So I recommend the proposed workaround for now with a proper comment.

Ok, thanks for your help.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: build system: no module target ending with slash?

2007-05-03 Thread Christian Hesse
On Thursday 03 May 2007, Sam Ravnborg wrote:
> On Thu, May 03, 2007 at 06:25:11AM +0200, Sam Ravnborg wrote:
> > On Thu, May 03, 2007 at 12:43:43AM +0200, Christian Hesse wrote:
> > > Hi James, hi everybody,
> > >
> > > playing with iwlwifi I try to patch it into the kernel and to build it
> > > from there. But I have a problem with the build system.
> > >
> > > The file drivers/net/wireless/mac80211/Makefile contains one single
> > > line:
> > >
> > > obj-$(CONFIG_IWLWIFI)   += iwlwifi/
> > >
> > > When CONFIG_IWLWIFI=m in scripts/Makefile.lib line 29 the target is
> > > filtered as it ends with a slash. That results in
> > > drivers/net/wireless/mac80211/built-in.o not being built and the build
> > > process breaks with an error. What is the correct way to handle this?
> > > Why are targets ending with a slash filtered?
> >
> > Looks buggy. I will take a look tonight.
>
> After some coffee...
>
> Line 29 in Kbuild.include find all modules and a directory is not a module.
> In line 26 in same file the directory iwlwifi is included in the list
> of directories to visit.
> So there is something else going on.

In scripts/Kbuild.include line 26 is empty and line 29 is a comment... Do I 
look at the wrong place?

I still believe in my version: built-in.o is built if any of $(obj-y) $(obj-m) 
$(obj-n) $(obj-) $(lib-target) contains anything in scripts/Makefile.build 
line 77. As scripts/Makefile.lib line 29 filters the only target the object 
file is not built.

> Anywhere I can get access to the combined source or could you try to post
> the full Makefile.

I just generated a patch [0] against vanilla 2.6.21 with latest mac80211 and 
iwlwifi from git. Get my config [1] and you should get my error.

[0] http://www.eworm.de/tmp/iwlwifi.patch.bz2
[1] http://www.eworm.de/tmp/config-iwlwifi-2.6.21
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: build system: no module target ending with slash?

2007-05-03 Thread Christian Hesse
On Thursday 03 May 2007, Sam Ravnborg wrote:
 On Thu, May 03, 2007 at 06:25:11AM +0200, Sam Ravnborg wrote:
  On Thu, May 03, 2007 at 12:43:43AM +0200, Christian Hesse wrote:
   Hi James, hi everybody,
  
   playing with iwlwifi I try to patch it into the kernel and to build it
   from there. But I have a problem with the build system.
  
   The file drivers/net/wireless/mac80211/Makefile contains one single
   line:
  
   obj-$(CONFIG_IWLWIFI)   += iwlwifi/
  
   When CONFIG_IWLWIFI=m in scripts/Makefile.lib line 29 the target is
   filtered as it ends with a slash. That results in
   drivers/net/wireless/mac80211/built-in.o not being built and the build
   process breaks with an error. What is the correct way to handle this?
   Why are targets ending with a slash filtered?
 
  Looks buggy. I will take a look tonight.

 After some coffee...

 Line 29 in Kbuild.include find all modules and a directory is not a module.
 In line 26 in same file the directory iwlwifi is included in the list
 of directories to visit.
 So there is something else going on.

In scripts/Kbuild.include line 26 is empty and line 29 is a comment... Do I 
look at the wrong place?

I still believe in my version: built-in.o is built if any of $(obj-y) $(obj-m) 
$(obj-n) $(obj-) $(lib-target) contains anything in scripts/Makefile.build 
line 77. As scripts/Makefile.lib line 29 filters the only target the object 
file is not built.

 Anywhere I can get access to the combined source or could you try to post
 the full Makefile.

I just generated a patch [0] against vanilla 2.6.21 with latest mac80211 and 
iwlwifi from git. Get my config [1] and you should get my error.

[0] http://www.eworm.de/tmp/iwlwifi.patch.bz2
[1] http://www.eworm.de/tmp/config-iwlwifi-2.6.21
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


build system: no module target ending with slash?

2007-05-02 Thread Christian Hesse
Hi James, hi everybody,

playing with iwlwifi I try to patch it into the kernel and to build it from 
there. But I have a problem with the build system.

The file drivers/net/wireless/mac80211/Makefile contains one single line:

obj-$(CONFIG_IWLWIFI)   += iwlwifi/

When CONFIG_IWLWIFI=m in scripts/Makefile.lib line 29 the target is filtered 
as it ends with a slash. That results in 
drivers/net/wireless/mac80211/built-in.o not being built and the build 
process breaks with an error. What is the correct way to handle this? Why are 
targets ending with a slash filtered?
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


build system: no module target ending with slash?

2007-05-02 Thread Christian Hesse
Hi James, hi everybody,

playing with iwlwifi I try to patch it into the kernel and to build it from 
there. But I have a problem with the build system.

The file drivers/net/wireless/mac80211/Makefile contains one single line:

obj-$(CONFIG_IWLWIFI)   += iwlwifi/

When CONFIG_IWLWIFI=m in scripts/Makefile.lib line 29 the target is filtered 
as it ends with a slash. That results in 
drivers/net/wireless/mac80211/built-in.o not being built and the build 
process breaks with an error. What is the correct way to handle this? Why are 
targets ending with a slash filtered?
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Christian Hesse
On Wednesday 25 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > On Monday 23 April 2007, Ingo Molnar wrote:
> > > i'm pleased to announce release -v5 of the CFS scheduler patchset.
> >
> > Hi Ingo,
> >
> > I just noticed that with cfs all processes (except some kernel
> > threads) run on cpu 0. I don't think this is expected cpu affinity for
> > an smp system? I remember about half of the processes running on each
> > core with mainline.
>
> i've got several SMP systems with CFS and all distribute the load
> properly to all CPUs, so it would be nice if you could tell me more
> about how the problem manifests itself on your system.
>
> for example, if you start two infinite loops:
>
> for (( N=0; N < 2; N++ )); do ( while :; do :; done ) & done
>
> do they end up on the same CPU?
>
> Or do you mean that the default placement of single tasks starts at
> CPU#0, while with mainline they were alternating?

That was not your fault. I updated suspend2 to 2.2.9.13 and everything works 
as expected again. Sorry for the noise.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Christian Hesse
On Wednesday 25 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
  On Monday 23 April 2007, Ingo Molnar wrote:
   i'm pleased to announce release -v5 of the CFS scheduler patchset.
 
  Hi Ingo,
 
  I just noticed that with cfs all processes (except some kernel
  threads) run on cpu 0. I don't think this is expected cpu affinity for
  an smp system? I remember about half of the processes running on each
  core with mainline.

 i've got several SMP systems with CFS and all distribute the load
 properly to all CPUs, so it would be nice if you could tell me more
 about how the problem manifests itself on your system.

 for example, if you start two infinite loops:

 for (( N=0; N  2; N++ )); do ( while :; do :; done )  done

 do they end up on the same CPU?

 Or do you mean that the default placement of single tasks starts at
 CPU#0, while with mainline they were alternating?

That was not your fault. I updated suspend2 to 2.2.9.13 and everything works 
as expected again. Sorry for the noise.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [patch] CFS scheduler, -v5

2007-04-24 Thread Christian Hesse
On Monday 23 April 2007, Ingo Molnar wrote:
> i'm pleased to announce release -v5 of the CFS scheduler patchset.

Hi Ingo,

I just noticed that with cfs all processes (except some kernel threads) run on 
cpu 0. I don't think this is expected cpu affinity for an smp system? I 
remember about half of the processes running on each core with mainline.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: crash with CFS v4 and qemu/kvm (was: [patch] CFS scheduler, v4)

2007-04-24 Thread Christian Hesse
On Monday 23 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > On Friday 20 April 2007, Ingo Molnar wrote:
> > > i'm pleased to announce release -v4 of the CFS patchset.
> >
> > Hi Ingo, hi Avi, hi all,
> >
> > I'm trying to use kvm-20 with cfs v4 and get a crash:
> >
> > [EMAIL PROTECTED]:~$ /usr/local/kvm/bin/qemu -snapshot
> > /mnt/data/virtual/qemu/winxp.img kvm_run: failed entry, reason 7
> > kvm_run returned -8
> >
> > It works (though it is a bit slow) if I start qemu with strace, so for
> > me it looks like a race condition?
>
> hm. Can you work it around with:
>
>echo 0 > /proc/sys/kernel/sched_granularity_ns
>
> ?
>
> If yes then this is a wakeup race: some piece of code relies on the
> upstream scheduler preempting the waker task immediately in 99% of the
> cases.
>
> and you might want to test -v5 too which i released earlier today. It
> has no bugfix in this area though, so it will likely still trigger this
> race - but it will also hopefully be even more pleasant to use than -v4
> ;-)

Hi Ingo,

This was kvm's fault. It works perfectly now without modifications to the 
scheduler. If anybody is interested in details please see the kvm mailing 
list archives.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: crash with CFS v4 and qemu/kvm (was: [patch] CFS scheduler, v4)

2007-04-24 Thread Christian Hesse
On Monday 23 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
  On Friday 20 April 2007, Ingo Molnar wrote:
   i'm pleased to announce release -v4 of the CFS patchset.
 
  Hi Ingo, hi Avi, hi all,
 
  I'm trying to use kvm-20 with cfs v4 and get a crash:
 
  [EMAIL PROTECTED]:~$ /usr/local/kvm/bin/qemu -snapshot
  /mnt/data/virtual/qemu/winxp.img kvm_run: failed entry, reason 7
  kvm_run returned -8
 
  It works (though it is a bit slow) if I start qemu with strace, so for
  me it looks like a race condition?

 hm. Can you work it around with:

echo 0  /proc/sys/kernel/sched_granularity_ns

 ?

 If yes then this is a wakeup race: some piece of code relies on the
 upstream scheduler preempting the waker task immediately in 99% of the
 cases.

 and you might want to test -v5 too which i released earlier today. It
 has no bugfix in this area though, so it will likely still trigger this
 race - but it will also hopefully be even more pleasant to use than -v4
 ;-)

Hi Ingo,

This was kvm's fault. It works perfectly now without modifications to the 
scheduler. If anybody is interested in details please see the kvm mailing 
list archives.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [patch] CFS scheduler, -v5

2007-04-24 Thread Christian Hesse
On Monday 23 April 2007, Ingo Molnar wrote:
 i'm pleased to announce release -v5 of the CFS scheduler patchset.

Hi Ingo,

I just noticed that with cfs all processes (except some kernel threads) run on 
cpu 0. I don't think this is expected cpu affinity for an smp system? I 
remember about half of the processes running on each core with mainline.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


crash with CFS v4 and qemu/kvm (was: [patch] CFS scheduler, v4)

2007-04-23 Thread Christian Hesse
On Friday 20 April 2007, Ingo Molnar wrote:
> i'm pleased to announce release -v4 of the CFS patchset.

Hi Ingo, hi Avi, hi all,

I'm trying to use kvm-20 with cfs v4 and get a crash:

[EMAIL PROTECTED]:~$ /usr/local/kvm/bin/qemu -snapshot 
/mnt/data/virtual/qemu/winxp.img
kvm_run: failed entry, reason 7
kvm_run returned -8

It works (though it is a bit slow) if I start qemu with strace, so for me it 
looks like a race condition?

I did not test any earlier versions of cfs and kvm in combination - I can't 
say if it happens there as well.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


crash with CFS v4 and qemu/kvm (was: [patch] CFS scheduler, v4)

2007-04-23 Thread Christian Hesse
On Friday 20 April 2007, Ingo Molnar wrote:
 i'm pleased to announce release -v4 of the CFS patchset.

Hi Ingo, hi Avi, hi all,

I'm trying to use kvm-20 with cfs v4 and get a crash:

[EMAIL PROTECTED]:~$ /usr/local/kvm/bin/qemu -snapshot 
/mnt/data/virtual/qemu/winxp.img
kvm_run: failed entry, reason 7
kvm_run returned -8

It works (though it is a bit slow) if I start qemu with strace, so for me it 
looks like a race condition?

I did not test any earlier versions of cfs and kvm in combination - I can't 
say if it happens there as well.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-19 Thread Christian Hesse
On Thursday 19 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > I now got some error message from my system:
> >
> > http://www.eworm.de/tmp/cfs-suspend.jpg
>
> ah, this pinpoints a bug: for performance reasons pick_next_task()
> assumes that the runqueue is not empty - which is true for schedule(),
> but not in migrate_dead_tasks(). Does the patch below fix the crash for
> you?
>
>  kernel/sched.c |2 ++
>  1 file changed, 2 insertions(+)
>
> Index: linux/kernel/sched.c
> ===
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -4425,6 +4425,8 @@ static void migrate_dead_tasks(unsigned
>   struct task_struct *next;
>
>   for (;;) {
> + if (!rq->nr_running)
> + break;
>   next = pick_next_task(rq, rq->curr);
>   if (!next)
>   break;

Suspend works perfectly with this patch. Thanks a lot and keep up the good 
work!
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-19 Thread Christian Hesse
On Thursday 19 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
  I now got some error message from my system:
 
  http://www.eworm.de/tmp/cfs-suspend.jpg

 ah, this pinpoints a bug: for performance reasons pick_next_task()
 assumes that the runqueue is not empty - which is true for schedule(),
 but not in migrate_dead_tasks(). Does the patch below fix the crash for
 you?

  kernel/sched.c |2 ++
  1 file changed, 2 insertions(+)

 Index: linux/kernel/sched.c
 ===
 --- linux.orig/kernel/sched.c
 +++ linux/kernel/sched.c
 @@ -4425,6 +4425,8 @@ static void migrate_dead_tasks(unsigned
   struct task_struct *next;

   for (;;) {
 + if (!rq-nr_running)
 + break;
   next = pick_next_task(rq, rq-curr);
   if (!next)
   break;

Suspend works perfectly with this patch. Thanks a lot and keep up the good 
work!
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Thursday 19 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> >
> > And it still hangs on suspend.
>
> i just tried the same and it suspended+resumed just fine:
>
> Restarting tasks ... done.
> Suspend2 debugging info:
> - Suspend core   : 2.2.9.12
> - Kernel Version : 2.6.21-rc7-CFS-v3
> - Compiler vers. : 4.0
> - Attempt number : 2
> - Parameters : 0 81920 0 0 0 0
> - Overall expected compression percentage: 0.
> - Compressor is 'lzf'.
>   Compressed 31133696 bytes into 14880587 (52 percent compression).
> - SwapAllocator active.
>   Swap available for image: 512036 pages.
> - FileAllocator inactive.
> - I/O speed: Write 76 MB/s, Read 42 MB/s.
> - Extra pages: 18 used/500.
>
> could you send me your .config?

My config is attached.

I now got some error message from my system:

http://www.eworm.de/tmp/cfs-suspend.jpg
-- 
Regards,
Chris
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc7-r1
# Wed Apr 18 22:25:20 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_IKPATCHES=y
CONFIG_IKPATCHES_PROC=y
# CONFIG_CPUSETS is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
# CONFIG_HPET_TIMER is not set
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not s

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Thursday 19 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > although probably your suspend2 problem is still not fixed, it's
> > > worth a try nevertheless. Which suspend2 patch did you apply, and
> > > was it against -rc6 or -rc7?
> >
> > You are right again. ;-)
> >
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> >
> > And it still hangs on suspend.
>
> what's the easiest way for me to try suspend2? Apply the patch, reboot
> into the kernel, then execute what command to suspend? (there's a
> confusing mismash of initiators of all the suspend variants. Can i drive
> this by echoing to /sys/power/state?)

Perhaps you have to install suspend2-userui as well for the output (I'm not 
shure whether it works without). Then you can trigger the suspend by echoing 
to /sys/power/suspend2/do_suspend.
Useful informations can be found in the Howto:

http://www.suspend2.net/HOWTO

I dropped some ccs to not abuse Linus and friends.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Wednesday 18 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > i took a quick look at suspend2 and it makes some use of yield().
> > > There's a bug in CFS's yield code, i've attached a patch that should
> > > fix it, does it make any difference to the hang?
> >
> > This patch should apply cleanly against what? The second hunk is
> > ignored as it has already been applied. Is this correct?
>
> hm, i think you might have had one of the earlier CFS patches.

You are right.

> > But no, it does not change anything. Let me know if you have any other
> > patches to test.
>
> could you try the -v3 patch i released a few hours ago:
>
>http://redhat.com/~mingo/cfs-scheduler/
>
> although probably your suspend2 problem is still not fixed, it's worth a
> try nevertheless. Which suspend2 patch did you apply, and was it against
> -rc6 or -rc7?

You are right again. ;-)

Linux 2.6.21-rc7
Suspend2 2.2.9.11 (applies cleanly to -rc7)
CFS v3 (without any additional patches)

And it still hangs on suspend.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Wednesday 18 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > On Friday 13 April 2007, Ingo Molnar wrote:
> > > as usual, any sort of feedback, bugreports, fixes and suggestions are
> > > more than welcome,
> >
> > When trying to suspend a system patched
> > with suspend2 2.2.9.11 it hangs with "doing atomic copy". Pressing the
> > ESC key results in a message that it tries to abort suspend, but then
> > still hangs.
>
> i took a quick look at suspend2 and it makes some use of yield().
> There's a bug in CFS's yield code, i've attached a patch that should fix
> it, does it make any difference to the hang?

This patch should apply cleanly against what? The second hunk is ignored as it 
has already been applied. Is this correct?

But no, it does not change anything. Let me know if you have any other patches 
to test.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


CFS and suspend2: hang in atomic copy (was: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS])

2007-04-18 Thread Christian Hesse
Hi Ingo and all,

On Friday 13 April 2007, Ingo Molnar wrote:
> as usual, any sort of feedback, bugreports, fixes and suggestions are
> more than welcome,

I just gave CFS a try on my system. From a user's point of view it looks good 
so far. Thanks for your work.

However I found a problem: When trying to suspend a system patched with 
suspend2 2.2.9.11 it hangs with "doing atomic copy". Pressing the ESC key 
results in a message that it tries to abort suspend, but then still hangs.

I cced suspend2 devel list, perhaps Nigel is interested as well.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


CFS and suspend2: hang in atomic copy (was: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS])

2007-04-18 Thread Christian Hesse
Hi Ingo and all,

On Friday 13 April 2007, Ingo Molnar wrote:
 as usual, any sort of feedback, bugreports, fixes and suggestions are
 more than welcome,

I just gave CFS a try on my system. From a user's point of view it looks good 
so far. Thanks for your work.

However I found a problem: When trying to suspend a system patched with 
suspend2 2.2.9.11 it hangs with doing atomic copy. Pressing the ESC key 
results in a message that it tries to abort suspend, but then still hangs.

I cced suspend2 devel list, perhaps Nigel is interested as well.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Wednesday 18 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
  On Friday 13 April 2007, Ingo Molnar wrote:
   as usual, any sort of feedback, bugreports, fixes and suggestions are
   more than welcome,
 
  When trying to suspend a system patched
  with suspend2 2.2.9.11 it hangs with doing atomic copy. Pressing the
  ESC key results in a message that it tries to abort suspend, but then
  still hangs.

 i took a quick look at suspend2 and it makes some use of yield().
 There's a bug in CFS's yield code, i've attached a patch that should fix
 it, does it make any difference to the hang?

This patch should apply cleanly against what? The second hunk is ignored as it 
has already been applied. Is this correct?

But no, it does not change anything. Let me know if you have any other patches 
to test.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Wednesday 18 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
   i took a quick look at suspend2 and it makes some use of yield().
   There's a bug in CFS's yield code, i've attached a patch that should
   fix it, does it make any difference to the hang?
 
  This patch should apply cleanly against what? The second hunk is
  ignored as it has already been applied. Is this correct?

 hm, i think you might have had one of the earlier CFS patches.

You are right.

  But no, it does not change anything. Let me know if you have any other
  patches to test.

 could you try the -v3 patch i released a few hours ago:

http://redhat.com/~mingo/cfs-scheduler/

 although probably your suspend2 problem is still not fixed, it's worth a
 try nevertheless. Which suspend2 patch did you apply, and was it against
 -rc6 or -rc7?

You are right again. ;-)

Linux 2.6.21-rc7
Suspend2 2.2.9.11 (applies cleanly to -rc7)
CFS v3 (without any additional patches)

And it still hangs on suspend.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Thursday 19 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
   although probably your suspend2 problem is still not fixed, it's
   worth a try nevertheless. Which suspend2 patch did you apply, and
   was it against -rc6 or -rc7?
 
  You are right again. ;-)
 
  Linux 2.6.21-rc7
  Suspend2 2.2.9.11 (applies cleanly to -rc7)
  CFS v3 (without any additional patches)
 
  And it still hangs on suspend.

 what's the easiest way for me to try suspend2? Apply the patch, reboot
 into the kernel, then execute what command to suspend? (there's a
 confusing mismash of initiators of all the suspend variants. Can i drive
 this by echoing to /sys/power/state?)

Perhaps you have to install suspend2-userui as well for the output (I'm not 
shure whether it works without). Then you can trigger the suspend by echoing 
to /sys/power/suspend2/do_suspend.
Useful informations can be found in the Howto:

http://www.suspend2.net/HOWTO

I dropped some ccs to not abuse Linus and friends.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse
On Thursday 19 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
  Linux 2.6.21-rc7
  Suspend2 2.2.9.11 (applies cleanly to -rc7)
  CFS v3 (without any additional patches)
 
  And it still hangs on suspend.

 i just tried the same and it suspended+resumed just fine:

 Restarting tasks ... done.
 Suspend2 debugging info:
 - Suspend core   : 2.2.9.12
 - Kernel Version : 2.6.21-rc7-CFS-v3
 - Compiler vers. : 4.0
 - Attempt number : 2
 - Parameters : 0 81920 0 0 0 0
 - Overall expected compression percentage: 0.
 - Compressor is 'lzf'.
   Compressed 31133696 bytes into 14880587 (52 percent compression).
 - SwapAllocator active.
   Swap available for image: 512036 pages.
 - FileAllocator inactive.
 - I/O speed: Write 76 MB/s, Read 42 MB/s.
 - Extra pages: 18 used/500.

 could you send me your .config?

My config is attached.

I now got some error message from my system:

http://www.eworm.de/tmp/cfs-suspend.jpg
-- 
Regards,
Chris
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc7-r1
# Wed Apr 18 22:25:20 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_IKPATCHES=y
CONFIG_IKPATCHES_PROC=y
# CONFIG_CPUSETS is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
# CONFIG_HPET_TIMER is not set
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
CONFIG_X86_LOCAL_APIC=y

Re: Hang at resume with AC adapter not plugged

2005-08-12 Thread Christian Hesse
On Tuesday 09 August 2005 07:45, Nigel Cunningham wrote:
> Hi Christian.
>
> On Tue, 2005-08-09 at 15:41, Christian Hesse wrote:
> > Hi everybody,
> >
> > I have a little problem with software suspend 2.1.9.1[012] on
> > 2.6.13-rc[3456]. The system hangs on resume if the AC adapter is not
> > plugged in. Everything works well if I use 2.1.9.5 on 2.6.12.x or plug in
> > the AC adapter. I've tried acpi-20050729 for 2.6.13-rc6 but that did not
> > change anything. The system is a Sumsung X10.
> >
> > Any ideas what could be the problem?
>
> Do you have the ACPI modules compiled in, or built as modules? I'd
> suggest that you try building them as modules and unloading while
> suspending if you're not doing that already.

Sometimes (very seldom) it also hangs if the AC adapter is plugged in, so I 
tested some more and found another interesting fact: It boots just fine if I 
use splash=verbose insted of splash=silent (even without AC adapter). I've 
patched the kernel with fbsplash-0.9.2-r4-2.6.13-rc[16].

Any idea what could be the cause?

-- 
Christian


pgp6RenGRl0SO.pgp
Description: PGP signature


Re: Hang at resume with AC adapter not plugged

2005-08-12 Thread Christian Hesse
On Tuesday 09 August 2005 07:45, Nigel Cunningham wrote:
 Hi Christian.

 On Tue, 2005-08-09 at 15:41, Christian Hesse wrote:
  Hi everybody,
 
  I have a little problem with software suspend 2.1.9.1[012] on
  2.6.13-rc[3456]. The system hangs on resume if the AC adapter is not
  plugged in. Everything works well if I use 2.1.9.5 on 2.6.12.x or plug in
  the AC adapter. I've tried acpi-20050729 for 2.6.13-rc6 but that did not
  change anything. The system is a Sumsung X10.
 
  Any ideas what could be the problem?

 Do you have the ACPI modules compiled in, or built as modules? I'd
 suggest that you try building them as modules and unloading while
 suspending if you're not doing that already.

Sometimes (very seldom) it also hangs if the AC adapter is plugged in, so I 
tested some more and found another interesting fact: It boots just fine if I 
use splash=verbose insted of splash=silent (even without AC adapter). I've 
patched the kernel with fbsplash-0.9.2-r4-2.6.13-rc[16].

Any idea what could be the cause?

-- 
Christian


pgp6RenGRl0SO.pgp
Description: PGP signature


Hang at resume with AC adapter not plugged

2005-08-08 Thread Christian Hesse
Hi everybody,

I have a little problem with software suspend 2.1.9.1[012] on 2.6.13-rc[3456]. 
The system hangs on resume if the AC adapter is not plugged in. Everything 
works well if I use 2.1.9.5 on 2.6.12.x or plug in the AC adapter. I've tried 
acpi-20050729 for 2.6.13-rc6 but that did not change anything. The system is 
a Sumsung X10.

Any ideas what could be the problem?
-- 
Regards,
Christian


pgpXWfjUC3Ut5.pgp
Description: PGP signature


Hang at resume with AC adapter not plugged

2005-08-08 Thread Christian Hesse
Hi everybody,

I have a little problem with software suspend 2.1.9.1[012] on 2.6.13-rc[3456]. 
The system hangs on resume if the AC adapter is not plugged in. Everything 
works well if I use 2.1.9.5 on 2.6.12.x or plug in the AC adapter. I've tried 
acpi-20050729 for 2.6.13-rc6 but that did not change anything. The system is 
a Sumsung X10.

Any ideas what could be the problem?
-- 
Regards,
Christian


pgpXWfjUC3Ut5.pgp
Description: PGP signature


Re: 2.6.12-ck4

2005-07-27 Thread Christian Hesse
On Wednesday 27 July 2005 13:11, Con Kolivas wrote:
> HZ-864.diff
> +My take on the never ending config HZ debate. Apart from the number not
> being pleasing on the eyes, a HZ value that isn't a multiple of 10 is
> perfectly valid. Setting HZ to 864 gives us very similar low latency
> performance to a 1000HZ kernel, decreases overhead ever so slightly, and
> minimises clock drift substantially. The -server patch uses HZ=82 for
> similar reasons, with the emphasis on throughput rather than low latency.
> Madness? Probably, but then I can't see any valid argument against using
> these values.

Some time ago I tried with HZ=209, but the system then freezes after a few 
minutes... Any ideas what could be the reason? Are only even numbers allowed?

-- 
Christian


pgplW4wgxiC6H.pgp
Description: PGP signature


Re: 2.6.12-ck4

2005-07-27 Thread Christian Hesse
On Wednesday 27 July 2005 13:11, Con Kolivas wrote:
 HZ-864.diff
 +My take on the never ending config HZ debate. Apart from the number not
 being pleasing on the eyes, a HZ value that isn't a multiple of 10 is
 perfectly valid. Setting HZ to 864 gives us very similar low latency
 performance to a 1000HZ kernel, decreases overhead ever so slightly, and
 minimises clock drift substantially. The -server patch uses HZ=82 for
 similar reasons, with the emphasis on throughput rather than low latency.
 Madness? Probably, but then I can't see any valid argument against using
 these values.

Some time ago I tried with HZ=209, but the system then freezes after a few 
minutes... Any ideas what could be the reason? Are only even numbers allowed?

-- 
Christian


pgplW4wgxiC6H.pgp
Description: PGP signature


Re: list patches in kernel

2005-07-26 Thread Christian Hesse
On Tuesday 26 July 2005 19:49, Brad Tilley wrote:
> Is there an easy way to make a running kernel display how it has been
> patched from vanilla? Probably not, but I thought I'd ask.

I provided a patch some time ago (search google groups for "kernel .patches 
support"), that works like the config.gz in /proc. But the majority didn't 
like it...

-- 
Christian


pgpbyERVMMp2I.pgp
Description: PGP signature


Re: list patches in kernel

2005-07-26 Thread Christian Hesse
On Tuesday 26 July 2005 19:49, Brad Tilley wrote:
 Is there an easy way to make a running kernel display how it has been
 patched from vanilla? Probably not, but I thought I'd ask.

I provided a patch some time ago (search google groups for kernel .patches 
support), that works like the config.gz in /proc. But the majority didn't 
like it...

-- 
Christian


pgpbyERVMMp2I.pgp
Description: PGP signature


Re: [2.6.12.3] dyntick 050610-1 breaks makes S3 suspend

2005-07-25 Thread Christian Hesse
On Monday 25 July 2005 12:27, Tony Lindgren wrote:
> * Christian Hesse <[EMAIL PROTECTED]> [050723 05:51]:
> > On Saturday 23 July 2005 14:35, Jan De Luyck wrote:
> > > Hello,
> > >
> > > I recently tried out dyntick 050610-1 against 2.6.12.3, works great, it
> > > actually makes a noticeable difference on my laptop's battery life. I
> > > don't have hard numbers, lets just say that instead of the usual ~3
> > > hours i get out of it, i was ~4 before it started nagging, usual use
> > > pattern at work.
> > >
> > > The only gripe I have with it that it stops S3 from working. If the
> > > patch is compiled in the kernel, it makes S3 suspend correctly, but
> > > resuming goes into a solid hang (nothing get's it back alive, have to
> > > keep the powerbutton for ~5 secs to shutdown the system)
> > >
> > > Anything I could test? The logs don't give anything useful..
> >
> > I reported this some time ago [1], but there's no sulution so far...
> >
> > [1] http://groups.google.com/groups?selm=4b4NI-7mJ-9%40gated-at.bofh.it
>
> In theory it should not happen... And it's working on my laptop for resume
> just fine with dyntick on. Can you try it without APIC support? Maybe
> that's the differerence again. (I don't have APIC on my laptop)

[EMAIL PROTECTED]:~# zcat /proc/config.gz | grep APIC
CONFIG_X86_GOOD_APIC=y
# CONFIG_X86_UP_APIC is not set

Only the second one can be changed in make (menu)config. So I think this is 
what you have?

> Also a workaround is to disable dyntick before suspend with:
>
> # echo 0 > /sys/devices/system/timer/timer0/dyn_tick_state
>
> and then enable it again after resume.

IIRC, this didn't work, system hangs at resume as well. Will try again if 
you've released an updated version.

-- 
Christian


pgpeNQS7NjPD8.pgp
Description: PGP signature


Re: [2.6.12.3] dyntick 050610-1 breaks makes S3 suspend

2005-07-25 Thread Christian Hesse
On Monday 25 July 2005 12:27, Tony Lindgren wrote:
 * Christian Hesse [EMAIL PROTECTED] [050723 05:51]:
  On Saturday 23 July 2005 14:35, Jan De Luyck wrote:
   Hello,
  
   I recently tried out dyntick 050610-1 against 2.6.12.3, works great, it
   actually makes a noticeable difference on my laptop's battery life. I
   don't have hard numbers, lets just say that instead of the usual ~3
   hours i get out of it, i was ~4 before it started nagging, usual use
   pattern at work.
  
   The only gripe I have with it that it stops S3 from working. If the
   patch is compiled in the kernel, it makes S3 suspend correctly, but
   resuming goes into a solid hang (nothing get's it back alive, have to
   keep the powerbutton for ~5 secs to shutdown the system)
  
   Anything I could test? The logs don't give anything useful..
 
  I reported this some time ago [1], but there's no sulution so far...
 
  [1] http://groups.google.com/groups?selm=4b4NI-7mJ-9%40gated-at.bofh.it

 In theory it should not happen... And it's working on my laptop for resume
 just fine with dyntick on. Can you try it without APIC support? Maybe
 that's the differerence again. (I don't have APIC on my laptop)

[EMAIL PROTECTED]:~# zcat /proc/config.gz | grep APIC
CONFIG_X86_GOOD_APIC=y
# CONFIG_X86_UP_APIC is not set

Only the second one can be changed in make (menu)config. So I think this is 
what you have?

 Also a workaround is to disable dyntick before suspend with:

 # echo 0  /sys/devices/system/timer/timer0/dyn_tick_state

 and then enable it again after resume.

IIRC, this didn't work, system hangs at resume as well. Will try again if 
you've released an updated version.

-- 
Christian


pgpeNQS7NjPD8.pgp
Description: PGP signature


Re: [2.6.12.3] dyntick 050610-1 breaks makes S3 suspend

2005-07-23 Thread Christian Hesse
On Saturday 23 July 2005 14:35, Jan De Luyck wrote:
> Hello,
>
> I recently tried out dyntick 050610-1 against 2.6.12.3, works great, it
> actually makes a noticeable difference on my laptop's battery life. I don't
> have hard numbers, lets just say that instead of the usual ~3 hours i get
> out of it, i was ~4 before it started nagging, usual use pattern at work.
>
> The only gripe I have with it that it stops S3 from working. If the patch
> is compiled in the kernel, it makes S3 suspend correctly, but resuming goes
> into a solid hang (nothing get's it back alive, have to keep the
> powerbutton for ~5 secs to shutdown the system)
>
> Anything I could test? The logs don't give anything useful..

I reported this some time ago [1], but there's no sulution so far...

[1] http://groups.google.com/groups?selm=4b4NI-7mJ-9%40gated-at.bofh.it

-- 
Christian


pgpyBhP39QXNl.pgp
Description: PGP signature


Re: [2.6.12.3] dyntick 050610-1 breaks makes S3 suspend

2005-07-23 Thread Christian Hesse
On Saturday 23 July 2005 14:35, Jan De Luyck wrote:
 Hello,

 I recently tried out dyntick 050610-1 against 2.6.12.3, works great, it
 actually makes a noticeable difference on my laptop's battery life. I don't
 have hard numbers, lets just say that instead of the usual ~3 hours i get
 out of it, i was ~4 before it started nagging, usual use pattern at work.

 The only gripe I have with it that it stops S3 from working. If the patch
 is compiled in the kernel, it makes S3 suspend correctly, but resuming goes
 into a solid hang (nothing get's it back alive, have to keep the
 powerbutton for ~5 secs to shutdown the system)

 Anything I could test? The logs don't give anything useful..

I reported this some time ago [1], but there's no sulution so far...

[1] http://groups.google.com/groups?selm=4b4NI-7mJ-9%40gated-at.bofh.it

-- 
Christian


pgpyBhP39QXNl.pgp
Description: PGP signature