[kvm-devel] kvm-27 host oopses

2007-07-02 Thread Dave Hansen
I had a host running kvm-27 oops on me last week.  The system had been
up for about 2 weeks, and had probably run and stopped at least a couple
hundred kvm guests.  I don't think it is very reproducible, but here it
is anyway.  The host is running 2.6.20.4.  

Here's the actual BUG_ON() that was hit:

static void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc,
size_t size)
{
void *p;

BUG_ON(!mc-nobjs);
p = mc-objects[--mc-nobjs];
memset(p, 0, size);
return p;
}

There were several oopses in there, but I think probably on the first
one is of any real interest.

Jun 29 06:40:45 elm3b173 kernel: [982460.818197] [ cut here 
]
Jun 29 06:40:45 elm3b173 kernel: [982460.827716] kernel BUG at 
/home/dave/kvm-27/kernel/mmu.c:276!
Jun 29 06:40:45 elm3b173 kernel: [982460.841230] invalid opcode:  [1] SMP 
Jun 29 06:40:45 elm3b173 kernel: [982460.849587] CPU 0 
Jun 29 06:40:45 elm3b173 kernel: [982460.853928] Modules linked in: kvm_intel 
kvm aic94xx
Jun 29 06:40:45 elm3b173 kernel: [982460.864257] Pid: 11430, comm: 
qemu-system-x86 Not tainted 2.6.20.4 #6
Jun 29 06:40:45 elm3b173 kernel: [982460.877419] RIP: 
0010:[_end+124740199/2127394328]  [_end+124740199/2127394328] 
:kvm:mmu_memory_cache_alloc+0xf/0x40
Jun 29 06:40:45 elm3b173 kernel: [982460.896268] RSP: 0018:810129485548  
EFLAGS: 00010246
Jun 29 06:40:45 elm3b173 kernel: [982460.907199] RAX:  RBX: 
8101d49cf820 RCX: 0002
Jun 29 06:40:45 elm3b173 kernel: [982460.921831] RDX:  RSI: 
0050 RDI: 8101f9ad24e8
Jun 29 06:40:45 elm3b173 kernel: [982460.936469] RBP: 810129485558 R08: 
 R09: 
Jun 29 06:40:45 elm3b173 kernel: [982460.951101] R10: 8100504f4000 R11: 
1000 R12: 8101f9ad2128
Jun 29 06:40:45 elm3b173 kernel: [982460.965739] R13:  R14: 
8101f9ad1990 R15: 8101f9ad2128
Jun 29 06:40:45 elm3b173 kernel: [982460.980375] FS:  2ba55b51fa00() 
GS:807d4000() knlGS:
Jun 29 06:40:45 elm3b173 kernel: [982460.996918] CS:  0010 DS: 002b ES: 002b 
CR0: 80050033
Jun 29 06:40:45 elm3b173 kernel: [982461.008690] CR2: b7f3e884 CR3: 
0001d37b CR4: 26e0
Jun 29 06:40:45 elm3b173 kernel: [982461.023339] Process qemu-system-x86 (pid: 
11430, threadinfo 810129484000, task 8102172900c0)
Jun 29 06:40:45 elm3b173 kernel: [982461.041972] Stack:  810088fd31e0 
8101d49cf820 810129485588 8801f1d8
Jun 29 06:40:45 elm3b173 kernel: [982461.058500]  81019896fefc 
8101d49cf820 0322 270d
Jun 29 06:40:45 elm3b173 kernel: [982461.073776]  8101294855c8 
8801f5ea  8101f9ad2128
Jun 29 06:40:45 elm3b173 kernel: [982461.088592] Call Trace:
Jun 29 06:40:45 elm3b173 kernel: [982461.094248]  [_end+124741616/2127394328] 
:kvm:kvm_mmu_alloc_page+0x38/0x110
Jun 29 06:40:45 elm3b173 kernel: [982461.107243]  [_end+124742658/2127394328] 
:kvm:kvm_mmu_get_page+0xda/0x150
Jun 29 06:40:45 elm3b173 kernel: [982461.119888]  [_end+124744897/2127394328] 
:kvm:mmu_alloc_roots+0xf9/0x190
Jun 29 06:40:45 elm3b173 kernel: [982461.132360]  [_end+124745400/2127394328] 
:kvm:paging_new_cr3+0x30/0x60
Jun 29 06:40:45 elm3b173 kernel: [982461.144467]  [_end+124724233/2127394328] 
:kvm:set_cr3+0xb1/0xd0
Jun 29 06:40:45 elm3b173 kernel: [982461.155369]  [_end+124818018/2127394328] 
:kvm_intel:handle_cr+0xba/0x1d0
Jun 29 06:40:45 elm3b173 kernel: [982461.167839]  [_end+124819078/2127394328] 
:kvm_intel:kvm_handle_exit+0x7e/0xb0
Jun 29 06:40:45 elm3b173 kernel: [982461.181163]  [_end+124819740/2127394328] 
:kvm_intel:vmx_vcpu_run+0x224/0x2d0
Jun 29 06:40:45 elm3b173 kernel: [982461.194314]  [_end+124732211/2127394328] 
:kvm:kvm_vcpu_ioctl_run+0xfb/0x140
Jun 29 06:40:45 elm3b173 kernel: [982461.207305]  [_end+124736555/2127394328] 
:kvm:kvm_vcpu_ioctl+0x113/0x420
Jun 29 06:40:45 elm3b173 kernel: [982461.219779]  [__alloc_pages+100/768] 
__alloc_pages+0x64/0x300
Jun 29 06:40:45 elm3b173 kernel: [982461.231031]  [find_busiest_group+673/1856] 
find_busiest_group+0x2a1/0x740
Jun 29 06:40:45 elm3b173 kernel: [982461.243319]  
[find_next_zero_string+39/128] find_next_zero_string+0x27/0x80
Jun 29 06:40:45 elm3b173 kernel: [982461.255772]  [iommu_range_alloc+40/208] 
iommu_range_alloc+0x28/0xd0
Jun 29 06:40:45 elm3b173 kernel: [982461.267531]  [iommu_alloc+134/208] 
iommu_alloc+0x86/0xd0
Jun 29 06:40:45 elm3b173 kernel: [982461.278256]  [calgary_map_single+75/128] 
calgary_map_single+0x4b/0x80
Jun 29 06:40:45 elm3b173 kernel: [982461.290199]  
[tg3_start_xmit_dma_bug+1215/1344] tg3_start_xmit_dma_bug+0x4bf/0x540
Jun 29 06:40:45 elm3b173 kernel: [982461.303175]  [dev_hard_start_xmit+131/256] 
dev_hard_start_xmit+0x83/0x100
Jun 29 06:40:45 elm3b173 kernel: [982461.315457]  [__qdisc_run+83/480] 
__qdisc_run+0x53/0x1e0
Jun 29 

Re: [kvm-devel] kvm-27 host oopses

2007-07-02 Thread Luca
On 7/2/07, Dave Hansen [EMAIL PROTECTED] wrote:
 I had a host running kvm-27 oops on me last week.  The system had been
 up for about 2 weeks, and had probably run and stopped at least a couple
 hundred kvm guests.  I don't think it is very reproducible, but here it
 is anyway.  The host is running 2.6.20.4.

 Here's the actual BUG_ON() that was hit:

 static void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc,
 size_t size)
 {
 void *p;

 BUG_ON(!mc-nobjs);
 p = mc-objects[--mc-nobjs];
 memset(p, 0, size);
 return p;
 }

MMU working memory was exhausted during a guest context switch. It has
been fixed by:

KVM: Lazy guest cr3 switching
4b82b37a35a085a07d9ed84efee06c69655fd3d1

which is included in KVM-28.

Luca

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-27 host oopses

2007-07-02 Thread Avi Kivity
Dave Hansen wrote:
 I had a host running kvm-27 oops on me last week.  The system had been
 up for about 2 weeks, and had probably run and stopped at least a couple
 hundred kvm guests.  I don't think it is very reproducible, but here it
 is anyway.  The host is running 2.6.20.4.  
 :kvm:kvm_mmu_alloc_page+0x38/0x110
 Jun 29 06:40:45 elm3b173 kernel: [982461.107243]  [_end+124742658/2127394328] 
 :kvm:kvm_mmu_get_page+0xda/0x150
 Jun 29 06:40:45 elm3b173 kernel: [982461.119888]  [_end+124744897/2127394328] 
 :kvm:mmu_alloc_roots+0xf9/0x190
 Jun 29 06:40:45 elm3b173 kernel: [982461.132360]  [_end+124745400/2127394328] 
 :kvm:paging_new_cr3+0x30/0x60
 Jun 29 06:40:45 elm3b173 kernel: [982461.144467]  [_end+124724233/2127394328] 
 :kvm:set_cr3+0xb1/0xd0
 Jun 29 06:40:45 elm3b173 kernel: [982461.155369]  [_end+124818018/2127394328] 
 :kvm_intel:handle_cr+0xba/0x1d0
   

That's fixed in kvm-28 (4b82b37a35a085a07d9ed84efee06c69655fd3d1).


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-27 host oopses

2007-07-02 Thread Dave Hansen
On Mon, 2007-07-02 at 20:58 +0200, Luca wrote:
 
 MMU working memory was exhausted during a guest context switch. It has
 been fixed by:
 
 KVM: Lazy guest cr3 switching
 4b82b37a35a085a07d9ed84efee06c69655fd3d1
 
 which is included in KVM-28. 

OK, I'll give kvm-28 a shot.  Thanks for the help!

-- Dave


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel