Re: Runtime-modified DIMMs and live migration issue

2015-08-18 Thread Andrey Korolyov
Fixed with cherry-pick of the
7a72f7a140bfd3a5dae73088947010bfdbcf6a40 and its predecessor
7103f60de8bed21a0ad5d15d2ad5b7a333dda201. Of course this is not a real
fix as the only race precondition is shifted/disappeared by a clear
assumption. Though there are not too many hotplug users around, I hope
this information would be useful for those who would experience the
same in a next year or so, until 3.18+ will be stable enough for
hypervisor kernel role. Any suggestions on a further debug/race
re-exposition are of course very welcomed.

CCing kvm@ as it looks as a hypervisor subsystem issue then. The
entire discussion can be found at
https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03117.html .
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-07-02 Thread Andrey Korolyov
 But you are very appositely mistaken: copy_huge_page() used to make
 the same mistake, and Dave Hansen fixed it back in v3.13, but the fix
 never went to the stable trees.

 commit 30b0a105d9f7141e4cbf72ae5511832457d89788
 Author: Dave Hansen dave.han...@linux.intel.com
 Date:   Thu Nov 21 14:31:58 2013 -0800

 mm: thp: give transparent hugepage code a separate copy_page

 Right now, the migration code in migrate_page_copy() uses 
 copy_huge_page()
 for hugetlbfs and thp pages:

if (PageHuge(page) || PageTransHuge(page))
 copy_huge_page(newpage, page);

 So, yay for code reuse.  But:

   void copy_huge_page(struct page *dst, struct page *src)
   {
 struct hstate *h = page_hstate(src);

 and a non-hugetlbfs page has no page_hstate().  This works 99% of the
 time because page_hstate() determines the hstate from the page order
 alone.  Since the page order of a THP page matches the default hugetlbfs
 page order, it works.

 But, if you change the default huge page size on the boot command-line
 (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
 so page_hstate() returns null and copy_huge_page() oopses pretty fast
 since copy_huge_page() dereferences the hstate:

   void copy_huge_page(struct page *dst, struct page *src)
   {
 struct hstate *h = page_hstate(src);
 if (unlikely(pages_per_huge_page(h)  MAX_ORDER_NR_PAGES)) {
   ...

 Mel noticed that the migration code is really the only user of these
 functions.  This moves all the copy code over to migrate.c and makes
 copy_huge_page() work for THP by checking for it explicitly.

 I believe the bug was introduced in commit b32967ff101a (mm: numa: Add
 THP migration for the NUMA working set scanning fault case)

 [a...@linux-foundation.org: fix coding-style and comment text, per Naoya 
 Horiguchi]
 Signed-off-by: Dave Hansen dave.han...@linux.intel.com
 Acked-by: Mel Gorman mgor...@suse.de
 Reviewed-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
 Cc: Hillf Danton dhi...@gmail.com
 Cc: Andrea Arcangeli aarca...@redhat.com
 Tested-by: Dave Jiang dave.ji...@intel.com
 Signed-off-by: Andrew Morton a...@linux-foundation.org
 Signed-off-by: Linus Torvalds torva...@linux-foundation.org


 Thanks, the issue is fixed on 3.10 with trivial patch modification.

Ping? 3.10 still misses that..
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-04-05 Thread Andrey Korolyov
A small update:

the behavior is caused by setting unrestricted_guest feature to N, I
had this feature disabled everywhere from approx. three years ago when
its enablement was one of suspects of the host crashes with
contemporary then KVM module. Also nVMX is likely to not work at all
and produce the same traces as in https://lkml.org/lkml/2014/7/17/12
without unrestricted_guest=1. I think this fact actually explaining
all real mode weirdness we`ve seen before and this should be probably
ended either by putting appropriate bits in a README or module
information or making strict dependency between
apicv/unrestricted_guest+nested/unrestricted_guest or fixing the issue
at its root if this is possible or appropriate solution. Thanks
everyone for keeping up with ideas through this thread!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-04-01 Thread Andrey Korolyov
On Wed, Apr 1, 2015 at 2:49 PM, Radim Krčmář rkrc...@redhat.com wrote:
 2015-03-31 21:23+0300, Andrey Korolyov:
 On Tue, Mar 31, 2015 at 9:04 PM, Bandan Das b...@redhat.com wrote:
  Bandan Das b...@redhat.com writes:
  Andrey Korolyov and...@xdel.ru writes:
  ...
  http://xdel.ru/downloads/kvm-e5v2-issue/another-tracepoint-fail-with-apicv.dat.gz
 
  Something a bit more interesting, but the mess is happening just
  *after* NMI firing.
 
  What happens if NMI is turned off on the host ?
 
  Sorry, I meant the watchdog..

 Thanks, everything goes well (as it probably should go there):
 http://xdel.ru/downloads/kvm-e5v2-issue/apicv-enabled-nmi-disabled.dat.gz

 Nice revelation!

 KVM doesn't expect host's NMIs to look like this so it doesn't pass them
 to the host.  What was the watchdog that casually sent NMIs?
 (It worked after nmi_watchdog=0 on the host?)

 (Guest's NMI should have a different result as well.  NMI_EXCEPTION is
  an expected exit reason for guest's hard exceptions, they are then
  differentiated by intr_info and nothing hinted that this was a NMI.)

Yes, I disabled host watchdog during runtime. Indeed guest-induced NMI
would look different and they had no reasons to be fired at this stage
inside guest. I`d suspect a hypervisor hardware misbehavior there but
have a very little idea on how APICv behavior (which is completely
microcode-dependent and CPU-dependent but decoupled from peripheral
hardware) may vary at this point, I am using 1.20140913.1 ucode
version from debian if this can matter. Will send trace suggested by
Paolo in a next couple of hours. Also it would be awesome to ask
hardware folks from Intel who can prove or disprove my abovementioned
statement (as I was unable to catch the problem on 2603v2 so far, this
hypothesis has some chance to be real).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-04-01 Thread Andrey Korolyov
On Wed, Apr 1, 2015 at 4:19 PM, Paolo Bonzini pbonz...@redhat.com wrote:


 On 01/04/2015 14:26, Andrey Korolyov wrote:
 Yes, I disabled host watchdog during runtime. Indeed guest-induced NMI
 would look different and they had no reasons to be fired at this stage
 inside guest. I`d suspect a hypervisor hardware misbehavior there but
 have a very little idea on how APICv behavior (which is completely
 microcode-dependent and CPU-dependent but decoupled from peripheral
 hardware) may vary at this point, I am using 1.20140913.1 ucode
 version from debian if this can matter. Will send trace suggested by
 Paolo in a next couple of hours. Also it would be awesome to ask
 hardware folks from Intel who can prove or disprove my abovementioned
 statement (as I was unable to catch the problem on 2603v2 so far, this
 hypothesis has some chance to be real).

 Yes, the interaction with the NMI watchdog is unexpected and makes a
 processor erratum somewhat more likely.

 Paolo


http://xdel.ru/downloads/kvm-e5v2-issue/trace-nmi-apicv-fail-at-reboot.dat.gz

err, no NMI entries nearby failure event, though capture should be correct:
/sys/kernel/debug/tracing/events/kvm*/filter
/sys/kernel/debug/tracing/events/*/kvm*/filter
/sys/kernel/debug/tracing/events/nmi*/filter
/sys/kernel/debug/tracing/events/*/nmi*/filter
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-04-01 Thread Andrey Korolyov
On Wed, Apr 1, 2015 at 6:37 PM, Andrey Korolyov and...@xdel.ru wrote:
 On Wed, Apr 1, 2015 at 4:19 PM, Paolo Bonzini pbonz...@redhat.com wrote:


 On 01/04/2015 14:26, Andrey Korolyov wrote:
 Yes, I disabled host watchdog during runtime. Indeed guest-induced NMI
 would look different and they had no reasons to be fired at this stage
 inside guest. I`d suspect a hypervisor hardware misbehavior there but
 have a very little idea on how APICv behavior (which is completely
 microcode-dependent and CPU-dependent but decoupled from peripheral
 hardware) may vary at this point, I am using 1.20140913.1 ucode
 version from debian if this can matter. Will send trace suggested by
 Paolo in a next couple of hours. Also it would be awesome to ask
 hardware folks from Intel who can prove or disprove my abovementioned
 statement (as I was unable to catch the problem on 2603v2 so far, this
 hypothesis has some chance to be real).

 Yes, the interaction with the NMI watchdog is unexpected and makes a
 processor erratum somewhat more likely.

 Paolo


 http://xdel.ru/downloads/kvm-e5v2-issue/trace-nmi-apicv-fail-at-reboot.dat.gz

 err, no NMI entries nearby failure event, though capture should be correct:
 /sys/kernel/debug/tracing/events/kvm*/filter
 /sys/kernel/debug/tracing/events/*/kvm*/filter
 /sys/kernel/debug/tracing/events/nmi*/filter
 /sys/kernel/debug/tracing/events/*/nmi*/filter

Moved 2603v2s back and issue is still here. I used wrong pattern for
the issue on a previous series of tests on those CPUs in the middle of
month, continuously respawning VMs when the real issue is hiding in
*first* reboot events starting from the hypervisor reboot (or module
load). So either it should be reproducible anywhere or this is not a
hardware issue (or it is related to the mainboard instead of CPU
itself :) ).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-04-01 Thread Andrey Korolyov
*putting my tinfoil hat on*

After thinking a little bit more, the observable behavior is a quite
good match for a bios-level hypervisor (hardware trojan in a modern
terminology), as it likely is sensitive to timing[1], does not appear
more than once per VM during boot cycle and seemingly does not regard
a fact if kvm-intel was reloaded once or twice (or more) and not
reproducible outside of domain of a single board model. If nobody has
a better suggestions to try on, I`ll do a couple of steps in a next
days:
- extract and compare bios to the vendor`s image with SPI programmer,
- extract and compare BMC image with public version (should be easy as well),
- try to analyze switch timings by writing sample code for a bare
hardware (there can be a hint that the L2 Linux guest can expose
larger execution time difference with L1 on host with top-level
hypervisor than on supposedly 'non-infected' one),
- try to analyze binary BIOS code itself, though it can be VERY
problematic, I am even not talking for same possibility for BMC.

Sorry for posting such a naive and stupid stuff in the public ml, but
I am really out of clues of what`s happening there and why it is not
reproducible anywhere else.

1. https://xakep.ru/2011/12/26/58104/ (russian text, but can be read
through g-translate without lack of details)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-03-31 Thread Andrey Korolyov
On Sun, Mar 29, 2015 at 3:25 AM, Hugh Dickins hu...@google.com wrote:
 On Sat, 28 Mar 2015, Andrey Korolyov wrote:
 On Tue, Feb 24, 2015 at 3:12 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
  On Wed, Feb 04, 2015 at 08:34:04PM +0400, Andrey Korolyov wrote:
  Hi,
  
  I've seen the problem quite a few times.  Before spending more time on
  it, I'd like to have a quick check here to see if anyone ever saw the
  same problem?  Hope it is a relevant question with this mail list.
  
  
  Jul  2 11:08:21 arno-3 kernel: [ 2165.078623] BUG: unable to handle
  kernel NULL pointer dereference at 0008
  Jul  2 11:08:21 arno-3 kernel: [ 2165.078916] IP: [8118d0fa]
  copy_huge_page+0x8a/0x2a0
  Jul  2 11:08:21 arno-3 kernel: [ 2165.079128] PGD 0
  Jul  2 11:08:21 arno-3 kernel: [ 2165.079198] Oops:  [#1] SMP
  Jul  2 11:08:21 arno-3 kernel: [ 2165.079319] Modules linked in:
  ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
  iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
  xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp
  iptable_filter ip_tables x_tables kvm_intel kvm bridge stp llc ast ttm
  drm_kms_helper drm sysimgblt sysfillrect syscopyarea lp mei_me ioatdma
  ext2 parport mei shpchp dcdbas joydev mac_hid lpc_ich acpi_pad wmi
  hid_generic usbhid hid ixgbe igb dca i2c_algo_bit ahci ptp libahci
  mdio pps_core
  Jul  2 11:08:21 arno-3 kernel: [ 2165.081090] CPU: 19 PID: 3494 Comm:
  qemu-system-x86 Not tainted 3.11.0-15-generic #25~precise1-Ubuntu
  Jul  2 11:08:21 arno-3 kernel: [ 2165.081424] Hardware name: Dell Inc.
  PowerEdge C6220 II/09N44V, BIOS 2.0.3 07/03/2013
  Jul  2 11:08:21 arno-3 kernel: [ 2165.081705] task: 88102675
  ti: 881026056000 task.ti: 881026056000
  Jul  2 11:08:21 arno-3 kernel: [ 2165.081973] RIP:
  0010:[8118d0fa]  [8118d0fa]
  copy_huge_page+0x8a/0x2a0
 
 
  Hello,
 
  sorry for possible top-posting, the same issue appears on at least
  3.10 LTS series. The original thread is at
  http://marc.info/?l=kvmm=14043742300901.
 
  Andrey,
 
  I am unable to access the URL above?
 
  The necessary components for failure to reappear are a single running
  kvm guest and mounted large thp: hugepagesz=1G (seemingly the same as
  in initial report). With default 2M pages everything is working well,
  the same for 3.18 with 1G THP. Are there any obvious clues for the
  issue?
 
  Thanks!
 
 

 Hello,

 Marcelo, sorry, I`ve missed your reply in time. The working link, for
 example is http://www.spinics.net/lists/linux-mm/msg75658.html. The
 reproducer is a very simple, you need 1G THP and mounted hugetlbfs.
 What is interesting, if guest is backed by THP like '-object
 memory-backend-file,id=mem,size=1G,mem-path=/hugepages,share=on' the
 failure is less likely to occur.

 I think you're mistaken when you write of 1G THP: although hugetlbfs
 can support 1G hugepages, we don't support that size with Transparent
 Huge Pages.

 But you are very appositely mistaken: copy_huge_page() used to make
 the same mistake, and Dave Hansen fixed it back in v3.13, but the fix
 never went to the stable trees.

 Your report was on an Ubuntu 3.11.0-15 kernel: I think Ubuntu have
 discontinued their 3.11-stable kernel series, but 3.10-longterm and
 3.12-longterm would benefit from including this fix.  I haven't tried
 patching and  building and testing it there, but it looks reasonable.

 Hugh

 commit 30b0a105d9f7141e4cbf72ae5511832457d89788
 Author: Dave Hansen dave.han...@linux.intel.com
 Date:   Thu Nov 21 14:31:58 2013 -0800

 mm: thp: give transparent hugepage code a separate copy_page

 Right now, the migration code in migrate_page_copy() uses copy_huge_page()
 for hugetlbfs and thp pages:

if (PageHuge(page) || PageTransHuge(page))
 copy_huge_page(newpage, page);

 So, yay for code reuse.  But:

   void copy_huge_page(struct page *dst, struct page *src)
   {
 struct hstate *h = page_hstate(src);

 and a non-hugetlbfs page has no page_hstate().  This works 99% of the
 time because page_hstate() determines the hstate from the page order
 alone.  Since the page order of a THP page matches the default hugetlbfs
 page order, it works.

 But, if you change the default huge page size on the boot command-line
 (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
 so page_hstate() returns null and copy_huge_page() oopses pretty fast
 since copy_huge_page() dereferences the hstate:

   void copy_huge_page(struct page *dst, struct page *src)
   {
 struct hstate *h = page_hstate(src);
 if (unlikely(pages_per_huge_page(h)  MAX_ORDER_NR_PAGES)) {
   ...

 Mel noticed that the migration code is really the only user of these
 functions.  This moves all the copy code over to migrate.c and makes
 copy_huge_page() work for THP by checking for it explicitly

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-31 Thread Andrey Korolyov
On Tue, Mar 31, 2015 at 4:45 PM, Radim Krčmář rkrc...@redhat.com wrote:
 2015-03-30 22:32+0300, Andrey Korolyov:
 On Mon, Mar 30, 2015 at 9:56 PM, Radim Krčmář rkrc...@redhat.com wrote:
  2015-03-27 13:16+0300, Andrey Korolyov:
  On Fri, Mar 27, 2015 at 12:03 AM, Bandan Das b...@redhat.com wrote:
   Radim Krčmář rkrc...@redhat.com writes:
   I second Bandan -- checking that it reproduces on other machine would 
   be
   great for sanity :)  (Although a bug in our APICv is far more likely.)
  
   If it's APICv related, a run without apicv enabled could give more 
   hints.
  
   Your devices not getting reset hypothesis makes the most sense to me,
   maybe the timer vector in the error message is just one part of
   the whole story. Another misbehaving interrupt from the dark comes in 
   at the
   same time and leads to a double fault.
 
  Default trace (APICv enabled, first reboot introduced the issue):
  http://xdel.ru/downloads/kvm-e5v2-issue/hanged-reboot-apic-on.dat.gz
 
  The relevant part is here,
  prefixed with qemu-system-x86-4180  [002]   697.111550:
 
kvm_exit: reason CR_ACCESS rip 0xd272 info 0 0
kvm_cr:   cr_write 0 = 0x10
kvm_mmu_get_page: existing sp gfn 0 0/4 q0 direct --- !pge !nxe root 
  0 sync
kvm_entry:vcpu 0
kvm_emulate_insn: f:d275: ea 7a d2 00 f0
kvm_emulate_insn: f:d27a: 2e 0f 01 1e f0 6c
kvm_emulate_insn: f:d280: 31 c0
kvm_emulate_insn: f:d282: 8e e0
kvm_emulate_insn: f:d284: 8e e8
kvm_emulate_insn: f:d286: 8e c0
kvm_emulate_insn: f:d288: 8e d8
kvm_emulate_insn: f:d28a: 8e d0
kvm_entry:vcpu 0
kvm_exit: reason EXTERNAL_INTERRUPT rip 0xd28f info 0 
  80f6
kvm_entry:vcpu 0
kvm_exit: reason EPT_VIOLATION rip 0x8dd0 info 184 0
kvm_page_fault:   address f8dd0 error_code 184
kvm_entry:vcpu 0
kvm_exit: reason EXTERNAL_INTERRUPT rip 0x8dd0 info 0 
  80f6
kvm_entry:vcpu 0
kvm_exit: reason EPT_VIOLATION rip 0x76d6 info 184 0
kvm_page_fault:   address f76d6 error_code 184
kvm_entry:vcpu 0
kvm_exit: reason EXTERNAL_INTERRUPT rip 0x76d6 info 0 
  80f6
kvm_entry:vcpu 0
kvm_exit: reason PENDING_INTERRUPT rip 0xd331 info 0 0
kvm_inj_virq: irq 8
kvm_entry:vcpu 0
kvm_exit: reason EXTERNAL_INTERRUPT rip 0xfea5 info 0 
  80f6
kvm_entry:vcpu 0
kvm_exit: reason EPT_VIOLATION rip 0xfea5 info 184 0
kvm_page_fault:   address ffea5 error_code 184
kvm_entry:vcpu 0
kvm_exit: reason EXTERNAL_INTERRUPT rip 0xfea5 info 0 
  80f6
kvm_entry:vcpu 0
kvm_exit: reason EPT_VIOLATION rip 0xe990 info 184 0
kvm_page_fault:   address fe990 error_code 184
kvm_entry:vcpu 0
kvm_exit: reason EXTERNAL_INTERRUPT rip 0xe990 info 0 
  80f6
kvm_entry:vcpu 0
kvm_exit: reason EXCEPTION_NMI rip 0xd334 info 0 8b0d
kvm_userspace_exit:   reason KVM_EXIT_INTERNAL_ERROR (17)
 
  Trace without APICv (three reboots, just to make sure to hit the
  problematic condition of supposed DF, as it still have not one hundred
  percent reproducibility):
  http://xdel.ru/downloads/kvm-e5v2-issue/apic-off.dat.gz
 
  The trace here contains a well matching excerpt, just instead of the
  EXCEPTION_NMI, it does
 
   169.905098: kvm_exit: reason EPT_VIOLATION rip 0xd334 info 
  181 0
   169.905102: kvm_page_fault:   address feffd066 error_code 181
 
  and works.  Page fault says we tried to read 0xfeffd066 -- probably IOPB
  of TSS.  (I guess it is pre-fetch for following IO instruction.)
 
  Nothing strikes me when looking at it, but some APICv boots don't fail,
  so it would be interesting to compare them ... hosts's 0xf6 interrupt
  (IRQ_WORK_VECTOR) is a possible source of races.  (We could look more
  closely.  It is fired too often for my liking as well.)

 Thanks Radim, 
 http://xdel.ru/downloads/kvm-e5v2-issue/no-fail-with-apicv.dat.gz

 The related bits looks the same as with enable_apicv=0 for me.

 Yeah,

  qemu-system-x86-4201  [007]   159.297337:
   kvm_exit: reason CR_ACCESS rip 0xd272 info 0 0
   kvm_cr:   cr_write 0 = 0x10
   kvm_mmu_get_page: existing sp gfn 0 0/4 q0 direct --- !pge !nxe root 0 
 sync
   kvm_entry:vcpu 0
   kvm_emulate_insn: f:d275: ea 7a d2 00 f0
   kvm_emulate_insn: f:d27a: 2e 0f 01 1e f0 6c
   kvm_emulate_insn: f:d280: 31 c0
   kvm_emulate_insn: f:d282: 8e e0
   kvm_emulate_insn: f:d284: 8e e8
   kvm_emulate_insn: f:d286: 8e c0
   kvm_emulate_insn: f:d288: 8e d8
   kvm_emulate_insn: f:d28a: 8e d0

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-31 Thread Andrey Korolyov
On Tue, Mar 31, 2015 at 7:45 PM, Radim Krčmář rkrc...@redhat.com wrote:
 2015-03-31 17:56+0300, Andrey Korolyov:
  Chasing the culprit this way could take a long time, so a new tracepoint
  that shows if 0xef is set on entry would let us guess the bug faster ...
 
  Please provide a failing trace with the following patch:

 Thanks, please see below:

 http://xdel.ru/downloads/kvm-e5v2-issue/new-tracepoint-fail-with-apicv.dat.gz

  qemu-system-x86-4022  [006]  255.915978:
   kvm_entry:vcpu 0
   kvm_emulate_insn: f:d275: ea 7a d2 00 f0
   kvm_emulate_insn: f:d27a: 2e 0f 01 1e f0 6c
   kvm_emulate_insn: f:d280: 31 c0
   kvm_emulate_insn: f:d282: 8e e0
   kvm_emulate_insn: f:d284: 8e e8
   kvm_emulate_insn: f:d286: 8e c0
   kvm_emulate_insn: f:d288: 8e d8
   kvm_emulate_insn: f:d28a: 8e d0
   kvm_entry:vcpu 0
   kvm_0xef: irr clear, isr clear, vmcs 0x0
   kvm_exit: reason EPT_VIOLATION rip 0x8dd0 info 184 0
   kvm_page_fault:   address f8dd0 error_code 184
   kvm_entry:vcpu 0
   kvm_0xef: irr clear, isr clear, vmcs 0x0
   kvm_exit: reason EPT_VIOLATION rip 0x76d6 info 184 0
   kvm_page_fault:   address f76d6 error_code 184
   kvm_entry:vcpu 0
   kvm_0xef: irr clear, isr clear, vmcs 0x0
   kvm_exit: reason EXCEPTION_NMI rip 0xd331 info 0 8b0d
   kvm_userspace_exit:   reason KVM_EXIT_INTERNAL_ERROR (17)

 Ok, nothing obvious here either ... I've desperately added all
 information I know about.  Please run it again, thanks.

 (The patch has to be applied instead of the previous one.)
 ---
 diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
 index 7c7bc8bef21f..f986636ad9d0 100644
 --- a/arch/x86/kvm/trace.h
 +++ b/arch/x86/kvm/trace.h
 @@ -742,6 +742,41 @@ TRACE_EVENT(kvm_emulate_insn,
  #define trace_kvm_emulate_insn_start(vcpu) trace_kvm_emulate_insn(vcpu, 0)
  #define trace_kvm_emulate_insn_failed(vcpu) trace_kvm_emulate_insn(vcpu, 1)

 +TRACE_EVENT(kvm_0xef,
 +   TP_PROTO(bool irr, bool isr, u32 info, bool on, bool pir, u16 status),
 +   TP_ARGS(irr, isr, info, on, pir, status),
 +
 +   TP_STRUCT__entry(
 +   __field(bool,  irr )
 +   __field(bool,  isr )
 +   __field(u32,   info)
 +   __field(bool,  on  )
 +   __field(bool,  pir )
 +   __field(u8,rvi )
 +   __field(u8,svi )
 +   ),
 +
 +   TP_fast_assign(
 +   __entry-irr  = irr;
 +   __entry-isr  = isr;
 +   __entry-info = info;
 +   __entry-on   = on;
 +   __entry-pir  = pir;
 +   __entry-rvi  = status  0xff;
 +   __entry-svi  = status  8;
 +   ),
 +
 +   TP_printk(irr %s, isr %s, info 0x%x, on %s, pir %s, rvi 0x%x, svi 
 0x%x,
 + __entry-irr ? set   : clear,
 + __entry-isr ? set   : clear,
 + __entry-info,
 + __entry-on  ? set   : clear,
 + __entry-pir ? set   : clear,
 + __entry-rvi,
 + __entry-svi
 +)
 +   );
 +
  TRACE_EVENT(
 vcpu_match_mmio,
 TP_PROTO(gva_t gva, gpa_t gpa, bool write, bool gpa_match),
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index eee63dc33d89..b461edc93d53 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -5047,6 +5047,25 @@ static int handle_machine_check(struct kvm_vcpu *vcpu)
 return 1;
  }

 +#define VEC_POS(v) ((v)  (32 - 1))
 +#define REG_POS(v) (((v)  5)  4)
 +static inline int apic_test_vector(int vec, void *bitmap)
 +{
 +   return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
 +}
 +
 +static inline void random_trace(struct kvm_vcpu *vcpu)
 +{
 +   struct vcpu_vmx *vmx = to_vmx(vcpu);
 +
 +   trace_kvm_0xef(apic_test_vector(0xef, vcpu-arch.apic-regs + 
 APIC_IRR),
 +  apic_test_vector(0xef, vcpu-arch.apic-regs + 
 APIC_ISR),
 +  vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
 +  test_bit(POSTED_INTR_ON, (unsigned long 
 *)vmx-pi_desc.control),
 +  test_bit(0xef, (unsigned long *)vmx-pi_desc.pir),
 +  vmcs_read16(GUEST_INTR_STATUS));
 +}
 +
  static int handle_exception(struct kvm_vcpu *vcpu)
  {
 struct vcpu_vmx *vmx = to_vmx(vcpu);
 @@ -5077,6 +5096,8 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 return 1;
 }

 +   random_trace(vcpu);
 +
 error_code = 0;
 if (intr_info  INTR_INFO_DELIVER_CODE_MASK)
 error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE);
 @@ -8143,6 +8164,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu 
 *vcpu)
 if (vmx-emulation_required)
 return;

 +   random_trace(vcpu);
 +
 if (vmx

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-31 Thread Andrey Korolyov
On Tue, Mar 31, 2015 at 9:04 PM, Bandan Das b...@redhat.com wrote:
 Bandan Das b...@redhat.com writes:

 Andrey Korolyov and...@xdel.ru writes:
 ...
 http://xdel.ru/downloads/kvm-e5v2-issue/another-tracepoint-fail-with-apicv.dat.gz

 Something a bit more interesting, but the mess is happening just
 *after* NMI firing.

 What happens if NMI is turned off on the host ?

 Sorry, I meant the watchdog..


Thanks, everything goes well (as it probably should go there):
http://xdel.ru/downloads/kvm-e5v2-issue/apicv-enabled-nmi-disabled.dat.gz
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-30 Thread Andrey Korolyov
On Mon, Mar 30, 2015 at 9:56 PM, Radim Krčmář rkrc...@redhat.com wrote:
 2015-03-27 13:16+0300, Andrey Korolyov:
 On Fri, Mar 27, 2015 at 12:03 AM, Bandan Das b...@redhat.com wrote:
  Radim Krčmář rkrc...@redhat.com writes:
  I second Bandan -- checking that it reproduces on other machine would be
  great for sanity :)  (Although a bug in our APICv is far more likely.)
 
  If it's APICv related, a run without apicv enabled could give more hints.
 
  Your devices not getting reset hypothesis makes the most sense to me,
  maybe the timer vector in the error message is just one part of
  the whole story. Another misbehaving interrupt from the dark comes in at 
  the
  same time and leads to a double fault.

 Default trace (APICv enabled, first reboot introduced the issue):
 http://xdel.ru/downloads/kvm-e5v2-issue/hanged-reboot-apic-on.dat.gz

 The relevant part is here,
 prefixed with qemu-system-x86-4180  [002]   697.111550:

   kvm_exit: reason CR_ACCESS rip 0xd272 info 0 0
   kvm_cr:   cr_write 0 = 0x10
   kvm_mmu_get_page: existing sp gfn 0 0/4 q0 direct --- !pge !nxe root 0 
 sync
   kvm_entry:vcpu 0
   kvm_emulate_insn: f:d275: ea 7a d2 00 f0
   kvm_emulate_insn: f:d27a: 2e 0f 01 1e f0 6c
   kvm_emulate_insn: f:d280: 31 c0
   kvm_emulate_insn: f:d282: 8e e0
   kvm_emulate_insn: f:d284: 8e e8
   kvm_emulate_insn: f:d286: 8e c0
   kvm_emulate_insn: f:d288: 8e d8
   kvm_emulate_insn: f:d28a: 8e d0
   kvm_entry:vcpu 0
   kvm_exit: reason EXTERNAL_INTERRUPT rip 0xd28f info 0 80f6
   kvm_entry:vcpu 0
   kvm_exit: reason EPT_VIOLATION rip 0x8dd0 info 184 0
   kvm_page_fault:   address f8dd0 error_code 184
   kvm_entry:vcpu 0
   kvm_exit: reason EXTERNAL_INTERRUPT rip 0x8dd0 info 0 80f6
   kvm_entry:vcpu 0
   kvm_exit: reason EPT_VIOLATION rip 0x76d6 info 184 0
   kvm_page_fault:   address f76d6 error_code 184
   kvm_entry:vcpu 0
   kvm_exit: reason EXTERNAL_INTERRUPT rip 0x76d6 info 0 80f6
   kvm_entry:vcpu 0
   kvm_exit: reason PENDING_INTERRUPT rip 0xd331 info 0 0
   kvm_inj_virq: irq 8
   kvm_entry:vcpu 0
   kvm_exit: reason EXTERNAL_INTERRUPT rip 0xfea5 info 0 80f6
   kvm_entry:vcpu 0
   kvm_exit: reason EPT_VIOLATION rip 0xfea5 info 184 0
   kvm_page_fault:   address ffea5 error_code 184
   kvm_entry:vcpu 0
   kvm_exit: reason EXTERNAL_INTERRUPT rip 0xfea5 info 0 80f6
   kvm_entry:vcpu 0
   kvm_exit: reason EPT_VIOLATION rip 0xe990 info 184 0
   kvm_page_fault:   address fe990 error_code 184
   kvm_entry:vcpu 0
   kvm_exit: reason EXTERNAL_INTERRUPT rip 0xe990 info 0 80f6
   kvm_entry:vcpu 0
   kvm_exit: reason EXCEPTION_NMI rip 0xd334 info 0 8b0d
   kvm_userspace_exit:   reason KVM_EXIT_INTERNAL_ERROR (17)

 Trace without APICv (three reboots, just to make sure to hit the
 problematic condition of supposed DF, as it still have not one hundred
 percent reproducibility):
 http://xdel.ru/downloads/kvm-e5v2-issue/apic-off.dat.gz

 The trace here contains a well matching excerpt, just instead of the
 EXCEPTION_NMI, it does

  169.905098: kvm_exit: reason EPT_VIOLATION rip 0xd334 info 181 0
  169.905102: kvm_page_fault:   address feffd066 error_code 181

 and works.  Page fault says we tried to read 0xfeffd066 -- probably IOPB
 of TSS.  (I guess it is pre-fetch for following IO instruction.)

 Nothing strikes me when looking at it, but some APICv boots don't fail,
 so it would be interesting to compare them ... hosts's 0xf6 interrupt
 (IRQ_WORK_VECTOR) is a possible source of races.  (We could look more
 closely.  It is fired too often for my liking as well.)


Thanks Radim, http://xdel.ru/downloads/kvm-e5v2-issue/no-fail-with-apicv.dat.gz

(missed right button in mailer previously)

The related bits looks the same as with enable_apicv=0 for me.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-03-28 Thread Andrey Korolyov
On Tue, Feb 24, 2015 at 3:12 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Wed, Feb 04, 2015 at 08:34:04PM +0400, Andrey Korolyov wrote:
 Hi,
 
 I've seen the problem quite a few times.  Before spending more time on
 it, I'd like to have a quick check here to see if anyone ever saw the
 same problem?  Hope it is a relevant question with this mail list.
 
 
 Jul  2 11:08:21 arno-3 kernel: [ 2165.078623] BUG: unable to handle
 kernel NULL pointer dereference at 0008
 Jul  2 11:08:21 arno-3 kernel: [ 2165.078916] IP: [8118d0fa]
 copy_huge_page+0x8a/0x2a0
 Jul  2 11:08:21 arno-3 kernel: [ 2165.079128] PGD 0
 Jul  2 11:08:21 arno-3 kernel: [ 2165.079198] Oops:  [#1] SMP
 Jul  2 11:08:21 arno-3 kernel: [ 2165.079319] Modules linked in:
 ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp
 iptable_filter ip_tables x_tables kvm_intel kvm bridge stp llc ast ttm
 drm_kms_helper drm sysimgblt sysfillrect syscopyarea lp mei_me ioatdma
 ext2 parport mei shpchp dcdbas joydev mac_hid lpc_ich acpi_pad wmi
 hid_generic usbhid hid ixgbe igb dca i2c_algo_bit ahci ptp libahci
 mdio pps_core
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081090] CPU: 19 PID: 3494 Comm:
 qemu-system-x86 Not tainted 3.11.0-15-generic #25~precise1-Ubuntu
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081424] Hardware name: Dell Inc.
 PowerEdge C6220 II/09N44V, BIOS 2.0.3 07/03/2013
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081705] task: 88102675
 ti: 881026056000 task.ti: 881026056000
 Jul  2 11:08:21 arno-3 kernel: [ 2165.081973] RIP:
 0010:[8118d0fa]  [8118d0fa]
 copy_huge_page+0x8a/0x2a0


 Hello,

 sorry for possible top-posting, the same issue appears on at least
 3.10 LTS series. The original thread is at
 http://marc.info/?l=kvmm=14043742300901.

 Andrey,

 I am unable to access the URL above?

 The necessary components for failure to reappear are a single running
 kvm guest and mounted large thp: hugepagesz=1G (seemingly the same as
 in initial report). With default 2M pages everything is working well,
 the same for 3.18 with 1G THP. Are there any obvious clues for the
 issue?

 Thanks!



Hello,

Marcelo, sorry, I`ve missed your reply in time. The working link, for
example is http://www.spinics.net/lists/linux-mm/msg75658.html. The
reproducer is a very simple, you need 1G THP and mounted hugetlbfs.
What is interesting, if guest is backed by THP like '-object
memory-backend-file,id=mem,size=1G,mem-path=/hugepages,share=on' the
failure is less likely to occur.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-27 Thread Andrey Korolyov
On Fri, Mar 27, 2015 at 12:03 AM, Bandan Das b...@redhat.com wrote:
 Radim Krčmář rkrc...@redhat.com writes:

 2015-03-26 21:24+0300, Andrey Korolyov:
 On Thu, Mar 26, 2015 at 8:40 PM, Radim Krčmář rkrc...@redhat.com wrote:
  2015-03-26 20:08+0300, Andrey Korolyov:
  KVM internal error. Suberror: 2
  extra data[0]: 80ef
  extra data[1]: 8b0d
 
  Btw. does this part ever change?
 
  I see that first report had:
 
KVM internal error. Suberror: 2
extra data[0]: 80d1
extra data[1]: 8b0d
 
  Was that a Windows guest by any chance?

 Yes, exactly, different extra data output was from a Windows VMs.

 Windows uses vector 0xd1 for timer interrupts.

 I second Bandan -- checking that it reproduces on other machine would be
 great for sanity :)  (Although a bug in our APICv is far more likely.)

 If it's APICv related, a run without apicv enabled could give more hints.

 Your devices not getting reset hypothesis makes the most sense to me,
 maybe the timer vector in the error message is just one part of
 the whole story. Another misbehaving interrupt from the dark comes in at the
 same time and leads to a double fault.

Default trace (APICv enabled, first reboot introduced the issue):
http://xdel.ru/downloads/kvm-e5v2-issue/hanged-reboot-apic-on.dat.gz

Trace without APICv (three reboots, just to make sure to hit the
problematic condition of supposed DF, as it still have not one hundred
percent reproducibility):
http://xdel.ru/downloads/kvm-e5v2-issue/apic-off.dat.gz

It would be great of course to reproduce this somewhere else,
otherwise all this thread may end in fixing a bug which exists only at
my particular platform. Right now I have no hardware except a lot of
well-known (in terms of existing issues) Supermicro boards of one
model.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-27 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 11:40 PM, Radim Krčmář rkrc...@redhat.com wrote:
 2015-03-26 21:24+0300, Andrey Korolyov:
 On Thu, Mar 26, 2015 at 8:40 PM, Radim Krčmář rkrc...@redhat.com wrote:
  2015-03-26 20:08+0300, Andrey Korolyov:
  KVM internal error. Suberror: 2
  extra data[0]: 80ef
  extra data[1]: 8b0d
 
  Btw. does this part ever change?
 
  I see that first report had:
 
KVM internal error. Suberror: 2
extra data[0]: 80d1
extra data[1]: 8b0d
 
  Was that a Windows guest by any chance?

 Yes, exactly, different extra data output was from a Windows VMs.

 Windows uses vector 0xd1 for timer interrupts.

 I second Bandan -- checking that it reproduces on other machine would be
 great for sanity :)  (Although a bug in our APICv is far more likely.)

Trace with new bits:

KVM internal error. Suberror: 2
extra data[0]: 80ef
extra data[1]: 8b0d
extra data[2]: 77b
EAX= EBX= ECX= EDX=
ESI= EDI= EBP= ESP=6d24
EIP=d331 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000 000f  9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT= 000f6cb0 0037
IDT=  03ff
CR0=0010 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=66 c3 cd 02 cb cd 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd
19 cb cd 1c cb cd 4a cb fa fc 66 ba 47 d3 0f 00 e9 ad fe f3 90 f0 0f
ba 2d d4 fe fb 3f
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-26 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 5:47 AM, Bandan Das b...@redhat.com wrote:
 Hi Andrey,

 Andrey Korolyov and...@xdel.ru writes:

 On Mon, Mar 16, 2015 at 10:17 PM, Andrey Korolyov and...@xdel.ru wrote:
 For now, it looks like bug have a mixed Murphy-Heisenberg nature, as
 it appearance is very rare (compared to the number of actual launches)
 and most probably bounded to the physical characteristics of my
 production nodes. As soon as I reach any reproducible path for a
 regular workstation environment, I`ll let everyone know. Also I am
 starting to think that issue can belong to the particular motherboard
 firmware revision, despite fact that the CPU microcode is the same
 everywhere.

 I will take the risk and say this - could it be a processor bug ? :)


 Hello everyone, I`ve managed to reproduce this issue
 *deterministically* with latest seabios with smp fix and 3.18.3. The
 error occuring just *once* per vm until hypervisor reboots, at least
 in my setup, this is definitely crazy...

 - launch two VMs (Centos 7 in my case),
 - wait a little while they are booting,
 - attach serial console (I am using virsh list for this exact purpose),
 - issue acpi reboot or reset, does not matter,
 - VM always hangs at boot, most times with sgabios initialization
 string printed out [1], but sometimes it hangs a bit later [2],
 - no matter how many times I try to relaunch the QEMU afterwards, the
 issue does not appear on VM which experienced problem once;
 - trace and sample args can be seen in [3] and [4] respectively.

 My system is a Dell R720 dual socket which has 2620v2s. I tried your
 setup but couldn't reproduce (my qemu cmdline isn't exactly the same
 as yours), although, if you could simplify your command line a bit,
 I can try again.

 Bandan

 1)
 Google, Inc.
 Serial Graphics Adapter 06/11/14
 SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $
 (pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014
 Term: 211x62
 4 0

 2)
 Google, Inc.
 Serial Graphics Adapter 06/11/14
 SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $
 (pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014
 Term: 211x62
 4 0
 [...empty screen...]
 SeaBIOS (version 1.8.1-20150325_230423-testnode)
 Machine UUID 3c78721f-7317-4f85-bcbe-f5ad46d293a1


 iPXE (http://ipxe.org) 00:02.0 C100 PCI2.10 PnP PMM+3FF95BA0+3FEF5BA0 C10

 3)

 KVM internal error. Suberror: 2
 extra data[0]: 80ef
 extra data[1]: 8b0d
 EAX= EBX= ECX= EDX=
 ESI= EDI= EBP= ESP=6d2c
 EIP=d331 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =   9300
 CS =f000 000f  9b00
 SS =   9300
 DS =   9300
 FS =   9300
 GS =   9300
 LDT=   8200
 TR =   8b00
 GDT= 000f6cb0 0037
 IDT=  03ff
 CR0=0010 CR2= CR3= CR4=
 DR0= DR1= DR2=
 DR3=
 DR6=0ff0 DR7=0400
 EFER=
 Code=66 c3 cd 02 cb cd 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd
 19 cb cd 1c cb cd 4a cb fa fc 66 ba 47 d3 0f 00 e9 ad fe f3 90 f0 0f
 ba 2d d4 fe fb 3f

 4)
 /usr/bin/qemu-system-x86_64 -name centos71 -S -machine
 pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -bios
 /usr/share/seabios/bios.bin -m 1024 -realtime mlock=off -smp
 12,sockets=1,cores=12,threads=12 -uuid
 3c78721f-7317-4f85-bcbe-f5ad46d293a1 -nographic -no-user-config
 -nodefaults -device sga -chardev
 socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos71.monitor,server,nowait
 -mon chardev=charmonitor,id=monitor,mode=control -rtc
 base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard
 -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global
 PIIX4_PM.disable_s4=1 -boot strict=on -device
 nec-usb-xhci,id=usb,bus=pci.0,addr=0x3 -device
 virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
 file=rbd:dev-rack2/centos7-1.raw:id=qemukvm:key=XX:auth_supported=cephx\;none:mon_host=10.6.0.1\:6789\;10.6.0.3\:6789\;10.6.0.4\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native
 -device 
 virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -chardev pty,id=charserial0 -device
 isa-serial,chardev=charserial0,id=serial0 -chardev
 socket,id=charchannel0,path=/var/lib/libvirt/qemu/centos71.sock,server,nowait
 -device 
 virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
 -msg timestamp=on

Hehe, 2.2 works just perfectly but 2.1 isn`t. I`ll bisect the issue in
a next couple of days and post the right commit (but as can remember
none of commits b/w 2.1 and 2.2 can fix simular issue by a purpose).
I`ve attached a reference xml to simplify playing with libvirt if
anyone willing to do so.
domain type='kvm

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-26 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 7:36 PM, Kevin O'Connor ke...@koconnor.net wrote:
 On Thu, Mar 26, 2015 at 04:58:07PM +0100, Radim Krčmář wrote:
 2015-03-25 20:05-0400, Kevin O'Connor:
  On Thu, Mar 26, 2015 at 02:35:58AM +0300, Andrey Korolyov wrote:
   Thanks, strangely the reboot is always failing now and always reaching
   seabios greeting. May be prints straightened up a race (e.g. it is not
   int19 problem really).
  
   object file part:
  
   d331 irq_trampoline_0x19:
   irq_trampoline_0x19():
   /root/seabios-1.8.1/src/romlayout.S:195
   d331:   cd 19   int$0x19
   d333:   cb  lretw
 
  [...]
   Jump to int19 (vector=f000e6f2)
 
  Thanks.  So, it dies on the int $0x19 instruction itself.  The
  vector looks correct and I don't see anything in the cpu register
  state that looks wrong.  Maybe one of the kvm developers will have an
  idea what could cause a fault there.

 The place agrees with the cd 19 cb part of KVM error output.
 Suberror 2 means that we were interrupted while delivering a vector,
 here it is disected: (delivering 'vect_info')

   vect_info (extra data[0]: 80ef)
   - vector 0xef
   - INTR_TYPE_EXT_INTR (0x000)
   - no error code (0x000)
   - valid (0x8000)

   intr_info (extra data[1]: 8b0d)
   - #GP (0x0d)
   - INTR_TYPE_HARD_EXCEPTION (0x300)
   - error code on stack (0x800)  [Hunk at the bottom exposes it.]
   - valid (0x8000)

 Thanks for the background info.

 Notice the 0xef.  My best hypothesis so far is that we fail at resetting
 devices, and 0xef is LOCAL_TIMER_VECTOR from Linux before we rebooted.
 (The bug happens at the first place that enables interrupts.)

 FYI, the int $0x19 isn't the first place SeaBIOS will enable
 interrupts.  Each screen print (every character in the seabios banner
 and uuid string) will call the vga bios (int $0x10) with irqs enabled
 (see output.c:screenc).

 Also, SeaBIOS loads a default vector (f000:ff53) at 0xef which does a
 simple iretw.

 Things that are unusual about the int $0x19 call:
   - it is likely the first place that the cpu is transitioned into
 16bit real mode as opposed to big real mode.  (That is, the
 first place interrupts are enabled with the segment limits set to
 0x.)
   - it's right after the fw/shadow.c:make_bios_readonly() call, which
 attempts to configures the memory at 0xf-0x10 as
 read-only.  That code also issues a wbinvd() call.

 I'm not sure if the crash always happens at the int $0x19 location
 though.  Andrey, does the crash always happen with EIP=d331 and/or
 with Code=... cd 19?

 -Kevin

There are also rare occurences for d3f9 (in the middle of ep) and d334
ep (less than one tenth of events for both). I`ll post a sample event
capture with and without Radim`s proposed patch maybe today or
tomorrow.

/root/seabios-1.8.1/src/romlayout.S:289
d3eb:   66 50   pushl  %eax
d3ed:   66 51   pushl  %ecx
d3ef:   66 52   pushl  %edx
d3f1:   66 53   pushl  %ebx
d3f3:   66 55   pushl  %ebp
d3f5:   66 56   pushl  %esi
d3f7:   66 57   pushl  %edi
d3f9:   06  pushw  %es
d3fa:   1e  pushw  %ds

d334 irq_trampoline_0x1c:
irq_trampoline_0x1c():
/root/seabios-1.8.1/src/romlayout.S:196
d334:   cd 1c   int$0x1c
d336:   cb  lretw
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-26 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 8:06 PM, Kevin O'Connor ke...@koconnor.net wrote:
 On Thu, Mar 26, 2015 at 07:48:09PM +0300, Andrey Korolyov wrote:
 On Thu, Mar 26, 2015 at 7:36 PM, Kevin O'Connor ke...@koconnor.net wrote:
  I'm not sure if the crash always happens at the int $0x19 location
  though.  Andrey, does the crash always happen with EIP=d331 and/or
  with Code=... cd 19?

 There are also rare occurences for d3f9 (in the middle of ep) and d334
 ep (less than one tenth of events for both). I`ll post a sample event
 capture with and without Radim`s proposed patch maybe today or
 tomorrow.

 /root/seabios-1.8.1/src/romlayout.S:289
 d3eb:   66 50   pushl  %eax
 d3ed:   66 51   pushl  %ecx
 d3ef:   66 52   pushl  %edx
 d3f1:   66 53   pushl  %ebx
 d3f3:   66 55   pushl  %ebp
 d3f5:   66 56   pushl  %esi
 d3f7:   66 57   pushl  %edi
 d3f9:   06  pushw  %es
 d3fa:   1e  pushw  %ds

 d334 irq_trampoline_0x1c:
 irq_trampoline_0x1c():
 /root/seabios-1.8.1/src/romlayout.S:196
 d334:   cd 1c   int$0x1c
 d336:   cb  lretw

 Thanks.  The d334 looks very similar to the d331 report (code=cd
 1c).  That path could happen during post (big real mode) or
 immiediately after post (real mode).

 The d3f9 report does not look like the others - interrupts are
 disabled there.  If you still have the error logs, can you post the
 full kvm crash report for d3f9?


Here you go:

KVM internal error. Suberror: 2
extra data[0]: 80ef
extra data[1]: 8b0d
EAX=0003 EBX= ECX= EDX=
ESI= EDI= EBP= ESP=6cd4
EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000 000f  9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT= 000f6e98 0037
IDT=  03ff
CR0=0010 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
b8 00 e0 00 00 8e
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-26 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 8:18 PM, Kevin O'Connor ke...@koconnor.net wrote:
 On Thu, Mar 26, 2015 at 08:08:52PM +0300, Andrey Korolyov wrote:
 On Thu, Mar 26, 2015 at 8:06 PM, Kevin O'Connor ke...@koconnor.net wrote:
  On Thu, Mar 26, 2015 at 07:48:09PM +0300, Andrey Korolyov wrote:
  On Thu, Mar 26, 2015 at 7:36 PM, Kevin O'Connor ke...@koconnor.net 
  wrote:
   I'm not sure if the crash always happens at the int $0x19 location
   though.  Andrey, does the crash always happen with EIP=d331 and/or
   with Code=... cd 19?
 
  There are also rare occurences for d3f9 (in the middle of ep) and d334
  ep (less than one tenth of events for both). I`ll post a sample event
  capture with and without Radim`s proposed patch maybe today or
  tomorrow.
 
  /root/seabios-1.8.1/src/romlayout.S:289
  d3eb:   66 50   pushl  %eax
  d3ed:   66 51   pushl  %ecx
  d3ef:   66 52   pushl  %edx
  d3f1:   66 53   pushl  %ebx
  d3f3:   66 55   pushl  %ebp
  d3f5:   66 56   pushl  %esi
  d3f7:   66 57   pushl  %edi
  d3f9:   06  pushw  %es
  d3fa:   1e  pushw  %ds
 
  d334 irq_trampoline_0x1c:
  irq_trampoline_0x1c():
  /root/seabios-1.8.1/src/romlayout.S:196
  d334:   cd 1c   int$0x1c
  d336:   cb  lretw
 
  Thanks.  The d334 looks very similar to the d331 report (code=cd
  1c).  That path could happen during post (big real mode) or
  immiediately after post (real mode).
 
  The d3f9 report does not look like the others - interrupts are
  disabled there.  If you still have the error logs, can you post the
  full kvm crash report for d3f9?
 

 Here you go:

 Thanks.  While we're at, can you verify if all your reports are
 showing the cpu in real mode.  That is, do they all have 
 in the third column of the segment registers - as in:

 ES =   9300


That`s positive.

 [...]
 Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
 b8 00 e0 00 00 8e

 KVM reports the code as int $0x10 here.  Was it possible this report
 was from a different build of seabios (that had a different code
 layout)?


Yep, sorry, I`ve mixed in logs just from before transition out of 1.7.5.

 Interestingly, this int $0x10 is also in real-mode and not big real
 mode, so I think it would have occurred after post completed.

 -Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-26 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 8:40 PM, Radim Krčmář rkrc...@redhat.com wrote:
 2015-03-26 20:08+0300, Andrey Korolyov:
 KVM internal error. Suberror: 2
 extra data[0]: 80ef
 extra data[1]: 8b0d

 Btw. does this part ever change?

 I see that first report had:

   KVM internal error. Suberror: 2
   extra data[0]: 80d1
   extra data[1]: 8b0d

 Was that a Windows guest by any chance?

Yes, exactly, different extra data output was from a Windows VMs.
Thanks for clarifying things for your patch, I hadn`t looked at the
vmx code yet and thought that it changing things.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-26 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 12:18 PM, Andrey Korolyov and...@xdel.ru wrote:
 On Thu, Mar 26, 2015 at 5:47 AM, Bandan Das b...@redhat.com wrote:
 Hi Andrey,

 Andrey Korolyov and...@xdel.ru writes:

 On Mon, Mar 16, 2015 at 10:17 PM, Andrey Korolyov and...@xdel.ru wrote:
 For now, it looks like bug have a mixed Murphy-Heisenberg nature, as
 it appearance is very rare (compared to the number of actual launches)
 and most probably bounded to the physical characteristics of my
 production nodes. As soon as I reach any reproducible path for a
 regular workstation environment, I`ll let everyone know. Also I am
 starting to think that issue can belong to the particular motherboard
 firmware revision, despite fact that the CPU microcode is the same
 everywhere.

 I will take the risk and say this - could it be a processor bug ? :)


 Hello everyone, I`ve managed to reproduce this issue
 *deterministically* with latest seabios with smp fix and 3.18.3. The
 error occuring just *once* per vm until hypervisor reboots, at least
 in my setup, this is definitely crazy...

 - launch two VMs (Centos 7 in my case),
 - wait a little while they are booting,
 - attach serial console (I am using virsh list for this exact purpose),
 - issue acpi reboot or reset, does not matter,
 - VM always hangs at boot, most times with sgabios initialization
 string printed out [1], but sometimes it hangs a bit later [2],
 - no matter how many times I try to relaunch the QEMU afterwards, the
 issue does not appear on VM which experienced problem once;
 - trace and sample args can be seen in [3] and [4] respectively.

 My system is a Dell R720 dual socket which has 2620v2s. I tried your
 setup but couldn't reproduce (my qemu cmdline isn't exactly the same
 as yours), although, if you could simplify your command line a bit,
 I can try again.

 Bandan

 1)
 Google, Inc.
 Serial Graphics Adapter 06/11/14
 SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $
 (pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014
 Term: 211x62
 4 0

 2)
 Google, Inc.
 Serial Graphics Adapter 06/11/14
 SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $
 (pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014
 Term: 211x62
 4 0
 [...empty screen...]
 SeaBIOS (version 1.8.1-20150325_230423-testnode)
 Machine UUID 3c78721f-7317-4f85-bcbe-f5ad46d293a1


 iPXE (http://ipxe.org) 00:02.0 C100 PCI2.10 PnP PMM+3FF95BA0+3FEF5BA0 C10

 3)

 KVM internal error. Suberror: 2
 extra data[0]: 80ef
 extra data[1]: 8b0d
 EAX= EBX= ECX= EDX=
 ESI= EDI= EBP= ESP=6d2c
 EIP=d331 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =   9300
 CS =f000 000f  9b00
 SS =   9300
 DS =   9300
 FS =   9300
 GS =   9300
 LDT=   8200
 TR =   8b00
 GDT= 000f6cb0 0037
 IDT=  03ff
 CR0=0010 CR2= CR3= CR4=
 DR0= DR1= DR2=
 DR3=
 DR6=0ff0 DR7=0400
 EFER=
 Code=66 c3 cd 02 cb cd 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd
 19 cb cd 1c cb cd 4a cb fa fc 66 ba 47 d3 0f 00 e9 ad fe f3 90 f0 0f
 ba 2d d4 fe fb 3f

 4)
 /usr/bin/qemu-system-x86_64 -name centos71 -S -machine
 pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -bios
 /usr/share/seabios/bios.bin -m 1024 -realtime mlock=off -smp
 12,sockets=1,cores=12,threads=12 -uuid
 3c78721f-7317-4f85-bcbe-f5ad46d293a1 -nographic -no-user-config
 -nodefaults -device sga -chardev
 socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos71.monitor,server,nowait
 -mon chardev=charmonitor,id=monitor,mode=control -rtc
 base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard
 -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global
 PIIX4_PM.disable_s4=1 -boot strict=on -device
 nec-usb-xhci,id=usb,bus=pci.0,addr=0x3 -device
 virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
 file=rbd:dev-rack2/centos7-1.raw:id=qemukvm:key=XX:auth_supported=cephx\;none:mon_host=10.6.0.1\:6789\;10.6.0.3\:6789\;10.6.0.4\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native
 -device 
 virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -chardev pty,id=charserial0 -device
 isa-serial,chardev=charserial0,id=serial0 -chardev
 socket,id=charchannel0,path=/var/lib/libvirt/qemu/centos71.sock,server,nowait
 -device 
 virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
 -msg timestamp=on

 Hehe, 2.2 works just perfectly but 2.1 isn`t. I`ll bisect the issue in
 a next couple of days and post the right commit (but as can remember
 none of commits b/w 2.1 and 2.2 can fix simular issue by a purpose).
 I`ve attached a reference xml

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-25 Thread Andrey Korolyov
On Mon, Mar 16, 2015 at 10:17 PM, Andrey Korolyov and...@xdel.ru wrote:
 For now, it looks like bug have a mixed Murphy-Heisenberg nature, as
 it appearance is very rare (compared to the number of actual launches)
 and most probably bounded to the physical characteristics of my
 production nodes. As soon as I reach any reproducible path for a
 regular workstation environment, I`ll let everyone know. Also I am
 starting to think that issue can belong to the particular motherboard
 firmware revision, despite fact that the CPU microcode is the same
 everywhere.


Hello everyone, I`ve managed to reproduce this issue
*deterministically* with latest seabios with smp fix and 3.18.3. The
error occuring just *once* per vm until hypervisor reboots, at least
in my setup, this is definitely crazy...

- launch two VMs (Centos 7 in my case),
- wait a little while they are booting,
- attach serial console (I am using virsh list for this exact purpose),
- issue acpi reboot or reset, does not matter,
- VM always hangs at boot, most times with sgabios initialization
string printed out [1], but sometimes it hangs a bit later [2],
- no matter how many times I try to relaunch the QEMU afterwards, the
issue does not appear on VM which experienced problem once;
- trace and sample args can be seen in [3] and [4] respectively.

1)
Google, Inc.
Serial Graphics Adapter 06/11/14
SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $
(pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014
Term: 211x62
4 0

2)
Google, Inc.
Serial Graphics Adapter 06/11/14
SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $
(pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014
Term: 211x62
4 0
[...empty screen...]
SeaBIOS (version 1.8.1-20150325_230423-testnode)
Machine UUID 3c78721f-7317-4f85-bcbe-f5ad46d293a1


iPXE (http://ipxe.org) 00:02.0 C100 PCI2.10 PnP PMM+3FF95BA0+3FEF5BA0 C10

3)

KVM internal error. Suberror: 2
extra data[0]: 80ef
extra data[1]: 8b0d
EAX= EBX= ECX= EDX=
ESI= EDI= EBP= ESP=6d2c
EIP=d331 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000 000f  9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT= 000f6cb0 0037
IDT=  03ff
CR0=0010 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=66 c3 cd 02 cb cd 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd
19 cb cd 1c cb cd 4a cb fa fc 66 ba 47 d3 0f 00 e9 ad fe f3 90 f0 0f
ba 2d d4 fe fb 3f

4)
/usr/bin/qemu-system-x86_64 -name centos71 -S -machine
pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -bios
/usr/share/seabios/bios.bin -m 1024 -realtime mlock=off -smp
12,sockets=1,cores=12,threads=12 -uuid
3c78721f-7317-4f85-bcbe-f5ad46d293a1 -nographic -no-user-config
-nodefaults -device sga -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos71.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard
-no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global
PIIX4_PM.disable_s4=1 -boot strict=on -device
nec-usb-xhci,id=usb,bus=pci.0,addr=0x3 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=rbd:dev-rack2/centos7-1.raw:id=qemukvm:key=XX:auth_supported=cephx\;none:mon_host=10.6.0.1\:6789\;10.6.0.3\:6789\;10.6.0.4\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/centos71.sock,server,nowait
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
-msg timestamp=on
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-25 Thread Andrey Korolyov
 - attach serial console (I am using virsh list for this exact purpose),

virsh console of course, sorry
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-25 Thread Andrey Korolyov
On Wed, Mar 25, 2015 at 11:54 PM, Kevin O'Connor ke...@koconnor.net wrote:
 On Wed, Mar 25, 2015 at 11:43:31PM +0300, Andrey Korolyov wrote:
 On Mon, Mar 16, 2015 at 10:17 PM, Andrey Korolyov and...@xdel.ru wrote:
  For now, it looks like bug have a mixed Murphy-Heisenberg nature, as
  it appearance is very rare (compared to the number of actual launches)
  and most probably bounded to the physical characteristics of my
  production nodes. As soon as I reach any reproducible path for a
  regular workstation environment, I`ll let everyone know. Also I am
  starting to think that issue can belong to the particular motherboard
  firmware revision, despite fact that the CPU microcode is the same
  everywhere.


 Hello everyone, I`ve managed to reproduce this issue
 *deterministically* with latest seabios with smp fix and 3.18.3. The
 error occuring just *once* per vm until hypervisor reboots, at least
 in my setup, this is definitely crazy...

 - launch two VMs (Centos 7 in my case),
 - wait a little while they are booting,
 - attach serial console (I am using virsh list for this exact purpose),
 - issue acpi reboot or reset, does not matter,
 - VM always hangs at boot, most times with sgabios initialization
 string printed out [1], but sometimes it hangs a bit later [2],
 - no matter how many times I try to relaunch the QEMU afterwards, the
 issue does not appear on VM which experienced problem once;
 - trace and sample args can be seen in [3] and [4] respectively.

 Can you add something like:

   -chardev file,path=seabioslog.`date +%s`,id=seabios -device 
 isa-debugcon,iobase=0x402,chardev=seabios

 to the qemu command line and forward the resulting log from both a
 succesful boot and a failed one?

 -Kevin

Of course, logs are attached.


reboot.failed
Description: Binary data


reboot.succeeded
Description: Binary data


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-25 Thread Andrey Korolyov
On Thu, Mar 26, 2015 at 2:02 AM, Kevin O'Connor ke...@koconnor.net wrote:
 On Thu, Mar 26, 2015 at 01:31:11AM +0300, Andrey Korolyov wrote:
 On Wed, Mar 25, 2015 at 11:54 PM, Kevin O'Connor ke...@koconnor.net wrote:
 
  Can you add something like:
 
-chardev file,path=seabioslog.`date +%s`,id=seabios -device 
  isa-debugcon,iobase=0x402,chardev=seabios
 
  to the qemu command line and forward the resulting log from both a
  succesful boot and a failed one?
 
  -Kevin

 Of course, logs are attached.

 Thanks.  From a diff of the two logs:

  4: 3ffe - 4000 = 2 RESERVED
  5: feffc000 - ff00 = 2 RESERVED
  6: fffc - 0001 = 2 RESERVED
   -enter handle_19:
   -  NULL
   -Booting from Hard Disk...
   -Booting from :7c00

 So, it got most of the way through the reboot - there's only a few
 function calls between the e820 map being dumped and the handle_19
 call.  The fault also seems to show it stopped in the BIOS in 16bit
 mode:

 EIP=d331 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =   9300
 CS =f000 000f  9b00

 Can you add the patch below, force the fault, and forward the log.

 Also, if you recreate the failure can you take the EIP from the fault
 (eg, d331) and search for the corresponding function in the output of:
   objdump -m i386 -M i8086 -M suffix -ldr out/rom16.o | less
 (That is, search for d331:.)  If that's too much of a pain, just
 send me a direct email with the seabios out/rom16.o file and the new
 EIP of the fault.  (I need the out/rom16.o that was used to build the
 version of SeaBIOS that faulted.)

 -Kevin


 diff --git a/src/post.c b/src/post.c
 index 9ea5620..bbd19c0 100644
 --- a/src/post.c
 +++ b/src/post.c
 @@ -185,21 +185,24 @@ prepareboot(void)
  pmm_prepboot();
  malloc_prepboot();
  memmap_prepboot();
 +dprintf(1, a\n);

  HaveRunPost = 2;

  // Setup bios checksum.
  BiosChecksum -= checksum((u8*)BUILD_BIOS_ADDR, BUILD_BIOS_SIZE);
 +dprintf(1, b\n);
  }

  // Begin the boot process by invoking an int0x19 in 16bit mode.
  void VISIBLE32FLAT
  startBoot(void)
  {
 +dprintf(1, e\n);
  // Clear low-memory allocations (required by PMM spec).
  memset((void*)BUILD_STACK_ADDR, 0, BUILD_EBDA_MINIMUM - 
 BUILD_STACK_ADDR);

 -dprintf(3, Jump to int19\n);
 +dprintf(1, Jump to int19 (vector=%x)\n, GET_IVT(0x19).segoff);
  struct bregs br;
  memset(br, 0, sizeof(br));
  br.flags = F_IF;
 @@ -239,9 +242,11 @@ maininit(void)
  // Prepare for boot.
  prepareboot();

 +dprintf(1, c\n);
  // Write protect bios memory.
  make_bios_readonly();

 +dprintf(1, d\n);
  // Invoke int 19 to start boot process.
  startBoot();
  }

Thanks, strangely the reboot is always failing now and always reaching
seabios greeting. May be prints straightened up a race (e.g. it is not
int19 problem really).

object file part:

d331 irq_trampoline_0x19:
irq_trampoline_0x19():
/root/seabios-1.8.1/src/romlayout.S:195
d331:   cd 19   int$0x19
d333:   cb  lretw


reboot.failed
Description: Binary data


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-16 Thread Andrey Korolyov
For now, it looks like bug have a mixed Murphy-Heisenberg nature, as
it appearance is very rare (compared to the number of actual launches)
and most probably bounded to the physical characteristics of my
production nodes. As soon as I reach any reproducible path for a
regular workstation environment, I`ll let everyone know. Also I am
starting to think that issue can belong to the particular motherboard
firmware revision, despite fact that the CPU microcode is the same
everywhere.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-12 Thread Andrey Korolyov
On Wed, Mar 11, 2015 at 10:59 PM, Dr. David Alan Gilbert
dgilb...@redhat.com wrote:
 * Andrey Korolyov (and...@xdel.ru) wrote:
 On Wed, Mar 11, 2015 at 10:33 PM, Dr. David Alan Gilbert
 dgilb...@redhat.com wrote:
  * Kevin O'Connor (ke...@koconnor.net) wrote:
  On Wed, Mar 11, 2015 at 02:45:31PM -0400, Kevin O'Connor wrote:
   On Wed, Mar 11, 2015 at 02:40:39PM -0400, Kevin O'Connor wrote:
For what it's worth, I can't seem to trigger the problem if I move the
cmos read above the SIPI/LAPIC code (see patch below).
  
   Ugh!
  
   That's a seabios bug.  Main processor modifies the rtc index
   (rtc_read()) while APs try to clear the NMI bit by modifying the rtc
   index (romlayout.S:transition32).
  
   I'll put together a fix.
 
  The seabios patch below resolves the issue for me.
 
  Thanks! Looks good here.
 
  Andrey, Paolo, Bandan: Does it fix it for you as well?
 

 Thanks Kevin, Dave,

 I`m afraid that I`m hitting something different not only because
 different suberror code but also because of mine version of seabios -
 I am using 1.7.5 and corresponding code in the proposed patch looks
 different - there is no smp-related code patch is about of. Those
 mentioned devices went to production successfully and I`m afraid I
 cannot afford playing on them anymore, even if I re-trigger the issue
 with patched 1.8.1-rc, there is no way to switch to a different kernel
 and retest due to specific conditions of this production suite. I`ve
 ordered a pair of new shoes^W 2620v2-s which should arrive to me next

 Well I was testing on a pair of 'E5-2620 v2'; but as you saw my test case
 was pretty simple.  If you can suggest any flags I should add etc to the
 test I'd be happy to give it a go.

 Dave

Here is mine launch string:

qemu-system-x86_64 -enable-kvm -name vmtest -S -machine
pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -m 512
-realtime mlock=off -smp 12,sockets=1,cores=12,threads=12 -numa
node,nodeid=0,cpus=0-11,mem=512 -nographic -no-user-config -nodefaults
-device sga -rtc base=utc,driftfix=slew -global
kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global
PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on
-device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -m
512,slots=31,maxmem=16384M -object
memory-backend-ram,id=mem0,size=512M -device
pc-dimm,id=dimm0,node=0,memdev=mem0

I omitted disk backend in this example, but there is a chance that my
problem is not reproducible without some calls made explicitly by a
bootloader (not sure what to say for mid-runtime failures).


 Monday, so I`ll be able to test a) against 1.8.0-release, b) against
 patched bios code, c) reproduce initial error on master/3.19 (may be
 I`ll take them before weekend by going into this computer shop in
 person). Until then, I have a very deep feeling that mine issue is not
 there :) Also I became very curious on how a lack of IDT feature may
 completely eliminate the issue appearance for me, the only possible
 explanation is a clock-related race which is kinda stupid suggestion
 and unlikely to exist in nature.

 Thanks again for everyone for throughout testing and ideas!

 
  -Kevin
 
 
  --- a/src/romlayout.S
  +++ b/src/romlayout.S
  @@ -22,7 +22,8 @@
   // %edx = return location (in 32bit mode)
   // Clobbers: ecx, flags, segment registers, cr0, idt/gdt
   DECLFUNC transition32
  -transition32_for_smi:
  +transition32_nmi_off:
  +// transition32 when NMI and A20 are already initialized
   movl %eax, %ecx
   jmp 1f
   transition32:
  @@ -205,7 +206,7 @@ __farcall16:
   entry_smi:
   // Transition to 32bit mode.
   movl $1f + BUILD_BIOS_ADDR, %edx
  -jmp transition32_for_smi
  +jmp transition32_nmi_off
   .code32
   1:  movl $BUILD_SMM_ADDR + 0x8000, %esp
   calll _cfunc32flat_handle_smi - BUILD_BIOS_ADDR
  @@ -216,8 +217,10 @@ entry_smi:
   DECLFUNC entry_smp
   entry_smp:
   // Transition to 32bit mode.
  +cli
  +cld
   movl $2f + BUILD_BIOS_ADDR, %edx
  -jmp transition32
  +jmp transition32_nmi_off
   .code32
   // Acquire lock and take ownership of shared stack
   1:  rep ; nop
  --
  Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
 --
 Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-12 Thread Andrey Korolyov
On Wed, Mar 11, 2015 at 10:33 PM, Dr. David Alan Gilbert
dgilb...@redhat.com wrote:
 * Kevin O'Connor (ke...@koconnor.net) wrote:
 On Wed, Mar 11, 2015 at 02:45:31PM -0400, Kevin O'Connor wrote:
  On Wed, Mar 11, 2015 at 02:40:39PM -0400, Kevin O'Connor wrote:
   For what it's worth, I can't seem to trigger the problem if I move the
   cmos read above the SIPI/LAPIC code (see patch below).
 
  Ugh!
 
  That's a seabios bug.  Main processor modifies the rtc index
  (rtc_read()) while APs try to clear the NMI bit by modifying the rtc
  index (romlayout.S:transition32).
 
  I'll put together a fix.

 The seabios patch below resolves the issue for me.

 Thanks! Looks good here.

 Andrey, Paolo, Bandan: Does it fix it for you as well?


Thanks Kevin, Dave,

I`m afraid that I`m hitting something different not only because
different suberror code but also because of mine version of seabios -
I am using 1.7.5 and corresponding code in the proposed patch looks
different - there is no smp-related code patch is about of. Those
mentioned devices went to production successfully and I`m afraid I
cannot afford playing on them anymore, even if I re-trigger the issue
with patched 1.8.1-rc, there is no way to switch to a different kernel
and retest due to specific conditions of this production suite. I`ve
ordered a pair of new shoes^W 2620v2-s which should arrive to me next
Monday, so I`ll be able to test a) against 1.8.0-release, b) against
patched bios code, c) reproduce initial error on master/3.19 (may be
I`ll take them before weekend by going into this computer shop in
person). Until then, I have a very deep feeling that mine issue is not
there :) Also I became very curious on how a lack of IDT feature may
completely eliminate the issue appearance for me, the only possible
explanation is a clock-related race which is kinda stupid suggestion
and unlikely to exist in nature.

Thanks again for everyone for throughout testing and ideas!


 -Kevin


 --- a/src/romlayout.S
 +++ b/src/romlayout.S
 @@ -22,7 +22,8 @@
  // %edx = return location (in 32bit mode)
  // Clobbers: ecx, flags, segment registers, cr0, idt/gdt
  DECLFUNC transition32
 -transition32_for_smi:
 +transition32_nmi_off:
 +// transition32 when NMI and A20 are already initialized
  movl %eax, %ecx
  jmp 1f
  transition32:
 @@ -205,7 +206,7 @@ __farcall16:
  entry_smi:
  // Transition to 32bit mode.
  movl $1f + BUILD_BIOS_ADDR, %edx
 -jmp transition32_for_smi
 +jmp transition32_nmi_off
  .code32
  1:  movl $BUILD_SMM_ADDR + 0x8000, %esp
  calll _cfunc32flat_handle_smi - BUILD_BIOS_ADDR
 @@ -216,8 +217,10 @@ entry_smi:
  DECLFUNC entry_smp
  entry_smp:
  // Transition to 32bit mode.
 +cli
 +cld
  movl $2f + BUILD_BIOS_ADDR, %edx
 -jmp transition32
 +jmp transition32_nmi_off
  .code32
  // Acquire lock and take ownership of shared stack
  1:  rep ; nop
 --
 Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-12 Thread Andrey Korolyov
On Thu, Mar 12, 2015 at 12:59 PM, Dr. David Alan Gilbert
dgilb...@redhat.com wrote:
 * Andrey Korolyov (and...@xdel.ru) wrote:
 On Wed, Mar 11, 2015 at 10:59 PM, Dr. David Alan Gilbert
 dgilb...@redhat.com wrote:
  * Andrey Korolyov (and...@xdel.ru) wrote:
  On Wed, Mar 11, 2015 at 10:33 PM, Dr. David Alan Gilbert
  dgilb...@redhat.com wrote:
   * Kevin O'Connor (ke...@koconnor.net) wrote:
   On Wed, Mar 11, 2015 at 02:45:31PM -0400, Kevin O'Connor wrote:
On Wed, Mar 11, 2015 at 02:40:39PM -0400, Kevin O'Connor wrote:
 For what it's worth, I can't seem to trigger the problem if I move 
 the
 cmos read above the SIPI/LAPIC code (see patch below).
   
Ugh!
   
That's a seabios bug.  Main processor modifies the rtc index
(rtc_read()) while APs try to clear the NMI bit by modifying the rtc
index (romlayout.S:transition32).
   
I'll put together a fix.
  
   The seabios patch below resolves the issue for me.
  
   Thanks! Looks good here.
  
   Andrey, Paolo, Bandan: Does it fix it for you as well?
  
 
  Thanks Kevin, Dave,
 
  I`m afraid that I`m hitting something different not only because
  different suberror code but also because of mine version of seabios -
  I am using 1.7.5 and corresponding code in the proposed patch looks
  different - there is no smp-related code patch is about of. Those
  mentioned devices went to production successfully and I`m afraid I
  cannot afford playing on them anymore, even if I re-trigger the issue
  with patched 1.8.1-rc, there is no way to switch to a different kernel
  and retest due to specific conditions of this production suite. I`ve
  ordered a pair of new shoes^W 2620v2-s which should arrive to me next
 
  Well I was testing on a pair of 'E5-2620 v2'; but as you saw my test case
  was pretty simple.  If you can suggest any flags I should add etc to the
  test I'd be happy to give it a go.
 
  Dave

 Here is mine launch string:

 qemu-system-x86_64 -enable-kvm -name vmtest -S -machine
 pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -m 512
 -realtime mlock=off -smp 12,sockets=1,cores=12,threads=12 -numa
 node,nodeid=0,cpus=0-11,mem=512 -nographic -no-user-config -nodefaults
 -device sga -rtc base=utc,driftfix=slew -global
 kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global
 PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on
 -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 -device
 virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -m
 512,slots=31,maxmem=16384M -object
 memory-backend-ram,id=mem0,size=512M -device
 pc-dimm,id=dimm0,node=0,memdev=mem0

 I omitted disk backend in this example, but there is a chance that my
 problem is not reproducible without some calls made explicitly by a
 bootloader (not sure what to say for mid-runtime failures).

 It seems to survive OK:

Thanks David, I`ll go through test sequence and report. Unfortunately
my orchestration does not have even a hundred millisecond precision
for libvirt events, so I can`t tell if the immediate start-up failures
happened before bootloader execution or during it, all I have for
those is a less-than-two-second interval between actual pass of a
launch command and paused state event. QEMU logging also does not give
me timestamps for an emulation errors even with appropriate timestamp
arg.


 while true; do (sleep 1; echo -e '\001cc\n'; sleep 5; echo -e 
 'q\n')|/opt/qemu-try-world3/bin/qemu-system-x86_64 -enable-kvm -name vmtest 
 -S -machine pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -m 
 512 -realtime mlock=off -smp 12,sockets=1,cores=12,threads=12 -numa 
 node,nodeid=0,cpus=0-11,mem=512 -nographic -no-user-config  -device sga -rtc 
 base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet 
 -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 
 -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 -device 
 virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -m 
 512,slots=31,maxmem=16384M -object memory-backend-ram,id=mem0,size=512M 
 -device pc-dimm,id=dimm0,node=0,memdev=mem0  ~/pi.vfd 21 | tee 
 /tmp/qemu.op; grep internal error /tmp/qemu.op -q  break; done

 Dave


 
  Monday, so I`ll be able to test a) against 1.8.0-release, b) against
  patched bios code, c) reproduce initial error on master/3.19 (may be
  I`ll take them before weekend by going into this computer shop in
  person). Until then, I have a very deep feeling that mine issue is not
  there :) Also I became very curious on how a lack of IDT feature may
  completely eliminate the issue appearance for me, the only possible
  explanation is a clock-related race which is kinda stupid suggestion
  and unlikely to exist in nature.
 
  Thanks again for everyone for throughout testing and ideas!
 
  
   -Kevin
  
  
   --- a/src/romlayout.S
   +++ b/src/romlayout.S
   @@ -22,7 +22,8 @@
// %edx = return location (in 32bit mode)
// Clobbers: ecx, flags, segment registers, cr0, idt/gdt

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-10 Thread Andrey Korolyov
On Sat, Mar 7, 2015 at 3:00 AM, Andrey Korolyov and...@xdel.ru wrote:
 On Fri, Mar 6, 2015 at 7:57 PM, Bandan Das b...@redhat.com wrote:
 Andrey Korolyov and...@xdel.ru writes:

 On Fri, Mar 6, 2015 at 1:14 AM, Andrey Korolyov and...@xdel.ru wrote:
 Hello,

 recently I`ve got a couple of shiny new Intel 2620v2s for future
 replacement of the E5-2620v1, but I experienced relatively many events
 with emulation errors, all traces looks simular to the one below. I am
 running qemu-2.1 on x86 on top of 3.10 branch for testing purposes but
 can switch to some other versions if necessary. Most of crashes
 happened during reboot cycle or at the end of ACPI-based shutdown
 action, if this can help. I have zero clues of what can introduce such
 a mess inside same processor family using identical software, as
 2620v1 has no simular problem ever. Please let me know if there can be
 some side measures for making entire story more clear.

 Thanks!

 KVM internal error. Suberror: 2
 extra data[0]: 80d1
 extra data[1]: 8b0d
 EAX=0003 EBX= ECX= EDX=
 ESI= EDI= EBP= ESP=6cd4
 EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =   9300
 CS =f000 000f  9b00
 SS =   9300
 DS =   9300
 FS =   9300
 GS =   9300
 LDT=   8200
 TR =   8b00
 GDT= 000f6e98 0037
 IDT=  03ff
 CR0=0010 CR2= CR3= CR4=
 DR0= DR1= DR2=
 DR3=
 DR6=0ff0 DR7=0400
 EFER=
 Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
 b8 00 e0 00 00 8e


 It turns out that those errors are introduced by APICv, which gets
 enabled due to different feature set. If anyone is interested in
 reproducing/fixing this exactly on 3.10, it takes about one hundred of
 migrations/power state changes for an issue to appear, guest OS can be
 Linux or Win.

 Are you able to reproduce this on a more recent upstream kernel as well ?

 Bandan

 I`ll go through test cycle with 3.18 and 2603v2 around tomorrow and
 follow up with any reproduceable results.

Heh.. issue is not triggered on 2603v2 at all, at least I am not able
to hit this. The only difference with 2620v2 except lower frequency is
an Intel Dynamic Acceleration feature. I`d appreciate any testing with
higher CPU models with same or richer feature set. The testing itself
can be done on both generic 3.10 or RH7 kernels, as both of them are
experiencing this issue. I conducted all tests with disabled cstates
so I advise to do the same for a first reproduction step.

Thanks!

model name  : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
stepping: 4
microcode   : 0x416
cpu MHz : 2100.039
cache size  : 15360 KB
siblings: 12
apicid  : 43
initial apicid  : 43
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca
sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi
flexpriority ept vpid fsgsbase smep erms
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-10 Thread Andrey Korolyov
On Tue, Mar 10, 2015 at 7:57 PM, Dr. David Alan Gilbert
dgilb...@redhat.com wrote:
 * Andrey Korolyov (and...@xdel.ru) wrote:
 On Sat, Mar 7, 2015 at 3:00 AM, Andrey Korolyov and...@xdel.ru wrote:
  On Fri, Mar 6, 2015 at 7:57 PM, Bandan Das b...@redhat.com wrote:
  Andrey Korolyov and...@xdel.ru writes:
 
  On Fri, Mar 6, 2015 at 1:14 AM, Andrey Korolyov and...@xdel.ru wrote:
  Hello,
 
  recently I`ve got a couple of shiny new Intel 2620v2s for future
  replacement of the E5-2620v1, but I experienced relatively many events
  with emulation errors, all traces looks simular to the one below. I am
  running qemu-2.1 on x86 on top of 3.10 branch for testing purposes but
  can switch to some other versions if necessary. Most of crashes
  happened during reboot cycle or at the end of ACPI-based shutdown
  action, if this can help. I have zero clues of what can introduce such
  a mess inside same processor family using identical software, as
  2620v1 has no simular problem ever. Please let me know if there can be
  some side measures for making entire story more clear.
 
  Thanks!
 
  KVM internal error. Suberror: 2
  extra data[0]: 80d1
  extra data[1]: 8b0d
  EAX=0003 EBX= ECX= EDX=
  ESI= EDI= EBP= ESP=6cd4
  EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =   9300
  CS =f000 000f  9b00
  SS =   9300
  DS =   9300
  FS =   9300
  GS =   9300
  LDT=   8200
  TR =   8b00
  GDT= 000f6e98 0037
  IDT=  03ff
  CR0=0010 CR2= CR3= CR4=
  DR0= DR1= DR2=
  DR3=
  DR6=0ff0 DR7=0400
  EFER=
  Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
  10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
  b8 00 e0 00 00 8e
 
 
  It turns out that those errors are introduced by APICv, which gets
  enabled due to different feature set. If anyone is interested in
  reproducing/fixing this exactly on 3.10, it takes about one hundred of
  migrations/power state changes for an issue to appear, guest OS can be
  Linux or Win.
 
  Are you able to reproduce this on a more recent upstream kernel as well ?
 
  Bandan
 
  I`ll go through test cycle with 3.18 and 2603v2 around tomorrow and
  follow up with any reproduceable results.

 Heh.. issue is not triggered on 2603v2 at all, at least I am not able
 to hit this. The only difference with 2620v2 except lower frequency is
 an Intel Dynamic Acceleration feature. I`d appreciate any testing with
 higher CPU models with same or richer feature set. The testing itself
 can be done on both generic 3.10 or RH7 kernels, as both of them are
 experiencing this issue. I conducted all tests with disabled cstates
 so I advise to do the same for a first reproduction step.

 Thanks!

 model name  : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
 stepping: 4
 microcode   : 0x416
 cpu MHz : 2100.039
 cache size  : 15360 KB
 siblings: 12
 apicid  : 43
 initial apicid  : 43
 fpu : yes
 fpu_exception   : yes
 cpuid level : 13
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
 mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
 syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
 rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq
 dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca
 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
 rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi
 flexpriority ept vpid fsgsbase smep erms

 I'm seeing something similar; it's very intermittent and generally
 happening right at boot of the guest;   I'm running this on qemu
 head+my postcopy world (but it's happening right at boot before postcopy
 gets a chance), and I'm using a 3.19ish kernel. Xeon E5-2407 in my case
 but hey maybe I'm seeing a different bug.

 Dave

Yep, looks like we are hitting same bug - two thirds of mine failure
events shot during boot/reboot cycle and approx. one third of events
happened in the middle of runtime. What CPU, v0 or v2 are you using
(in other words, is APICv enabled)?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-10 Thread Andrey Korolyov
On Tue, Mar 10, 2015 at 9:16 PM, Dr. David Alan Gilbert
dgilb...@redhat.com wrote:
 * Andrey Korolyov (and...@xdel.ru) wrote:
 On Tue, Mar 10, 2015 at 7:57 PM, Dr. David Alan Gilbert
 dgilb...@redhat.com wrote:
  * Andrey Korolyov (and...@xdel.ru) wrote:
  On Sat, Mar 7, 2015 at 3:00 AM, Andrey Korolyov and...@xdel.ru wrote:
   On Fri, Mar 6, 2015 at 7:57 PM, Bandan Das b...@redhat.com wrote:
   Andrey Korolyov and...@xdel.ru writes:
  
   On Fri, Mar 6, 2015 at 1:14 AM, Andrey Korolyov and...@xdel.ru 
   wrote:
   Hello,
  
   recently I`ve got a couple of shiny new Intel 2620v2s for future
   replacement of the E5-2620v1, but I experienced relatively many 
   events
   with emulation errors, all traces looks simular to the one below. I 
   am
   running qemu-2.1 on x86 on top of 3.10 branch for testing purposes 
   but
   can switch to some other versions if necessary. Most of crashes
   happened during reboot cycle or at the end of ACPI-based shutdown
   action, if this can help. I have zero clues of what can introduce 
   such
   a mess inside same processor family using identical software, as
   2620v1 has no simular problem ever. Please let me know if there can 
   be
   some side measures for making entire story more clear.
  
   Thanks!
  
   KVM internal error. Suberror: 2
   extra data[0]: 80d1
   extra data[1]: 8b0d
   EAX=0003 EBX= ECX= EDX=
   ESI= EDI= EBP= ESP=6cd4
   EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
   ES =   9300
   CS =f000 000f  9b00
   SS =   9300
   DS =   9300
   FS =   9300
   GS =   9300
   LDT=   8200
   TR =   8b00
   GDT= 000f6e98 0037
   IDT=  03ff
   CR0=0010 CR2= CR3= CR4=
   DR0= DR1= DR2=
   DR3=
   DR6=0ff0 DR7=0400
   EFER=
   Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
   10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
   b8 00 e0 00 00 8e
  
  
   It turns out that those errors are introduced by APICv, which gets
   enabled due to different feature set. If anyone is interested in
   reproducing/fixing this exactly on 3.10, it takes about one hundred of
   migrations/power state changes for an issue to appear, guest OS can be
   Linux or Win.
  
   Are you able to reproduce this on a more recent upstream kernel as 
   well ?
  
   Bandan
  
   I`ll go through test cycle with 3.18 and 2603v2 around tomorrow and
   follow up with any reproduceable results.
 
  Heh.. issue is not triggered on 2603v2 at all, at least I am not able
  to hit this. The only difference with 2620v2 except lower frequency is
  an Intel Dynamic Acceleration feature. I`d appreciate any testing with
  higher CPU models with same or richer feature set. The testing itself
  can be done on both generic 3.10 or RH7 kernels, as both of them are
  experiencing this issue. I conducted all tests with disabled cstates
  so I advise to do the same for a first reproduction step.
 
  Thanks!
 
  model name  : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
  stepping: 4
  microcode   : 0x416
  cpu MHz : 2100.039
  cache size  : 15360 KB
  siblings: 12
  apicid  : 43
  initial apicid  : 43
  fpu : yes
  fpu_exception   : yes
  cpuid level : 13
  wp  : yes
  flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
  mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
  syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
  rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq
  dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca
  sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
  rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi
  flexpriority ept vpid fsgsbase smep erms
 
  I'm seeing something similar; it's very intermittent and generally
  happening right at boot of the guest;   I'm running this on qemu
  head+my postcopy world (but it's happening right at boot before postcopy
  gets a chance), and I'm using a 3.19ish kernel. Xeon E5-2407 in my case
  but hey maybe I'm seeing a different bug.
 
  Dave

 Yep, looks like we are hitting same bug - two thirds of mine failure
 events shot during boot/reboot cycle and approx. one third of events
 happened in the middle of runtime. What CPU, v0 or v2 are you using
 (in other words, is APICv enabled)?

 processor   : 7
 vendor_id   : GenuineIntel
 cpu family  : 6
 model   : 45
 model name  : Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz
 stepping: 7
 microcode   : 0x70d
 cpu MHz

Re: [Qemu-devel] E5-2620v2 - emulation stop error

2015-03-06 Thread Andrey Korolyov
On Fri, Mar 6, 2015 at 7:57 PM, Bandan Das b...@redhat.com wrote:
 Andrey Korolyov and...@xdel.ru writes:

 On Fri, Mar 6, 2015 at 1:14 AM, Andrey Korolyov and...@xdel.ru wrote:
 Hello,

 recently I`ve got a couple of shiny new Intel 2620v2s for future
 replacement of the E5-2620v1, but I experienced relatively many events
 with emulation errors, all traces looks simular to the one below. I am
 running qemu-2.1 on x86 on top of 3.10 branch for testing purposes but
 can switch to some other versions if necessary. Most of crashes
 happened during reboot cycle or at the end of ACPI-based shutdown
 action, if this can help. I have zero clues of what can introduce such
 a mess inside same processor family using identical software, as
 2620v1 has no simular problem ever. Please let me know if there can be
 some side measures for making entire story more clear.

 Thanks!

 KVM internal error. Suberror: 2
 extra data[0]: 80d1
 extra data[1]: 8b0d
 EAX=0003 EBX= ECX= EDX=
 ESI= EDI= EBP= ESP=6cd4
 EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =   9300
 CS =f000 000f  9b00
 SS =   9300
 DS =   9300
 FS =   9300
 GS =   9300
 LDT=   8200
 TR =   8b00
 GDT= 000f6e98 0037
 IDT=  03ff
 CR0=0010 CR2= CR3= CR4=
 DR0= DR1= DR2=
 DR3=
 DR6=0ff0 DR7=0400
 EFER=
 Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
 b8 00 e0 00 00 8e


 It turns out that those errors are introduced by APICv, which gets
 enabled due to different feature set. If anyone is interested in
 reproducing/fixing this exactly on 3.10, it takes about one hundred of
 migrations/power state changes for an issue to appear, guest OS can be
 Linux or Win.

 Are you able to reproduce this on a more recent upstream kernel as well ?

 Bandan

I`ll go through test cycle with 3.18 and 2603v2 around tomorrow and
follow up with any reproduceable results.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


E5-2620v2 - emulation stop error

2015-03-05 Thread Andrey Korolyov
Hello,

recently I`ve got a couple of shiny new Intel 2620v2s for future
replacement of the E5-2620v1, but I experienced relatively many events
with emulation errors, all traces looks simular to the one below. I am
running qemu-2.1 on x86 on top of 3.10 branch for testing purposes but
can switch to some other versions if necessary. Most of crashes
happened during reboot cycle or at the end of ACPI-based shutdown
action, if this can help. I have zero clues of what can introduce such
a mess inside same processor family using identical software, as
2620v1 has no simular problem ever. Please let me know if there can be
some side measures for making entire story more clear.

Thanks!

KVM internal error. Suberror: 2
extra data[0]: 80d1
extra data[1]: 8b0d
EAX=0003 EBX= ECX= EDX=
ESI= EDI= EBP= ESP=6cd4
EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000 000f  9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT= 000f6e98 0037
IDT=  03ff
CR0=0010 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
b8 00 e0 00 00 8e
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: E5-2620v2 - emulation stop error

2015-03-05 Thread Andrey Korolyov
On Fri, Mar 6, 2015 at 1:14 AM, Andrey Korolyov and...@xdel.ru wrote:
 Hello,

 recently I`ve got a couple of shiny new Intel 2620v2s for future
 replacement of the E5-2620v1, but I experienced relatively many events
 with emulation errors, all traces looks simular to the one below. I am
 running qemu-2.1 on x86 on top of 3.10 branch for testing purposes but
 can switch to some other versions if necessary. Most of crashes
 happened during reboot cycle or at the end of ACPI-based shutdown
 action, if this can help. I have zero clues of what can introduce such
 a mess inside same processor family using identical software, as
 2620v1 has no simular problem ever. Please let me know if there can be
 some side measures for making entire story more clear.

 Thanks!

 KVM internal error. Suberror: 2
 extra data[0]: 80d1
 extra data[1]: 8b0d
 EAX=0003 EBX= ECX= EDX=
 ESI= EDI= EBP= ESP=6cd4
 EIP=d3f9 EFL=00010202 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =   9300
 CS =f000 000f  9b00
 SS =   9300
 DS =   9300
 FS =   9300
 GS =   9300
 LDT=   8200
 TR =   8b00
 GDT= 000f6e98 0037
 IDT=  03ff
 CR0=0010 CR2= CR3= CR4=
 DR0= DR1= DR2=
 DR3=
 DR6=0ff0 DR7=0400
 EFER=
 Code=48 18 67 8c 00 8c d1 8e d9 66 5a 66 58 66 5d 66 c3 cd 02 cb cd
 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb cd 19 cb cd 1c cb fa fc 66
 b8 00 e0 00 00 8e


It turns out that those errors are introduced by APICv, which gets
enabled due to different feature set. If anyone is interested in
reproducing/fixing this exactly on 3.10, it takes about one hundred of
migrations/power state changes for an issue to appear, guest OS can be
Linux or Win.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-02-04 Thread Andrey Korolyov
Hi,

I've seen the problem quite a few times.  Before spending more time on
it, I'd like to have a quick check here to see if anyone ever saw the
same problem?  Hope it is a relevant question with this mail list.


Jul  2 11:08:21 arno-3 kernel: [ 2165.078623] BUG: unable to handle
kernel NULL pointer dereference at 0008
Jul  2 11:08:21 arno-3 kernel: [ 2165.078916] IP: [8118d0fa]
copy_huge_page+0x8a/0x2a0
Jul  2 11:08:21 arno-3 kernel: [ 2165.079128] PGD 0
Jul  2 11:08:21 arno-3 kernel: [ 2165.079198] Oops:  [#1] SMP
Jul  2 11:08:21 arno-3 kernel: [ 2165.079319] Modules linked in:
ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp
iptable_filter ip_tables x_tables kvm_intel kvm bridge stp llc ast ttm
drm_kms_helper drm sysimgblt sysfillrect syscopyarea lp mei_me ioatdma
ext2 parport mei shpchp dcdbas joydev mac_hid lpc_ich acpi_pad wmi
hid_generic usbhid hid ixgbe igb dca i2c_algo_bit ahci ptp libahci
mdio pps_core
Jul  2 11:08:21 arno-3 kernel: [ 2165.081090] CPU: 19 PID: 3494 Comm:
qemu-system-x86 Not tainted 3.11.0-15-generic #25~precise1-Ubuntu
Jul  2 11:08:21 arno-3 kernel: [ 2165.081424] Hardware name: Dell Inc.
PowerEdge C6220 II/09N44V, BIOS 2.0.3 07/03/2013
Jul  2 11:08:21 arno-3 kernel: [ 2165.081705] task: 88102675
ti: 881026056000 task.ti: 881026056000
Jul  2 11:08:21 arno-3 kernel: [ 2165.081973] RIP:
0010:[8118d0fa]  [8118d0fa]
copy_huge_page+0x8a/0x2a0


Hello,

sorry for possible top-posting, the same issue appears on at least
3.10 LTS series. The original thread is at
http://marc.info/?l=kvmm=14043742300901. The necessary components for
failure to reappear are a single running kvm guest and mounted large
thp: hugepagesz=1G (seemingly the same as in initial report). With
default 2M pages everything is working well, the same for 3.18 with 1G
THP. Are there any obvious clues for the issue?

Thanks!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-02-04 Thread Andrey Korolyov
Sorry for all the previous mess, my Claws-mailer went nuts for no reason.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cpu frequency

2015-02-03 Thread Andrey Korolyov
On Wed, Feb 4, 2015 at 3:06 AM, Nerijus Baliunas
neri...@users.sourceforge.net wrote:
 On Tue, 3 Feb 2015 18:07:57 +0400 Andrey Korolyov and...@xdel.ru wrote:

 Have you tried to disable turbo mode (assuming you have new enough CPU
 model) and fix frequency via frequency governor` settings? If it
 helps, it can be an ugly hack with pre-up/post-up libvirt actions,
 though you`d probably want to keep frequency the same to maximize
 performance.

 I tried to alter frequency governor settings, but unsuccessfully.
 They seem to change but then revert back in some short time.
 CentOS 7, Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz.


I remember that the floating frequency was resulted in incorrect guest
CPU information, so may be it is an exact solution for your situation.
Those frequency values could be altered by running service,
unfortunately I have a not enough knowledge about Centos7 packages to
name it.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cpu frequency

2015-02-03 Thread Andrey Korolyov
 It did not help. Today that commecrial application detects 2400, although
 Control Panel - System shows 2.20 GHz.
 So my question again - is it possible to patch qemu-kvm that it shows some
 constant frequency to the guest? But the answer is probably not, because
 I don't know how the application computes the frequency...


Have you tried to disable turbo mode (assuming you have new enough CPU
model) and fix frequency via frequency governor` settings? If it
helps, it can be an ugly hack with pre-up/post-up libvirt actions,
though you`d probably want to keep frequency the same to maximize
performance.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible approaches to limit csw overhead

2014-11-17 Thread Andrey Korolyov
Hello,

I have a rather practical question, is it possible to limit amount of
vm-initiated events for single VM? As and example, VM which
experienced OOM and effectively stuck dead generates a lot of
unnecessary context switches triggering do_raw_spin_lock very often
and therefore increasing overall compute workload. This possibly can
be done via reactive limitation of the cpu quota via cgroup, but such
method is quite impractical because every orchestration solution will
need to implement its own piece of code to detect such VM states and
act properly. I wonder if there may be a proposal which will do this
job better than userspace-implemented perf statistics loop.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [question] virtio-blk performance degradationhappened with virito-serial

2014-09-02 Thread Andrey Korolyov
On Tue, Sep 2, 2014 at 10:36 AM, Amit Shah amit.s...@redhat.com wrote:
 On (Mon) 01 Sep 2014 [20:52:46], Zhang Haoyu wrote:
  Hi, all
 
  I start a VM with virtio-serial (default ports number: 31), and found 
  that virtio-blk performance degradation happened, about 25%, this 
  problem can be reproduced 100%.
  without virtio-serial:
  4k-read-random 1186 IOPS
  with virtio-serial:
  4k-read-random 871 IOPS
 
  but if use max_ports=2 option to limit the max number of virio-serial 
  ports, then the IO performance degradation is not so serious, about 5%.
 
  And, ide performance degradation does not happen with virtio-serial.
 
 Pretty sure it's related to MSI vectors in use.  It's possible that
 the virtio-serial device takes up all the avl vectors in the guests,
 leaving old-style irqs for the virtio-blk device.
 
 I don't think so,
 I use iometer to test 64k-read(or write)-sequence case, if I disable the 
 virtio-serial dynamically via device manager-virtio-serial = disable,
 then the performance get promotion about 25% immediately, then I re-enable 
 the virtio-serial via device manager-virtio-serial = enable,
 the performance got back again, very obvious.
 add comments:
 Although the virtio-serial is enabled, I don't use it at all, the 
 degradation still happened.

 Using the vectors= option as mentioned below, you can restrict the
 number of MSI vectors the virtio-serial device gets.  You can then
 confirm whether it's MSI that's related to these issues.

 So, I think it has no business with legacy interrupt mode, right?
 
 I am going to observe the difference of perf top data on qemu and perf kvm 
 stat data when disable/enable virtio-serial in guest,
 and the difference of perf top data on guest when disable/enable 
 virtio-serial in guest,
 any ideas?
 
 Thanks,
 Zhang Haoyu
 If you restrict the number of vectors the virtio-serial device gets
 (using the -device virtio-serial-pci,vectors= param), does that make
 things better for you?



 Amit
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can confirm serious degradation comparing to the 1.1 with regular
serial output  - I am able to hang VM forever after some tens of
seconds after continuously printing dmest to the ttyS0. VM just ate
all available CPU quota during test and hanged over some tens of
seconds, not even responding to regular pings and progressively
raising CPU consumption up to the limit.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [question] virtio-blk performance degradationhappened with virito-serial

2014-09-02 Thread Andrey Korolyov
On Tue, Sep 2, 2014 at 10:11 PM, Amit Shah amit.s...@redhat.com wrote:
 On (Tue) 02 Sep 2014 [22:05:45], Andrey Korolyov wrote:

 Can confirm serious degradation comparing to the 1.1 with regular
 serial output  - I am able to hang VM forever after some tens of
 seconds after continuously printing dmest to the ttyS0. VM just ate
 all available CPU quota during test and hanged over some tens of
 seconds, not even responding to regular pings and progressively
 raising CPU consumption up to the limit.

 Entirely different to what's being discussed here.  You're observing
 slowdown with ttyS0 in the guest -- the isa-serial device.  This
 thread is discussing virtio-blk and virtio-serial.

 Amit

Sorry for thread hijacking, the problem definitely not related to the
interrupt rework, will start a new thread.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug: No irq handler for vector (irq -1) on C602

2014-08-19 Thread Andrey Korolyov
Hello,

ran into this error for a first time over veru large hardware
span/uptime (the server which experienced the error is identical to
others, and I had absolutely none of MSI-related problems with this
hardware ever).

Running 3.10 at host, I had one (of many) VM on it which produced
enormous count of context switches due to mess inside (hundreds of
active apache-itk workers). All VM threads are pinned to the first
sibling for every core on two-head system, e.g. having 24 HT cores and
second half is just HT siblings, cpuset cg limits threads only to
first. The error itself was produced a second after reset event for
this VM (through libvirt, if exact call matters):

[7696746.523478] do_IRQ: 11.233 No irq handler for vector (irq -1)

Since there are no hints for this exact error recently, and it
triggered by critical part of the kernel code, I think it may be
interesting to re-raise the issue (or, at least, make a better bound
for error source).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Verifying Execution Integrity in Untrusted hypervisors

2014-07-26 Thread Andrey Korolyov
On Sat, Jul 26, 2014 at 2:06 AM, Paolo Bonzini pbonz...@redhat.com wrote:

 Thanks a lot Paolo.

 Is there a way to atleast detect that the hypervisor has done something
 malicious and the client will be able to refer to some kind of logs to
 prove it?

 If you want a theoretical, perfect solution, no.  I wouldn't be surprised
 if this is equivalent to the halting problem.

 If you want a practical solution, you have to define a threat model.  What
 kind of attacks are you worried about?  Which parts of the environment can
 you control?  Can you place something trusted between the vulnerable VM
 and its clients?  And so on.

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Here are some bits I read before:
https://www.cs.purdue.edu/homes/bb/cs590/papers/secure_vm.pdf. It`s
all about timing measurement after all, if you`ll be able to measure
them or derive methods from, say, cache correlation attacks to avoid
possibility of continuous hijack due to knowledge of amount of
computation/timings which will not be possible with constant Eve
measurements. you can complete task at a half. Complete execution
without continuous checking against locally placed trusted blackbox
equivalent (hardware token, trusted execution replaying or so) is
hardly possible by my understanding. Anyway, any of imaginable cases
relies on a finite amount of computing power available to a single
thread, so I can hardly say that real-world implementation *is
secure*, though we can define high probability of it. I believe that
the homomorphic encryption can do its way for at least some kind of
services by next decade, due to tendency of total cloudization, and
this is definitely better than sticks-and-mud approach with timings.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


No-downtime ROM upgrades

2014-05-02 Thread Andrey Korolyov
Hello,

As it shows up, upgrading of any system ROMs loaded to the emulator
initially is not possible without complete restart of the emulator
itself, as live migration refuses to complete with different payload at
both ends. Assuming that the guest side payload for vga/ethernet can be
actually re-read via powering off and on corresponding PCI devices at
the runtime, it`s hard to say of what to do with BIOS itself. Does
anyone have an idea if such kind of upgrades are discussed/implemented
somewhere? Though ROMs are very unlikely to be buggy, some additional
features may come in over couple of releases adding necessity of
relaunching virtual machine (like upcoming PNP080 update for example).
And, of course, there are business-driven cases where such thing must be
avoided. :)

Thanks!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.10.X kernel/jump_label kvm

2014-03-01 Thread Andrey Korolyov
On 02/28/2014 11:47 PM, Stefan Priebe wrote:
 Hello,
 
 i got this stack trace multiple times while using a vanilla 3.10.32
 kernel and already sent it to the list in december but got no replies.
 

Hi,

What kind of workload the host system is experiencing at same time? Does
this event correlate with high memory pressure?

 [78136.551061] WARNING: at kernel/jump_label.c:80
 __static_key_slow_dec+0xb6/0xc0()
 [78136.551062] jump label: negative count!
 [78136.551063] Modules linked in: sch_htb act_police cls_u32 sch_ingress
 vhost_net tun macvtap macvlan netconsole ipt_REJECT dlm sctp
 iptable_filter ip_tables x_tables iscsi_tcp libiscsi_tcp libiscsi
 scsi_transport_iscsi nfsd auth_rpcgss oid_registry bonding ext2 8021q
 garp fuse mperf coretemp kvm_intel kvm crc32_pclmul ghash_clmulni_intel
 microcode i2c_i801 button dm_mod raid1 md_mod usbhid usb_storage
 ohci_hcd sg sd_mod ehci_pci ahci ehci_hcd igb libahci i2c_algo_bit isci
 usbcore i2c_core usb_common libsas ptp ixgbe(O) scsi_transport_sas pps_core
 [78136.551080] CPU: 21 PID: 47183 Comm: kvm Tainted: GW  O
 3.10.32+68-ph #1
 [78136.551081] Hardware name: Supermicro
 X9DRW-3LN4F+/X9DRW-3TF+/X9DRW-3LN4F+/X9DRW-3TF+, BIOS 3.00 07/05/2013
 [78136.551081]  0009 882f4a669be8 81524606
 882f4a669c28
 [78136.551085]  8104853b 4a669c08 a045cc40
 00fa
 [78136.551088]  a045cc60 882f51460160 882f74ab8110
 882f4a669c88
 [78136.551091] Call Trace:
 [78136.551093]  [81524606] dump_stack+0x19/0x1b
 [78136.551095]  [8104853b] warn_slowpath_common+0x6b/0xa0
 [78136.551098]  [81048611] warn_slowpath_fmt+0x41/0x50
 [78136.551100]  [810e05f6] __static_key_slow_dec+0xb6/0xc0
 [78136.551102]  [810e0631] static_key_slow_dec_deferred+0x11/0x20
 [78136.551110]  [a043ff60] kvm_free_lapic+0x90/0xa0 [kvm]
 [78136.551116]  [a0429ef3] kvm_arch_vcpu_uninit+0x23/0x90 [kvm]
 [78136.551122]  [a0410a20] kvm_vcpu_uninit+0x20/0x40 [kvm]
 [78136.551125]  [a021fc12] vmx_free_vcpu+0x52/0x70 [kvm_intel]
 [78136.551132]  [a04295ef] kvm_arch_vcpu_free+0x4f/0x60 [kvm]
 [78136.551138]  [a042a112] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]
 [78136.551141]  [81071048] ? synchronize_srcu+0x18/0x20
 [78136.551143]  [8112677a] ? mmu_notifier_unregister+0xaa/0xe0
 [78136.551149]  [a041380e] kvm_put_kvm+0x10e/0x1b0 [kvm]
 [78136.551155]  [a0413a33] kvm_vcpu_release+0x13/0x20 [kvm]
 [78136.551157]  [811452d1] __fput+0xe1/0x230
 [78136.551160]  [81145429] fput+0x9/0x10
 [78136.551162]  [81068de5] task_work_run+0xb5/0xd0
 [78136.551164]  [8104da1c] do_exit+0x2ac/0xa30
 [78136.551166]  [8107a89b] ? wake_up_state+0xb/0x10
 [78136.551169]  [81059fad] ? signal_wake_up_state+0x1d/0x30
 [78136.551171]  [8105b1c3] ? zap_other_threads+0x83/0xa0
 [78136.551173]  [8104e21a] do_group_exit+0x3a/0xa0
 [78136.551175]  [8104e292] SyS_exit_group+0x12/0x20
 [78136.551177]  [81529fd2] system_call_fastpath+0x16/0x1b
 [78136.551178] ---[ end trace b9ebb6de9753ef4c ]---
 
 Thanks!
 
 Stefan
 
 -- 
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] Qemu main thread is blocked in g_poll in windows guest

2014-02-20 Thread Andrey Korolyov
On 10/15/2013 04:18 PM, Xiexiangyou wrote:
 Thanks for your reply :-)
 The QEMU version is 1.5.1,and the KVM version is 3.6
 
 QEMU command:
 /usr/bin/qemu-kvm -name win2008_dc_5 -S -machine 
 pc-i440fx-1.5,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 
 4,maxcpus=64,sockets=16,cores=4,threads=1 -uuid 
 13e08e3e-cd23-4450-8bd3-60e7c220316d -no-user-config -nodefaults -chardev 
 socket,id=charmonitor,path=/var/lib/libvirt/qemu/win2008_dc_5.monitor,server,nowait
  -mon chardev=charmonitor,id=monitor,mode=control -rtc 
 base=utc,clock=vm,driftfix=slew -no-hpet -no-shutdown -device 
 piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
 virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device 
 virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive 
 file=/dev/vmdisk/win2008_dc_5,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none,aio=native
  -device 
 scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1
  -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device 
 virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:16:49:23,bus=pci.0,addr=0x3
  -chardev socket,id=charchannel0,path=/var/run/libvirt/qe
 m
u/win2008_dc_5.extend,server,nowait -device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
 -chardev 
socket,id=charchannel1,path=/var/run/libvirt/qemu/win2008_dc_5.agent,server,nowait
 -device 
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
 -device usb-tablet,id=input0 -vnc 0.0.0.0:4 -device 
cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
 
 (gdb) bt
 #0  0x7f9ba661a423 in poll () from /lib64/libc.so.6
 #1  0x0059460f in os_host_main_loop_wait (timeout=4294967295) at 
 main-loop.c:226
 #2  0x005946a4 in main_loop_wait (nonblocking=0) at main-loop.c:464
 #3  0x00619309 in main_loop () at vl.c:2182
 #4  0x0061fb5e in main (argc=54, argv=0x7fff879830c8, 
 envp=0x7fff87983280) at vl.c:4611
 
 Main thread's strace message:
 # strace -p 6386
 Process 6386 attached - interrupt to quit
 restart_syscall(... resuming interrupted call ...
 
 cpu thread's strace message:
 # strace -p 6389
 Process 6389 attached - interrupt to quit
 rt_sigtimedwait([BUS USR1], 0x7f9ba36fbc00) = -1 EAGAIN (Resource temporarily 
 unavailable)
 rt_sigpending([])   = 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ioctl(17, 0xae80, 0)= 0
 ...
 
 Thanks!
 --xie
 
 -Original Message-
 From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
 Bonzini
 Sent: Tuesday, October 15, 2013 7:52 PM
 To: Xiexiangyou
 Cc: qemu-de...@nongnu.org; qemu-devel-requ...@nongnu.org; 
 kvm@vger.kernel.org; Huangpeng (Peter); Luonengjun
 Subject: Re: [RFH] Qemu main thread is blocked in g_poll in windows guest
 
 Il 15/10/2013 12:21, Xiexiangyou ha scritto:
 Hi all:

 Windows2008 Guest run without pressure for long time. Sometimes, it
 stop and looks like hanging. But when I connect to it with VNC, It
 resume to run, but VM's time is delayed . When the vm is hanging, I
 check the main thread of QEMU. I find that the thread is blocked in
 g_poll function. it is waiting for a SIG, However, there is no SIG .

 I tried the clock with hpet and no hpet, but came out the same
 problem. Then I upgrade the glibc to newer, it didn't work too. I'm
 confused. Is the reason that VM in sleep state and doesn't emit the
 signal. I set the windows 's power option, enable/disable the
 allow the wake timers, I didn't work.

 Is anybody have met the same problem before, or know the reason. Your
 reply will be very helpful.
 
 This post is missing a few pieces of information:
 
 * What version of QEMU is this?
 
 * What is the command line?
 
 * How do you know g_poll is waiting for a signal and not for a file
 descriptor?
 
 * What is the backtrace of the main thread?  What is the backtrace of
 the VCPU thread?
 
 etc.
 
 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


Hello,

To revive this thread - I have exactly the same problem on freshly
migrated virtual machines. The guest operating system is almost always
Linux, bug impact ratio is very low, about one per tens of migrations.
VM 'uptime', 

Re: QEMU P2P migration speed

2014-02-08 Thread Andrey Korolyov
On 02/07/2014 07:32 PM, Paolo Bonzini wrote:
 Il 07/02/2014 14:07, Andrey Korolyov ha scritto:
 Ok, I will do, but looks like libvirt version(1.0.2) in not relevant -
 it meets criteria set by debian packagers
 
 Then Debian's qemu packaging it's wrong, QEMU 1.6 or newer should conflict
 with libvirt 1.2.0.
 
 and again, 'broken state' is
 not relevant to the libvirt state history, it more likely to be qemu/kvm
 problem.
 
 It is relevant, qemu introduced a new migration status before active 
 (setup) and libvirt doesn't recognize it.  That's why you need at
 least 1.2.0.
 
 Paolo
 

Thanks, both issues - with reverted CPU dependency and with migration
itself went away.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QEMU P2P migration speed

2014-02-07 Thread Andrey Korolyov
On 02/07/2014 12:14 PM, Paolo Bonzini wrote:
 Il 06/02/2014 14:40, Andrey Korolyov ha scritto:
 Took and build 1.6.2 and faced a problem - after a couple of bounce
 iterations of migration (1-2-1-2) VM is not able to migrate anymore
 back in a probabilistic manner with an error 'internal error unexpected
 migration status in setup'. Error may disappear over a time, or may not
 disappear at all and it may took a lot of tries in a row to succeed.
 There are no obvious hints with default logging level in libvirt/qemu
 logs and seemingly libvirt is not a cause because accumulated error
 state preserves over service restarts. Also every VM is affected, not
 ones which are experiencing multiple migration actions. Error happens on
 3rd-5th second of the migration procedure, if it may help.
 
 You need to update libvirt too.
 
 Paolo

Ok, I will do, but looks like libvirt version(1.0.2) in not relevant -
it meets criteria set by debian packagers and again, 'broken state' is
not relevant to the libvirt state history, it more likely to be qemu/kvm
problem.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QEMU P2P migration speed

2014-02-06 Thread Andrey Korolyov
On 02/05/2014 07:15 PM, Paolo Bonzini wrote:
 Il 05/02/2014 11:46, Andrey Korolyov ha scritto:
 On 02/05/2014 11:27 AM, Paolo Bonzini wrote:
 Il 04/02/2014 18:06, Andrey Korolyov ha scritto:
 Migration time is almost independent of VM RSS(varies by ten percent at
 maximum), for situation when VM is active on target host, time is about
 85 seconds to migrate 8G between hosts, and when it is turned off,
 migration time *increasing* to 120s. For curious ones, frequency
 management is completely inactive on both nodes, neither CStates
 mechanism. Interconnection is relatively fast (20+Gbit/s by IPoIB).

 What version of QEMU?

 Paolo

 Ancie.. ehm, stable - 1.1.2 from wheezy. Should I try 1.6/1.7?
 
 Yeah, you can checkout the release notes on wiki.qemu.org to find out
 which versions had good improvements.  You can also try compiling
 straight from git, there are more speedups there.
 
 Paolo
 

Took and build 1.6.2 and faced a problem - after a couple of bounce
iterations of migration (1-2-1-2) VM is not able to migrate anymore
back in a probabilistic manner with an error 'internal error unexpected
migration status in setup'. Error may disappear over a time, or may not
disappear at all and it may took a lot of tries in a row to succeed.
There are no obvious hints with default logging level in libvirt/qemu
logs and seemingly libvirt is not a cause because accumulated error
state preserves over service restarts. Also every VM is affected, not
ones which are experiencing multiple migration actions. Error happens on
3rd-5th second of the migration procedure, if it may help.

What is more interesting, the original counter-intuitive behavior is not
disappeared but increased relative span: 25 vs 70 seconds for fully
commited 8G VM. I am suspecting some mechanism falling back to the idle
and dropping overall performance therefore but can not image one beyond
standard freq/cstates which are definitely turned off.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: QEMU P2P migration speed

2014-02-05 Thread Andrey Korolyov
On 02/05/2014 11:27 AM, Paolo Bonzini wrote:
 Il 04/02/2014 18:06, Andrey Korolyov ha scritto:
 Migration time is almost independent of VM RSS(varies by ten percent at
 maximum), for situation when VM is active on target host, time is about
 85 seconds to migrate 8G between hosts, and when it is turned off,
 migration time *increasing* to 120s. For curious ones, frequency
 management is completely inactive on both nodes, neither CStates
 mechanism. Interconnection is relatively fast (20+Gbit/s by IPoIB).
 
 What version of QEMU?
 
 Paolo

Ancie.. ehm, stable - 1.1.2 from wheezy. Should I try 1.6/1.7?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


QEMU P2P migration speed

2014-02-04 Thread Andrey Korolyov
Hello,

I`ve got strange results during benchmarking migration speed for
different kinds of loads on source/target host: when source host is
'empty', migration takes approx. 30 percent longer than the same for
host already occupied by one VM with CPU overcommit ratio=1.

[src host, three equal vms, each with ability to eat all cores once]
[tgt host, one VM, with same appetite and limitations]

All VMs was put into cgroups with same cpu ceiling and cpu shares values.

Migration time is almost independent of VM RSS(varies by ten percent at
maximum), for situation when VM is active on target host, time is about
85 seconds to migrate 8G between hosts, and when it is turned off,
migration time *increasing* to 120s. For curious ones, frequency
management is completely inactive on both nodes, neither CStates
mechanism. Interconnection is relatively fast (20+Gbit/s by IPoIB).

Anyone have a suggestions on how to possibly explain/fix this?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Equivalent of vmware SIOC (Storage IO Control) in KVM

2013-10-13 Thread Andrey Korolyov
Hello,

By the way, is there plans to enhance qemu I/O throttling to able to
swallow peaks or to apply various disciplines? Current one-second flat
discipline seemingly is not enough for uneven workloads especially
when there is no alternative like cgroups for nbd usage.

Thanks!

On Sun, Oct 13, 2013 at 5:26 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 2) enable I/O throttling in QEMU, to apply limits at the level of the
 guest disk.  If you're using libvirt, add the iotune element within
 the disk element in the definition of the virtual machine.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Lockups using per-thread cgs and kvm

2013-02-12 Thread Andrey Korolyov
Hi,

We (a cloud hosting provider) has recently observed a couple of
strange lockups when physical node runs significant amount of
Win2008R2 kvm appliances, one may see collection of those lockups at
the link below. After checking a lot of ideas without any valuable
result, I have suggested that nested per-thread cgroup placement
created by libvirt may lead to this problem(libvirt puts emulator and
each of vcpu threads into separate sub-cgroup). Disabling such
behavior, e.g. having only one cgroup per kvm process per cgroup type
solved this problem, at least it didn`t happen on most stressful tests
we`re able to do. Since it is generally unusual for well-known kernel
mechanism, such as cgroups, to broke way like this, I hope we`ve found
a quite rare kind of bug. Just for the record, the bug also may happen
using linux guest, but much rarely, one or two orders of magnitude. We
have stayed on default scheduler granularity value at this tests, if
it matters.

For anyone who wants to see entire timeline of this bug, please see [1].


[1]. http://www.spinics.net/lists/kvm/msg85956.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-31 Thread Andrey Korolyov
On Thu, Jan 31, 2013 at 12:11 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Wed, Jan 30, 2013 at 11:21:08AM +0300, Andrey Korolyov wrote:
 On Wed, Jan 30, 2013 at 3:15 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
  On Tue, Jan 29, 2013 at 02:35:02AM +0300, Andrey Korolyov wrote:
  On Mon, Jan 28, 2013 at 5:56 PM, Andrey Korolyov and...@xdel.ru wrote:
   On Mon, Jan 28, 2013 at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com 
   wrote:
   On Mon, Jan 28, 2013 at 12:04:50AM +0300, Andrey Korolyov wrote:
   On Sat, Jan 26, 2013 at 12:49 AM, Marcelo Tosatti 
   mtosa...@redhat.com wrote:
On Fri, Jan 25, 2013 at 10:45:02AM +0300, Andrey Korolyov wrote:
On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti 
mtosa...@redhat.com wrote:
 On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote:
 Thank you Marcelo,

 Host node locking up sometimes later than yesterday, bur 
 problem still
 here, please see attached dmesg. Stuck process looks like
 root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
 /usr/bin/kvm -no-user-config -device ? -device pci-assign,? 
 -device
 virtio-blk-pci,? -device

 on fourth vm by count.

 Should I try upstream kernel instead of applying patch to the 
 latest
 3.4 or it is useless?

 If you can upgrade to an upstream kernel, please do that.

   
With vanilla 3.7.4 there is almost no changes, and NMI started 
firing
again. External symptoms looks like following: starting from some
count, may be third or sixth vm, qemu-kvm process allocating its
memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. 
Patch
helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to
kill stuck kvm processes and node returned back to the normal, 
when on
3.2 sending SIGKILL to the process causing zombies and hanged 
``ps''
output (problem and workaround when no scheduler involved described
here http://www.spinics.net/lists/kvm/msg84799.html).
   
Try disabling pause loop exiting with ple_gap=0 kvm-intel.ko module 
parameter.
   
  
   Hi Marcelo,
  
   thanks, this parameter helped to increase number of working VMs in a
   half of order of magnitude, from 3-4 to 10-15. Very high SY load, 10
   to 15 percents, persists on such numbers for a long time, where linux
   guests in same configuration do not jump over one percent even under
   stress bench. After I disabled HT, crash happens only in long runs and
   now it is kernel panic :)
   Stair-like memory allocation behaviour disappeared, but other symptom
   leading to the crash which I have not counted previously, persists: if
   VM count is ``enough'' for crash, some qemu processes starting to eat
   one core, and they`ll panic system after run in tens of minutes in
   such state or if I try to attach debugger to one of them. If needed, I
   can log entire crash output via netconsole, now I have some tail,
   almost the same every time:
   http://xdel.ru/downloads/btwin.png
  
   Yes, please log entire crash output, thanks.
  
  
   Here please, 3.7.4-vanilla, 16 vms, ple_gap=0:
  
   http://xdel.ru/downloads/oops-default-kvmintel.txt
 
  Just an update: I was able to reproduce that on pure linux VMs using
  qemu-1.3.0 and ``stress'' benchmark running on them - panic occurs at
  start of vm(with count ten working machines at the moment). Qemu-1.1.2
  generally is not able to reproduce that, but host node with older
  version crashing on less amount of Windows VMs(three to six instead
  ten to fifteen) than with 1.3, please see trace below:
 
  http://xdel.ru/downloads/oops-old-qemu.txt
 
  Single bit memory error, apparently. Try:
 
  1. memtest86.
  2. Boot with slub_debug=ZFPU kernel parameter.
  3. Reproduce on different machine
 
 

 Hi Marcelo,

 I always follow the rule - if some weird bug exists, check it on
 ECC-enabled machine and check IPMI logs too before start complaining
 :) I have finally managed to ``fix'' the problem, but my solution
 seems a bit strange:
 - I have noticed that if virtual machines started without any cgroup
 setting they will not cause this bug under any conditions,
 - I have thought, very wrong in my mind, that the
 CONFIG_SCHED_AUTOGROUP should regroup the tasks without any cgroup and
 should not touch tasks already inside any existing cpu cgroup. First
 sight on the 200-line patch shows that the autogrouping always applies
 to all tasks, so I tried to disable it,
 - wild magic appears - VMs didn`t crashed host any more, even in count
 30+ they work fine.
 I still don`t know what exactly triggered that and will I face it
 again under different conditions, so my solution more likely to be a
 patch of mud in wall of the dam, instead of proper fixing.

 There seems to be two possible origins of such error - a very very
 hideous race condition involving cgroups and processes like qemu-kvm
 causing frequent context switches and simple

Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-30 Thread Andrey Korolyov
On Wed, Jan 30, 2013 at 3:15 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Tue, Jan 29, 2013 at 02:35:02AM +0300, Andrey Korolyov wrote:
 On Mon, Jan 28, 2013 at 5:56 PM, Andrey Korolyov and...@xdel.ru wrote:
  On Mon, Jan 28, 2013 at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com 
  wrote:
  On Mon, Jan 28, 2013 at 12:04:50AM +0300, Andrey Korolyov wrote:
  On Sat, Jan 26, 2013 at 12:49 AM, Marcelo Tosatti mtosa...@redhat.com 
  wrote:
   On Fri, Jan 25, 2013 at 10:45:02AM +0300, Andrey Korolyov wrote:
   On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti 
   mtosa...@redhat.com wrote:
On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote:
Thank you Marcelo,
   
Host node locking up sometimes later than yesterday, bur problem 
still
here, please see attached dmesg. Stuck process looks like
root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
/usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device
virtio-blk-pci,? -device
   
on fourth vm by count.
   
Should I try upstream kernel instead of applying patch to the 
latest
3.4 or it is useless?
   
If you can upgrade to an upstream kernel, please do that.
   
  
   With vanilla 3.7.4 there is almost no changes, and NMI started firing
   again. External symptoms looks like following: starting from some
   count, may be third or sixth vm, qemu-kvm process allocating its
   memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. Patch
   helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to
   kill stuck kvm processes and node returned back to the normal, when on
   3.2 sending SIGKILL to the process causing zombies and hanged ``ps''
   output (problem and workaround when no scheduler involved described
   here http://www.spinics.net/lists/kvm/msg84799.html).
  
   Try disabling pause loop exiting with ple_gap=0 kvm-intel.ko module 
   parameter.
  
 
  Hi Marcelo,
 
  thanks, this parameter helped to increase number of working VMs in a
  half of order of magnitude, from 3-4 to 10-15. Very high SY load, 10
  to 15 percents, persists on such numbers for a long time, where linux
  guests in same configuration do not jump over one percent even under
  stress bench. After I disabled HT, crash happens only in long runs and
  now it is kernel panic :)
  Stair-like memory allocation behaviour disappeared, but other symptom
  leading to the crash which I have not counted previously, persists: if
  VM count is ``enough'' for crash, some qemu processes starting to eat
  one core, and they`ll panic system after run in tens of minutes in
  such state or if I try to attach debugger to one of them. If needed, I
  can log entire crash output via netconsole, now I have some tail,
  almost the same every time:
  http://xdel.ru/downloads/btwin.png
 
  Yes, please log entire crash output, thanks.
 
 
  Here please, 3.7.4-vanilla, 16 vms, ple_gap=0:
 
  http://xdel.ru/downloads/oops-default-kvmintel.txt

 Just an update: I was able to reproduce that on pure linux VMs using
 qemu-1.3.0 and ``stress'' benchmark running on them - panic occurs at
 start of vm(with count ten working machines at the moment). Qemu-1.1.2
 generally is not able to reproduce that, but host node with older
 version crashing on less amount of Windows VMs(three to six instead
 ten to fifteen) than with 1.3, please see trace below:

 http://xdel.ru/downloads/oops-old-qemu.txt

 Single bit memory error, apparently. Try:

 1. memtest86.
 2. Boot with slub_debug=ZFPU kernel parameter.
 3. Reproduce on different machine



Hi Marcelo,

I always follow the rule - if some weird bug exists, check it on
ECC-enabled machine and check IPMI logs too before start complaining
:) I have finally managed to ``fix'' the problem, but my solution
seems a bit strange:
- I have noticed that if virtual machines started without any cgroup
setting they will not cause this bug under any conditions,
- I have thought, very wrong in my mind, that the
CONFIG_SCHED_AUTOGROUP should regroup the tasks without any cgroup and
should not touch tasks already inside any existing cpu cgroup. First
sight on the 200-line patch shows that the autogrouping always applies
to all tasks, so I tried to disable it,
- wild magic appears - VMs didn`t crashed host any more, even in count
30+ they work fine.
I still don`t know what exactly triggered that and will I face it
again under different conditions, so my solution more likely to be a
patch of mud in wall of the dam, instead of proper fixing.

There seems to be two possible origins of such error - a very very
hideous race condition involving cgroups and processes like qemu-kvm
causing frequent context switches and simple incompatibility between
NUMA, logic of CONFIG_SCHED_AUTOGROUP and qemu VMs already doing work
in the cgroup, since I have not observed this errors on single numa
node(mean, desktop) on relatively heavier condition.
--
To unsubscribe from this list: send the line unsubscribe kvm

Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-28 Thread Andrey Korolyov
On Mon, Jan 28, 2013 at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Mon, Jan 28, 2013 at 12:04:50AM +0300, Andrey Korolyov wrote:
 On Sat, Jan 26, 2013 at 12:49 AM, Marcelo Tosatti mtosa...@redhat.com 
 wrote:
  On Fri, Jan 25, 2013 at 10:45:02AM +0300, Andrey Korolyov wrote:
  On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti mtosa...@redhat.com 
  wrote:
   On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote:
   Thank you Marcelo,
  
   Host node locking up sometimes later than yesterday, bur problem still
   here, please see attached dmesg. Stuck process looks like
   root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
   /usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device
   virtio-blk-pci,? -device
  
   on fourth vm by count.
  
   Should I try upstream kernel instead of applying patch to the latest
   3.4 or it is useless?
  
   If you can upgrade to an upstream kernel, please do that.
  
 
  With vanilla 3.7.4 there is almost no changes, and NMI started firing
  again. External symptoms looks like following: starting from some
  count, may be third or sixth vm, qemu-kvm process allocating its
  memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. Patch
  helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to
  kill stuck kvm processes and node returned back to the normal, when on
  3.2 sending SIGKILL to the process causing zombies and hanged ``ps''
  output (problem and workaround when no scheduler involved described
  here http://www.spinics.net/lists/kvm/msg84799.html).
 
  Try disabling pause loop exiting with ple_gap=0 kvm-intel.ko module 
  parameter.
 

 Hi Marcelo,

 thanks, this parameter helped to increase number of working VMs in a
 half of order of magnitude, from 3-4 to 10-15. Very high SY load, 10
 to 15 percents, persists on such numbers for a long time, where linux
 guests in same configuration do not jump over one percent even under
 stress bench. After I disabled HT, crash happens only in long runs and
 now it is kernel panic :)
 Stair-like memory allocation behaviour disappeared, but other symptom
 leading to the crash which I have not counted previously, persists: if
 VM count is ``enough'' for crash, some qemu processes starting to eat
 one core, and they`ll panic system after run in tens of minutes in
 such state or if I try to attach debugger to one of them. If needed, I
 can log entire crash output via netconsole, now I have some tail,
 almost the same every time:
 http://xdel.ru/downloads/btwin.png

 Yes, please log entire crash output, thanks.


Here please, 3.7.4-vanilla, 16 vms, ple_gap=0:

http://xdel.ru/downloads/oops-default-kvmintel.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-28 Thread Andrey Korolyov
On Mon, Jan 28, 2013 at 5:56 PM, Andrey Korolyov and...@xdel.ru wrote:
 On Mon, Jan 28, 2013 at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Mon, Jan 28, 2013 at 12:04:50AM +0300, Andrey Korolyov wrote:
 On Sat, Jan 26, 2013 at 12:49 AM, Marcelo Tosatti mtosa...@redhat.com 
 wrote:
  On Fri, Jan 25, 2013 at 10:45:02AM +0300, Andrey Korolyov wrote:
  On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti mtosa...@redhat.com 
  wrote:
   On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote:
   Thank you Marcelo,
  
   Host node locking up sometimes later than yesterday, bur problem still
   here, please see attached dmesg. Stuck process looks like
   root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
   /usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device
   virtio-blk-pci,? -device
  
   on fourth vm by count.
  
   Should I try upstream kernel instead of applying patch to the latest
   3.4 or it is useless?
  
   If you can upgrade to an upstream kernel, please do that.
  
 
  With vanilla 3.7.4 there is almost no changes, and NMI started firing
  again. External symptoms looks like following: starting from some
  count, may be third or sixth vm, qemu-kvm process allocating its
  memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. Patch
  helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to
  kill stuck kvm processes and node returned back to the normal, when on
  3.2 sending SIGKILL to the process causing zombies and hanged ``ps''
  output (problem and workaround when no scheduler involved described
  here http://www.spinics.net/lists/kvm/msg84799.html).
 
  Try disabling pause loop exiting with ple_gap=0 kvm-intel.ko module 
  parameter.
 

 Hi Marcelo,

 thanks, this parameter helped to increase number of working VMs in a
 half of order of magnitude, from 3-4 to 10-15. Very high SY load, 10
 to 15 percents, persists on such numbers for a long time, where linux
 guests in same configuration do not jump over one percent even under
 stress bench. After I disabled HT, crash happens only in long runs and
 now it is kernel panic :)
 Stair-like memory allocation behaviour disappeared, but other symptom
 leading to the crash which I have not counted previously, persists: if
 VM count is ``enough'' for crash, some qemu processes starting to eat
 one core, and they`ll panic system after run in tens of minutes in
 such state or if I try to attach debugger to one of them. If needed, I
 can log entire crash output via netconsole, now I have some tail,
 almost the same every time:
 http://xdel.ru/downloads/btwin.png

 Yes, please log entire crash output, thanks.


 Here please, 3.7.4-vanilla, 16 vms, ple_gap=0:

 http://xdel.ru/downloads/oops-default-kvmintel.txt

Just an update: I was able to reproduce that on pure linux VMs using
qemu-1.3.0 and ``stress'' benchmark running on them - panic occurs at
start of vm(with count ten working machines at the moment). Qemu-1.1.2
generally is not able to reproduce that, but host node with older
version crashing on less amount of Windows VMs(three to six instead
ten to fifteen) than with 1.3, please see trace below:

http://xdel.ru/downloads/oops-old-qemu.txt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-27 Thread Andrey Korolyov
On Sat, Jan 26, 2013 at 12:49 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Fri, Jan 25, 2013 at 10:45:02AM +0300, Andrey Korolyov wrote:
 On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti mtosa...@redhat.com wrote:
  On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote:
  Thank you Marcelo,
 
  Host node locking up sometimes later than yesterday, bur problem still
  here, please see attached dmesg. Stuck process looks like
  root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
  /usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device
  virtio-blk-pci,? -device
 
  on fourth vm by count.
 
  Should I try upstream kernel instead of applying patch to the latest
  3.4 or it is useless?
 
  If you can upgrade to an upstream kernel, please do that.
 

 With vanilla 3.7.4 there is almost no changes, and NMI started firing
 again. External symptoms looks like following: starting from some
 count, may be third or sixth vm, qemu-kvm process allocating its
 memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. Patch
 helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to
 kill stuck kvm processes and node returned back to the normal, when on
 3.2 sending SIGKILL to the process causing zombies and hanged ``ps''
 output (problem and workaround when no scheduler involved described
 here http://www.spinics.net/lists/kvm/msg84799.html).

 Try disabling pause loop exiting with ple_gap=0 kvm-intel.ko module parameter.


Hi Marcelo,

thanks, this parameter helped to increase number of working VMs in a
half of order of magnitude, from 3-4 to 10-15. Very high SY load, 10
to 15 percents, persists on such numbers for a long time, where linux
guests in same configuration do not jump over one percent even under
stress bench. After I disabled HT, crash happens only in long runs and
now it is kernel panic :)
Stair-like memory allocation behaviour disappeared, but other symptom
leading to the crash which I have not counted previously, persists: if
VM count is ``enough'' for crash, some qemu processes starting to eat
one core, and they`ll panic system after run in tens of minutes in
such state or if I try to attach debugger to one of them. If needed, I
can log entire crash output via netconsole, now I have some tail,
almost the same every time:
http://xdel.ru/downloads/btwin.png
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-24 Thread Andrey Korolyov
Thank you Marcelo,

Host node locking up sometimes later than yesterday, bur problem still
here, please see attached dmesg. Stuck process looks like
root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
/usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device
virtio-blk-pci,? -device

on fourth vm by count.

Should I try upstream kernel instead of applying patch to the latest
3.4 or it is useless?

On Thu, Jan 24, 2013 at 4:52 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Tue, Jan 22, 2013 at 09:00:25PM +0300, Andrey Korolyov wrote:
 Hi,

 problem described in the title happens on heavy I/O pressure on the
 host, without idle=poll trace almost always is the same, involving
 mwait, with poll and nohz=off RIP varies from time to time, at the
 previous hang it was tg_throttle_down, rather than test_ti_thread_flag
 in attached one. Both possible clocksource drivers, hpet and tsc, able
 to reproduce that with equal probability. VMs are pinned over one of
 two numa sets on two-head machine, mean emulator thread and each of
 vcpu threads has its own cpuset cg with '0-5,12-17' or '6-11,18-23'.
 I`ll appreciate any suggestions to try.

 Andrey,

 Can you reproduce with an upstream kernel? Commit
 5cfc2aabcb282f fixes a livelock.

  d2 75 c3 eb 03 41 89 c6 48 83 c4 18 44 89 f0 5b 5d 41 5c 41 5d 41 5e 41
 5f c3 31 c0 c3 48 63 ff 48 c7 c2 80 37 01 00 48 8b 0c fd e0 d6 68 81
 [12738.508644] Call Trace:
 [12738.508648]  [81035a66] ? walk_tg_tree_from+0x70/0x99
 [12738.508652]  [81014c03] ? __switch_to_xtra+0x14c/0x160
 [12738.508656]  [8103bcce] ? throttle_cfs_rq+0x4d/0x109
 [12738.508660]  [8103be70] ? put_prev_task_fair+0x3f/0x65
 [12738.508663]  [8134c8ae] ? __schedule+0x32e/0x5c3
 [12738.508666]  [8134ceee] ? yield_to+0xfa/0x10c
 [12738.508669]  [8105d5af] ? atomic_inc+0x3/0x4
 [12738.508678]  [a03a8fc4] ? kvm_vcpu_on_spin+0x8c/0xf7 [kvm]
 [12738.508684]  [a030602f] ? handle_pause+0x11/0x18


dmesg.txt.gz
Description: GNU Zip compressed data


Re: windows 2008 guest causing rcu_shed to emit NMI

2013-01-24 Thread Andrey Korolyov
On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti mtosa...@redhat.com wrote:
 On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote:
 Thank you Marcelo,

 Host node locking up sometimes later than yesterday, bur problem still
 here, please see attached dmesg. Stuck process looks like
 root 19251  0.0  0.0 228476 12488 ?D14:42   0:00
 /usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device
 virtio-blk-pci,? -device

 on fourth vm by count.

 Should I try upstream kernel instead of applying patch to the latest
 3.4 or it is useless?

 If you can upgrade to an upstream kernel, please do that.


With vanilla 3.7.4 there is almost no changes, and NMI started firing
again. External symptoms looks like following: starting from some
count, may be third or sixth vm, qemu-kvm process allocating its
memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. Patch
helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to
kill stuck kvm processes and node returned back to the normal, when on
3.2 sending SIGKILL to the process causing zombies and hanged ``ps''
output (problem and workaround when no scheduler involved described
here http://www.spinics.net/lists/kvm/msg84799.html).


dmesg-3.7.4.txt.gz
Description: GNU Zip compressed data


windows 2008 guest causing rcu_shed to emit NMI

2013-01-22 Thread Andrey Korolyov
Hi,

problem described in the title happens on heavy I/O pressure on the
host, without idle=poll trace almost always is the same, involving
mwait, with poll and nohz=off RIP varies from time to time, at the
previous hang it was tg_throttle_down, rather than test_ti_thread_flag
in attached one. Both possible clocksource drivers, hpet and tsc, able
to reproduce that with equal probability. VMs are pinned over one of
two numa sets on two-head machine, mean emulator thread and each of
vcpu threads has its own cpuset cg with '0-5,12-17' or '6-11,18-23'.
I`ll appreciate any suggestions to try.


dmesg2.txt.gz
Description: GNU Zip compressed data


Proper taming of oom-killer with kvm

2013-01-10 Thread Andrey Korolyov
Hi,

I have recently run in the following issue: under certain conditions,
if emulator process have exceeded its own memory limit in the cgroup
and oom shot it, /proc entry may stay long indefinitely. There are two
possible side-effects - at first, if one will try to read cmdline from
such entry, his request will hang indefinitely long too, e.g. if
issuing ``ps aux'' once per minute will fill out default PID limit in
less a half of day by ps processes in D state. Second effect may
appear only on heavily loaded node - scheduler process will eat 100%
of selected cores (almost always at only one), with system becoming
unresponsive in a couple of minutes. This should be reproduced easily:

- start kvm process
- put in to memory cgroup and set the limits
- disable oom_killer via oom_control
- simply put the process into oom condition (using balloon, it should
be very simple)
- check kvm process state - if it stuck in D state, all should be
okay, since you`re able to catch oom condition - simply send TERM
signal and raise memory limit by nonsignificant amount and process
wiil end normally.
If you`re  observing kvm process with triggered under_oom flag and in
_sleep_ state, TERM signal will kill it, with nice lock described
above.

I have solved problem by quite stupid workaround - after getting
informed of oom event(w/ disabled oom in cg), I`m  freezing kvm
process via freezer cg, therefore moving it to D state, then sending
it TERM, then raising memory limits and finally unfreezing it - it is
very ugly, but at least I have get rid of the problem.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html