[Bug 1788098] Comment bridged from LTC Bugzilla

2019-09-27 Thread bugproxy
--- Comment From p...@au1.ibm.com 2019-09-27 01:51 EDT---
(In reply to comment #73)
> Leonardo, can you elaborate on the 'other possible issues'? We're hesitant
> to pull 18 patches into a stable kernel under the assumption that they
> *might* fix some *potential* issues, without clear evidence. If you can test
> the single-patch kernel and report back that there are still issues then
> that's a much stronger case for the other patches.
>
> Commit 'KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page
> fault' that you're asking for requires all these additional backports to
> apply cleanly. Which makes me wonder if we're not actually introducing a
> problem with these backports just to fix it again later. Not saying that's
> the case, just wondering...
>
> Also, the following seem to be totally unrelated and unnecessary:
> - KVM: PPC: Remove unused kvm_unmap_hva callback
> - powerpc/mm/radix: Remove unused code
>
> While looking through the patches I also noticed that the following is the
> second patch of a series of 11 but it's the only one from the series that
> you're backporting.
> - powerpc/kvm: Switch kvm pmd allocator to custom allocator
> Its commit message mentions subsequent patches of that series so I'm
> wondering why we need/want only this single patch??
>
> Remember that we have to support this kernel for years and years to come so
> we only want to backport the absolute necessary.
>
> Lastly and FYI, the following is the minimal subset of your patches that all
> cherry-pick cleanly:
> - KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page fault
> - KVM: PPC: Book3S HV: Don't use compound_order to determine host mapping
> size
> - KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix()
> - KVM: PPC: Book3S HV: radix: Refine IO region partition scope attributes
> - KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot() in page fault handler
> - KVM: PPC: Book3S HV: Handle 1GB pages in radix page fault handler
> - KVM: PPC: Book3S HV: Streamline setting of reference and change bits
> - KVM: PPC: Book3S HV: Radix page fault handler optimizations
>
> Please provide some context why we need all the above (and potentially more).

OK, so these are the ones *not* included in the above list (oldest to
newest, with upstream commit IDs):

39c983ea0f96 KVM: PPC: Remove unused kvm_unmap_hva callback

This one is dead code removal, it can be dropped.

e2560b108fb1 KVM: PPC: Book3S HV: Make radix use correct tlbie sequence
in kvmppc_radix_tlbie_page

This one adds barriers which are required according to the architecture
specification. It is not strictly related to fixing this bug, but if not
included here, another bug should be raised to include it. It is quite
safe since it is just adding barrier instructions. Without it there is a
possibility of occasional mis-translation of addresses (though perhaps a
very small possibility). If another bug is raised for this patch,
include df158189dbcc below as well in the same bug.

7e3d9a1d0f2c KVM: PPC: Book3S HV: Make radix clear pte when unmapping

This fixes a real bug, though it is not strictly related to the bug in
this bugzilla. If it is not included here then another bug should be
raised to include it. It is a small, simple and safe change. Without it
there is a possibility of guests getting stuck doing continual
hypervisor page faults.

df158189dbcc KVM: PPC: Book 3S HV: Do ptesync in radix guest exit path

This one, like e2560b108fb1 above, adds barriers which are required
according to the architecture specification. It is not strictly related
to fixing this bug, but if not included here, another bug should be
raised to include it. It is quite safe since it is just adding barrier
instructions.

21828c99ee91 powerpc/kvm: Switch kvm pmd allocator to custom allocator

This one is not needed and can be dropped.

99491e2d0e50 powerpc/mm/radix: Remove unused code

This is dead code removal and can be dropped.

0078778a86b1 powerpc/mm/radix: implement LPID based TLB flushes to be
used by KVM

This is not strictly needed and can be dropped if d91cb39ffa7b and
9a4506e11b97 are being dropped.

a5fad1e95952 KVM: PPC: Book3S HV: Use a helper to unmap ptes in the
radix fault path

This is not strictly needed (code refactoring) and can be dropped.

a5704e83aa3d KVM: PPC: Book3S HV: Recursively unmap all page table
entries when unmapping

This one fixes a memory leak, so is not strictly related to this bug.
The memory leak will probably not be apparent unless users are using 1GB
huge pages to back guests.

d91cb39ffa7b KVM: PPC: Book3S HV: Make radix use the Linux translation
flush functions for partition scope

This is code refactoring and can be dropped.

9a4506e11b97 KVM: PPC: Book3S HV: Make radix handle process scoped LPID
flush in C, with relocation on

This is code refactoring and can be dropped.

878cf2bb2d8d KVM: PPC: Book3S HV: radix: Do not clear partition PTE when
RC or write bits do not match

This one is a 

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-09-26 Thread bugproxy
--- Comment From mranw...@us.ibm.com 2019-09-26 15:42 EDT---
Re-opening on our side to test in 19.10.  Everything should be there for that, 
but it would be good to confirm this in time to get any needed fixes to 20.04, 
too.  Just being clear at this point we don't need to target bionic - but 
validate on 19.10.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-05-13 Thread bugproxy
--- Comment From mranw...@us.ibm.com 2019-05-13 17:04 EDT---
Adding Paul Mackerras - can you help with the context for the patches - beyond 
the potential performance impact?   We were picking up this series because it 
fixes the migration problem, which appeared after adding a patch for bug 169712 
for performance.  Thanks!

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-05-10 Thread bugproxy
--- Comment From leona...@ibm.com 2019-05-10 19:54 EDT---
Hello Juerg,

As this complete list was suggested by Paul, I think he may be the best
person to show the context of the patch series.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-04-29 Thread bugproxy
--- Comment From leona...@ibm.com 2019-04-29 18:50 EDT---
(In reply to comment #70)
> Leonardo, since you seem to have a reliable reproducer now, could you give
> this test kernel [1] a try? It just contains commit c066fafc595e ("KVM: PPC:
> Book3S HV: Use correct pagesize in kvm_unmap_radix()") and is basically what
> Joe gave you (comment #5) but at that time you weren't able to reproduce the
> issue.
>
> [1] https://kernel.ubuntu.com/~juergh/lp1788098/

Hello Juerg,

As you pointed, this kernel has only one of the 19 patches of the patch series.
IMHO it would't be very productive to test this kernel as is. It can as well 
work just fine, but it doesn't have the complete solution to this problems.
The kernel with the whole patch series is already tested, and solves many other 
possible issues.

But If you think it's really important to test this one, I will try to
schedule it for testing ASAP.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-04-10 Thread bugproxy
--- Comment From leona...@ibm.com 2019-04-10 12:08 EDT---
(In reply to comment #68)
> In comment #22 above, it states that "In a meeting with lagarcia, I was
> informed this patch is very important, and that it is already on kernel
> 4.18-15 onwards."
>
> So, I assume that the required patchset(s) are already applied to the 18.04
> HWE kernel, and this bug requests a backport to the bionic 4.15 kernel.
>
> Next step is for the Canonical kernel team to analyse this backport request,
> dropping the commit fb1522e099f0 ("KVM: update to new mmu_notifier semantic
> v2", 2017-08-31), to assess whether it can be SRU'ed into the bionic 4.15
> kernel.

I may be wrong, but the patch to be dropped is "KVM: PPC: Remove unused
kvm_unmap_hva callback" (7fe24f427a09).

On this commit, it says it's removing code that is dead since commit
fb1522e099f0.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-04-10 Thread bugproxy
--- Comment From p...@au1.ibm.com 2019-04-10 04:10 EDT---
(In reply to comment #66)
> Stefan NACK'ed the series. For some unknown reason that email did make it
> into the archive so here is ist content:
>
> > Since commit fb1522e099f0 ("KVM: update to new mmu_notifier semantic
> > v2", 2017-08-31), the MMU notifier code in KVM no longer calls the
> > kvm_unmap_hva callback.  This removes the PPC implementations of
> > kvm_unmap_hva().
>
> This is not really the way SRUs should be done. We cannot remove support for
> interfaces after release. Also the amount of change as a requisite should be
> kept as minimal as possible. This just feels like too many changes without a
> strong argument on why this must be done that way.
>
> -Stefan

Well it was just removing dead code, but whatever.

The series should be fine without that patch.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-04-08 Thread bugproxy
--- Comment From leona...@ibm.com 2019-04-08 10:37 EDT---
This (In reply to comment #64)
> Hi Leonardo,
> unfortunately there was an issue with the SRU request and Juerg NACK-ed it,
> please have a look here:
> https://lists.ubuntu.com/archives/kernel-team/2019-March/099128.html
> Please re-submit the SRU request with the requested corrections.

The email you posted was from March 10, and is outdated. The changes
required were made, and it was acked on March 13, as said on the
previous comment.

Please see https://lists.ubuntu.com/archives/kernel-
team/2019-March/099221.html

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-03-26 Thread bugproxy
--- Comment From leona...@ibm.com 2019-03-26 13:12 EDT---
Updating:

The patchset was acked by Juerg Haefliger on Mar 13.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-03-14 Thread bugproxy
--- Comment From leona...@ibm.com 2019-03-14 15:47 EDT---
Patchset SRU

[Impact]
* VMs have a high chance to hit guest migration issues if more than one guest 
migration happens at a time, while using THP on ppc64le.

* Migrating VMs in parallel will cause at least one guest to crash about
half the time.  Since VM migration is a upgrade/uptime strategy this has
a fairly large customer impact.

* The uploaded patches correct the behavior of THP on guests. They are
available on v4.18.x onwards.

[Test Case]

* One can reproduce the bug by trying two guest migrations, at the same
time, following this instructions on comment 12:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1788098/comments/12

[Regression Potential]

* These patches are already on linux-stable since v4.18.15 (also on
hwe), so there is low regression chance.

8afc7da95a7e [Bionic] KVM: PPC: Book3S HV: Avoid crash from THP collapse during 
radix page fault
82f7758a9c99 [Bionic] KVM: PPC: Book3S HV: Don't use compound_order to 
determine host mapping size
b0f7664dc993 [Bionic] KVM: PPC: Book3S HV: Use correct pagesize in 
kvm_unmap_radix()
1991612ab005 [Bionic] KVM: PPC: Book3S HV: radix: Do not clear partition PTE 
when RC or write bits do not match
04fea11aa5fe [Bionic] KVM: PPC: Book3S HV: radix: Refine IO region partition 
scope attributes
9037e89d8093 [Bionic] KVM: PPC: Book3S HV: Make radix handle process scoped 
LPID flush in C, with relocation on
ed0a86a433c7 [Bionic] KVM: PPC: Book3S HV: Make radix use the Linux translation 
flush functions for partition scope
0effe5dc3cf4 [Bionic] KVM: PPC: Book3S HV: Recursively unmap all page table 
entries when unmapping
42cbaef5361b [Bionic] KVM: PPC: Book3S HV: Use a helper to unmap ptes in the 
radix fault path
414207e08540 [Bionic] powerpc/mm/radix: implement LPID based TLB flushes to be 
used by KVM
eb2a70df7099 [Bionic] powerpc/mm/radix: Remove unused code
ad052e60a417 [Bionic] powerpc/kvm: Switch kvm pmd allocator to custom allocator
bb2c03e387f4 [Bionic] KVM: PPC: Book 3S HV: Do ptesync in radix guest exit path
699642e0a4f8 [Bionic] KVM: PPC: Book3S HV: Make radix clear pte when unmapping
297755f60b17 [Bionic] KVM: PPC: Book3S HV: Make radix use correct tlbie 
sequence in kvmppc_radix_tlbie_page
d5f5570b7df4 [Bionic] KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot() in page 
fault handler
b0adb3223100 [Bionic] KVM: PPC: Book3S HV: Handle 1GB pages in radix page fault 
handler
5be468e7408b [Bionic] KVM: PPC: Book3S HV: Streamline setting of reference and 
change bits
860816ea1680 [Bionic] KVM: PPC: Book3S HV: Radix page fault handler 
optimizations
7fe24f427a09 [Bionic] KVM: PPC: Remove unused kvm_unmap_hva callback

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-03-12 Thread bugproxy
--- Comment From leona...@ibm.com 2019-03-12 14:52 EDT---
(In reply to comment #60)
The patches were sent to Ubuntu kernel-team mailing list.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-02-26 Thread bugproxy
--- Comment From leona...@ibm.com 2019-02-26 12:58 EDT---
Here are the patches:

https://gitlab.com/LeoBras/bionic/compare/master...lp1788098

Also, I attached a tgz with the patches.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-02-26 Thread bugproxy
--- Comment From leona...@ibm.com 2019-02-26 12:36 EDT---
(In reply to comment #41)
> ...or perhaps I've misunderstood. Are the patches listed in comment #23 the
> complete set required to resolve the issue (with no complex backporting
> required)?

Yes, the patches listed by Paul are the only ones required to fix the
issue.

As noted by Paul, there is only one patch that causes some conflict.
I have solved this conflict and I will soon attach the full patch series.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-02-25 Thread bugproxy
--- Comment From leona...@ibm.com 2019-02-25 18:35 EDT---
I cherry-picked all patches on top of ubuntu-bionic (Ubuntu-4.15.0-45.48).

Then, the next step was trying to find a way to reproduce the bug.

I have noted, after several tests, that the previous suggestion of
Michael Ranweiler was valid, but it's reproduction rate is about 50%. As
previously I have tested only a few times, I could not get it to
reproduce.

How it fails:
During 'memtest' second part, on a 'migrated to' guest, one of the migrations 
(that occur in parallel) would exit with a "Segmentation Fault" and not 
conclude the normal flow of the test.
(It never reaches the puts part)

After applying the kernel patches, it seems to work just fine all the
times (I have tested 10+ times by now).

The kernel debs generated by the building process can be downloaded on
the link bellow:

ftp://testcase.software.ibm.com/fromibm/linux/patched_kernel.tar.gz
- Please use user=anonymous, passwd=anonymous if asked
- Make sure to download it soon, as the link will be available for 3 business 
days.

Building info:
command: fakeroot debian/rules binary-generic binary-perarch
git repo (before patches) : git://kernel.ubuntu.com/ubuntu/ubuntu-bionic.git
(tag: Ubuntu-4.15.0-45.48)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-02-19 Thread bugproxy
--- Comment From p...@au1.ibm.com 2019-02-19 20:52 EDT---
(In reply to comment #34)
> In a meeting with lagarcia, I was informed this patch is very important, and
> that it is already on kernel 4.18-15 onwards.
>
> In fact, including this one. there are two important patches on this subject:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/
> ?h=kvm-ppc-next=c066fafc595eef5ae3c83ae3a8305956b8c3ef15
> https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/
> ?h=kvm-ppc-next=6579804c431712d56956a63b1a01509441cc6800

To get those you will need to cherry-pick the following patches from
upstream:

39c983ea0f96 KVM: PPC: Remove unused kvm_unmap_hva callback
c4c8a7643e74 KVM: PPC: Book3S HV: Radix page fault handler optimizations
f7caf712d885 KVM: PPC: Book3S HV: Streamline setting of reference and change 
bits
58c5c276b4c2 KVM: PPC: Book3S HV: Handle 1GB pages in radix page fault handler
31c8b0d0694a KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot() in page fault 
handler
e2560b108fb1 KVM: PPC: Book3S HV: Make radix use correct tlbie sequence in 
kvmppc_radix_tlbie_page
7e3d9a1d0f2c KVM: PPC: Book3S HV: Make radix clear pte when unmapping
df158189dbcc KVM: PPC: Book 3S HV: Do ptesync in radix guest exit path
21828c99ee91 powerpc/kvm: Switch kvm pmd allocator to custom allocator
99491e2d0e50 powerpc/mm/radix: Remove unused code
0078778a86b1 powerpc/mm/radix: implement LPID based TLB flushes to be used by 
KVM (note that this one will generate some conflicts)
a5fad1e95952 KVM: PPC: Book3S HV: Use a helper to unmap ptes in the radix fault 
path
a5704e83aa3d KVM: PPC: Book3S HV: Recursively unmap all page table entries when 
unmapping
d91cb39ffa7b KVM: PPC: Book3S HV: Make radix use the Linux translation flush 
functions for partition scope
9a4506e11b97 KVM: PPC: Book3S HV: Make radix handle process scoped LPID flush 
in C, with relocation on
bc64dd0e1c4e KVM: PPC: Book3S HV: radix: Refine IO region partition scope 
attributes
878cf2bb2d8d KVM: PPC: Book3S HV: radix: Do not clear partition PTE when RC or 
write bits do not match
c066fafc595e KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix()
71d29f43b633 KVM: PPC: Book3S HV: Don't use compound_order to determine host 
mapping size
6579804c4317 KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix 
page fault

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-02-08 Thread bugproxy
--- Comment From leona...@ibm.com 2019-02-08 14:05 EDT---
In a meeting with lagarcia, I was informed this patch is very important, and 
that it is already on kernel 4.18-15 onwards.

In fact, including this one. there are two important patches on this
subject:

https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/?h=kvm-ppc-next=c066fafc595eef5ae3c83ae3a8305956b8c3ef15
https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/?h=kvm-ppc-next=6579804c431712d56956a63b1a01509441cc6800

As I said before, for 18.10 onwards (kernel >= 4.18), the patch is
available from kernel upstream source, but for Ubuntu 18.04 they may not
be so easily applied.

So I will work on backporting them to v4.15.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-01-24 Thread bugproxy
--- Comment From leona...@ibm.com 2019-01-24 09:34 EDT---
By the test results, the problem doesn't seem to reproduce.

Are there any other suggestions to reproduce it?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-01-24 Thread bugproxy
--- Comment From leona...@ibm.com 2019-01-24 09:26 EDT---
By suggestion of Michael Ranweiler, I did some concurrent migration tests.
In fact, I just repeated the procedure used before, but did it twice at roughly 
the same time (in parallel).

The results are attached.
Migration 1:  from1.txt to1.txt
Migration 2:  from2.txt to2.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-01-04 Thread bugproxy
--- Comment From leona...@ibm.com 2019-01-04 14:29 EDT---
Test: Verify all memory after migration

###
Host:
###

# uname -a
Linux host 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 
ppc64le ppc64le ppc64le GNU/Linux

#cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

#cat /proc/cpuinfo
[...]
processor   : 159
cpu : POWER9, altivec supported
clock   : 2300.00MHz
revision: 2.2 (pvr 004e 1202)

timebase: 51200
platform: PowerNV
model   : 8375-42A
machine : PowerNV 8375-42A
firmware: OPAL
MMU : Radix

As previously, I have built version Qemu 3.1.0 and made sure the patch that 
enables THP was included:
#../configure 
--target-list=ppc-linux-user,ppc64-linux-user,ppc64le-linux-user,ppc-softmmu,ppc64-softmmu
 --enable-debug-info --enable-trace-backends=log --python=/usr/bin/python3 && 
make -j $(nproc)'

#./ppc-softmmu/qemu-system-ppc -version
QEMU emulator version 3.1.0 (v3.1.0-dirty)

###
Guest:
###

### CLI 1:  Migrating from:
MALLOC_PERTURB_=1 /home/leonardo/qemu/build/ppc64-softmmu/qemu-system-ppc64 \
-nographic \
-serial mon:stdio \
-name 'avocado-vt-vm1'  \
-machine pseries  \
-nodefaults  \
-vga std \
-device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x3,chassis_nr=1  \
-device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=0x4 \
-object rng-random,filename=/dev/random,id=passthrough-RHq4nIpF \
-device 
virtio-rng-pci,id=virtio-rng-pci-aXCni2OX,rng=passthrough-RHq4nIpF,bus=pci.0,addr=0x5
 \
-device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x6 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x7 \
-drive 
id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/leonardo/images/ubuntu-18.04-ppc64le.qcow2
 \
-device scsi-hd,id=image1,drive=drive_image1 \
-m 8192  \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
-vnc :0  \
-rtc base=utc,clock=host  \
-boot order=cdn,once=c,menu=off,strict=off \
-enable-kvm  \
-watchdog i6300esb \
-watchdog-action reset \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 \
-initrd /boot/initrd.img-4.15.0-20-generic \
-kernel /boot/vmlinux-4.15.0-20-generic \
-append "root=UUID=b4ef9412-06d6-4947-9969-f15c7cc2c986 ro quiet splash

### CLI 2:  Migrating To
Copy of CLI 1, changing:

- -name 'avocado-vt-vm1'  \
+ -name 'avocado-vt-vm2'  \
+ -S
- -vnc :0  \
+ -vnc :1  \
+ -incoming tcp:0:5801 \

### Inside Guest:

#uname -a
Linux localhost 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 
ppc64le ppc64le ppc64le GNU/Linux

# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

#cat /proc/cpuinfo
processor   : 3
cpu : POWER9 (architected), altivec supported
clock   : 2900.00MHz
revision: 2.2 (pvr 004e 1202)

timebase: 51200
platform: pSeries
model   : IBM pSeries (emulated by qemu)
machine : CHRP IBM pSeries (emulated by qemu)
MMU : Radix

###
Test Software:
###
I created a simple C file to:
- allocate 2MB blocks,
- write urandom to them,
- md5sum all the blocks together,
- stops, allowing migration,
- re-md5sum everything,
- free the blocks.

The attached source file is copied to guest, then compiled:
#gcc -o memtest memtest.c -lcrypto

###
Procedure
###

Use CLI commands to bring up Guest "Migrate from" and "Migrate to".

On "Migrate from":
root@localhost:~# ./memtest
Block 0
Block 128
[...]
Block 3968
Allocated 4075 blocks of 2097152 size.
Md5 = 209a63b9c1f9acd13fa32236229daa9b 
Press enter key to check memory integrity

[1]+  Stopped ./memtest
root@localhost:~# free -h
totalusedfree  shared  buff/cache   available
Mem:   8.0G7.7G246M 64K 21M 37M
Swap:  758M758M  0B

- Enter Qemu Monitor: 
QEMU 3.1.0 monitor - type 'help' for more information
(qemu) migrate -d tcp:0:5801

(qemu) info status
VM status: paused (postmigrate)
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
compress: off events: off
postcopy-ram: off x-colo: off release-ram: off block: off return-path: off 
pause-before-switchover: off
x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: 
off
Migration status: completed
total time: 248950 milliseconds
downtime: 112 milliseconds
setup: 18 milliseconds
transferred ram: 9847781 kbytes
throughput: 269.52 mbps
remaining ram: 0 kbytes
total ram: 8405056 kbytes
duplicate: 143398 pages
skipped: 0 pages
normal: 2456826 pages
normal 

[Bug 1788098] Comment bridged from LTC Bugzilla

2019-01-04 Thread bugproxy
--- Comment From leona...@ibm.com 2019-01-04 06:12 EDT---
I have tried the following test in order to reproduce the bug:

##
root@localhost:~# uname -a
Linux localhost 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 
ppc64le ppc64le ppc64le GNU/Linux
root@localhost:~# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
##

dd if=/dev/urandom of=/dev/shm/img bs=2M  count=2000
md5sum /dev/shm/img > test.md5

After the migration, i did:
md5sum -c test.md5
And the result was OK. (memory not corrupted).

I also modified the above test allocating chunks of 2M, this way:

for i in {0001..2000} ; do dd if=/dev/urandom of=/dev/shm/img_${i} bs=2M 
count=1 ; done
md5sum /dev/shm/* > test.md5

After the migration, i did:
md5sum -c test.md5
And the result was OK for every file. (memory not corrupted).

Conclusion:
- I have found no difference between patched and unpatched kernel during the 
tests.
- The memory after the migration seems fine, returning the same memory block 
(tested with md5sum)

Is there any other suggestion about how to reproduce the bug?

Thanks!

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1788098] Comment bridged from LTC Bugzilla

2018-12-21 Thread bugproxy
--- Comment From leona...@ibm.com 2018-12-21 12:10 EDT---
Hello,

I have been trying to reproduce this bug over this week, but I couldn't
do so on Ubuntu.

Could anyone verify what I have been doing wrong?

#

## QEMU

I have built version Qemu 3.1.0 and made sure the patch that enables THP was 
included:
../configure 
--target-list=ppc-linux-user,ppc64-linux-user,ppc64le-linux-user,ppc-softmmu,ppc64-softmmu
 --enable-debug-info --enable-trace-backends=log --python=/usr/bin/python3 && 
make -j $(nproc)'

./ppc-softmmu/qemu-system-ppc -version
QEMU emulator version 3.1.0 (v3.1.0-dirty)

## Kernel

uname -a
Linux NAME 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 
ppc64le ppc64le ppc64le GNU/Linux

cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

## CLI command

Both commands were sent on the same host,  (1) is the "migrating from"
instance and (2) is the "migrate to" instance.

(1)
MALLOC_PERTURB_=1 /home/leonardo/qemu/build/ppc64-softmmu/qemu-system-ppc64 \
-nographic \
-serial mon:stdio \
-S  \
-name 'avocado-vt-vm1'  \
-machine pseries  \
-nodefaults  \
-vga std \
-device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x3,chassis_nr=1  \
-device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=0x4 \
-object rng-random,filename=/dev/random,id=passthrough-RHq4nIpF \
-device 
virtio-rng-pci,id=virtio-rng-pci-aXCni2OX,rng=passthrough-RHq4nIpF,bus=pci.0,addr=0x5
 \
-device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x6 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x7 \
-drive 
id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/leonardo/images/ubuntu-18.04-ppc64le.qcow2
 \
-device scsi-hd,id=image1,drive=drive_image1 \
-m 8192  \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
-vnc :0  \
-rtc base=utc,clock=host  \
-boot order=cdn,once=c,menu=off,strict=off \
-enable-kvm  \
-watchdog i6300esb \
-watchdog-action reset \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9

(2) Same as above. Changes only a few stuff:
- -name 'avocado-vt-vm1'  \
+ -name 'avocado-vt-vm2'  \
- -vnc :0  \
+ -vnc :1  \
+ -incoming tcp:0:5801 \

## Testing and Results

(1) On guest :
# stress --io 5 --cpu 4
stress: info: [812] dispatching hogs: 4 cpu, 5 io, 0 vm, 0 hdd

(1) on Qemu Terminal:
(qemu) migrate_set_speed 256
(qemu) migrate -d tcp:0:5801
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
compress: off events: off postcopy-ram: off x-colo: off release-ram: off block: 
off return-path: off pause-before-switchover: off x
-multifd: off dirty-bitmaps: off
Migration status: completed
total time: 1776 milliseconds
downtime: 61 milliseconds
setup: 9 milliseconds
transferred ram: 422571 kbytes
throughput: 1964.89 mbps
remaining ram: 0 kbytes
total ram: 8405056 kbytes
duplicate: 2006371 pages
skipped: 0 pages
normal: 101037 pages
normal bytes: 404148 kbytes
dirty sync count: 3
page size: 4 kbytes
(qemu) info status
VM status: paused (postmigrate)

It's all over on ~2 seconds, no issues. Stress stay running on the new
machine. (after cont)

###

Other Qemu tested, with the same result:
v2.12 git
v3.0.0 git
Debian 1:2.12+dfsg-3ubuntu8)

Other Host Kernel tested, with the same result:
4.18.0 - Vanilla, no patch
4.15.0-42-generic
4.15.0-42-generic + patch
4.15.0-32-generic (provided by jsalisbury)
4.15.0-20-generic
4.15.0 - Vanilla, no patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1788098

Title:
  Avoid migration issues with aligned 2MB THB

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1788098/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs