Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command
Thanks to all for suggestions and notes. Though as Andrew Cooper noticed current approach is too over simplified. As Tams K Lengyel noticed the effect could be too negligible and some OS specific logic should be present. So as for today we could drop the patch. 20.03.2023 19:32, Ковалёв Сергей пишет: gva_to_gfn command used for fast address translation in LibVMI project. With such a command it is possible to perform address translation in single call instead of series of queries to get every page table. Thanks to Dmitry Isaykin for involvement. Signed-off-by: Sergey Kovalev --- Cc: Jan Beulich Cc: Andrew Cooper Cc: "Roger Pau Monné" Cc: Wei Liu Cc: George Dunlap Cc: Julien Grall Cc: Stefano Stabellini Cc: Tamas K Lengyel Cc: xen-devel@lists.xenproject.org --- --- xen/arch/x86/domctl.c | 17 + xen/include/public/domctl.h | 13 + 2 files changed, 30 insertions(+) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 2118fcad5d..0c9706ea0a 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -1364,6 +1364,23 @@ long arch_do_domctl( copyback = true; break; + case XEN_DOMCTL_gva_to_gfn: + { + uint64_t ga = domctl->u.gva_to_gfn.addr; + uint64_t cr3 = domctl->u.gva_to_gfn.cr3; + struct vcpu* v = d->vcpu[0]; + uint32_t pfec = PFEC_page_present; + unsigned int page_order; + + uint64_t gfn = paging_ga_to_gfn_cr3(v, cr3, ga, , _order); + domctl->u.gva_to_gfn.addr = gfn; + domctl->u.gva_to_gfn.page_order = page_order; + if ( __copy_to_guest(u_domctl, domctl, 1) ) + ret = -EFAULT; + + break; + } + default: ret = -ENOSYS; break; diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 51be28c3de..628dfc68fd 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -948,6 +948,17 @@ struct xen_domctl_paging_mempool { uint64_aligned_t size; /* Size in bytes. */ }; +/* + * XEN_DOMCTL_gva_to_gfn. + * + * Get the guest virtual to guest physicall address translated. + */ +struct xen_domctl_gva_to_gfn { + uint64_aligned_t addr; + uint64_aligned_t cr3; + uint64_aligned_t page_order; +}; + #if defined(__i386__) || defined(__x86_64__) struct xen_domctl_vcpu_msr { uint32_t index; @@ -1278,6 +1289,7 @@ struct xen_domctl { #define XEN_DOMCTL_vmtrace_op 84 #define XEN_DOMCTL_get_paging_mempool_size 85 #define XEN_DOMCTL_set_paging_mempool_size 86 +#define XEN_DOMCTL_gva_to_gfn 87 #define XEN_DOMCTL_gdbsx_guestmemio 1000 #define XEN_DOMCTL_gdbsx_pausevcpu 1001 #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -1340,6 +1352,7 @@ struct xen_domctl { struct xen_domctl_vuart_op vuart_op; struct xen_domctl_vmtrace_op vmtrace_op; struct xen_domctl_paging_mempool paging_mempool; + struct xen_domctl_gva_to_gfn gva_to_gfn; uint8_t pad[128]; } u; }; -- Best regards, Sergey Kovalev
Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command
21.03.2023 2:34, Tamas K Lengyel пишет: On Mon, Mar 20, 2023 at 3:23 PM Ковалёв Сергей <mailto:va...@list.ru>> wrote: > > > > 21.03.2023 1:51, Tamas K Lengyel wrote: > > > > > > On Mon, Mar 20, 2023 at 12:32 PM Ковалёв Сергей <mailto:va...@list.ru> > > <mailto:va...@list.ru <mailto:va...@list.ru>>> wrote: > > > > > > gva_to_gfn command used for fast address translation in LibVMI project. > > > With such a command it is possible to perform address translation in > > > single call instead of series of queries to get every page table. > > > > You have a couple assumptions here: > > - Xen will always have a direct map of the entire guest memory - there > > are already plans to move away from that. Without that this approach > > won't have any advantage over doing the same mapping by LibVMI > > Thanks! I didn't know about the plan. Though I use this patch > back ported into 4.16. > > > - LibVMI has to map every page for each page table for every lookup - > > you have to do that only for the first, afterwards the pages on which > > the pagetable is are kept in a cache and subsequent lookups would be > > actually faster then having to do this domctl since you can keep being > > in the same process instead of having to jump to Xen. > > Yes. I know about the page cache. But I have faced with several issues > with cache like this one https://github.com/libvmi/libvmi/pull/1058 <https://github.com/libvmi/libvmi/pull/1058> . > So I had to disable the cache. The issue you linked to is an issue with a stale v2p cache, which is a virtual TLB. The cache I talked about is the page cache, which is just maintaining a list of the pages that were accessed by LibVMI for future accesses. You can have one and not the other (ie. ./configure --disable-address-cache --enable-page-cache). Tamas Thanks. I know about the page cache. Though I'm not familiar with it close enough. As far as I understand at the time the page cache implementation in LibVMI looks like this: 1. Call sequence: vmi_read > vmi_read_page > driver_read_page > xen_read_page > memory_cache_insert ..> get_memory_data > xen_get_memory > xen_get_memory_pfn > xc_map_foreign_range 2. This is perfectly valid while guest OS keeps page there. And physical pages are always there. 3. To renew cache the "age_limit" counter is used. 4. In Xen driver implementation in LibVMI the "age_limit" is disabled. 5. Also it is possible to invalidate cache with "xen_write" or "vmi_pagecache_flush". But it is not used. 6. Other way to avoid too big cache is cache size limit. So on every insert half of the cache is dropped on size overflow. So the only thing we should know is valid mapping of guest virtual address to guest physical address. And the slow paths are: 1. A first traversal of new page table set. E.g. for the new process. 2. Or new subset of page tables for known process. 3. Subsequent page access after cache clean on size overflow. Am I right? The main idea behind the patch: 1. For the very first time it would be done faster with hypercall. 2. For subsequent calls v2p translation cache could be used (used in my current work in LibVMI). 3. To avoid errors with stale cache v2p cache could be invalidated on every event (VMI_FLUSH_RATE = 1). -- Best regards, Sergey Kovalev
Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command
20.03.2023 22:07, Andrew Cooper пишет: On 20/03/2023 4:32 pm, Ковалёв Сергей wrote: gva_to_gfn command used for fast address translation in LibVMI project. With such a command it is possible to perform address translation in single call instead of series of queries to get every page table. Thanks to Dmitry Isaykin for involvement. Signed-off-by: Sergey Kovalev I fully appreciate why you want this hypercall, and I've said several times that libvmi wants something better than it has, but... diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 2118fcad5d..0c9706ea0a 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -1364,6 +1364,23 @@ long arch_do_domctl( copyback = true; break; + case XEN_DOMCTL_gva_to_gfn: + { + uint64_t ga = domctl->u.gva_to_gfn.addr; + uint64_t cr3 = domctl->u.gva_to_gfn.cr3; + struct vcpu* v = d->vcpu[0]; ... this isn't safe if you happen to issue this hypercall too early in a domain's lifecycle. If nothing else, you want to do a domain_vcpu() check and return -ENOENT in the failure case. Thanks! More generally, issuing the hypercall under vcpu0 isn't necessarily correct. It is common for all vCPUs to have equivalent paging settings, but e.g. Xen transiently disables CR4.CET and CR0.WP in order to make self-modifying code changes. Furthermore, the setting of CR4.{PAE,PSE} determines reserved bits, so you can't even ignore the access rights and hope that the translation works out correctly. Thanks! I didn't think about such things earlier. I should to think this know carefully. Ideally we'd have a pagewalk algorithm which didn't require taking a vcpu, and instead just took a set of paging configuration, but it is all chronically entangled right now. Do You mean to add new implementation of "paging_ga_to_gfn_cr3"? I think, at a minimum, you need to take a vcpu_id as an input, but I suspect to make this a usable API you want an altp2m view id too. Why we should consider altp2m while translating guest virtual address to guest physical one? Also, I'm pretty sure this is only safe for a paused vCPU. If the vCPU isn't paused, then there's a TOCTOU race in the pagewalk code when inspecting control registers. Thanks! Should we pause the domain? + uint32_t pfec = PFEC_page_present; + unsigned int page_order; + + uint64_t gfn = paging_ga_to_gfn_cr3(v, cr3, ga, , _order); + domctl->u.gva_to_gfn.addr = gfn; + domctl->u.gva_to_gfn.page_order = page_order; page_order is only not stack rubble if gfn is different to INVALID_GFN. Sorry but I don't understand "is only not stack rubble". Do you mean that I should initialize "page_order" while defining it? + if ( __copy_to_guest(u_domctl, domctl, 1) ) + ret = -EFAULT; You want to restrict this to just the gva_to_gfn sub-portion. No point copying back more than necessary. ~Andrew Thanks a lot! -- Best regards, Sergey Kovalev
Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command
21.03.2023 1:51, Tamas K Lengyel wrote: On Mon, Mar 20, 2023 at 12:32 PM Ковалёв Сергей <mailto:va...@list.ru>> wrote: > > gva_to_gfn command used for fast address translation in LibVMI project. > With such a command it is possible to perform address translation in > single call instead of series of queries to get every page table. You have a couple assumptions here: - Xen will always have a direct map of the entire guest memory - there are already plans to move away from that. Without that this approach won't have any advantage over doing the same mapping by LibVMI Thanks! I didn't know about the plan. Though I use this patch back ported into 4.16. - LibVMI has to map every page for each page table for every lookup - you have to do that only for the first, afterwards the pages on which the pagetable is are kept in a cache and subsequent lookups would be actually faster then having to do this domctl since you can keep being in the same process instead of having to jump to Xen. Yes. I know about the page cache. But I have faced with several issues with cache like this one https://github.com/libvmi/libvmi/pull/1058 . So I had to disable the cache. With these perspectives in mind I don't think this would be a useful addition. Please prove me wrong with performance numbers and a specific use-case that warrants adding this and how you plan to introduce it into LibVMI without causing performance regression to all other use-cases. I will send You a PR into LibVMI in a day or two. I don't have any performance numbers at the time. I send this patch to share my current work as soon as possible. To prevent regression in all use-cases we could add a configure option. Thanks to make me notice that! Tamas -- С уважением, Ковалёв Сергей.
[XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command
gva_to_gfn command used for fast address translation in LibVMI project. With such a command it is possible to perform address translation in single call instead of series of queries to get every page table. Thanks to Dmitry Isaykin for involvement. Signed-off-by: Sergey Kovalev --- Cc: Jan Beulich Cc: Andrew Cooper Cc: "Roger Pau Monné" Cc: Wei Liu Cc: George Dunlap Cc: Julien Grall Cc: Stefano Stabellini Cc: Tamas K Lengyel Cc: xen-devel@lists.xenproject.org --- --- xen/arch/x86/domctl.c | 17 + xen/include/public/domctl.h | 13 + 2 files changed, 30 insertions(+) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 2118fcad5d..0c9706ea0a 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -1364,6 +1364,23 @@ long arch_do_domctl( copyback = true; break; +case XEN_DOMCTL_gva_to_gfn: +{ +uint64_t ga = domctl->u.gva_to_gfn.addr; +uint64_t cr3 = domctl->u.gva_to_gfn.cr3; +struct vcpu* v = d->vcpu[0]; +uint32_t pfec = PFEC_page_present; +unsigned int page_order; + +uint64_t gfn = paging_ga_to_gfn_cr3(v, cr3, ga, , _order); +domctl->u.gva_to_gfn.addr = gfn; +domctl->u.gva_to_gfn.page_order = page_order; +if ( __copy_to_guest(u_domctl, domctl, 1) ) +ret = -EFAULT; + +break; +} + default: ret = -ENOSYS; break; diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 51be28c3de..628dfc68fd 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -948,6 +948,17 @@ struct xen_domctl_paging_mempool { uint64_aligned_t size; /* Size in bytes. */ }; +/* + * XEN_DOMCTL_gva_to_gfn. + * + * Get the guest virtual to guest physicall address translated. + */ +struct xen_domctl_gva_to_gfn { +uint64_aligned_t addr; +uint64_aligned_t cr3; +uint64_aligned_t page_order; +}; + #if defined(__i386__) || defined(__x86_64__) struct xen_domctl_vcpu_msr { uint32_t index; @@ -1278,6 +1289,7 @@ struct xen_domctl { #define XEN_DOMCTL_vmtrace_op84 #define XEN_DOMCTL_get_paging_mempool_size 85 #define XEN_DOMCTL_set_paging_mempool_size 86 +#define XEN_DOMCTL_gva_to_gfn87 #define XEN_DOMCTL_gdbsx_guestmemio1000 #define XEN_DOMCTL_gdbsx_pausevcpu 1001 #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -1340,6 +1352,7 @@ struct xen_domctl { struct xen_domctl_vuart_op vuart_op; struct xen_domctl_vmtrace_opvmtrace_op; struct xen_domctl_paging_mempoolpaging_mempool; +struct xen_domctl_gva_to_gfngva_to_gfn; uint8_t pad[128]; } u; }; -- 2.38.1
Xen Kdump analysis with crash utility
Hello, I'm trying to start use of Kdump in my Xen 4.16 setup with Ubuntu 18.04.6 ( 5.4.0-137-generic ). I was able to load "dump-capture kernel" with kexec-tools and collect crashdump with makedumpfile like this: ``` makedumpfile -E -X -d 0 /proc/vmcore /var/crash/dump ``` This dump file could be used to analyze Dom0 panics. Though I have some issues while analyzing dump file for Xen kernel: ``` ~/src/crash/crash --hyper ~/xen-syms-dbg/usr/lib/debug/xen-syms /var/crash/202301241536/dump.202301241536 crash 8.0.2++ ... GNU gdb (GDB) 10.2 ... crash: invalid kernel virtual address: 1ef8 type: "fill_pcpu_struct" WARNING: cannot fill pcpu_struct. crash: cannot read cpu_info. ``` As far as I know developers community of crash utility doesn't actively support Xen. From https://github.com/crash-utility/crash/issues/21#issuecomment-330847410 : ``` I cannot help you with Xen-related issues because Red Hat stopped releasing Xen kernels several years ago (RHEL5 was the last Red Hat kernel that contained a Xen kernel). Since then, ongoing Xen kernel support in the crash utility has been maintained by engineers who work for other distributions that still offer Xen kernels. ``` Does anybody use kdump to analyze Xen crashes? Could anybody share some tips and tricks with me to use crash or other tools with such dumps? Thanks a lot. -- Best regards, Sergey Kovalev
Re: [Xen-devel] [XEN PATCH v1 1/1] x86/vm_event: add fast single step
Andrew, Tamas thank you very much. I will improve the patch. December 17, 2019 3:13:42 PM UTC, Andrew Cooper пишет: >On 17/12/2019 15:10, Tamas K Lengyel wrote: >> On Tue, Dec 17, 2019 at 8:08 AM Tamas K Lengyel >wrote: >>> On Tue, Dec 17, 2019 at 7:48 AM Andrew Cooper > wrote: On 17/12/2019 14:40, Sergey Kovalev wrote: > On break point event eight context switches occures. > > With fast single step it is possible to shorten path for two >context > switches > and gain 35% spead-up. > > Was tested on Debian branch of Xen 4.12. See at: > >https://github.com/skvl/xen/tree/debian/knorrie/4.12/fast-singlestep > > Rebased on master: > https://github.com/skvl/xen/tree/fast-singlestep > > Signed-off-by: Sergey Kovalev 35% looks like a good number, but what is "fast single step"? All >this appears to be is plumbing for to cause an altp2m switch on single >step. >>> Yes, a better explanation would be much needed here and I'm not 100% >>> sure it correctly implements what I think it tries to. >>> >>> This is my interpretation of what the idea is: when using DRAKVUF >(or >>> another system using altp2m with shadow pages similar to what I >>> describe in >https://xenproject.org/2016/04/13/stealthy-monitoring-with-xen-altp2m), >>> after a breakpoint is hit the system switches to the default >>> unrestricted altp2m view with singlestep enabled. When the >singlestep >>> traps to Xen another vm_event is sent to the monitor agent, which >then >>> normally disables singlestepping and switches the altp2m view back >to >>> the restricted view. This patch looks like its short-circuiting that >>> last part so that it doesn't need to send the vm_event out for the >>> singlestep event and should switch back to the restricted view in >Xen >>> automatically. It's a nice optimization. But what seems to be >missing >>> is the altp2m switch itself. >> Never mind, p2m_altp2m_check does the altp2m switch as well, so this >> patch implements what I described above. Please update the patch >> message to be more descriptive (you can copy my description from >> above). > >Also please read CODING_STYLE in the root of the xen repository. The >important ones you need to fix are spaces in "if ( ... )" statements, >and binary operators on the end of the first line rather than the >beginning of the continuation. > >~Andrew -- Простите за краткость, создано в K-9 Mail. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] How to specify vendor and product strings for disk?
Hello, Is it possible to specify vendor and product strings for disk through xl.cfg or xenstore or something else? Something like https://www.redhat.com/archives/libvir-list/2012-November/msg00205.html. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Weird altp2m behaviour when switching early to a new view
> No yet, we're working on it. Could You point me to the branch with Your patches please? I Could not find it in https://xenbits.xen.org/gitweb/?p=xen.git With best regards Sergey Kovalev. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Weird altp2m behaviour when switching early to a new view
Hello Razvan. Have Your patch been accepted in Xen hypervisor? Searching through git I have found commit "61bdddb82151fbf51c58f6ebc1b4a687942c45a8" on "Thu Jun 28 10:54:01 2018 +0300". Is that commit deals with the error? With best regards Sergey Kovalev. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] OProfile with Xen-4.9
Hello Boris. Thank You very much for you suggestion. I will check it. I'm trying to understand control flow during hypercalls to Xen API and how much this cost. I will try to measure this with PERF. But this is only half of way. With best regards Sergey Kovalev. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] OProfile with Xen-4.9
Hello. I have installed Xen-4.9 from Ubuntu 17.10 package. And I would like to profile it with OProfile (as far as I know this is the only option). With instruction from http://wiki.prgmr.com/mediawiki/index.php/Chapter_10:_Profiling_and_Benchmarking_Under_Xen#Profiling_with_Xen I have been able to build oprofile-0.9.4. There were some bugs ( https://patchwork.ozlabs.org/patch/679925/ , https://sourceforge.net/p/oprofile/mailman/message/30135604/ , https://acassis.wordpress.com/2012/11/13/error-reference-counts-cannot-be-declared-mutable/ ). With `opreport -t 2` I receive: > opreport error: No sample file found: try running opcontrol --dump > or specify a session containing sample files With `more /proc/interrupts |grep NMI` I get: > NMI: 0 0 0 0 0 0 > 0 0 Non-maskable interrupts Could somebody point me to some good instruction for profiling current version of Xen? P.S. I have rebuild package with Performance Counters and Performance Counters Array Histrograms already. -- With best regards Sergey Kovalev ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel