Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 10.07.2013, at 20:24, Scott Wood wrote: > On 07/10/2013 05:23:36 AM, Alexander Graf wrote: >> On 10.07.2013, at 00:26, Scott Wood wrote: >> > On 07/09/2013 05:00:26 PM, Alexander Graf wrote: >> >> It'll also be more flexible at the same time. You could take the logs and >> >> actually check what's going on to debug issues that you're encountering >> >> for example. >> >> We could even go as far as sharing the same tool with other >> >> architectures, so that we only have to learn how to debug things once. >> > >> > Have you encountered an actual need for this flexibility, or is it >> > theoretical? >> Yeah, first thing I did back then to actually debug kvm failures was to add >> trace points. > > I meant specifically for handling exit timings this way. No, but I did encounter the need for debugging exits. And the only thing we would need to get exit timing stats once we already have trace points for exit types would be to have trace points for guest entry and maybe type specific events to indicate what the exit is about as well. > >> > Is there common infrastructure for dealing with measuring intervals and >> > tracking statistics thereof, rather than just tracking points and letting >> > userspace connect the dots (though it could still do that as an option)? >> > Even if it must be done in userspace, it doesn't seem like something that >> > should be KVM-specific. >> Would you like to have different ways of measuring mm subsystem overhead? I >> don't :). The same goes for KVM really. If we could converge towards a >> single user space interface to get exit timings, it'd make debugging a lot >> easier. > > I agree -- that's why I said it doesn't seem like something that should be > KVM-specific. But that's orthogonal to whether it's done in kernel space or > user space. The ability to get begin/end events from userspace would be nice > when it is specifically requested, but it would also be nice if the kernel > could track some basic statistics so we wouldn't have to ship so much data > around to arrive at the same result. > > At the very least, I'd like such a tool/infrastructure to exist before we > start complaining about doing minor maintenance of the current mechanism. I admit that I don't fully understand qemu/scripts/kvm/kvm_stat, but it seems to me as if it already does pretty much what we want. It sets up a filter to only get events and their time stamps through. It does use normal exit trace points on x86 to replace the old debugfs based stat counters. And it seems to work reasonably well for that. > >> We already have this for the debugfs counters btw. And the timing framework >> does break kvm_stat today already, as it emits textual stats rather than >> numbers which all of the other debugfs stats do. But at least I can take the >> x86 kvm_stat tool and run it on ppc just fine to see exit stats. > > We already have what? The last two sentences seem contradictory -- can you > or can't you use kvm_stat as is? I'm not familiar with kvm_stat. Kvm_stat back in the day used debugfs to give you an idea on what exit event happens most often. That mechanism got replaced by trace points later which the current kvm_stat uses. I still have a copy of the old kvm_stat that I always use to get a first feeling for what goes wrong if something goes wrong. The original code couldn't deal with the fact that we have a debugfs file that contains text though. I patched it locally. It also works just fine if you simply disable timing stats, since then you won't have the text file. > What does x86 KVM expose in debugfs? The same thing it always exposed - exit stats. I am fairly sure Avi wanted to completely deprecate that interface in favor of the trace point based approach, but I don't think he ever got around to it. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 07/10/2013 05:23:36 AM, Alexander Graf wrote: On 10.07.2013, at 00:26, Scott Wood wrote: > On 07/09/2013 05:00:26 PM, Alexander Graf wrote: >> It'll also be more flexible at the same time. You could take the logs and actually check what's going on to debug issues that you're encountering for example. >> We could even go as far as sharing the same tool with other architectures, so that we only have to learn how to debug things once. > > Have you encountered an actual need for this flexibility, or is it theoretical? Yeah, first thing I did back then to actually debug kvm failures was to add trace points. I meant specifically for handling exit timings this way. > Is there common infrastructure for dealing with measuring intervals and tracking statistics thereof, rather than just tracking points and letting userspace connect the dots (though it could still do that as an option)? Even if it must be done in userspace, it doesn't seem like something that should be KVM-specific. Would you like to have different ways of measuring mm subsystem overhead? I don't :). The same goes for KVM really. If we could converge towards a single user space interface to get exit timings, it'd make debugging a lot easier. I agree -- that's why I said it doesn't seem like something that should be KVM-specific. But that's orthogonal to whether it's done in kernel space or user space. The ability to get begin/end events from userspace would be nice when it is specifically requested, but it would also be nice if the kernel could track some basic statistics so we wouldn't have to ship so much data around to arrive at the same result. At the very least, I'd like such a tool/infrastructure to exist before we start complaining about doing minor maintenance of the current mechanism. We already have this for the debugfs counters btw. And the timing framework does break kvm_stat today already, as it emits textual stats rather than numbers which all of the other debugfs stats do. But at least I can take the x86 kvm_stat tool and run it on ppc just fine to see exit stats. We already have what? The last two sentences seem contradictory -- can you or can't you use kvm_stat as is? I'm not familiar with kvm_stat. What does x86 KVM expose in debugfs? >> > Lots of debug options are enabled at build time; why must this be different? >> Because I think it's valuable as debug tool for cases where compile time switches are not the best way of debugging things. It's not a high profile thing to tackle for me tbh, but I don't really think working heavily on the timing stat thing is the correct path to walk along. > > Adding new exit types isn't "working heavily" on it. No, but the fact that the first patch is a fix to add exit stats for exits that we missed out before doesn't give me a lot of confidence that lots of people use timing stats. And I am always very weary of #ifdef'ed code, as it blows up the test matrix heavily. I used it quite a lot when I was doing KVM performance work. It's just been a while since I last did that. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 10.07.2013, at 00:26, Scott Wood wrote: > On 07/09/2013 05:00:26 PM, Alexander Graf wrote: >> On 09.07.2013, at 23:54, Scott Wood wrote: >> > On 07/09/2013 04:49:32 PM, Alexander Graf wrote: >> >> Not sure I understand. What the timing stats do is that they measure the >> >> time between [exit ... entry], right? We'd do the same thing, just all in >> >> C code. That means we would become slightly less accurate, but gain >> >> dynamic enabling of the traces and get rid of all the timing stat asm >> >> code. >> > >> > Compile-time enabling bothers me less than a loss of accuracy (not just a >> > small loss by moving into C code, but a potential for a large loss if we >> > overflow the buffer) >> Then don't overflow the buffer. Make it large enough. > > How large is that? Does the tool recognize and report when overflow happens? > > How much will the overhead of running some python script on the host, > consuming a large volume of data, affect the results? > >> IIRC ftrace improved recently to dynamically increase the buffer size too. >> Steven, do I remember correctly here? > > Yay more complexity. > > So now we get to worry about possible memory allocations happening when we > try to log something? Or if there is a way to do an "atomic" log, we're back > to the "buffer might be full" situation. > >> > and a dependency on a userspace tool >> We already have that for kvm_stat. It's a simple python script - and you >> surely have python on your rootfs, no? >> > (both in terms of the tool needing to be written, and in the hassle of >> > ensuring that it's present in the root filesystem of whatever system I'm >> > testing). And the whole mechanism will be more complicated. >> It'll also be more flexible at the same time. You could take the logs and >> actually check what's going on to debug issues that you're encountering for >> example. >> We could even go as far as sharing the same tool with other architectures, >> so that we only have to learn how to debug things once. > > Have you encountered an actual need for this flexibility, or is it > theoretical? Yeah, first thing I did back then to actually debug kvm failures was to add trace points. > Is there common infrastructure for dealing with measuring intervals and > tracking statistics thereof, rather than just tracking points and letting > userspace connect the dots (though it could still do that as an option)? > Even if it must be done in userspace, it doesn't seem like something that > should be KVM-specific. Would you like to have different ways of measuring mm subsystem overhead? I don't :). The same goes for KVM really. If we could converge towards a single user space interface to get exit timings, it'd make debugging a lot easier. We already have this for the debugfs counters btw. And the timing framework does break kvm_stat today already, as it emits textual stats rather than numbers which all of the other debugfs stats do. But at least I can take the x86 kvm_stat tool and run it on ppc just fine to see exit stats. > >> > Lots of debug options are enabled at build time; why must this be >> > different? >> Because I think it's valuable as debug tool for cases where compile time >> switches are not the best way of debugging things. It's not a high profile >> thing to tackle for me tbh, but I don't really think working heavily on the >> timing stat thing is the correct path to walk along. > > Adding new exit types isn't "working heavily" on it. No, but the fact that the first patch is a fix to add exit stats for exits that we missed out before doesn't give me a lot of confidence that lots of people use timing stats. And I am always very weary of #ifdef'ed code, as it blows up the test matrix heavily. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On Tue, 2013-07-09 at 17:26 -0500, Scott Wood wrote: > On 07/09/2013 05:00:26 PM, Alexander Graf wrote: > > > > On 09.07.2013, at 23:54, Scott Wood wrote: > > > > > On 07/09/2013 04:49:32 PM, Alexander Graf wrote: > > >> Not sure I understand. What the timing stats do is that they > > measure the time between [exit ... entry], right? We'd do the same > > thing, just all in C code. That means we would become slightly less > > accurate, but gain dynamic enabling of the traces and get rid of all > > the timing stat asm code. > > > > > > Compile-time enabling bothers me less than a loss of accuracy (not > > just a small loss by moving into C code, but a potential for a large > > loss if we overflow the buffer) > > > > Then don't overflow the buffer. Make it large enough. > > How large is that? Does the tool recognize and report when overflow > happens? Note, the ftrace buffers allow you to see when overflow does happen. > > How much will the overhead of running some python script on the host, > consuming a large volume of data, affect the results? This doesn't need to be python, and you can read the buffers in binary as well. Mauro wrote a tool that uses ftrace for MCE errors. You can probably do something similar. I need to get the code that reads ftrace binary buffers out as a library. > > > IIRC ftrace improved recently to dynamically increase the buffer size > > too. What did change was that you can create buffers for your own use. > > > > Steven, do I remember correctly here? > > Yay more complexity. What? Is ftrace complex? ;-) > > So now we get to worry about possible memory allocations happening when > we try to log something? Or if there is a way to do an "atomic" log, > we're back to the "buffer might be full" situation. Nope, ftrace doesn't do dynamic allocation here. -- Steve > > > > and a dependency on a userspace tool > > > > We already have that for kvm_stat. It's a simple python script - and > > you surely have python on your rootfs, no? > > > > > (both in terms of the tool needing to be written, and in the hassle > > of ensuring that it's present in the root filesystem of whatever > > system I'm testing). And the whole mechanism will be more > > complicated. > > > > It'll also be more flexible at the same time. You could take the logs > > and actually check what's going on to debug issues that you're > > encountering for example. > > > > We could even go as far as sharing the same tool with other > > architectures, so that we only have to learn how to debug things once. > > Have you encountered an actual need for this flexibility, or is it > theoretical? > > Is there common infrastructure for dealing with measuring intervals and > tracking statistics thereof, rather than just tracking points and > letting userspace connect the dots (though it could still do that as an > option)? Even if it must be done in userspace, it doesn't seem like > something that should be KVM-specific. > > > > Lots of debug options are enabled at build time; why must this be > > different? > > > > Because I think it's valuable as debug tool for cases where compile > > time switches are not the best way of debugging things. It's not a > > high profile thing to tackle for me tbh, but I don't really think > > working heavily on the timing stat thing is the correct path to walk > > along. > > Adding new exit types isn't "working heavily" on it. > > -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On Wed, 2013-07-10 at 00:00 +0200, Alexander Graf wrote: > Then don't overflow the buffer. Make it large enough. IIRC ftrace improved > recently to dynamically increase the buffer size too. > > Steven, do I remember correctly here? Not really. Ftrace only dynamically increases the buffer when the trace is first used. Other than that, the size is static. I also wouldn't suggest allocating the buffer when needed as that has the overhead of allocating memory. -- Steve -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 07/09/2013 05:00:26 PM, Alexander Graf wrote: On 09.07.2013, at 23:54, Scott Wood wrote: > On 07/09/2013 04:49:32 PM, Alexander Graf wrote: >> Not sure I understand. What the timing stats do is that they measure the time between [exit ... entry], right? We'd do the same thing, just all in C code. That means we would become slightly less accurate, but gain dynamic enabling of the traces and get rid of all the timing stat asm code. > > Compile-time enabling bothers me less than a loss of accuracy (not just a small loss by moving into C code, but a potential for a large loss if we overflow the buffer) Then don't overflow the buffer. Make it large enough. How large is that? Does the tool recognize and report when overflow happens? How much will the overhead of running some python script on the host, consuming a large volume of data, affect the results? IIRC ftrace improved recently to dynamically increase the buffer size too. Steven, do I remember correctly here? Yay more complexity. So now we get to worry about possible memory allocations happening when we try to log something? Or if there is a way to do an "atomic" log, we're back to the "buffer might be full" situation. > and a dependency on a userspace tool We already have that for kvm_stat. It's a simple python script - and you surely have python on your rootfs, no? > (both in terms of the tool needing to be written, and in the hassle of ensuring that it's present in the root filesystem of whatever system I'm testing). And the whole mechanism will be more complicated. It'll also be more flexible at the same time. You could take the logs and actually check what's going on to debug issues that you're encountering for example. We could even go as far as sharing the same tool with other architectures, so that we only have to learn how to debug things once. Have you encountered an actual need for this flexibility, or is it theoretical? Is there common infrastructure for dealing with measuring intervals and tracking statistics thereof, rather than just tracking points and letting userspace connect the dots (though it could still do that as an option)? Even if it must be done in userspace, it doesn't seem like something that should be KVM-specific. > Lots of debug options are enabled at build time; why must this be different? Because I think it's valuable as debug tool for cases where compile time switches are not the best way of debugging things. It's not a high profile thing to tackle for me tbh, but I don't really think working heavily on the timing stat thing is the correct path to walk along. Adding new exit types isn't "working heavily" on it. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 09.07.2013, at 23:54, Scott Wood wrote: > On 07/09/2013 04:49:32 PM, Alexander Graf wrote: >> On 09.07.2013, at 20:29, Scott Wood wrote: >> > On 07/09/2013 12:46:32 PM, Alexander Graf wrote: >> >> On 07/09/2013 07:16 PM, Scott Wood wrote: >> >>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote: >> On 03.07.2013, at 15:30, Mihai Caraman wrote: >> > Some guests are making use of return from machine check instruction >> > to do crazy things even though the 64-bit kernel doesn't handle yet >> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction >> > accordingly. >> > >> > Signed-off-by: Mihai Caraman >> > --- >> > arch/powerpc/include/asm/kvm_host.h |1 + >> > arch/powerpc/kvm/booke_emulate.c| 25 + >> > arch/powerpc/kvm/timing.c |1 + >> > 3 files changed, 27 insertions(+), 0 deletions(-) >> > >> > diff --git a/arch/powerpc/include/asm/kvm_host.h >> > b/arch/powerpc/include/asm/kvm_host.h >> > index af326cd..0466789 100644 >> > --- a/arch/powerpc/include/asm/kvm_host.h >> > +++ b/arch/powerpc/include/asm/kvm_host.h >> > @@ -148,6 +148,7 @@ enum kvm_exit_types { >> > EMULATED_TLBWE_EXITS, >> > EMULATED_RFI_EXITS, >> > EMULATED_RFCI_EXITS, >> > +EMULATED_RFMCI_EXITS, >> I would quite frankly prefer to see us abandon the whole exit timing >> framework in the kernel and instead use trace points. Then we don't >> have to maintain all of this randomly exercised code. >> >>> Would this map well to tracepoints? We're not trying to track discrete >> >>> events, so much as accumulated time spent in different areas. >> >> I think so. We'd just have to emit tracepoints as soon as we enter >> >> handle_exit and in prepare_to_enter. Then a user space program should >> >> have everything it needs to create statistics out of that. It would >> >> certainly simplify the entry/exit path. >> > >> > I was hoping that wasn't going to be your answer. :-) >> > >> > Such a change would introduce a new dependency, more complexity, and the >> > possibility for bad totals to result from a ring buffer filling faster >> > than userspace can drain it. >> Well, at least it would allow for optional tracing :). Today you have to >> change a compile flag to enable / disable timing stats. >> > >> > I also don't see how it would simplify entry/exit, since we'd still need >> > to take timestamps in the same places, in order to record a final event >> > that says how long a particular event took. >> Not sure I understand. What the timing stats do is that they measure the >> time between [exit ... entry], right? We'd do the same thing, just all in C >> code. That means we would become slightly less accurate, but gain dynamic >> enabling of the traces and get rid of all the timing stat asm code. > > Compile-time enabling bothers me less than a loss of accuracy (not just a > small loss by moving into C code, but a potential for a large loss if we > overflow the buffer) Then don't overflow the buffer. Make it large enough. IIRC ftrace improved recently to dynamically increase the buffer size too. Steven, do I remember correctly here? > and a dependency on a userspace tool We already have that for kvm_stat. It's a simple python script - and you surely have python on your rootfs, no? > (both in terms of the tool needing to be written, and in the hassle of > ensuring that it's present in the root filesystem of whatever system I'm > testing). And the whole mechanism will be more complicated. It'll also be more flexible at the same time. You could take the logs and actually check what's going on to debug issues that you're encountering for example. We could even go as far as sharing the same tool with other architectures, so that we only have to learn how to debug things once. > Lots of debug options are enabled at build time; why must this be different? Because I think it's valuable as debug tool for cases where compile time switches are not the best way of debugging things. It's not a high profile thing to tackle for me tbh, but I don't really think working heavily on the timing stat thing is the correct path to walk along. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 07/09/2013 04:49:32 PM, Alexander Graf wrote: On 09.07.2013, at 20:29, Scott Wood wrote: > On 07/09/2013 12:46:32 PM, Alexander Graf wrote: >> On 07/09/2013 07:16 PM, Scott Wood wrote: >>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote: On 03.07.2013, at 15:30, Mihai Caraman wrote: > Some guests are making use of return from machine check instruction > to do crazy things even though the 64-bit kernel doesn't handle yet > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. > > Signed-off-by: Mihai Caraman > --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kvm/booke_emulate.c| 25 + > arch/powerpc/kvm/timing.c |1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index af326cd..0466789 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -148,6 +148,7 @@ enum kvm_exit_types { > EMULATED_TLBWE_EXITS, > EMULATED_RFI_EXITS, > EMULATED_RFCI_EXITS, > +EMULATED_RFMCI_EXITS, I would quite frankly prefer to see us abandon the whole exit timing framework in the kernel and instead use trace points. Then we don't have to maintain all of this randomly exercised code. >>> Would this map well to tracepoints? We're not trying to track discrete events, so much as accumulated time spent in different areas. >> I think so. We'd just have to emit tracepoints as soon as we enter handle_exit and in prepare_to_enter. Then a user space program should have everything it needs to create statistics out of that. It would certainly simplify the entry/exit path. > > I was hoping that wasn't going to be your answer. :-) > > Such a change would introduce a new dependency, more complexity, and the possibility for bad totals to result from a ring buffer filling faster than userspace can drain it. Well, at least it would allow for optional tracing :). Today you have to change a compile flag to enable / disable timing stats. > > I also don't see how it would simplify entry/exit, since we'd still need to take timestamps in the same places, in order to record a final event that says how long a particular event took. Not sure I understand. What the timing stats do is that they measure the time between [exit ... entry], right? We'd do the same thing, just all in C code. That means we would become slightly less accurate, but gain dynamic enabling of the traces and get rid of all the timing stat asm code. Compile-time enabling bothers me less than a loss of accuracy (not just a small loss by moving into C code, but a potential for a large loss if we overflow the buffer) and a dependency on a userspace tool (both in terms of the tool needing to be written, and in the hassle of ensuring that it's present in the root filesystem of whatever system I'm testing). And the whole mechanism will be more complicated. Lots of debug options are enabled at build time; why must this be different? -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 09.07.2013, at 20:29, Scott Wood wrote: > On 07/09/2013 12:46:32 PM, Alexander Graf wrote: >> On 07/09/2013 07:16 PM, Scott Wood wrote: >>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote: On 03.07.2013, at 15:30, Mihai Caraman wrote: > Some guests are making use of return from machine check instruction > to do crazy things even though the 64-bit kernel doesn't handle yet > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. > > Signed-off-by: Mihai Caraman > --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kvm/booke_emulate.c| 25 + > arch/powerpc/kvm/timing.c |1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h > b/arch/powerpc/include/asm/kvm_host.h > index af326cd..0466789 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -148,6 +148,7 @@ enum kvm_exit_types { > EMULATED_TLBWE_EXITS, > EMULATED_RFI_EXITS, > EMULATED_RFCI_EXITS, > +EMULATED_RFMCI_EXITS, I would quite frankly prefer to see us abandon the whole exit timing framework in the kernel and instead use trace points. Then we don't have to maintain all of this randomly exercised code. >>> Would this map well to tracepoints? We're not trying to track discrete >>> events, so much as accumulated time spent in different areas. >> I think so. We'd just have to emit tracepoints as soon as we enter >> handle_exit and in prepare_to_enter. Then a user space program should have >> everything it needs to create statistics out of that. It would certainly >> simplify the entry/exit path. > > I was hoping that wasn't going to be your answer. :-) > > Such a change would introduce a new dependency, more complexity, and the > possibility for bad totals to result from a ring buffer filling faster than > userspace can drain it. Well, at least it would allow for optional tracing :). Today you have to change a compile flag to enable / disable timing stats. > > I also don't see how it would simplify entry/exit, since we'd still need to > take timestamps in the same places, in order to record a final event that > says how long a particular event took. Not sure I understand. What the timing stats do is that they measure the time between [exit ... entry], right? We'd do the same thing, just all in C code. That means we would become slightly less accurate, but gain dynamic enabling of the traces and get rid of all the timing stat asm code. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 07/09/2013 12:46:32 PM, Alexander Graf wrote: On 07/09/2013 07:16 PM, Scott Wood wrote: On 07/08/2013 01:45:58 PM, Alexander Graf wrote: On 03.07.2013, at 15:30, Mihai Caraman wrote: > Some guests are making use of return from machine check instruction > to do crazy things even though the 64-bit kernel doesn't handle yet > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. > > Signed-off-by: Mihai Caraman > --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kvm/booke_emulate.c| 25 + > arch/powerpc/kvm/timing.c |1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index af326cd..0466789 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -148,6 +148,7 @@ enum kvm_exit_types { > EMULATED_TLBWE_EXITS, > EMULATED_RFI_EXITS, > EMULATED_RFCI_EXITS, > +EMULATED_RFMCI_EXITS, I would quite frankly prefer to see us abandon the whole exit timing framework in the kernel and instead use trace points. Then we don't have to maintain all of this randomly exercised code. Would this map well to tracepoints? We're not trying to track discrete events, so much as accumulated time spent in different areas. I think so. We'd just have to emit tracepoints as soon as we enter handle_exit and in prepare_to_enter. Then a user space program should have everything it needs to create statistics out of that. It would certainly simplify the entry/exit path. I was hoping that wasn't going to be your answer. :-) Such a change would introduce a new dependency, more complexity, and the possibility for bad totals to result from a ring buffer filling faster than userspace can drain it. I also don't see how it would simplify entry/exit, since we'd still need to take timestamps in the same places, in order to record a final event that says how long a particular event took. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 07/09/2013 07:16 PM, Scott Wood wrote: On 07/08/2013 01:45:58 PM, Alexander Graf wrote: On 03.07.2013, at 15:30, Mihai Caraman wrote: > Some guests are making use of return from machine check instruction > to do crazy things even though the 64-bit kernel doesn't handle yet > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. > > Signed-off-by: Mihai Caraman > --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kvm/booke_emulate.c| 25 + > arch/powerpc/kvm/timing.c |1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index af326cd..0466789 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -148,6 +148,7 @@ enum kvm_exit_types { > EMULATED_TLBWE_EXITS, > EMULATED_RFI_EXITS, > EMULATED_RFCI_EXITS, > +EMULATED_RFMCI_EXITS, I would quite frankly prefer to see us abandon the whole exit timing framework in the kernel and instead use trace points. Then we don't have to maintain all of this randomly exercised code. Would this map well to tracepoints? We're not trying to track discrete events, so much as accumulated time spent in different areas. I think so. We'd just have to emit tracepoints as soon as we enter handle_exit and in prepare_to_enter. Then a user space program should have everything it needs to create statistics out of that. It would certainly simplify the entry/exit path. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 07/08/2013 01:45:58 PM, Alexander Graf wrote: On 03.07.2013, at 15:30, Mihai Caraman wrote: > Some guests are making use of return from machine check instruction > to do crazy things even though the 64-bit kernel doesn't handle yet > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. > > Signed-off-by: Mihai Caraman > --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kvm/booke_emulate.c| 25 + > arch/powerpc/kvm/timing.c |1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index af326cd..0466789 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -148,6 +148,7 @@ enum kvm_exit_types { >EMULATED_TLBWE_EXITS, >EMULATED_RFI_EXITS, >EMULATED_RFCI_EXITS, > + EMULATED_RFMCI_EXITS, I would quite frankly prefer to see us abandon the whole exit timing framework in the kernel and instead use trace points. Then we don't have to maintain all of this randomly exercised code. Would this map well to tracepoints? We're not trying to track discrete events, so much as accumulated time spent in different areas. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
On 03.07.2013, at 15:30, Mihai Caraman wrote: > Some guests are making use of return from machine check instruction > to do crazy things even though the 64-bit kernel doesn't handle yet > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. > > Signed-off-by: Mihai Caraman > --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kvm/booke_emulate.c| 25 + > arch/powerpc/kvm/timing.c |1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h > b/arch/powerpc/include/asm/kvm_host.h > index af326cd..0466789 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -148,6 +148,7 @@ enum kvm_exit_types { > EMULATED_TLBWE_EXITS, > EMULATED_RFI_EXITS, > EMULATED_RFCI_EXITS, > + EMULATED_RFMCI_EXITS, I would quite frankly prefer to see us abandon the whole exit timing framework in the kernel and instead use trace points. Then we don't have to maintain all of this randomly exercised code. FWIW I think in this case however, treating RFMCI the same as RFI or random "instruction emulation" shouldn't hurt. This whole table is only about timing measurements. If you want to know for real what's going on, use trace points. Otherwise looks good. Alex > DEC_EXITS, > EXT_INTR_EXITS, > HALT_WAKEUP, > diff --git a/arch/powerpc/kvm/booke_emulate.c > b/arch/powerpc/kvm/booke_emulate.c > index 27a4b28..aaff1b7 100644 > --- a/arch/powerpc/kvm/booke_emulate.c > +++ b/arch/powerpc/kvm/booke_emulate.c > @@ -23,6 +23,7 @@ > > #include "booke.h" > > +#define OP_19_XOP_RFMCI 38 > #define OP_19_XOP_RFI 50 > #define OP_19_XOP_RFCI51 > > @@ -43,6 +44,12 @@ static void kvmppc_emul_rfci(struct kvm_vcpu *vcpu) > kvmppc_set_msr(vcpu, vcpu->arch.csrr1); > } > > +static void kvmppc_emul_rfmci(struct kvm_vcpu *vcpu) > +{ > + vcpu->arch.pc = vcpu->arch.mcsrr0; > + kvmppc_set_msr(vcpu, vcpu->arch.mcsrr1); > +} > + > int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, > unsigned int inst, int *advance) > { > @@ -65,6 +72,12 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct > kvm_vcpu *vcpu, > *advance = 0; > break; > > + case OP_19_XOP_RFMCI: > + kvmppc_emul_rfmci(vcpu); > + kvmppc_set_exit_type(vcpu, EMULATED_RFMCI_EXITS); > + *advance = 0; > + break; > + > default: > emulated = EMULATE_FAIL; > break; > @@ -138,6 +151,12 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, > int sprn, ulong spr_val) > case SPRN_DBCR1: > vcpu->arch.dbg_reg.dbcr1 = spr_val; > break; > + case SPRN_MCSRR0: > + vcpu->arch.mcsrr0 = spr_val; > + break; > + case SPRN_MCSRR1: > + vcpu->arch.mcsrr1 = spr_val; > + break; > case SPRN_DBSR: > vcpu->arch.dbsr &= ~spr_val; > break; > @@ -284,6 +303,12 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, > int sprn, ulong *spr_val) > case SPRN_DBCR1: > *spr_val = vcpu->arch.dbg_reg.dbcr1; > break; > + case SPRN_MCSRR0: > + *spr_val = vcpu->arch.mcsrr0; > + break; > + case SPRN_MCSRR1: > + *spr_val = vcpu->arch.mcsrr1; > + break; > case SPRN_DBSR: > *spr_val = vcpu->arch.dbsr; > break; > diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c > index c392d26..670f63d 100644 > --- a/arch/powerpc/kvm/timing.c > +++ b/arch/powerpc/kvm/timing.c > @@ -129,6 +129,7 @@ static const char > *kvm_exit_names[__NUMBER_OF_KVM_EXIT_TYPES] = { > [EMULATED_TLBSX_EXITS] ="EMUL_TLBSX", > [EMULATED_TLBWE_EXITS] ="EMUL_TLBWE", > [EMULATED_RFI_EXITS] = "EMUL_RFI", > + [EMULATED_RFMCI_EXITS] ="EMUL_RFMCI", > [DEC_EXITS] = "DEC", > [EXT_INTR_EXITS] = "EXTINT", > [HALT_WAKEUP] = "HALT", > -- > 1.7.3.4 > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
Some guests are making use of return from machine check instruction to do crazy things even though the 64-bit kernel doesn't handle yet this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly. Signed-off-by: Mihai Caraman --- arch/powerpc/include/asm/kvm_host.h |1 + arch/powerpc/kvm/booke_emulate.c| 25 + arch/powerpc/kvm/timing.c |1 + 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index af326cd..0466789 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -148,6 +148,7 @@ enum kvm_exit_types { EMULATED_TLBWE_EXITS, EMULATED_RFI_EXITS, EMULATED_RFCI_EXITS, + EMULATED_RFMCI_EXITS, DEC_EXITS, EXT_INTR_EXITS, HALT_WAKEUP, diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c index 27a4b28..aaff1b7 100644 --- a/arch/powerpc/kvm/booke_emulate.c +++ b/arch/powerpc/kvm/booke_emulate.c @@ -23,6 +23,7 @@ #include "booke.h" +#define OP_19_XOP_RFMCI 38 #define OP_19_XOP_RFI 50 #define OP_19_XOP_RFCI51 @@ -43,6 +44,12 @@ static void kvmppc_emul_rfci(struct kvm_vcpu *vcpu) kvmppc_set_msr(vcpu, vcpu->arch.csrr1); } +static void kvmppc_emul_rfmci(struct kvm_vcpu *vcpu) +{ + vcpu->arch.pc = vcpu->arch.mcsrr0; + kvmppc_set_msr(vcpu, vcpu->arch.mcsrr1); +} + int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int inst, int *advance) { @@ -65,6 +72,12 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, *advance = 0; break; + case OP_19_XOP_RFMCI: + kvmppc_emul_rfmci(vcpu); + kvmppc_set_exit_type(vcpu, EMULATED_RFMCI_EXITS); + *advance = 0; + break; + default: emulated = EMULATE_FAIL; break; @@ -138,6 +151,12 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val) case SPRN_DBCR1: vcpu->arch.dbg_reg.dbcr1 = spr_val; break; + case SPRN_MCSRR0: + vcpu->arch.mcsrr0 = spr_val; + break; + case SPRN_MCSRR1: + vcpu->arch.mcsrr1 = spr_val; + break; case SPRN_DBSR: vcpu->arch.dbsr &= ~spr_val; break; @@ -284,6 +303,12 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val) case SPRN_DBCR1: *spr_val = vcpu->arch.dbg_reg.dbcr1; break; + case SPRN_MCSRR0: + *spr_val = vcpu->arch.mcsrr0; + break; + case SPRN_MCSRR1: + *spr_val = vcpu->arch.mcsrr1; + break; case SPRN_DBSR: *spr_val = vcpu->arch.dbsr; break; diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c index c392d26..670f63d 100644 --- a/arch/powerpc/kvm/timing.c +++ b/arch/powerpc/kvm/timing.c @@ -129,6 +129,7 @@ static const char *kvm_exit_names[__NUMBER_OF_KVM_EXIT_TYPES] = { [EMULATED_TLBSX_EXITS] ="EMUL_TLBSX", [EMULATED_TLBWE_EXITS] ="EMUL_TLBWE", [EMULATED_RFI_EXITS] = "EMUL_RFI", + [EMULATED_RFMCI_EXITS] ="EMUL_RFMCI", [DEC_EXITS] = "DEC", [EXT_INTR_EXITS] = "EXTINT", [HALT_WAKEUP] = "HALT", -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html