Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On (Tue) 21 Aug 2012 [11:47:16], Rusty Russell wrote: > On Thu, 9 Aug 2012 15:46:20 +0530, Amit Shah wrote: > > Hi, > > > > On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: > > > Hi All, > > > > > > The following patch set provides a low-overhead system for collecting > > > kernel > > > tracing data of guests by a host in a virtualization environment. > > > > So I just have one minor comment, please post a non-RFC version of the > > patch. > > > > Since you have an ACK from Steven for the ftrace patch, I guess Rusty > > can push this in via his virtio tree? > > > > I'll ack the virtio-console bits in the next series you send. > > You didn't Ack, BTW. At least, AFAICT. Ah, sorry. Will do that now. Amit
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Thu, 9 Aug 2012 15:46:20 +0530, Amit Shah wrote: > Hi, > > On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: > > Hi All, > > > > The following patch set provides a low-overhead system for collecting kernel > > tracing data of guests by a host in a virtualization environment. > > So I just have one minor comment, please post a non-RFC version of the > patch. > > Since you have an ACK from Steven for the ftrace patch, I guess Rusty > can push this in via his virtio tree? > > I'll ack the virtio-console bits in the next series you send. You didn't Ack, BTW. At least, AFAICT. Cheers, Rusty.
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
Hi, On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: > Hi All, > > The following patch set provides a low-overhead system for collecting kernel > tracing data of guests by a host in a virtualization environment. So I just have one minor comment, please post a non-RFC version of the patch. Since you have an ACK from Steven for the ftrace patch, I guess Rusty can push this in via his virtio tree? I'll ack the virtio-console bits in the next series you send. Thanks, Amit
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
Hi Amit, Sorry for the late reply. (2012/07/27 18:43), Amit Shah wrote: On (Fri) 27 Jul 2012 [17:55:11], Yoshihiro YUNOMAE wrote: Hi Amit, Thank you for commenting on our work. (2012/07/26 20:35), Amit Shah wrote: On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: [...] ***Just enhancement ideas*** - Support for trace-cmd - Support for 9pfs protocol - Support for non-blocking mode in QEMU There were patches long back (by me) to make chardevs non-blocking but they didn't make it upstream. Fedora carries them, if you want to try out. Though we want to converge on a reasonable solution that's acceptable upstream as well. Just that no one's working on it currently. Any help here will be appreciated. Thanks! In this case, since a guest will stop to run when host reads trace data of the guest, char device is needed to add a non-blocking mode. I'll read your patch series. Is the latest version 8? http://lists.gnu.org/archive/html/qemu-devel/2010-12/msg00035.html I suppose the latest version on-list is what you quote above. The objections to the patch series are mentioned in Anthony's mails. I'll check the mails. Hans maintains a rebased version of the patches in his tree at http://cgit.freedesktop.org/~jwrdegoede/qemu/ those patches are included in Fedora's qemu-kvm, so you can try that out if it improves performance for you. Thanks. I'll check those patches. - Make "vhost-serial" I need to understand a) why it's perf-critical, and b) why should the host be involved at all, to comment on these. a) To make collecting overhead decrease for application on a guest. (see above) b) Trace data of host kernel is not involved even if we introduce this patch set. I see, so you suggested vhost-serial only because you saw the guest stopping problem due to the absence of non-blocking code? If so, it now makes sense. I don't think we need vhost-serial in any way yet. I understood. We suggested vhost-serial as one of the ideas for improving performances. Other features(trace-cmd, 9pfs, and non-blocking chardev) should be supported first, I think. BTW where do you parse the trace data obtained from guests? On a remote host? It is the best that we can parse the data on a remote host in this tracing system. Existing trace-cmd can already parse it on a remote site. If we add the feature collecting event-format data(guest's debugfs has that) from guests, we can parse tracing data on a remote host as well as on a host running guests. Thank you, -- Yoshihiro YUNOMAE Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: yoshihiro.yunomae...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Wed, Jul 25, 2012 at 8:15 AM, Masami Hiramatsu wrote: > (2012/07/25 5:26), Blue Swirl wrote:> >>> The following patch set provides a low-overhead system for collecting kernel >>> tracing data of guests by a host in a virtualization environment. >>> >>> A guest OS generally shares some devices with other guests or a host, so >>> reasons of any problems occurring in a guest may be from other guests or a >>> host. >>> Then, to collect some tracing data of a number of guests and a host is >>> needed >>> when some problems occur in a virtualization environment. One of methods to >>> realize that is to collect tracing data of guests in a host. To do this, >>> network >>> is generally used. However, high load will be taken to applications on >>> guests >>> using network I/O because there are many network stack layers. Therefore, >>> a communication method for collecting the data without using network is >>> needed. >> >> I implemented something similar earlier by passing trace data from >> OpenBIOS to QEMU using the firmware configuration device. The data >> format was the same as QEMU used for simpletrace event structure >> instead of ftrace. I didn't commit it because of a few problems. > > Sounds interesting :) > I guess you traced BIOS events, right? Yes, I converted a few DPRINTFs to tracepoints as a proof of concept. > >> I'm not familiar with ftrace, is it possible to trace two guest >> applications (BIOS and kernel) at the same time? > > Since ftrace itself is a tracing feature in the linux kernel, it > can trace two or more applications (processes) if those run on linux > kernel. However, I think OpenBIOS runs *under* the guest kernel. > If so, ftrace currently can't trace OpenBIOS from guest side. No, OpenBIOS boots the machine and then passes control to boot loader and that to kernel. The kernel will make a few calls to OpenBIOS at start but not later. OpenBIOS is used by QEMU as Sparc and PowerPC BIOS. > > I think it may need another enhancement on both OpenBIOS and linux > kernel to trace BIOS event from linux kernel. > Ideally both OpenBIOS and Linux should be able to feed trace events back to QEMU independently. >> Or could this be >> handled by opening two different virtio-serial pipes, one for BIOS and >> the other for the kernel? > > Of course, virtio-serial itself can open multiple channels, thus, if > OpenBIOS can handle virtio, it can pass trace data via another > channel. Currently OpenBIOS probes the PCI bus and identifies virtio devices but ignores them, adding virtio-serial support shouldn't be too hard. There's a time window between CPU boot and PCI probe when the the device will not be available though. > >> In my version, the tracepoint ID would have been used to demultiplex >> QEMU tracepoints from BIOS tracepoints, but something like separate ID >> spaces would have been better. > > I guess your feature notifies events to QEMU and QEMU records that in > their own buffer. Therefore it must have different tracepoint IDs. > On the other hand, with this feature, QEMU just passes trace-data to > host-side pipe. Since outer tracing tool separately collects trace > data, we don't need to demultiplex the data. > > Perhaps, in the analyzing phase (after tracing), we have to mix events > again. At that time, we'll add some guest-ID for each event-ID, but > it can be done offline. Yes, the multiplexing/demultiplexing is only needed in my version because the feeds are not independent. > > Best Regards, > > -- > Masami HIRAMATSU > Software Platform Research Dept. Linux Technology Center > Hitachi, Ltd., Yokohama Research Laboratory > E-mail: masami.hiramatsu...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On (Fri) 27 Jul 2012 [17:55:11], Yoshihiro YUNOMAE wrote: > Hi Amit, > > Thank you for commenting on our work. > > (2012/07/26 20:35), Amit Shah wrote: > >On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: > > [...] > > >> > >>Therefore, we propose a new system "virtio-trace", which uses enhanced > >>virtio-serial and existing ring-buffer of ftrace, for collecting guest > >>kernel > >>tracing data. In this system, there are 5 main components: > >> (1) Ring-buffer of ftrace in a guest > >> - When trace agent reads ring-buffer, a page is removed from > >> ring-buffer. > >> (2) Trace agent in the guest > >> - Splice the page of ring-buffer to read_pipe using splice() without > >>memory copying. Then, the page is spliced from write_pipe to virtio > >>without memory copying. > > > >I really like the splicing idea. > > Thanks. We will improve this patch set. > > >> (3) Virtio-console driver in the guest > >> - Pass the page to virtio-ring > >> (4) Virtio-serial bus in QEMU > >> - Copy the page to kernel pipe > >> (5) Reader in the host > >> - Read guest tracing data via FIFO(named pipe) > > > >So will this be useful only if guest and host run the same kernel? > > > >I'd like to see the host kernel not being used at all -- collect all > >relevant info from the guest and send it out to qemu, where it can be > >consumed directly by apps driving the tracing. > > No, this patch set is used only for guest kernels, so guest and host > don't need to run the same kernel. OK - that's good to know. > >>***Evaluation*** > >>When a host collects tracing data of a guest, the performance of using > >>virtio-trace is compared with that of using native(just running ftrace), > >>IVRing, and virtio-serial(normal method of read/write). > > > >Why is tracing performance-sensitive? i.e. why try to optimise this > >at all? > > To minimize effects for applications on guests when a host collects > tracing data of guests. > For example, we assume the situation where guests A and B are running > on a host sharing I/O device. An I/O delay problem occur in guest A, > but it doesn't for the requirement in guest B. In this case, we need to > collect tracing data of guests A and B, but a usual method using > network takes high load for applications of guest B even if guest B is > normally running. Therefore, we try to decrease the load on guests. > We also use this feature for performance analysis on production > virtualization systems. OK, got it. > > [...] > > >> > >>***Just enhancement ideas*** > >> - Support for trace-cmd > >> - Support for 9pfs protocol > >> - Support for non-blocking mode in QEMU > > > >There were patches long back (by me) to make chardevs non-blocking but > >they didn't make it upstream. Fedora carries them, if you want to try > >out. Though we want to converge on a reasonable solution that's > >acceptable upstream as well. Just that no one's working on it > >currently. Any help here will be appreciated. > > Thanks! In this case, since a guest will stop to run when host reads > trace data of the guest, char device is needed to add a non-blocking > mode. I'll read your patch series. Is the latest version 8? > http://lists.gnu.org/archive/html/qemu-devel/2010-12/msg00035.html I suppose the latest version on-list is what you quote above. The objections to the patch series are mentioned in Anthony's mails. Hans maintains a rebased version of the patches in his tree at http://cgit.freedesktop.org/~jwrdegoede/qemu/ those patches are included in Fedora's qemu-kvm, so you can try that out if it improves performance for you. > >> - Make "vhost-serial" > > > >I need to understand a) why it's perf-critical, and b) why should the > >host be involved at all, to comment on these. > > a) To make collecting overhead decrease for application on a guest. >(see above) > b) Trace data of host kernel is not involved even if we introduce this >patch set. I see, so you suggested vhost-serial only because you saw the guest stopping problem due to the absence of non-blocking code? If so, it now makes sense. I don't think we need vhost-serial in any way yet. BTW where do you parse the trace data obtained from guests? On a remote host? Thanks, Amit
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
Hi Amit, Thank you for commenting on our work. (2012/07/26 20:35), Amit Shah wrote: On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: [...] Therefore, we propose a new system "virtio-trace", which uses enhanced virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel tracing data. In this system, there are 5 main components: (1) Ring-buffer of ftrace in a guest - When trace agent reads ring-buffer, a page is removed from ring-buffer. (2) Trace agent in the guest - Splice the page of ring-buffer to read_pipe using splice() without memory copying. Then, the page is spliced from write_pipe to virtio without memory copying. I really like the splicing idea. Thanks. We will improve this patch set. (3) Virtio-console driver in the guest - Pass the page to virtio-ring (4) Virtio-serial bus in QEMU - Copy the page to kernel pipe (5) Reader in the host - Read guest tracing data via FIFO(named pipe) So will this be useful only if guest and host run the same kernel? I'd like to see the host kernel not being used at all -- collect all relevant info from the guest and send it out to qemu, where it can be consumed directly by apps driving the tracing. No, this patch set is used only for guest kernels, so guest and host don't need to run the same kernel. ***Evaluation*** When a host collects tracing data of a guest, the performance of using virtio-trace is compared with that of using native(just running ftrace), IVRing, and virtio-serial(normal method of read/write). Why is tracing performance-sensitive? i.e. why try to optimise this at all? To minimize effects for applications on guests when a host collects tracing data of guests. For example, we assume the situation where guests A and B are running on a host sharing I/O device. An I/O delay problem occur in guest A, but it doesn't for the requirement in guest B. In this case, we need to collect tracing data of guests A and B, but a usual method using network takes high load for applications of guest B even if guest B is normally running. Therefore, we try to decrease the load on guests. We also use this feature for performance analysis on production virtualization systems. [...] ***Just enhancement ideas*** - Support for trace-cmd - Support for 9pfs protocol - Support for non-blocking mode in QEMU There were patches long back (by me) to make chardevs non-blocking but they didn't make it upstream. Fedora carries them, if you want to try out. Though we want to converge on a reasonable solution that's acceptable upstream as well. Just that no one's working on it currently. Any help here will be appreciated. Thanks! In this case, since a guest will stop to run when host reads trace data of the guest, char device is needed to add a non-blocking mode. I'll read your patch series. Is the latest version 8? http://lists.gnu.org/archive/html/qemu-devel/2010-12/msg00035.html - Make "vhost-serial" I need to understand a) why it's perf-critical, and b) why should the host be involved at all, to comment on these. a) To make collecting overhead decrease for application on a guest. (see above) b) Trace data of host kernel is not involved even if we introduce this patch set. Thank you, -- Yoshihiro YUNOMAE Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: yoshihiro.yunomae...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote: > Hi All, > > The following patch set provides a low-overhead system for collecting kernel > tracing data of guests by a host in a virtualization environment. > > A guest OS generally shares some devices with other guests or a host, so > reasons of any problems occurring in a guest may be from other guests or a > host. > Then, to collect some tracing data of a number of guests and a host is needed > when some problems occur in a virtualization environment. One of methods to > realize that is to collect tracing data of guests in a host. To do this, > network > is generally used. However, high load will be taken to applications on guests > using network I/O because there are many network stack layers. Therefore, > a communication method for collecting the data without using network is > needed. > > We submitted a patch set of "IVRing", a ring-buffer driver constructed on > Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in > this June. IVRing and the IVRing reader use POSIX shared memory each other > without using network, so a low-overhead system for collecting guest tracing > data is realized. However, this patch set has some problems as follows: > - use IVShmem instead of virtio > - create a new ring-buffer without using existing ring-buffer in kernel > - scalability >-- not support SMP environment >-- buffer size limitation >-- not support live migration (maybe difficult for realize this) > > Therefore, we propose a new system "virtio-trace", which uses enhanced > virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel > tracing data. In this system, there are 5 main components: > (1) Ring-buffer of ftrace in a guest > - When trace agent reads ring-buffer, a page is removed from ring-buffer. > (2) Trace agent in the guest > - Splice the page of ring-buffer to read_pipe using splice() without >memory copying. Then, the page is spliced from write_pipe to virtio >without memory copying. I really like the splicing idea. > (3) Virtio-console driver in the guest > - Pass the page to virtio-ring > (4) Virtio-serial bus in QEMU > - Copy the page to kernel pipe > (5) Reader in the host > - Read guest tracing data via FIFO(named pipe) So will this be useful only if guest and host run the same kernel? I'd like to see the host kernel not being used at all -- collect all relevant info from the guest and send it out to qemu, where it can be consumed directly by apps driving the tracing. > ***Evaluation*** > When a host collects tracing data of a guest, the performance of using > virtio-trace is compared with that of using native(just running ftrace), > IVRing, and virtio-serial(normal method of read/write). Why is tracing performance-sensitive? i.e. why try to optimise this at all? > > The overview of this evaluation is as follows: > (a) A guest on a KVM is prepared. > - The guest is dedicated one physical CPU as a virtual CPU(VCPU). > > (b) The guest starts to write tracing data to ring-buffer of ftrace. > - The probe points are all trace points of sched, timer, and kmem. > > (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark > tool in the guest. > - Dhrystone 2 intends system performance by repeating integer arithmetic >as a score. > - Since higher score equals to better system performance, if the score >decrease based on bare environment, it indicates that any operation >disturbs the integer arithmetic. Then, we define the overhead of >transporting trace data is calculated as follows: > OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100. > > The performance of each method is compared as follows: > [1] Native > - only recording trace data to ring-buffer on a guest > [2] Virtio-trace > - running a trace agent on a guest > - a reader on a host opens FIFO using cat command > [3] IVRing > - A SystemTap script in a guest records trace data to IVRing. >-- probe points are same as ftrace. > [4] Virtio-serial(normal) > - A reader(using cat) on a guest output trace data to a host using >standard output via virtio-serial. > > Other information is as follows: > - host >kernel: 3.3.7-1 (Fedora16) >CPU: Intel Xeon x5660@2.80GHz(12core) >Memory: 48GB > > - guest(only booting one guest) >kernel: 3.5.0-rc4+ (Fedora16) >CPU: 1VCPU(dedicated) >Memory: 1GB > > > 3 patterns based on the bare environment were indicated as follows: > Scores overhead against [0] Native > [0] Native: 28807569.5 - > [1] Virtio-trace:28685049.5 0.43% > [2] IVRing: 28418595.5 1.35% > [3] Virtio-serial: 13262258.753.96% > > > ***Just enhancement ideas*** > - Support for trace-cmd >
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Wed, Jul 25, 2012 at 10:13 AM, Yoshihiro YUNOMAE wrote: > Hi Stefan, > > > (2012/07/24 22:41), Stefan Hajnoczi wrote: >> >> On Tue, Jul 24, 2012 at 12:19 PM, Yoshihiro YUNOMAE >> wrote: > > Are you using text formatted ftrace? >>> >>> No, currently using raw format, but we'd like to reformat it in text. >> >> >> Capturing the info necessary to translate numbers into symbols is one >> of the problems of host<->guest tracing so I'm curious how you handle >> this :). > > > Right, your consideration is true. > > >> Apologies for my lack of ftrace knowledge but how useful is the raw >> tracing data on the host? How do you pretty-print it in >> human-readable form? > > > perf and trace-cmd can actually translate raw-formatted trace data to > text-formatted trace data by using information of kernel or trace > format under tracing/events directory in debugfs. In the same way, if > the information of a guest is exported to a host, we can translate > raw trace data of a guest to text trace data on a host. We will use > 9pfs to export that. Thanks, it's clear now :). Stefan
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
Hi Stefan, (2012/07/24 22:41), Stefan Hajnoczi wrote: On Tue, Jul 24, 2012 at 12:19 PM, Yoshihiro YUNOMAE wrote: Are you using text formatted ftrace? No, currently using raw format, but we'd like to reformat it in text. Capturing the info necessary to translate numbers into symbols is one of the problems of host<->guest tracing so I'm curious how you handle this :). Right, your consideration is true. Apologies for my lack of ftrace knowledge but how useful is the raw tracing data on the host? How do you pretty-print it in human-readable form? perf and trace-cmd can actually translate raw-formatted trace data to text-formatted trace data by using information of kernel or trace format under tracing/events directory in debugfs. In the same way, if the information of a guest is exported to a host, we can translate raw trace data of a guest to text trace data on a host. We will use 9pfs to export that. Thank you, -- Yoshihiro YUNOMAE Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: yoshihiro.yunomae...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
(2012/07/25 5:26), Blue Swirl wrote:> >> The following patch set provides a low-overhead system for collecting kernel >> tracing data of guests by a host in a virtualization environment. >> >> A guest OS generally shares some devices with other guests or a host, so >> reasons of any problems occurring in a guest may be from other guests or a >> host. >> Then, to collect some tracing data of a number of guests and a host is needed >> when some problems occur in a virtualization environment. One of methods to >> realize that is to collect tracing data of guests in a host. To do this, >> network >> is generally used. However, high load will be taken to applications on guests >> using network I/O because there are many network stack layers. Therefore, >> a communication method for collecting the data without using network is >> needed. > > I implemented something similar earlier by passing trace data from > OpenBIOS to QEMU using the firmware configuration device. The data > format was the same as QEMU used for simpletrace event structure > instead of ftrace. I didn't commit it because of a few problems. Sounds interesting :) I guess you traced BIOS events, right? > I'm not familiar with ftrace, is it possible to trace two guest > applications (BIOS and kernel) at the same time? Since ftrace itself is a tracing feature in the linux kernel, it can trace two or more applications (processes) if those run on linux kernel. However, I think OpenBIOS runs *under* the guest kernel. If so, ftrace currently can't trace OpenBIOS from guest side. I think it may need another enhancement on both OpenBIOS and linux kernel to trace BIOS event from linux kernel. > Or could this be > handled by opening two different virtio-serial pipes, one for BIOS and > the other for the kernel? Of course, virtio-serial itself can open multiple channels, thus, if OpenBIOS can handle virtio, it can pass trace data via another channel. > In my version, the tracepoint ID would have been used to demultiplex > QEMU tracepoints from BIOS tracepoints, but something like separate ID > spaces would have been better. I guess your feature notifies events to QEMU and QEMU records that in their own buffer. Therefore it must have different tracepoint IDs. On the other hand, with this feature, QEMU just passes trace-data to host-side pipe. Since outer tracing tool separately collects trace data, we don't need to demultiplex the data. Perhaps, in the analyzing phase (after tracing), we have to mix events again. At that time, we'll add some guest-ID for each event-ID, but it can be done offline. Best Regards, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Tue, Jul 24, 2012 at 2:36 AM, Yoshihiro YUNOMAE wrote: > Hi All, > > The following patch set provides a low-overhead system for collecting kernel > tracing data of guests by a host in a virtualization environment. > > A guest OS generally shares some devices with other guests or a host, so > reasons of any problems occurring in a guest may be from other guests or a > host. > Then, to collect some tracing data of a number of guests and a host is needed > when some problems occur in a virtualization environment. One of methods to > realize that is to collect tracing data of guests in a host. To do this, > network > is generally used. However, high load will be taken to applications on guests > using network I/O because there are many network stack layers. Therefore, > a communication method for collecting the data without using network is > needed. I implemented something similar earlier by passing trace data from OpenBIOS to QEMU using the firmware configuration device. The data format was the same as QEMU used for simpletrace event structure instead of ftrace. I didn't commit it because of a few problems. I'm not familiar with ftrace, is it possible to trace two guest applications (BIOS and kernel) at the same time? Or could this be handled by opening two different virtio-serial pipes, one for BIOS and the other for the kernel? In my version, the tracepoint ID would have been used to demultiplex QEMU tracepoints from BIOS tracepoints, but something like separate ID spaces would have been better. > > We submitted a patch set of "IVRing", a ring-buffer driver constructed on > Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in > this June. IVRing and the IVRing reader use POSIX shared memory each other > without using network, so a low-overhead system for collecting guest tracing > data is realized. However, this patch set has some problems as follows: > - use IVShmem instead of virtio > - create a new ring-buffer without using existing ring-buffer in kernel > - scalability >-- not support SMP environment >-- buffer size limitation >-- not support live migration (maybe difficult for realize this) > > Therefore, we propose a new system "virtio-trace", which uses enhanced > virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel > tracing data. In this system, there are 5 main components: > (1) Ring-buffer of ftrace in a guest > - When trace agent reads ring-buffer, a page is removed from ring-buffer. > (2) Trace agent in the guest > - Splice the page of ring-buffer to read_pipe using splice() without >memory copying. Then, the page is spliced from write_pipe to virtio >without memory copying. > (3) Virtio-console driver in the guest > - Pass the page to virtio-ring > (4) Virtio-serial bus in QEMU > - Copy the page to kernel pipe > (5) Reader in the host > - Read guest tracing data via FIFO(named pipe) > > ***Evaluation*** > When a host collects tracing data of a guest, the performance of using > virtio-trace is compared with that of using native(just running ftrace), > IVRing, and virtio-serial(normal method of read/write). > > > The overview of this evaluation is as follows: > (a) A guest on a KVM is prepared. > - The guest is dedicated one physical CPU as a virtual CPU(VCPU). > > (b) The guest starts to write tracing data to ring-buffer of ftrace. > - The probe points are all trace points of sched, timer, and kmem. > > (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark > tool in the guest. > - Dhrystone 2 intends system performance by repeating integer arithmetic >as a score. > - Since higher score equals to better system performance, if the score >decrease based on bare environment, it indicates that any operation >disturbs the integer arithmetic. Then, we define the overhead of >transporting trace data is calculated as follows: > OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100. > > The performance of each method is compared as follows: > [1] Native > - only recording trace data to ring-buffer on a guest > [2] Virtio-trace > - running a trace agent on a guest > - a reader on a host opens FIFO using cat command > [3] IVRing > - A SystemTap script in a guest records trace data to IVRing. >-- probe points are same as ftrace. > [4] Virtio-serial(normal) > - A reader(using cat) on a guest output trace data to a host using >standard output via virtio-serial. > > Other information is as follows: > - host >kernel: 3.3.7-1 (Fedora16) >CPU: Intel Xeon x5660@2.80GHz(12core) >Memory: 48GB > > - guest(only booting one guest) >kernel: 3.5.0-rc4+ (Fedora16) >CPU: 1VCPU(dedicated) >Memory: 1GB > > > 3 patterns based on the bare environment were indicated as follows: >Scores overhead against [0]
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Tue, Jul 24, 2012 at 12:03 PM, Masami Hiramatsu wrote: > (2012/07/24 19:02), Stefan Hajnoczi wrote: >> On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE >> wrote: >>> The performance of each method is compared as follows: >>> [1] Native >>> - only recording trace data to ring-buffer on a guest >>> [2] Virtio-trace >>> - running a trace agent on a guest >>> - a reader on a host opens FIFO using cat command >>> [3] IVRing >>> - A SystemTap script in a guest records trace data to IVRing. >>>-- probe points are same as ftrace. >>> [4] Virtio-serial(normal) >>> - A reader(using cat) on a guest output trace data to a host using >>>standard output via virtio-serial. >> >> The first time I read this I thought you are adding a new virtio-trace >> device. But it looks like this series really add splice support to >> virtio-console and that yields a big performance improvement when >> sending trace_pipe_raw. > > Yes, sorry for the confusion. Actually this is an enhancement of > virtio-serial. I'm working with Yoshihiro on this feature. > >> Guest ftrace is useful and I like this. Have you thought about >> controlling ftrace from the host? Perhaps a command could be added to >> the QEMU guest agent which basically invokes trace-cmd/perf. > > As you can see, guest trace-agent can be controlled via a > control channel. In our scenario, host tools can control that > instead of guest one. > > We are considering that exporting the tracing part of guest's > debugfs to host via another virtio-serial channel by using > 9pfs, so that the host tools can refer that. > > (In this scenario, guest trace-agent will also provide 9pfs server. > Since it means that the agent can handle writing a special file, > trace-agent can be controlled via the special file on exported > debugfs.) > > Of course, this also requires modifying trace-cmd/perf to accept > some options like guest-debugfs mount point, guest's serial > channel pipe (or unix socket?), etc. However, it will be a small > change. Okay, thanks for explaining some of the ideas you have. I won't ask more because it's out of scope for this patch series :). Stefan
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Tue, Jul 24, 2012 at 12:19 PM, Yoshihiro YUNOMAE wrote: >>> Are you using text formatted ftrace? > No, currently using raw format, but we'd like to reformat it in text. Capturing the info necessary to translate numbers into symbols is one of the problems of host<->guest tracing so I'm curious how you handle this :). Apologies for my lack of ftrace knowledge but how useful is the raw tracing data on the host? How do you pretty-print it in human-readable form? Stefan
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
Hi Stefan, Thank you for commenting on our patch set. (2012/07/24 20:03), Masami Hiramatsu wrote: (2012/07/24 19:02), Stefan Hajnoczi wrote: On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE wrote: The performance of each method is compared as follows: [1] Native - only recording trace data to ring-buffer on a guest [2] Virtio-trace - running a trace agent on a guest - a reader on a host opens FIFO using cat command [3] IVRing - A SystemTap script in a guest records trace data to IVRing. -- probe points are same as ftrace. [4] Virtio-serial(normal) - A reader(using cat) on a guest output trace data to a host using standard output via virtio-serial. The first time I read this I thought you are adding a new virtio-trace device. But it looks like this series really add splice support to virtio-console and that yields a big performance improvement when sending trace_pipe_raw. Yes, sorry for the confusion. Actually this is an enhancement of virtio-serial. I'm working with Yoshihiro on this feature. Guest ftrace is useful and I like this. Have you thought about controlling ftrace from the host? Perhaps a command could be added to the QEMU guest agent which basically invokes trace-cmd/perf. As you can see, guest trace-agent can be controlled via a control channel. In our scenario, host tools can control that instead of guest one. We are considering that exporting the tracing part of guest's debugfs to host via another virtio-serial channel by using 9pfs, so that the host tools can refer that. (In this scenario, guest trace-agent will also provide 9pfs server. Since it means that the agent can handle writing a special file, trace-agent can be controlled via the special file on exported debugfs.) Of course, this also requires modifying trace-cmd/perf to accept some options like guest-debugfs mount point, guest's serial channel pipe (or unix socket?), etc. However, it will be a small change. Thank you, >> Are you using text formatted ftrace? No, currently using raw format, but we'd like to reformat it in text. Thank you, -- Yoshihiro YUNOMAE Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: yoshihiro.yunomae...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
(2012/07/24 19:02), Stefan Hajnoczi wrote: > On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE > wrote: >> The performance of each method is compared as follows: >> [1] Native >> - only recording trace data to ring-buffer on a guest >> [2] Virtio-trace >> - running a trace agent on a guest >> - a reader on a host opens FIFO using cat command >> [3] IVRing >> - A SystemTap script in a guest records trace data to IVRing. >>-- probe points are same as ftrace. >> [4] Virtio-serial(normal) >> - A reader(using cat) on a guest output trace data to a host using >>standard output via virtio-serial. > > The first time I read this I thought you are adding a new virtio-trace > device. But it looks like this series really add splice support to > virtio-console and that yields a big performance improvement when > sending trace_pipe_raw. Yes, sorry for the confusion. Actually this is an enhancement of virtio-serial. I'm working with Yoshihiro on this feature. > Guest ftrace is useful and I like this. Have you thought about > controlling ftrace from the host? Perhaps a command could be added to > the QEMU guest agent which basically invokes trace-cmd/perf. As you can see, guest trace-agent can be controlled via a control channel. In our scenario, host tools can control that instead of guest one. We are considering that exporting the tracing part of guest's debugfs to host via another virtio-serial channel by using 9pfs, so that the host tools can refer that. (In this scenario, guest trace-agent will also provide 9pfs server. Since it means that the agent can handle writing a special file, trace-agent can be controlled via the special file on exported debugfs.) Of course, this also requires modifying trace-cmd/perf to accept some options like guest-debugfs mount point, guest's serial channel pipe (or unix socket?), etc. However, it will be a small change. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE wrote: > The performance of each method is compared as follows: > [1] Native > - only recording trace data to ring-buffer on a guest > [2] Virtio-trace > - running a trace agent on a guest > - a reader on a host opens FIFO using cat command > [3] IVRing > - A SystemTap script in a guest records trace data to IVRing. >-- probe points are same as ftrace. > [4] Virtio-serial(normal) > - A reader(using cat) on a guest output trace data to a host using >standard output via virtio-serial. The first time I read this I thought you are adding a new virtio-trace device. But it looks like this series really add splice support to virtio-console and that yields a big performance improvement when sending trace_pipe_raw. Guest ftrace is useful and I like this. Have you thought about controlling ftrace from the host? Perhaps a command could be added to the QEMU guest agent which basically invokes trace-cmd/perf. Are you using text formatted ftrace? Stefan
Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
(2012/07/24 11:36), Yoshihiro YUNOMAE wrote: > Therefore, we propose a new system "virtio-trace", which uses enhanced > virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel > tracing data. In this system, there are 5 main components: > (1) Ring-buffer of ftrace in a guest > - When trace agent reads ring-buffer, a page is removed from ring-buffer. > (2) Trace agent in the guest > - Splice the page of ring-buffer to read_pipe using splice() without >memory copying. Then, the page is spliced from write_pipe to virtio >without memory copying. > (3) Virtio-console driver in the guest > - Pass the page to virtio-ring > (4) Virtio-serial bus in QEMU > - Copy the page to kernel pipe > (5) Reader in the host > - Read guest tracing data via FIFO(named pipe) So, this is our answer for the argued points in previous thread. This virtio-serial and ftrace enhancements doesn't introduce new "ringbuffer" in the kernel, and just use virtio's ringbuffer. Also, using splice gives us a great advantage in the performance because of copy-less trace-data transfer. Actually, one copy should occur in the host (to write it into the pipe), because removing physical pages of the guest is hard to track and may involve a TLB flush per page, even if it is done in background. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com
[Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace
Hi All, The following patch set provides a low-overhead system for collecting kernel tracing data of guests by a host in a virtualization environment. A guest OS generally shares some devices with other guests or a host, so reasons of any problems occurring in a guest may be from other guests or a host. Then, to collect some tracing data of a number of guests and a host is needed when some problems occur in a virtualization environment. One of methods to realize that is to collect tracing data of guests in a host. To do this, network is generally used. However, high load will be taken to applications on guests using network I/O because there are many network stack layers. Therefore, a communication method for collecting the data without using network is needed. We submitted a patch set of "IVRing", a ring-buffer driver constructed on Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in this June. IVRing and the IVRing reader use POSIX shared memory each other without using network, so a low-overhead system for collecting guest tracing data is realized. However, this patch set has some problems as follows: - use IVShmem instead of virtio - create a new ring-buffer without using existing ring-buffer in kernel - scalability -- not support SMP environment -- buffer size limitation -- not support live migration (maybe difficult for realize this) Therefore, we propose a new system "virtio-trace", which uses enhanced virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel tracing data. In this system, there are 5 main components: (1) Ring-buffer of ftrace in a guest - When trace agent reads ring-buffer, a page is removed from ring-buffer. (2) Trace agent in the guest - Splice the page of ring-buffer to read_pipe using splice() without memory copying. Then, the page is spliced from write_pipe to virtio without memory copying. (3) Virtio-console driver in the guest - Pass the page to virtio-ring (4) Virtio-serial bus in QEMU - Copy the page to kernel pipe (5) Reader in the host - Read guest tracing data via FIFO(named pipe) ***Evaluation*** When a host collects tracing data of a guest, the performance of using virtio-trace is compared with that of using native(just running ftrace), IVRing, and virtio-serial(normal method of read/write). The overview of this evaluation is as follows: (a) A guest on a KVM is prepared. - The guest is dedicated one physical CPU as a virtual CPU(VCPU). (b) The guest starts to write tracing data to ring-buffer of ftrace. - The probe points are all trace points of sched, timer, and kmem. (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark tool in the guest. - Dhrystone 2 intends system performance by repeating integer arithmetic as a score. - Since higher score equals to better system performance, if the score decrease based on bare environment, it indicates that any operation disturbs the integer arithmetic. Then, we define the overhead of transporting trace data is calculated as follows: OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100. The performance of each method is compared as follows: [1] Native - only recording trace data to ring-buffer on a guest [2] Virtio-trace - running a trace agent on a guest - a reader on a host opens FIFO using cat command [3] IVRing - A SystemTap script in a guest records trace data to IVRing. -- probe points are same as ftrace. [4] Virtio-serial(normal) - A reader(using cat) on a guest output trace data to a host using standard output via virtio-serial. Other information is as follows: - host kernel: 3.3.7-1 (Fedora16) CPU: Intel Xeon x5660@2.80GHz(12core) Memory: 48GB - guest(only booting one guest) kernel: 3.5.0-rc4+ (Fedora16) CPU: 1VCPU(dedicated) Memory: 1GB 3 patterns based on the bare environment were indicated as follows: Scores overhead against [0] Native [0] Native: 28807569.5 - [1] Virtio-trace:28685049.5 0.43% [2] IVRing: 28418595.5 1.35% [3] Virtio-serial: 13262258.753.96% ***Just enhancement ideas*** - Support for trace-cmd - Support for 9pfs protocol - Support for non-blocking mode in QEMU - Make "vhost-serial" Thank you, --- Masami Hiramatsu (5): virtio/console: Allocate scatterlist according to the current pipe size ftrace: Allow stealing pages from pipe buffer virtio/console: Wait until the port is ready on splice virtio/console: Add a failback for unstealable pipe buffer virtio/console: Add splice_write support Yoshihiro YUNOMAE (1): tools: Add guest trace agent as a user tool drivers/char/virtio_console.c | 198 ++-- kernel/trace/trace.c