Re: [PATCH] vhost/scsi: Use safe iteration in vhost_scsi_complete_cmd_work()
On Thu, Nov 16, 2017 at 08:52:39AM +0900, Byungchul Park wrote: > On Thu, Nov 09, 2017 at 09:17:29AM +0900, Byungchul Park wrote: > > I am sorry for having made a mistake on it. > > Hello Nicholas, > > Please consider this patch urgently. I'm sorry for having changed the > original behavior with the previous patch. > > The safe version of llist API should be used to keep the original > behavior. > > Thanks, > Byungchul I have included this patch in my tree. > > -8<- > > >From ba9a0f76dffceffa4fa3aa2d9be49cdb0d9b7d4f Mon Sep 17 00:00:00 2001 > > From: Byungchul Park > > Date: Thu, 9 Nov 2017 09:00:21 +0900 > > Subject: [PATCH] vhost/scsi: Use safe iteration in > > vhost_scsi_complete_cmd_work() > > > > The following patch changed the behavior which originally did safe > > iteration. Make it safe as it was. > > > >12bdcbd539c6327c09da0503c674733cb2d82cb5 > >vhost/scsi: Don't reinvent the wheel but use existing llist API > > > > Signed-off-by: Byungchul Park > > --- > > drivers/vhost/scsi.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c > > index 046f6d2..46539ca 100644 > > --- a/drivers/vhost/scsi.c > > +++ b/drivers/vhost/scsi.c > > @@ -519,7 +519,7 @@ static void vhost_scsi_complete_cmd_work(struct > > vhost_work *work) > > vs_completion_work); > > DECLARE_BITMAP(signal, VHOST_SCSI_MAX_VQ); > > struct virtio_scsi_cmd_resp v_rsp; > > - struct vhost_scsi_cmd *cmd; > > + struct vhost_scsi_cmd *cmd, *t; > > struct llist_node *llnode; > > struct se_cmd *se_cmd; > > struct iov_iter iov_iter; > > @@ -527,7 +527,7 @@ static void vhost_scsi_complete_cmd_work(struct > > vhost_work *work) > > > > bitmap_zero(signal, VHOST_SCSI_MAX_VQ); > > llnode = llist_del_all(&vs->vs_completion_list); > > - llist_for_each_entry(cmd, llnode, tvc_completion_list) { > > + llist_for_each_entry_safe(cmd, t, llnode, tvc_completion_list) { > > se_cmd = &cmd->tvc_se_cmd; > > > > pr_debug("%s tv_cmd %p resid %u status %#02x\n", __func__, > > -- > > 1.9.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
On Wed, 15 Nov 2017, Peter Zijlstra wrote: > On Mon, Nov 13, 2017 at 06:06:02PM +0800, Quan Xu wrote: > > From: Yang Zhang > > > > Implement a generic idle poll which resembles the functionality > > found in arch/. Provide weak arch_cpu_idle_poll function which > > can be overridden by the architecture code if needed. > > No, we want less of those magic hooks, not more. > > > Interrupts arrive which may not cause a reschedule in idle loops. > > In KVM guest, this costs several VM-exit/VM-entry cycles, VM-entry > > for interrupts and VM-exit immediately. Also this becomes more > > expensive than bare metal. Add a generic idle poll before enter > > real idle path. When a reschedule event is pending, we can bypass > > the real idle path. > > Why not do a HV specific idle driver? If I understand the problem correctly then he wants to avoid the heavy lifting in tick_nohz_idle_enter() in the first place, but there is already an interesting quirk there which makes it exit early. See commit 3c5d92a0cfb5 ("nohz: Introduce arch_needs_cpu"). The reason for this commit looks similar. But lets not proliferate that. I'd rather see that go away. But the irq_timings stuff is heading into the same direction, with a more complex prediction logic which should tell you pretty good how long that idle period is going to be and in case of an interrupt heavy workload this would skip the extra work of stopping and restarting the tick and provide a very good input into a polling decision. This can be handled either in a HV specific idle driver or even in the generic core code. If the interrupt does not arrive then you can assume within the predicted time then you can assume that the flood stopped and invoke halt or whatever. That avoids all of that 'tunable and tweakable' x86 specific hackery and utilizes common functionality which is mostly there already. Thanks, tglx ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Xen-devel] [PATCH RFC v3 0/6] x86/idle: add halt poll support
On Mon, Nov 13, 2017 at 06:05:59PM +0800, Quan Xu wrote: > From: Yang Zhang > > Some latency-intensive workload have seen obviously performance > drop when running inside VM. The main reason is that the overhead > is amplified when running inside VM. The most cost I have seen is > inside idle path. Meaning an VMEXIT b/c it is an 'halt' operation ? And then going back in guest (VMRESUME) takes time. And hence your latency gets all whacked b/c of this? So if I understand - you want to use your _full_ timeslice (of the guest) without ever (or as much as possible) to go in the hypervisor? Which means in effect you don't care about power-saving or CPUfreq savings, you just want to eat the full CPU for snack? > > This patch introduces a new mechanism to poll for a while before > entering idle state. If schedule is needed during poll, then we > don't need to goes through the heavy overhead path. Schedule of what? The guest or the host? ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ
On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote: > Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature indicates the > support of reporting hints of guest free pages to the host via > virtio-balloon. The host requests the guest to report the free pages by > sending commands via the virtio-balloon configuration registers. > > When the guest starts to report, the first element added to the free page > vq is a sequence id of the start reporting command. The id is given by > the host, and it indicates whether the following free pages correspond > to the command. For example, the host may stop the report and start again > with a new command id. The obsolete pages for the previous start command > can be detected by the id dismatching on the host. The id is added to the > vq using an output buffer, and the free pages are added to the vq using > input buffer. > > Here are some explainations about the added configuration registers: > - host2guest_cmd: a register used by the host to send commands to the > guest. > - guest2host_cmd: written by the guest to ACK to the host about the > commands that have been received. The host will clear the corresponding > bits on the host2guest_cmd register. The guest also uses this register > to send commands to the host (e.g. when finish free page reporting). > - free_page_cmd_id: the sequence id of the free page report command > given by the host. > > Signed-off-by: Wei Wang > Signed-off-by: Liang Li > Cc: Michael S. Tsirkin > Cc: Michal Hocko > --- > drivers/virtio/virtio_balloon.c | 234 > > include/uapi/linux/virtio_balloon.h | 11 ++ > 2 files changed, 223 insertions(+), 22 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index b31fc25..4087f04 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -55,7 +55,12 @@ static struct vfsmount *balloon_mnt; > > struct virtio_balloon { > struct virtio_device *vdev; > - struct virtqueue *inflate_vq, *deflate_vq, *stats_vq; > + struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq; > + > + /* Balloon's own wq for cpu-intensive work items */ > + struct workqueue_struct *balloon_wq; > + /* The free page reporting work item submitted to the balloon wq */ > + struct work_struct report_free_page_work; > > /* The balloon servicing is delegated to a freezable workqueue. */ > struct work_struct update_balloon_stats_work; > @@ -65,6 +70,10 @@ struct virtio_balloon { > spinlock_t stop_update_lock; > bool stop_update; > > + /* Stop reporting free pages */ > + bool report_free_page_stop; > + uint32_t free_page_cmd_id; > + > /* Waiting for host to ack the pages we released. */ > wait_queue_head_t acked; > > @@ -191,6 +200,30 @@ static void send_balloon_page_sg(struct virtio_balloon > *vb, > kick_and_wait(vq, vb->acked); > } > > +static void send_free_page_sg(struct virtqueue *vq, void *addr, uint32_t > size) > +{ > + int err = 0; > + unsigned int len; > + > + /* Detach all the used buffers from the vq */ > + while (virtqueue_get_buf(vq, &len)) > + ; > + > + /* > + * Since this is an optimization feature, losing a couple of free > + * pages to report isn't important. We simply resturn without adding > + * the page if the vq is full. > + */ > + if (vq->num_free) { > + err = add_one_sg(vq, addr, size); > + BUG_ON(err); > + } > + > + /* Batch till the vq is full */ > + if (!vq->num_free) > + virtqueue_kick(vq); > +} > + > /* > * Send balloon pages in sgs to host. The balloon pages are recorded in the > * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE. > @@ -495,9 +528,8 @@ static void stats_handle_request(struct virtio_balloon > *vb) > virtqueue_kick(vq); > } > > -static void virtballoon_changed(struct virtio_device *vdev) > +static void virtballoon_cmd_balloon_memory(struct virtio_balloon *vb) > { > - struct virtio_balloon *vb = vdev->priv; > unsigned long flags; > > spin_lock_irqsave(&vb->stop_update_lock, flags); > @@ -506,6 +538,50 @@ static void virtballoon_changed(struct virtio_device > *vdev) > spin_unlock_irqrestore(&vb->stop_update_lock, flags); > } > > +static void virtballoon_cmd_report_free_page_start(struct virtio_balloon *vb) > +{ > + unsigned long flags; > + > + vb->report_free_page_stop = false; > + spin_lock_irqsave(&vb->stop_update_lock, flags); > + if (!vb->stop_update) > + queue_work(vb->balloon_wq, &vb->report_free_page_work); > + spin_unlock_irqrestore(&vb->stop_update_lock, flags); > +} > + > +static void virtballoon_changed(struct virtio_device *vdev) > +{ > + struct virtio_balloon *vb = vdev->priv; > + u32 host2guest_cmd, guest2host_cmd = 0; > + > +
Call for papers - WorldCIST'18 - Naples, Italy - Extended deadline: November 26
* Proceedings by Springer ** Extended versions of best selected papers will be published in JCR/SCI/SSCI journals --- WorldCist'18 - 6th World Conference on Information Systems and Technologies Naples, Italy, 27 - 29 March 2018 http://www.worldcist.org/ - SCOPE The WorldCist'18 - 6th World Conference on Information Systems and Technologies (http://www.worldcist.org/), to be held at Naples, Italy, 27 - 29 March 2018, is a global forum for researchers and practitioners to present and discuss the most recent innovations, trends, results, experiences and concerns in the several perspectives of Information Systems and Technologies. We are pleased to invite you to submit your papers to WorldCist'18. All submissions will be reviewed on the basis of relevance, originality, importance and clarity. THEMES Submitted papers should be related with one or more of the main themes proposed for the Conference: A) Information and Knowledge Management (IKM); B) Organizational Models and Information Systems (OMIS); C) Software and Systems Modeling (SSM); D) Software Systems, Architectures, Applications and Tools (SSAAT); E) Multimedia Systems and Applications (MSA); F) Computer Networks, Mobility and Pervasive Systems (CNMPS); G) Intelligent and Decision Support Systems (IDSS); H) Big Data Analytics and Applications (BDAA); I) Human-Computer Interaction (HCI); J) Ethics, Computers and Security (ECS) K) Health Informatics (HIS); L) Information Technologies in Education (ITE); M) Information Technologies in Radiocommunications (ITR). N) Technologies for Biomedical Applications (TBA) TYPES of SUBMISSIONS and DECISIONS Types of Submissions and Decisions Four types of papers can be submitted: Full paper: Finished or consolidated R&D works, to be included in one of the Conference themes. These papers are assigned a 10-page limit. Short paper: Ongoing works with relevant preliminary results, open to discussion. These papers are assigned a 7-page limit. Poster paper: Initial work with relevant ideas, open to discussion. These papers are assigned to a 4-page limit. Company paper: Companies' papers that show practical experience, R & D, tools, etc., focused on some topics of the conference. These papers are assigned to a 4-page limit. Submitted papers must comply with the format of Advances in Intelligent Systems and Computing Series (see Instructions for Authors at Springer Website or download a DOC example) be written in English, must not have been published before, not be under review for any other conference or publication and not include any information leading to the authors identification. Therefore, the authors names, affiliations and bibliographic references should not be included in the version for evaluation by the Program Committee. This information should only be included in the camera-ready version, saved in Word or Latex format and also in PDF format. These files must be accompanied by the Consent to Publication form filled out, in a ZIP file, and uploaded at the conference management system. All papers will be subjected to a double-blind review by at least two members of the Program Committee. Based on Program Committee evaluation, a paper can be rejected or accepted by the Conference Chairs. In the later case, it can be accepted as the type originally submitted or as another type. Thus, full papers can be accepted as short papers or poster papers only. Similarly, short papers can be accepted as poster papers only. In these cases, the authors will be allowed to maintain the original number of pages in the camera-ready version. The authors of accepted poster papers must also build and print a poster to be exhibited during the Conference. This poster must follow an A1 or A2 vertical format. The Conference can includes Work Sessions where these posters are presented and orally discussed, with a 5 minute limit per poster. The authors of accepted full papers will have 15 minutes to present their work in a Conference Work Session; approximately 5 minutes of discussion will follow each presentation. The authors of accepted short papers and company papers will have 11 minutes to present their work in a Conference Work Session; approximately 4 minutes of discussion will follow each presentation. PUBLICATION & INDEXING To ensure that a full paper, short paper, poster paper or company paper is published, at least one of the authors must be fully registered by the 7th of January 2018, and the paper must comply with the suggested layout and page-limit. Additionally, all recommended changes must be addressed by the authors before they submit the camera-ready version. No more than one paper per registration will
Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ
On Wed, Nov 15, 2017 at 11:47:58AM +0800, Wei Wang wrote: > On 11/15/2017 05:21 AM, Michael S. Tsirkin wrote: > > On Tue, Nov 14, 2017 at 08:02:03PM +0800, Wei Wang wrote: > > > On 11/14/2017 01:32 AM, Michael S. Tsirkin wrote: > > > > > - guest2host_cmd: written by the guest to ACK to the host about the > > > > > commands that have been received. The host will clear the > > > > > corresponding > > > > > bits on the host2guest_cmd register. The guest also uses this register > > > > > to send commands to the host (e.g. when finish free page reporting). > > > > I am not sure what is the role of guest2host_cmd. Reporting of > > > > the correct cmd id seems sufficient indication that guest > > > > received the start command. Not getting any more seems sufficient > > > > to detect stop. > > > > > > > I think the issue is when the host is waiting for the guest to report > > > pages, > > > it does not know whether the guest is going to report more or the report > > > is > > > done already. That's why we need a way to let the guest tell the host "the > > > report is done, don't wait for more", then the host continues to the next > > > step - sending the non-free pages to the destination. The following method > > > is a conclusion of other comments, with some new thought. Please have a > > > check if it is good. > > config won't work well for this IMHO. > > Writes to config register are hard to synchronize with the VQ. > > For example, guest sends free pages, host says stop, meanwhile > > guest sends stop for 1st set of pages. > > I still don't see an issue with this. Please see below: > (before jumping into the discussion, just make sure I've well explained this > point: now host-to-guest commands are done via config, and guest-to-host > commands are done via the free page vq) This is fine by me actually. But right now you have guest to host not going through vq, going through command register instead - this is how sending stop to host seems to happen. If you make it go through vq then I think all will be well. > > Case: Host starts to request the reporting with cmd_id=1. Some time later, > Host writes "stop" to config, meantime guest happens to finish the reporting > and plan to actively send a "stop" command from the free_page_vq(). > Essentially, this is like a sync between two threads - if we view > the config interrupt handler as one thread, another is the free page > reporting worker thread. > > - what the config handler does is simply: > 1.1: WRITE_ONCE(vb->reporting_stop, true); > > - what the reporting thread will do is > 2.1: WRITE_ONCE(vb->reporting_stop, true); > 2.2: send_stop_to_host_via_vq(); > > From the guest point of view, no matter 1.1 is executed first or 2.1 first, > it doesn't make a difference to the end result - vb->reporting_stop is set. > > From the host point of view, it knows that cmd_id=1 has truly stopped the > reporting when it receives a "stop" sign via the vq. > > > > How about adding a buffer with "stop" in the VQ instead? > > Wastes a VQ entry which you will need to reserve for this > > but is it a big deal? > > The free page vq is guest-to-host direction. Yes, for guest to host stop sign. > Using it for host-to-guest > requests will make it bidirectional, which will result in the same issue > described before: https://lkml.org/lkml/2017/10/11/1009 (the first response) > > On the other hand, I think adding another new vq for host-to-guest > requesting doesn't make a difference in essence, compared to using config > (same 1.1, 2.1, 2.2 above), but will be more complicated. I agree with this. Host to guest can just incremenent the "free command id" register. > > > > Two new configuration registers in total: > > > - cmd_reg: the command register, combined from the previous host2guest and > > > guest2host. I think we can use the same register for host requesting and > > > guest ACKing, since the guest writing will trap to QEMU, that is, all the > > > writes to the register are performed in QEMU, and we can keep things work > > > in > > > a correct way there. > > > - cmd_id_reg: the sequence id of the free page report command. > > > > > > -- free page report: > > > - host requests the guest to start reporting by "cmd_reg | > > > REPORT_START"; > > > - guest ACKs to the host about receiving the start reporting request > > > by > > > "cmd_reg | REPORT_START", host will clear the flag bit once receiving the > > > ACK. > > > - host requests the guest to stop reporting by "cmd_reg | > > > REPORT_STOP"; > > > - guest ACKs to the host about receiving the stop reporting request > > > by > > > "cmd_reg | REPORT_STOP", host will clear the flag once receiving the ACK. > > > - guest tells the host about the start of the reporting by writing > > > "cmd > > > id" into an outbuf, which is added to the free page vq. > > > - guest tells the host about the end of the reporting
Re: [PATCH RFC v3 0/6] x86/idle: add halt poll support
On Mon, Nov 13, 2017 at 07:01:40PM +0800, Quan Xu wrote: > Documentation/sysctl/kernel.txt | 35 > arch/x86/include/asm/paravirt.h |5 ++ > arch/x86/include/asm/paravirt_types.h |6 +++ > arch/x86/kernel/kvm.c | 73 > + > arch/x86/kernel/paravirt.c| 10 + > arch/x86/kernel/process.c |7 +++ > include/linux/kernel.h|6 +++ > include/linux/tick.h |2 + > kernel/sched/idle.c |2 + > kernel/sysctl.c | 34 +++ > kernel/time/tick-sched.c | 11 + > kernel/time/tick-sched.h |3 + > 12 files changed, 194 insertions(+), 0 deletions(-) You seem to have forgotten to CC me on the actual patches, but no. Not going to happen. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
On Mon, Nov 13, 2017 at 06:06:02PM +0800, Quan Xu wrote: > From: Yang Zhang > > Implement a generic idle poll which resembles the functionality > found in arch/. Provide weak arch_cpu_idle_poll function which > can be overridden by the architecture code if needed. No, we want less of those magic hooks, not more. > Interrupts arrive which may not cause a reschedule in idle loops. > In KVM guest, this costs several VM-exit/VM-entry cycles, VM-entry > for interrupts and VM-exit immediately. Also this becomes more > expensive than bare metal. Add a generic idle poll before enter > real idle path. When a reschedule event is pending, we can bypass > the real idle path. Why not do a HV specific idle driver? ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization