Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Leonid Yegoshin
On 01/13/2016 02:45 AM, Will Deacon wrote: I don't think the address dependency is enough on its own. By that reasoning, the following variant (WRC+addr+addr) would work too: P0: Wx = 1 P1: Rx == 1 Wy = 1 P2: Ry == 1 Rx = 0 So are you saying that this is also forbidden? Imagine that P0

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Leonid Yegoshin
On 01/13/2016 12:48 PM, Peter Zijlstra wrote: On Wed, Jan 13, 2016 at 11:02:35AM -0800, Leonid Yegoshin wrote: I ask HW team about it but I have a question - has it any relationship with replacing MIPS SYNC with lightweight SYNCs (SYNC_WMB etc)? Of course. If you cannot explain the semantics o

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Peter Zijlstra
On Wed, Jan 13, 2016 at 11:02:35AM -0800, Leonid Yegoshin wrote: > I ask HW team about it but I have a question - has it any relationship with > replacing MIPS SYNC with lightweight SYNCs (SYNC_WMB etc)? Of course. If you cannot explain the semantics of the primitives you introduce, how can we ju

[PATCH v3 4/4] x86: drop mfence in favor of lock+addl

2016-01-13 Thread Michael S. Tsirkin
mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Just poking at SP would be the most natural, but if we then read the value from SP, we get a false dependency which will slow us down. This was noted in this article: ht

[PATCH v3 2/4] x86: drop a comment left over from X86_OOSTORE

2016-01-13 Thread Michael S. Tsirkin
The comment about wmb being non-nop to deal with non-intel CPUs is a left over from before commit 09df7c4c8097 ("x86: Remove CONFIG_X86_OOSTORE"). It makes no sense now: in particular, wmb is not a nop even for regular intel CPUs because of weird use-cases e.g. dealing with WC memory. Drop this c

[PATCH v3 1/4] x86: add cc clobber for addl

2016-01-13 Thread Michael S. Tsirkin
addl clobbers flags (such as CF) but barrier.h didn't tell this to gcc. Historically, gcc doesn't need one on x86, and always considers flags clobbered. We are probably missing the cc clobber in a *lot* of places for this reason. But even if not necessary, it's probably a good thing to add for doc

[PATCH v3 3/4] x86: tweak the comment about use of wmb for IO

2016-01-13 Thread Michael S. Tsirkin
On x86, we *do* still use the non-nop rmb/wmb for IO barriers, but even that is generally questionable. Leave them around as historial unless somebody can point to a case where they care about the performance, but tweak the comment so people don't think they are strictly required in all cases. Si

[PATCH v3 0/4] x86: faster mb()+documentation tweaks

2016-01-13 Thread Michael S. Tsirkin
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So let's use the locked variant everywhere. While I was at it, I found some inconsistencies in comments in arch/x86/include/asm/barrier.h The documentati

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Leonid Yegoshin
(I try to answer on multiple mails in one) First of all, it seems like some generic notes should be given here: 1. Generic MIPS "SYNC" (aka "SYNC 0") instruction is a very heavy in some CPUs. On that CPUs it basically kills pipelines in each CPU, can do a special memory/IO bus transaction (sim

[PATCH] uapi: use __u8 from linux/types.h

2016-01-13 Thread Gleb Fotengauer-Malinovskiy
Kernel headers should use linux/types.h based definitions. Signed-off-by: Gleb Fotengauer-Malinovskiy --- include/uapi/linux/virtio_gpu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h index 7a63faa..4b04ead 1

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Leonid Yegoshin
On 01/13/2016 02:45 AM, Will Deacon wrote: On Tue, Jan 12, 2016 at 12:45:14PM -0800, Leonid Yegoshin wrote: I don't think the address dependency is enough on its own. By that reasoning, the following variant (WRC+addr+addr) would work too: P0: Wx = 1 P1: Rx == 1 Wy = 1 P2: Ry == 1 Rx = 0

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Leonid Yegoshin
On 01/12/2016 01:40 PM, Peter Zijlstra wrote: It is selectable only for MIPS R2 but not MIPS R6. The reason is - most of MIPS R2 CPUs have short pipeline and that SYNC is just waste of CPU resource, especially taking into account that "lightweight syncs" are converted to a heavy "SYNC 0" in man

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Leonid Yegoshin
On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote: On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends, smp_read_barrier_depends, smp_store_release and smp_load_acquire match the asm-generic variants exactly. Drop the local definitions and pull in asm-generic/barrier.h instead. This sta

15-day Public Review for Virtual I/O Device (VIRTIO) Version 1.0 - ends January 26th

2016-01-13 Thread Chet Ensign
OASIS members and other interested parties, The OASIS Virtual I/O Device (VIRTIO) TC members [1] have produced an updated Committee Specification Draft (CSD) and submitted this specification for 15-day public review: Virtual I/O Device (VIRTIO) Version 1.0 Committee Specification Draft 05 / Publi

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Linus Torvalds
On Wed, Jan 13, 2016 at 8:42 AM, Michael S. Tsirkin wrote: > > Oh, I think this means we need a "cc" clobber. It's probably a good idea to add one. Historically, gcc doesn't need one on x86, and always considers flags clobbered. We are probably missing the cc clobber in a *lot* of places for thi

[PATCH 0/2] vhost: cross-endian code cleanup

2016-01-13 Thread Greg Kurz
This series is a respin of the following patch: http://patchwork.ozlabs.org/patch/565921/ Patch 1 is preliminary work: it gives better names to the helpers that are involved in cross-endian support. Patch 2 is actually a v2 of the original patch. All devices now call a helper in the generic code

[PATCH 2/2] vhost: disentangle vring endianness stuff from the core code

2016-01-13 Thread Greg Kurz
The way vring endianness is being handled currently obfuscates the code in vhost_init_used(). This patch tries to fix that by doing the following: - move the the code that adjusts endianness to a dedicated helper - export this helper so that backends explicitely call it No behaviour change. Sign

[PATCH 1/2] vhost: helpers to enable/disable vring endianness

2016-01-13 Thread Greg Kurz
The default use case for vhost is when the host and the vring have the same endianness (default native endianness). But there are cases where they differ and vhost should byteswap when accessing the vring: - the host is big endian and the vring comes from a virtio 1.0 device which is always littl

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Michael S. Tsirkin
On Wed, Jan 13, 2016 at 05:53:20PM +0100, Borislav Petkov wrote: > On Wed, Jan 13, 2016 at 06:42:48PM +0200, Michael S. Tsirkin wrote: > > Oh, I think this means we need a "cc" clobber. > > Btw, does your microbenchmark do it too? Yes - I fixed it now, but it did not affect the result. We'd need

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Borislav Petkov
On Wed, Jan 13, 2016 at 06:42:48PM +0200, Michael S. Tsirkin wrote: > Oh, I think this means we need a "cc" clobber. Btw, does your microbenchmark do it too? Because, the "cc" clobber should cause additional handling of flags, depending on the context. It won't matter if the context doesn't need

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Michael S. Tsirkin
On Wed, Jan 13, 2016 at 05:33:31PM +0100, Borislav Petkov wrote: > On Wed, Jan 13, 2016 at 06:25:21PM +0200, Michael S. Tsirkin wrote: > > Which flag do you refer to, exactly? > > All the flags in rFLAGS which ADD modifies: OF,SF,ZF,AF,PF,CF Oh, I think this means we need a "cc" clobber. This al

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Borislav Petkov
On Wed, Jan 13, 2016 at 06:25:21PM +0200, Michael S. Tsirkin wrote: > Which flag do you refer to, exactly? All the flags in rFLAGS which ADD modifies: OF,SF,ZF,AF,PF,CF -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ___ Vir

[PULL] virtio: barrier rework+fixes

2016-01-13 Thread Michael S. Tsirkin
The following changes since commit afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc: Linux 4.4 (2016-01-10 15:01:32 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus for you to fetch changes up to 43e361f23c49dbddf74f56ddf6cdd8

Re: [PATCH] uapi: use __u8 from linux/types.h

2016-01-13 Thread Michael S. Tsirkin
On Wed, Jan 13, 2016 at 07:10:15PM +0300, Gleb Fotengauer-Malinovskiy wrote: > Kernel headers should use linux/types.h based definitions. > > Signed-off-by: Gleb Fotengauer-Malinovskiy Acked-by: Michael S. Tsirkin > --- > include/uapi/linux/virtio_gpu.h | 2 +- > 1 file changed, 1 insertion(+

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Michael S. Tsirkin
On Wed, Jan 13, 2016 at 05:17:04PM +0100, Borislav Petkov wrote: > On Tue, Jan 12, 2016 at 03:24:05PM -0800, Linus Torvalds wrote: > > But talking to the hw people about this is certainly a good idea regardless. > > I'm not seeing it in this thread but I might've missed it too. Anyway, > I'm being

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Michael S. Tsirkin
On Tue, Jan 12, 2016 at 01:37:38PM -0800, Linus Torvalds wrote: > On Tue, Jan 12, 2016 at 12:59 PM, Andy Lutomirski wrote: > > > > Here's an article with numbers: > > > > http://shipilev.net/blog/2014/on-the-fence-with-dependencies/ > > Well, that's with the busy loop and one set of code generati

Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016-01-13 Thread Borislav Petkov
On Tue, Jan 12, 2016 at 03:24:05PM -0800, Linus Torvalds wrote: > But talking to the hw people about this is certainly a good idea regardless. I'm not seeing it in this thread but I might've missed it too. Anyway, I'm being reminded that the ADD will change rFLAGS while MFENCE doesn't touch them.

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Will Deacon
On Tue, Jan 12, 2016 at 12:45:14PM -0800, Leonid Yegoshin wrote: > >The issue I have with the SYNC description in the text above is that it > >describes the single CPU (program order) and the dual-CPU (confusingly > >named global order) cases, but then doesn't generalise any further. That > >means