subject:"\[RFC PATCH v4 0\/9\]"

Re: [RFC PATCH v4 0/9] KVM: selftests: some improvement and a new test for kvm page table

2021-03-11 Thread wangyanan (Y)


Hi all,

Kindly ping :)!

Are there any further comments for this v4 series? Please let me know if 
there

is still something that needs fixing.

Or is this v4 series fine enough to be queued? Most of the patches have been
added with Reviewed-by. If there are merge conflicts with the newest branch,
please also let me know and I will send a new version fixed.

Regards,
Yanan

On 2021/3/2 20:57, Yanan Wang wrote:

Hi,
This v4 series can mainly include two parts.
Based on kvm queue branch: 
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=queue
Links of v1: 
https://lore.kernel.org/lkml/20210208090841.333724-1-wangyana...@huawei.com/
Links of v2: 
https://lore.kernel.org/lkml/20210225055940.18748-1-wangyana...@huawei.com/
Links of v3: 
https://lore.kernel.org/lkml/20210301065916.11484-1-wangyana...@huawei.com/

In the first part, all the known hugetlb backing src types specified
with different hugepage sizes are listed, so that we can specify use
of hugetlb source of the exact granularity that we want, instead of
the system default ones. And as all the known hugetlb page sizes are
listed, it's appropriate for all architectures. Besides, a helper that
can get granularity of different backing src types(anonumous/thp/hugetlb)
is added, so that we can use the accurate backing src granularity for
kinds of alignment or guest memory accessing of vcpus.

In the second part, a new test is added:
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm. And the following explains
what we can exactly do through this test.

The function guest_code() can cover the conditions where a single vcpu or
multiple vcpus access guest pages within the same memory region, in three
VM stages(before dirty logging, during dirty logging, after dirty logging).
Besides, the backing src memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings
or block mappings can be chosen by users to be created in the test.

If ANONYMOUS memory is specified, kvm will create normal page mappings
for the tested memory region before dirty logging, and update attributes
of the page mappings from RO to RW during dirty logging. If THP/HUGETLB
memory is specified, kvm will create block mappings for the tested memory
region before dirty logging, and split the blcok mappings into normal page
mappings during dirty logging, and coalesce the page mappings back into
block mappings after dirty logging is stopped.

So in summary, as a performance tester, this test can present the
performance of kvm creating/updating normal page mappings, or the
performance of kvm creating/splitting/recovering block mappings,
through execution time.

When we need to coalesce the page mappings back to block mappings after
dirty logging is stopped, we have to firstly invalidate *all* the TLB
entries for the page mappings right before installation of the block entry,
because a TLB conflict abort error could occur if we can't invalidate the
TLB entries fully. We have hit this TLB conflict twice on aarch64 software
implementation and fixed it. As this test can imulate process from dirty
logging enabled to dirty logging stopped of a VM with block mappings,
so it can also reproduce this TLB conflict abort due to inadequate TLB
invalidation when coalescing tables.

Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyana...@huawei.com/

---

Change logs:

v3->v4:
- Add a helper to get system default hugetlb page size
- Add tags of Reviewed-by of Ben in the patches

v2->v3:
- Add tags of Suggested-by, Reviewed-by in the patches
- Add a generic micro to get hugetlb page sizes
- Some changes for suggestions about v2 series

v1->v2:
- Add a patch to sync header files
- Add helpers to get granularity of different backing src types
- Some changes for suggestions about v1 series

---

Yanan Wang (9):
   tools headers: sync headers of asm-generic/hugetlb_encode.h
   tools headers: Add a macro to get HUGETLB page sizes for mmap
   KVM: selftests: Use flag CLOCK_MONOTONIC_RAW for timing
   KVM: selftests: Make a generic helper to get vm guest mode strings
   KVM: selftests: Add a helper to get system configured THP page size
   KVM: selftests: Add a helper to get system default hugetlb page size
   KVM: selftests: List all hugetlb src types specified with page sizes
   KVM: selftests: Adapt vm_userspace_mem_region_add to new helpers
   KVM: selftests: Add a test for kvm page table code

  include/uapi/linux/mman.h |   2 +
  tools/include/asm-generic/hugetlb_encode.h|   3 +
  tools/include/uapi/linux/mman.h   |   2 +
  tools/testing/selftests/kvm/Makefile  |   3 +
  .../selftests/kvm/demand_paging_test.c|   8 +-
  .../selftests/kvm/dirty_log_perf_test.c   |  14 +-
  .../testing/selftests/kvm/include/kvm_uti

[RFC PATCH v4 0/9] KVM: selftests: some improvement and a new test for kvm page table

2021-03-02 Thread Yanan Wang

Hi,
This v4 series can mainly include two parts.
Based on kvm queue branch: 
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=queue
Links of v1: 
https://lore.kernel.org/lkml/20210208090841.333724-1-wangyana...@huawei.com/
Links of v2: 
https://lore.kernel.org/lkml/20210225055940.18748-1-wangyana...@huawei.com/
Links of v3: 
https://lore.kernel.org/lkml/20210301065916.11484-1-wangyana...@huawei.com/

In the first part, all the known hugetlb backing src types specified
with different hugepage sizes are listed, so that we can specify use
of hugetlb source of the exact granularity that we want, instead of
the system default ones. And as all the known hugetlb page sizes are
listed, it's appropriate for all architectures. Besides, a helper that
can get granularity of different backing src types(anonumous/thp/hugetlb)
is added, so that we can use the accurate backing src granularity for
kinds of alignment or guest memory accessing of vcpus.

In the second part, a new test is added:
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm. And the following explains
what we can exactly do through this test.

The function guest_code() can cover the conditions where a single vcpu or
multiple vcpus access guest pages within the same memory region, in three
VM stages(before dirty logging, during dirty logging, after dirty logging).
Besides, the backing src memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings
or block mappings can be chosen by users to be created in the test.

If ANONYMOUS memory is specified, kvm will create normal page mappings
for the tested memory region before dirty logging, and update attributes
of the page mappings from RO to RW during dirty logging. If THP/HUGETLB
memory is specified, kvm will create block mappings for the tested memory
region before dirty logging, and split the blcok mappings into normal page
mappings during dirty logging, and coalesce the page mappings back into
block mappings after dirty logging is stopped.

So in summary, as a performance tester, this test can present the
performance of kvm creating/updating normal page mappings, or the
performance of kvm creating/splitting/recovering block mappings,
through execution time.

When we need to coalesce the page mappings back to block mappings after
dirty logging is stopped, we have to firstly invalidate *all* the TLB
entries for the page mappings right before installation of the block entry,
because a TLB conflict abort error could occur if we can't invalidate the
TLB entries fully. We have hit this TLB conflict twice on aarch64 software
implementation and fixed it. As this test can imulate process from dirty
logging enabled to dirty logging stopped of a VM with block mappings,
so it can also reproduce this TLB conflict abort due to inadequate TLB
invalidation when coalescing tables.

Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyana...@huawei.com/

---

Change logs:

v3->v4:
- Add a helper to get system default hugetlb page size
- Add tags of Reviewed-by of Ben in the patches

v2->v3:
- Add tags of Suggested-by, Reviewed-by in the patches
- Add a generic micro to get hugetlb page sizes
- Some changes for suggestions about v2 series

v1->v2:
- Add a patch to sync header files
- Add helpers to get granularity of different backing src types
- Some changes for suggestions about v1 series

---

Yanan Wang (9):
  tools headers: sync headers of asm-generic/hugetlb_encode.h
  tools headers: Add a macro to get HUGETLB page sizes for mmap
  KVM: selftests: Use flag CLOCK_MONOTONIC_RAW for timing
  KVM: selftests: Make a generic helper to get vm guest mode strings
  KVM: selftests: Add a helper to get system configured THP page size
  KVM: selftests: Add a helper to get system default hugetlb page size
  KVM: selftests: List all hugetlb src types specified with page sizes
  KVM: selftests: Adapt vm_userspace_mem_region_add to new helpers
  KVM: selftests: Add a test for kvm page table code

 include/uapi/linux/mman.h |   2 +
 tools/include/asm-generic/hugetlb_encode.h|   3 +
 tools/include/uapi/linux/mman.h   |   2 +
 tools/testing/selftests/kvm/Makefile  |   3 +
 .../selftests/kvm/demand_paging_test.c|   8 +-
 .../selftests/kvm/dirty_log_perf_test.c   |  14 +-
 .../testing/selftests/kvm/include/kvm_util.h  |   4 +-
 .../testing/selftests/kvm/include/test_util.h |  21 +-
 .../selftests/kvm/kvm_page_table_test.c   | 476 ++
 tools/testing/selftests/kvm/lib/kvm_util.c|  59 ++-
 tools/testing/selftests/kvm/lib/test_util.c   | 122 -
 tools/testing/selftests/kvm/steal_time.c  |   4 +-
 12 files changed, 659 insertions(+), 59 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/kvm_page_table_te

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-09 Thread Sergey Senozhatsky

On (09/06/19 16:01), Peter Zijlstra wrote:
> In fact, i've gotten output that is plain impossible with
> the current junk.

Peter, can you post any of those backtraces? Very curious.

-ss

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-08 Thread Peter Zijlstra

On Fri, Sep 06, 2019 at 04:01:26PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 06, 2019 at 02:42:11PM +0200, Petr Mladek wrote:

> > 7. People would complain when continuous lines become less
> >reliable. It might be most visible when mixing backtraces
> >from all CPUs. Simple sorting by prefix will not make
> >it readable. The historic way was to synchronize CPUs
> >by a spin lock. But then the cpu_lock() could cause
> >deadlock.
> 
> Why? I'm running with that thing on, I've never seen a deadlock ever
> because of it. In fact, i've gotten output that is plain impossible with
> the current junk.
> 
> The cpu-lock is inside the all-backtrace spinlock, not outside. And as I
> said yesterday, only the lockless console has any wait-loops while
> holding the cpu-lock. It _will_ make progress.

So I've been a huge flaming idiot.. so while I'm not particularly
sympathetic to NMIs that block, there are a number of really trivial
deadlocks possible -- and it is a minor miracle I've not actually hit
them (I suppose because printk() isn't really all that common).

The whole cpu-lock thing I had needs to go. But not having it makes
lockless console output unreadable and unsable garbage.

I've got some ideas on a replacement, but I need to further consider it.

:-/

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread John Ogness

On 2019-09-06, Peter Zijlstra  wrote:
>> I wish it was that simple. It is possible that I see it too
>> complicated. But this comes to my mind:
>> 
>> 1. The simple printk_buffer_store(buf, n) is not NMI safe. For this,
>>we might need the reserve-store approach.
>
> Of course it is, and sure it has a reserve+commit internally. I'm sure
> I posted an implenentation of something like this at some point.
>
> It is lockless (wait-free in fact, which is stronger) and supports
> multi-readers. I'm sure I posted something like that before, and ISTR
> John has something like that around somewhere too.

Yes. It was called RFCv1[0].

> The only thing I'm omitting is doing vscnprintf() twice, first to
> determine the length, and then into the reservation. Partly because I
> think that is silly and 256 chars should be plenty for everyone,
> partly because that avoids having vscnprintf() inside the cpu_lock()
> and partly because it is simpler to not do that.

Yes, this approach is more straight forward and was suggested in the
feedback to RFCv1. Although I think the current limit (1024) should
still be OK. Then we have 1 dedicated page per CPU for vscnprintf().

>> 2. The simple approach works only with lockless consoles. We need
>>something else for the rest at least for NMI. Simle offloading
>>to a kthread has been blocked for years. People wanted the
>>trylock-and-flush-immediately approach.
>
> Have an irq_work to wake up a kthread that will print to shit
> consoles.

This is the approach in all the RFC versions.

>> 5. John planed to use the cpu_lock in the lockless consoles.
>>I wonder if it was only in the console->write() callback
>>or if it would spread the lock more widely.

The 8250 driver in RFCv1 uses the cpu-lock in console->write() on a
per-character basis and in console->write_atomic() on a per-line
basis. This is necessary because the 8250 driver cannot run lockless. It
requires synchronization for its UART_IER clearing/setting before/after
transmit.

IMO the existing early console implementations are _not_ safe for
preemption. This was the reason for the new write_atomic() callback in
RFCv1.

> Right, I'm saying that since you need it anyway, lift it up one layer.
> It makes everything simpler. More simpler is more better.

This was my reasoning for using the cpu-lock in RFCv1. Moving to a
lockless ringbuffer for RFCv2 was because there was too much
resistance/concern surrounding the cpu-lock. But yes, if we want to
support atomic consoles, the cpu-lock will still be needed.

The cpu-lock (and the related concerns) were discussed here[1].

>> 7. People would complain when continuous lines become less
>>reliable. It might be most visible when mixing backtraces
>>from all CPUs. Simple sorting by prefix will not make
>>it readable. The historic way was to synchronize CPUs
>>by a spin lock. But then the cpu_lock() could cause
>>deadlock.
>
> Why? I'm running with that thing on, I've never seen a deadlock ever
> because of it.

As was discussed in the thread I just mentioned, introducing the
cpu-lock means that _all_ NMI functions taking spinlocks need to use the
cpu-lock. Even though Peter has never seen a deadlock, a deadlock is
possible if a BUG is triggered while one such spinlock is held. Also
note that it is not allowed to have 2 cpu-locks in the system. This is
where the BKL references started showing up.

Spinlocks in NMI context are rare, but they have existed in the past and
could exist again in the future. My suggestion was to create the policy
that any needed locking in NMI context must be done using the one
cpu-lock.

John Ogness

[0] https://lkml.kernel.org/r/20190212143003.48446-1-john.ogn...@linutronix.de
[1] https://lkml.kernel.org/r/20190227094655.ecdwhsc2bf5sp...@pathway.suse.cz

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Sergey Senozhatsky

On (09/06/19 16:01), Peter Zijlstra wrote:
> > 2. The simple approach works only with lockless consoles. We need
> >something else for the rest at least for NMI. Simle offloading
> >to a kthread has been blocked for years. People wanted the
> >trylock-and-flush-immediately approach.
> 
> Have an irq_work to wake up a kthread that will print to shit consoles.

Do we need sched dependency? We can print a batch of pending
logbuf messages and queue another irw_work if there are more
pending messages, right?

-ss

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Peter Zijlstra

On Fri, Sep 06, 2019 at 04:01:26PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 06, 2019 at 02:42:11PM +0200, Petr Mladek wrote:
> > 7. People would complain when continuous lines become less
> >reliable. It might be most visible when mixing backtraces
> >from all CPUs. Simple sorting by prefix will not make
> >it readable. The historic way was to synchronize CPUs
> >by a spin lock. But then the cpu_lock() could cause
> >deadlock.
> 
> Why? I'm running with that thing on, I've never seen a deadlock ever
> because of it. In fact, i've gotten output that is plain impossible with
> the current junk.
> 
> The cpu-lock is inside the all-backtrace spinlock, not outside. And as I
> said yesterday, only the lockless console has any wait-loops while
> holding the cpu-lock. It _will_ make progress.

Oooh, I think I see. So one solution would be to pass the NMI along in
chain like.  Send it to a single CPU at a time, when finished, send it
to the next.

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Peter Zijlstra

On Fri, Sep 06, 2019 at 02:42:11PM +0200, Petr Mladek wrote:

> I wish it was that simple. It is possible that I see it too
> complicated. But this comes to my mind:
> 
> 1. The simple printk_buffer_store(buf, n) is not NMI safe. For this,
>we might need the reserve-store approach.

Of course it is, and sure it has a reserve+commit internally. I'm sure I
posted an implenentation of something like this at some point.

It is lockless (wait-free in fact, which is stronger) and supports
multi-readers. I'm sure I posted something like that before, and ISTR
John has something like that around somewhere too.

The only thing I'm omitting is doing vscnprintf() twice, first to
determine the length, and then into the reservation. Partly because I
think that is silly and 256 chars should be plenty for everyone, partly
because that avoids having vscnprintf() inside the cpu_lock() and partly
because it is simpler to not do that.

> 2. The simple approach works only with lockless consoles. We need
>something else for the rest at least for NMI. Simle offloading
>to a kthread has been blocked for years. People wanted the
>trylock-and-flush-immediately approach.

Have an irq_work to wake up a kthread that will print to shit consoles.
Seriously.. the trylock and flush stuff is horrific crap. You guys been
piling on the hack for years now, surely you're tired of that gunk?

(and if you _realy_ care, build a flush function that 'works'
mostly and waits for the kthread of choice to finish printing to the
'imporant' shit console).

> 3. console_lock works in tty as a big kernel lock. I do not know
>much details. But people familiar with the code said that
>it was a disaster. I assume that tty is still rather
>important console. I am not sure how it would fit into the
>simple approach.

The kernel thread in charge of printing doesn't care.

> 4. The console handling has got non-synchronous (console_trylock)
>quite early (ver 2.4.10, year 2001). The reason was to do not
>serialize CPUs by the speed of the console.
> 
>Serialized output could remove many troubles. The logic in
>console_unlock() is really crazy. It might be acceptable
>for debugging. But is it acceptable on production systems?

The kernel thread doesn't care. If you care about independent consoles,
have a kernel thread per console. That way a fast console can print fast
while a slow console will print slow and everybody is happy.

> 5. John planed to use the cpu_lock in the lockless consoles.
>I wonder if it was only in the console->write() callback
>or if it would spread the lock more widely.

Right, I'm saying that since you need it anyway, lift it up one layer.
It makes everything simpler. More simpler is more better.

> 6. One huge nightmare is panic() and code called from there.
>It is a maze of hacks, including arch-specific code, to
>prevent deadlocks and get the messages out.
> 
>Any lock might be blocked on any CPU at the moment. Or it
>it might become blocked when CPUs are stopped by NMI.
> 
>Fully lock-less log buffer might save us some headache.
>I am not sure whether a single lock shared between printk()
>writers and console drivers will make the situation easier
>or more complicated.

So panic is a non issue for the lockless console.

It only matters if you care to get something out of the crap consoles.
So print everything to the lockless buffer and lockless consoles, then
try and force as much as you can out of the crap consoles.

If you die, tought luck, at least the lockless consoles and kdump image
have the whole message.

> 7. People would complain when continuous lines become less
>reliable. It might be most visible when mixing backtraces
>from all CPUs. Simple sorting by prefix will not make
>it readable. The historic way was to synchronize CPUs
>by a spin lock. But then the cpu_lock() could cause
>deadlock.

Why? I'm running with that thing on, I've never seen a deadlock ever
because of it. In fact, i've gotten output that is plain impossible with
the current junk.

The cpu-lock is inside the all-backtrace spinlock, not outside. And as I
said yesterday, only the lockless console has any wait-loops while
holding the cpu-lock. It _will_ make progress.

> I would be really happy when we could ignore some of the problems
> or find an easy solution. I just want to make sure that we take
> into account all the known aspects.
> 
> I am sure that we could do better than we do now. I do not want
> to block any improvements. I am just a bit lost in the many
> black corners.

I hope the above helps. Also note that Linus' memory buffer is a
lockless console.

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Sergey Senozhatsky

On (09/06/19 12:49), Peter Zijlstra wrote:
> On Fri, Sep 06, 2019 at 07:09:43PM +0900, Sergey Senozhatsky wrote:
> 
> > ---
> > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> > index 139c310049b1..9c73eb6259ce 100644
> > --- a/kernel/printk/printk_safe.c
> > +++ b/kernel/printk/printk_safe.c
> > @@ -103,7 +103,10 @@ static __printf(2, 0) int printk_safe_log_store(struct 
> > printk_safe_seq_buf *s,
> > if (atomic_cmpxchg(&s->len, len, len + add) != len)
> > goto again;
> >  
> > -   queue_flush_work(s);
> > +   if (early_console)
> > +   early_console->write(early_console, s->buffer + len, add);
> > +   else
> > +   queue_flush_work(s);
> > return add;
> >  }
> 
> You've not been following along, that generates absolutely unreadable
> garbage.

This was more of a joke/reference to "Those NMI buffers are a trainwreck
and need to die a horrible death". Of course this needs a re-entrant cpu
lock to serialize access to atomic/early consoles. But here is one more
missing thing - we need atomic/early consoles on a separate, sort of
immutable, list. And probably forbid any modifications of such console
drivers, (PM, etc.) If we can do this then we don't need to take console_sem
while we iterate that list, which removes sched/timekeeping locks out
of the fast printk() path.

We, at the same time, don't have that many options on systems without
atomic/early consoles. Move printing to NMI (e.g. up to X pending logbug
lines per NMI)? Move printing to IPI (again, up to X pending logbuf lines
per IPI)? printk() softirqs?

-ss

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Petr Mladek

On Fri 2019-09-06 11:06:27, Peter Zijlstra wrote:
> On Thu, Sep 05, 2019 at 04:31:18PM +0200, Peter Zijlstra wrote:
> > So I have something roughly like the below; I'm suggesting you add the
> > line with + on:
> > 
> >   int early_vprintk(const char *fmt, va_list args)
> >   {
> > char buf[256]; // teh suck!
> > int old, n = vscnprintf(buf, sizeof(buf), fmt, args);
> > 
> > old = cpu_lock();
> > +   printk_buffer_store(buf, n);
> > early_console->write(early_console, buf, n);
> > cpu_unlock(old);
> > 
> > return n;
> >   }
> > 
> > (yes, yes, we can get rid of the on-stack @buf thing with a
> > reserve+commit API, but who cares :-))
> 
> Another approach is something like:
> 
> DEFINE_PER_CPU(int, printk_nest);
> DEFINE_PER_CPU(char, printk_line[4][256]);
> 
> int vprintk(const char *fmt, va_list args)
> {
>   int c, n, i;
>   char *buf;
> 
>   preempt_disable();
>   i = min(3, this_cpu_inc_return(printk_nest) - 1);
>   buf = this_cpu_ptr(printk_line[i]);
>   n = vscnprintf(buf, 256, fmt, args);
> 
>   c = cpu_lock();
>   printk_buffer_store(buf, n);
>   if (early_console)
>   early_console->write(early_console, buf, n);
>   cpu_unlock(c);
> 
>   this_cpu_dec(printk_nest);
>   preempt_enable();
> 
>   return n;
> }
> 
> Again, simple and straight forward (and I'm sure it's been mentioned
> before too).
> 
> We really should not be making this stuff harder than it needs to be
> (and anybody whining about lines longer than 256 characters can just go
> away, those are unreadable anyway).

I wish it was that simple. It is possible that I see it too
complicated. But this comes to my mind:

1. The simple printk_buffer_store(buf, n) is not NMI safe. For this,
   we might need the reserve-store approach.

2. The simple approach works only with lockless consoles. We need
   something else for the rest at least for NMI. Simle offloading
   to a kthread has been blocked for years. People wanted the
   trylock-and-flush-immediately approach.

3. console_lock works in tty as a big kernel lock. I do not know
   much details. But people familiar with the code said that
   it was a disaster. I assume that tty is still rather
   important console. I am not sure how it would fit into the
   simple approach.

4. The console handling has got non-synchronous (console_trylock)
   quite early (ver 2.4.10, year 2001). The reason was to do not
   serialize CPUs by the speed of the console.

   Serialized output could remove many troubles. The logic in
   console_unlock() is really crazy. It might be acceptable
   for debugging. But is it acceptable on production systems?

5. John planed to use the cpu_lock in the lockless consoles.
   I wonder if it was only in the console->write() callback
   or if it would spread the lock more widely.

6. One huge nightmare is panic() and code called from there.
   It is a maze of hacks, including arch-specific code, to
   prevent deadlocks and get the messages out.

   Any lock might be blocked on any CPU at the moment. Or it
   it might become blocked when CPUs are stopped by NMI.

   Fully lock-less log buffer might save us some headache.
   I am not sure whether a single lock shared between printk()
   writers and console drivers will make the situation easier
   or more complicated.

7. People would complain when continuous lines become less
   reliable. It might be most visible when mixing backtraces
   from all CPUs. Simple sorting by prefix will not make
   it readable. The historic way was to synchronize CPUs
   by a spin lock. But then the cpu_lock() could cause
   deadlock.

I would be really happy when we could ignore some of the problems
or find an easy solution. I just want to make sure that we take
into account all the known aspects.

I am sure that we could do better than we do now. I do not want
to block any improvements. I am just a bit lost in the many
black corners.

Best Regards,
Petr

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Peter Zijlstra

On Fri, Sep 06, 2019 at 07:09:43PM +0900, Sergey Senozhatsky wrote:

> ---
> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index 139c310049b1..9c73eb6259ce 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -103,7 +103,10 @@ static __printf(2, 0) int printk_safe_log_store(struct 
> printk_safe_seq_buf *s,
> if (atomic_cmpxchg(&s->len, len, len + add) != len)
> goto again;
>  
> -   queue_flush_work(s);
> +   if (early_console)
> +   early_console->write(early_console, s->buffer + len, add);
> +   else
> +   queue_flush_work(s);
> return add;
>  }

You've not been following along, that generates absolutely unreadable
garbage.

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Sergey Senozhatsky

On (09/06/19 11:06), Peter Zijlstra wrote:
> Another approach is something like:
> 
> DEFINE_PER_CPU(int, printk_nest);
> DEFINE_PER_CPU(char, printk_line[4][256]);
>
> int vprintk(const char *fmt, va_list args)
> {
>   int c, n, i;
>   char *buf;
> 
>   preempt_disable();
>   i = min(3, this_cpu_inc_return(printk_nest) - 1);
>   buf = this_cpu_ptr(printk_line[i]);
>   n = vscnprintf(buf, 256, fmt, args);
> 
>   c = cpu_lock();
>   printk_buffer_store(buf, n);
>   if (early_console)
>   early_console->write(early_console, buf, n);
>   cpu_unlock(c);
>
>   this_cpu_dec(printk_nest);
>   preempt_enable();
> 
>   return n;
> }
> 
> Again, simple and straight forward (and I'm sure it's been mentioned
> before too).

 :)

---
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 139c310049b1..9c73eb6259ce 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -103,7 +103,10 @@ static __printf(2, 0) int printk_safe_log_store(struct 
printk_safe_seq_buf *s,
if (atomic_cmpxchg(&s->len, len, len + add) != len)
goto again;
 
-   queue_flush_work(s);
+   if (early_console)
+   early_console->write(early_console, s->buffer + len, add);
+   else
+   queue_flush_work(s);
return add;
 }
---

-ss

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Petr Mladek

On Thu 2019-09-05 12:11:01, Steven Rostedt wrote:
> 
> [ Added Ted and Linux Plumbers ]
> 
> On Thu, 5 Sep 2019 17:38:21 +0200 (CEST)
> Thomas Gleixner  wrote:
> 
> > On Thu, 5 Sep 2019, Peter Zijlstra wrote:
> > > On Thu, Sep 05, 2019 at 03:05:13PM +0200, Petr Mladek wrote:  
> > > > The alternative lockless approach is still more complicated than
> > > > the serialized one. But I think that it is manageable thanks to
> > > > the simplified state tracking. And I might safe use some pain
> > > > in the long term.  
> > > 
> > > I've not looked at it yet, sorry. But per the above argument of needing
> > > the CPU serialization _anyway_, I don't see a compelling reason not to
> > > use it.
> > > 
> > > It is simple, it works. Let's use it.
> > > 
> > > If you really fancy a multi-writer buffer, you can always switch to one
> > > later, if you can convince someone it actually brings benefits and not
> > > just head-aches.  
> > 
> > Can we please grab one of the TBD slots at kernel summit next week, sit
> > down in a room and hash that out?
> >
> 
> We should definitely be able to find a room that will be available next
> week.

Sounds great. I am blocked only during Livepatching miniconference
that is scheduled on Wednesday, Sep 11 at 15:00
(basically the very last slot).

Best Regards,
Petr

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-06 Thread Peter Zijlstra

On Thu, Sep 05, 2019 at 04:31:18PM +0200, Peter Zijlstra wrote:
> So I have something roughly like the below; I'm suggesting you add the
> line with + on:
> 
>   int early_vprintk(const char *fmt, va_list args)
>   {
>   char buf[256]; // teh suck!
>   int old, n = vscnprintf(buf, sizeof(buf), fmt, args);
> 
>   old = cpu_lock();
> + printk_buffer_store(buf, n);
>   early_console->write(early_console, buf, n);
>   cpu_unlock(old);
> 
>   return n;
>   }
> 
> (yes, yes, we can get rid of the on-stack @buf thing with a
> reserve+commit API, but who cares :-))

Another approach is something like:

DEFINE_PER_CPU(int, printk_nest);
DEFINE_PER_CPU(char, printk_line[4][256]);

int vprintk(const char *fmt, va_list args)
{
int c, n, i;
char *buf;

preempt_disable();
i = min(3, this_cpu_inc_return(printk_nest) - 1);
buf = this_cpu_ptr(printk_line[i]);
n = vscnprintf(buf, 256, fmt, args);

c = cpu_lock();
printk_buffer_store(buf, n);
if (early_console)
early_console->write(early_console, buf, n);
cpu_unlock(c);

this_cpu_dec(printk_nest);
preempt_enable();

return n;
}

Again, simple and straight forward (and I'm sure it's been mentioned
before too).

We really should not be making this stuff harder than it needs to be
(and anybody whining about lines longer than 256 characters can just go
away, those are unreadable anyway).

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-05 Thread John Ogness

On 2019-09-05, Steven Rostedt  wrote:
>>> But per the above argument of needing the CPU serialization
>>> _anyway_, I don't see a compelling reason not to use it.
>>> 
>>> It is simple, it works. Let's use it.
>>> 
>>> If you really fancy a multi-writer buffer, you can always switch to
>>> one later, if you can convince someone it actually brings benefits
>>> and not just head-aches.
>> 
>> Can we please grab one of the TBD slots at kernel summit next week,
>> sit down in a room and hash that out?
>>
>
> We should definitely be able to find a room that will be available
> next week.

FWIW, on Monday at 12:45 I am giving a talk[0] on the printk
rework. I'll be dedicating a few slides to presenting the lockless
multi-writer design, but will also talk about the serialized CPU
approach from RFCv1.

John Ogness

[0] https://www.linuxplumbersconf.org/event/4/contributions/290/

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-05 Thread Steven Rostedt



[ Added Ted and Linux Plumbers ]

On Thu, 5 Sep 2019 17:38:21 +0200 (CEST)
Thomas Gleixner  wrote:

> On Thu, 5 Sep 2019, Peter Zijlstra wrote:
> > On Thu, Sep 05, 2019 at 03:05:13PM +0200, Petr Mladek wrote:  
> > > The alternative lockless approach is still more complicated than
> > > the serialized one. But I think that it is manageable thanks to
> > > the simplified state tracking. And I might safe use some pain
> > > in the long term.  
> > 
> > I've not looked at it yet, sorry. But per the above argument of needing
> > the CPU serialization _anyway_, I don't see a compelling reason not to
> > use it.
> > 
> > It is simple, it works. Let's use it.
> > 
> > If you really fancy a multi-writer buffer, you can always switch to one
> > later, if you can convince someone it actually brings benefits and not
> > just head-aches.  
> 
> Can we please grab one of the TBD slots at kernel summit next week, sit
> down in a room and hash that out?
>

We should definitely be able to find a room that will be available next
week.

-- Steve

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-05 Thread Thomas Gleixner

On Thu, 5 Sep 2019, Peter Zijlstra wrote:
> On Thu, Sep 05, 2019 at 03:05:13PM +0200, Petr Mladek wrote:
> > The alternative lockless approach is still more complicated than
> > the serialized one. But I think that it is manageable thanks to
> > the simplified state tracking. And I might safe use some pain
> > in the long term.
> 
> I've not looked at it yet, sorry. But per the above argument of needing
> the CPU serialization _anyway_, I don't see a compelling reason not to
> use it.
> 
> It is simple, it works. Let's use it.
> 
> If you really fancy a multi-writer buffer, you can always switch to one
> later, if you can convince someone it actually brings benefits and not
> just head-aches.

Can we please grab one of the TBD slots at kernel summit next week, sit
down in a room and hash that out?

Thanks,

tglx

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-05 Thread Peter Zijlstra

On Thu, Sep 05, 2019 at 03:05:13PM +0200, Petr Mladek wrote:

> The serialized approach used a lock. It was re-entrant and thus less
> error-prone but still a lock.
> 
> The lock was planed to be used not only to access the buffer but also
> for eventual locking inside lockless consoles. It might allow to
> have some synchronization even in lockless consoles. But it
> would be big-kernel-lock-like style. It might create yet
> another maze of problems.

I really don't see your point. All it does is limit buffer writers to a
single CPU, and does the same for the atomic/early console output.

But it must very much be a leaf lock -- that is, there must not be any
locking inside it -- and that is fine, if a console cannot do lockless
output, it simply cannot be marked as having an atomic/early console.

You've seen the force_earlyprintk patches I use [*], that stuff works
and is infinitely better than the current printk trainwreck -- and it
uses exactly such serialization -- although I only added it to make the
output actually readable. And _that_ is exactly why I propose adding it,
you need it _anyway_.

So the argument goes like:

 - synchronous output to lockless consoles (early serial) is mandatory
 - such output needs to be CPU serialized, otherwise it becomes
   unreadable garbage.
 - since we need that serialization anyway, might as well lift it up one
   layer an put it around the buffer.

Since a single-cpu buffer writer can be wait free (and relatively
simple), the only possible waiting is on the lockless console (polling
until the UART is ready for it's next byte). There is nothing else. It
will make progress.

> If we remove per-CPU buffers in NMI. We would need to synchronize
> again printing backtraces from all CPUs. Otherwise they would get
> mixed and hard to read. It might be solved by some prefix and
> sorting in userspace but...

It must have cpu prefixes anyway; the multi-writer thing will equally
mix them together. This is a complete non sequitur.

That current printk stuff is just pure and utter crap. Those NMI buffers
are a trainwreck and need to die a horrible death.

> I agree that this lockless variant is really complicated. I am not
> able to prove that it is race free as it is now. I understand
> the algorithm. But there are too many synchronization points.
> 
> Peter, have you seen my alternative approach, please. See
> https://lore.kernel.org/lkml/20190704103321.10022-1-pmla...@suse.com/
> 
> It uses two tricks:
> 
>1. Two bits in the sequence number are used to track the state
>   of the related data. It allows to implement the entire
>   life cycle of each entry using atomic operation on a single
>   variable.
> 
>2. There is a helper function to read valid data for each entry,
>   see prb_read_desc(). It checks the state before and after
>   reading the data to make sure that they are valid. And
>   it includes the needed read barriers. As a result there
>   are only three explicit barriers in the code. All other
>   are implicitly done by cmpxchg() atomic operations.
> 
> The alternative lockless approach is still more complicated than
> the serialized one. But I think that it is manageable thanks to
> the simplified state tracking. And I might safe use some pain
> in the long term.

I've not looked at it yet, sorry. But per the above argument of needing
the CPU serialization _anyway_, I don't see a compelling reason not to
use it.

It is simple, it works. Let's use it.

If you really fancy a multi-writer buffer, you can always switch to one
later, if you can convince someone it actually brings benefits and not
just head-aches.

So I have something roughly like the below; I'm suggesting you add the
line with + on:

  int early_vprintk(const char *fmt, va_list args)
  {
char buf[256]; // teh suck!
int old, n = vscnprintf(buf, sizeof(buf), fmt, args);

old = cpu_lock();
+   printk_buffer_store(buf, n);
early_console->write(early_console, buf, n);
cpu_unlock(old);

return n;
  }

(yes, yes, we can get rid of the on-stack @buf thing with a
reserve+commit API, but who cares :-))

[*] git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git 
debug/experimental

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-05 Thread Petr Mladek

On Wed 2019-09-04 14:35:31, Peter Zijlstra wrote:
> On Thu, Aug 08, 2019 at 12:32:25AM +0206, John Ogness wrote:
> > Hello,
> > 
> > This is a follow-up RFC on the work to re-implement much of
> > the core of printk. The threads for the previous RFC versions
> > are here: v1[0], v2[1], v3[2].
> > 
> > This series only builds upon v3 (i.e. the first part of this
> > series is exactly v3). The main purpose of this series is to
> > replace the current printk ringbuffer with the new
> > ringbuffer. As was discussed[3], this is a conservative
> > first step to rework printk. For example, all logbuf_lock
> > usage is kept even though the new ringbuffer does not
> > require it. This avoids any side-effect bugs in case the
> > logbuf_lock is (unintentionally) synchronizing more than
> > just the ringbuffer. However, this also means that the
> > series does not bring any improvements, just swapping out
> > implementations. A future patch will remove the logbuf_lock.
> 
> So after reading most of the first patch (and it look _much_ better than
> previous times), I'm left wondering *why* ?!
> 
> That is, why do we need this complexity, as compared to that
> CPU serialized approach?

The serialized approach used a lock. It was re-entrant and thus less
error-prone but still a lock.

The lock was planed to be used not only to access the buffer but also
for eventual locking inside lockless consoles. It might allow to
have some synchronization even in lockless consoles. But it
would be big-kernel-lock-like style. It might create yet
another maze of problems.

If we remove per-CPU buffers in NMI. We would need to synchronize
again printing backtraces from all CPUs. Otherwise they would get
mixed and hard to read. It might be solved by some prefix and
sorting in userspace but...

This why I asked to see a fully lockless code to see how
more complicated it was. John told me that he had an early
version of it around.

I agree that this lockless variant is really complicated. I am not
able to prove that it is race free as it is now. I understand
the algorithm. But there are too many synchronization points.

Peter, have you seen my alternative approach, please. See
https://lore.kernel.org/lkml/20190704103321.10022-1-pmla...@suse.com/

It uses two tricks:

   1. Two bits in the sequence number are used to track the state
  of the related data. It allows to implement the entire
  life cycle of each entry using atomic operation on a single
  variable.

   2. There is a helper function to read valid data for each entry,
  see prb_read_desc(). It checks the state before and after
  reading the data to make sure that they are valid. And
  it includes the needed read barriers. As a result there
  are only three explicit barriers in the code. All other
  are implicitly done by cmpxchg() atomic operations.

The alternative lockless approach is still more complicated than
the serialized one. But I think that it is manageable thanks to
the simplified state tracking. And I might safe use some pain
in the long term.

> In my book simpler is better here. printk() is an absolute utter slow
> path anyway, nobody cares about the performance much, and I'm thinking
> that it should be plenty fast enough as long as you don't run a
> synchronous serial output (which is exactly what I do do/require
> anyway).

I fully agree.

Best Regards,
Petr

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-04 Thread Peter Zijlstra

On Thu, Aug 08, 2019 at 12:32:25AM +0206, John Ogness wrote:
> Hello,
> 
> This is a follow-up RFC on the work to re-implement much of
> the core of printk. The threads for the previous RFC versions
> are here: v1[0], v2[1], v3[2].
> 
> This series only builds upon v3 (i.e. the first part of this
> series is exactly v3). The main purpose of this series is to
> replace the current printk ringbuffer with the new
> ringbuffer. As was discussed[3], this is a conservative
> first step to rework printk. For example, all logbuf_lock
> usage is kept even though the new ringbuffer does not
> require it. This avoids any side-effect bugs in case the
> logbuf_lock is (unintentionally) synchronizing more than
> just the ringbuffer. However, this also means that the
> series does not bring any improvements, just swapping out
> implementations. A future patch will remove the logbuf_lock.

So after reading most of the first patch (and it look _much_ better than
previous times), I'm left wondering *why* ?!

That is, why do we need this complexity, as compared to that
CPU serialized approach?

What do we hope to gain by doing a multi-writer buffer? Yes, it is
awesome, but from where I'm sitting it is also completely silly, because
we'll want to CPU serialize the serial console anyway (otherwise it gets
to be a completely unreadable mess).

By having the whole thing CPU serialized we looose multi-writer and
consequently the buffer gets to be significantly simpler (as you know;
because ISTR you've actually done this before -- but I cannot find here
why that didn't live).

In my book simpler is better here. printk() is an absolute utter slow
path anyway, nobody cares about the performance much, and I'm thinking
that it should be plenty fast enough as long as you don't run a
synchronous serial output (which is exactly what I do do/require
anyway).

So can we have a few words to explain why we need multi-writer and all
this complexity?

[RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-08-07 Thread John Ogness

Hello,

This is a follow-up RFC on the work to re-implement much of
the core of printk. The threads for the previous RFC versions
are here: v1[0], v2[1], v3[2].

This series only builds upon v3 (i.e. the first part of this
series is exactly v3). The main purpose of this series is to
replace the current printk ringbuffer with the new
ringbuffer. As was discussed[3], this is a conservative
first step to rework printk. For example, all logbuf_lock
usage is kept even though the new ringbuffer does not
require it. This avoids any side-effect bugs in case the
logbuf_lock is (unintentionally) synchronizing more than
just the ringbuffer. However, this also means that the
series does not bring any improvements, just swapping out
implementations. A future patch will remove the logbuf_lock.

Except for the test module (patches 2 and 6), the rest may
already be interesting for mainline as is. I have tested
the various interfaces (console, /dev/kmsg, syslog,
kmsg_dump) and their features and all looks good AFAICT.

The patches can be broken down as follows:

1-2: the previously posted RFCv3

3-7: addresses minor issues from RFCv3

8:   adds new high-level ringbuffer functions to support
 printk (nothing involving new memory barriers)

9:   replace the ringbuffer usage in printk.c

One important thing to know (as is mentioned in the commit
message of patch 9), there are 2 externally visible
changes:

- vmcore info changes

- powerpc powernv/opal memdump of log discontinued

I have no idea how acceptable these changes are.

I will not be posting any further printk patches until I
have received some feedback on this. I appreciate all the
help so far. I realize that this is a lot of code to go
through.

The series is based on 5.3-rc3. I would encourage people to
apply the series and give it a run. I expect that you
will not notice any difference with your printk behaviour.

John Ogness

[0] https://lkml.kernel.org/r/20190212143003.48446-1-john.ogn...@linutronix.de
[1] https://lkml.kernel.org/r/20190607162349.18199-1-john.ogn...@linutronix.de
[2] https://lkml.kernel.org/r/2019072701.11260-1-john.ogn...@linutronix.de
[3] https://lkml.kernel.org/r/87y35hn6ih@linutronix.de

John Ogness (9):
  printk-rb: add a new printk ringbuffer implementation
  printk-rb: add test module
  printk-rb: fix missing includes/exports
  printk-rb: initialize new descriptors as invalid
  printk-rb: remove extra data buffer size allocation
  printk-rb: adjust test module ringbuffer sizes
  printk-rb: increase size of seq and size variables
  printk-rb: new functionality to support printk
  printk: use a new ringbuffer implementation

 arch/powerpc/platforms/powernv/opal.c |   22 +-
 include/linux/kmsg_dump.h |6 +-
 include/linux/printk.h|   12 -
 kernel/printk/Makefile|5 +
 kernel/printk/dataring.c  |  809 ++
 kernel/printk/dataring.h  |  108 +++
 kernel/printk/numlist.c   |  376 +
 kernel/printk/numlist.h   |   72 ++
 kernel/printk/printk.c|  745 +
 kernel/printk/ringbuffer.c| 1079 +
 kernel/printk/ringbuffer.h|  354 
 kernel/printk/test_prb.c  |  256 ++
 12 files changed, 3450 insertions(+), 394 deletions(-)
 create mode 100644 kernel/printk/dataring.c
 create mode 100644 kernel/printk/dataring.h
 create mode 100644 kernel/printk/numlist.c
 create mode 100644 kernel/printk/numlist.h
 create mode 100644 kernel/printk/ringbuffer.c
 create mode 100644 kernel/printk/ringbuffer.h
 create mode 100644 kernel/printk/test_prb.c

-- 
2.20.1

Re: [RFC PATCH v4 0/9]

2015-08-06 Thread Jaehoon Chung

On 08/06/2015 04:31 PM, Shawn Lin wrote:
> 在 2015/8/6 15:08, Jaehoon Chung 写道:
>> Hi, Shawn.
>>
>> I remembered that Krzysztof has mentioned "Fix the title of cover letter."
>> Your cover letter's title is nothing.. "[RFC PATCH v4 0/9] " ??
>> [RFC PATCH v4 0/9] your title...
>  Sorry, I forgot it, and will fix in next version...

No problem :) 
At next time,  add the title at your cover-letter, plz.

Best Regards,
Jaehoon Chung

> 
>> Best Regards,
>> Jaehoon Chung
>>
>> On 08/06/2015 03:44 PM, Shawn Lin wrote:
>>> Add external dma support for Synopsys MSHC
>>>
>>> Synopsys DesignWare mobile storage host controller supports three
>>> types of transfer mode: pio, internal dma and external dma. However,
>>> dw_mmc can only supports pio and internal dma now. Thus some platforms
>>> using dw-mshc integrated with generic dma can't work in dma mode. So we
>>> submit this patch to achieve it.
>>>
>>> And the config option, CONFIG_MMC_DW_IDMAC, was added by Will Newton
>>> (commit:f95f3850) for the first version of dw_mmc and never be touched since
>>> then. At that time dt-bindings hadn't been introduced into dw_mmc yet means
>>> we should select CONFIG_MMC_DW_IDMAC to enable internal dma mode at compile
>>> time. Nowadays, device-tree helps us to support a variety of boards with one
>>> kernel. That's why we need to remove it and decide the transfer mode by 
>>> reading
>>> dw_mmc's HCON reg at runtime.
>>>
>>> This RFC patch needs lots of ACKs. I know it's hard, but it does need 
>>> someone
>>> to make the running.
>>>
>>> Patch does the following things:
>>> - remove CONFIG_MMC_DW_IDMAC config option
>>> - add bindings for edmac used by synopsys-dw-mshc
>>>at runtime
>>> - add edmac support for synopsys-dw-mshc
>>>
>>> Patch is based on next of git://git.linaro.org/people/ulf.hansson/mmc
>>>
>>>
>>> Changes in v4:
>>> - remove "host->trans_mode" and use "host->use_dma" to indicate
>>>transfer mode.
>>> - remove all bt-bindings' changes since we don't need new properities.
>>> - check transfer mode at runtime by reading HCON reg
>>> - spilt defconfig changes for each sub-architecture
>>> - fix the title of cover letter
>>> - reuse some code for reducing code size
>>>
>>> Changes in v3:
>>> - choose transfer mode at runtime
>>> - remove all CONFIG_MMC_DW_IDMAC config option
>>> - add supports-idmac property for some platforms
>>>
>>> Changes in v2:
>>> - Fix typo of dev_info msg
>>> - remove unused dmach from declaration of dw_mci_dma_slave
>>>
>>> Shawn Lin (9):
>>>mmc: dw_mmc: Add external dma interface support
>>>Documentation: synopsys-dw-mshc: add bindings for idmac and edmac
>>>mips: pistachio_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>arc: axs10x_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>arm: exynos_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>arm: hisi_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>arm: lpc18xx_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>arm: multi_v7_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>arm: zx_defconfig: remove CONFIG_MMC_DW_IDMAC
>>>
>>>   .../devicetree/bindings/mmc/synopsys-dw-mshc.txt   |  25 ++
>>>   arch/arc/configs/axs101_defconfig  |   1 -
>>>   arch/arc/configs/axs103_defconfig  |   1 -
>>>   arch/arc/configs/axs103_smp_defconfig  |   1 -
>>>   arch/arm/configs/exynos_defconfig  |   1 -
>>>   arch/arm/configs/hisi_defconfig|   1 -
>>>   arch/arm/configs/lpc18xx_defconfig |   1 -
>>>   arch/arm/configs/multi_v7_defconfig|   1 -
>>>   arch/arm/configs/zx_defconfig  |   1 -
>>>   arch/mips/configs/pistachio_defconfig  |   1 -
>>>   drivers/mmc/host/Kconfig   |  11 +-
>>>   drivers/mmc/host/dw_mmc-pltfm.c|   2 +
>>>   drivers/mmc/host/dw_mmc.c  | 258 
>>> +
>>>   include/linux/mmc/dw_mmc.h |  27 ++-
>>>   14 files changed, 257 insertions(+), 75 deletions(-)
>>>
>>
>>
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v4 0/9]

2015-08-06 Thread Shawn Lin


在 2015/8/6 15:08, Jaehoon Chung 写道:

Hi, Shawn.

I remembered that Krzysztof has mentioned "Fix the title of cover letter."
Your cover letter's title is nothing.. "[RFC PATCH v4 0/9] " ??
[RFC PATCH v4 0/9] your title...

 Sorry, I forgot it, and will fix in next version...


Best Regards,
Jaehoon Chung

On 08/06/2015 03:44 PM, Shawn Lin wrote:

Add external dma support for Synopsys MSHC

Synopsys DesignWare mobile storage host controller supports three
types of transfer mode: pio, internal dma and external dma. However,
dw_mmc can only supports pio and internal dma now. Thus some platforms
using dw-mshc integrated with generic dma can't work in dma mode. So we
submit this patch to achieve it.

And the config option, CONFIG_MMC_DW_IDMAC, was added by Will Newton
(commit:f95f3850) for the first version of dw_mmc and never be touched since
then. At that time dt-bindings hadn't been introduced into dw_mmc yet means
we should select CONFIG_MMC_DW_IDMAC to enable internal dma mode at compile
time. Nowadays, device-tree helps us to support a variety of boards with one
kernel. That's why we need to remove it and decide the transfer mode by reading
dw_mmc's HCON reg at runtime.

This RFC patch needs lots of ACKs. I know it's hard, but it does need someone
to make the running.

Patch does the following things:
- remove CONFIG_MMC_DW_IDMAC config option
- add bindings for edmac used by synopsys-dw-mshc
   at runtime
- add edmac support for synopsys-dw-mshc

Patch is based on next of git://git.linaro.org/people/ulf.hansson/mmc


Changes in v4:
- remove "host->trans_mode" and use "host->use_dma" to indicate
   transfer mode.
- remove all bt-bindings' changes since we don't need new properities.
- check transfer mode at runtime by reading HCON reg
- spilt defconfig changes for each sub-architecture
- fix the title of cover letter
- reuse some code for reducing code size

Changes in v3:
- choose transfer mode at runtime
- remove all CONFIG_MMC_DW_IDMAC config option
- add supports-idmac property for some platforms

Changes in v2:
- Fix typo of dev_info msg
- remove unused dmach from declaration of dw_mci_dma_slave

Shawn Lin (9):
   mmc: dw_mmc: Add external dma interface support
   Documentation: synopsys-dw-mshc: add bindings for idmac and edmac
   mips: pistachio_defconfig: remove CONFIG_MMC_DW_IDMAC
   arc: axs10x_defconfig: remove CONFIG_MMC_DW_IDMAC
   arm: exynos_defconfig: remove CONFIG_MMC_DW_IDMAC
   arm: hisi_defconfig: remove CONFIG_MMC_DW_IDMAC
   arm: lpc18xx_defconfig: remove CONFIG_MMC_DW_IDMAC
   arm: multi_v7_defconfig: remove CONFIG_MMC_DW_IDMAC
   arm: zx_defconfig: remove CONFIG_MMC_DW_IDMAC

  .../devicetree/bindings/mmc/synopsys-dw-mshc.txt   |  25 ++
  arch/arc/configs/axs101_defconfig  |   1 -
  arch/arc/configs/axs103_defconfig  |   1 -
  arch/arc/configs/axs103_smp_defconfig  |   1 -
  arch/arm/configs/exynos_defconfig  |   1 -
  arch/arm/configs/hisi_defconfig|   1 -
  arch/arm/configs/lpc18xx_defconfig |   1 -
  arch/arm/configs/multi_v7_defconfig|   1 -
  arch/arm/configs/zx_defconfig  |   1 -
  arch/mips/configs/pistachio_defconfig  |   1 -
  drivers/mmc/host/Kconfig   |  11 +-
  drivers/mmc/host/dw_mmc-pltfm.c|   2 +
  drivers/mmc/host/dw_mmc.c  | 258 +
  include/linux/mmc/dw_mmc.h |  27 ++-
  14 files changed, 257 insertions(+), 75 deletions(-)








--
Shawn Lin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v4 0/9]

2015-08-06 Thread Jaehoon Chung

Hi, Shawn.

I remembered that Krzysztof has mentioned "Fix the title of cover letter."
Your cover letter's title is nothing.. "[RFC PATCH v4 0/9] " ??
[RFC PATCH v4 0/9] your title...

Best Regards,
Jaehoon Chung

On 08/06/2015 03:44 PM, Shawn Lin wrote:
> Add external dma support for Synopsys MSHC
> 
> Synopsys DesignWare mobile storage host controller supports three
> types of transfer mode: pio, internal dma and external dma. However,
> dw_mmc can only supports pio and internal dma now. Thus some platforms
> using dw-mshc integrated with generic dma can't work in dma mode. So we
> submit this patch to achieve it.
> 
> And the config option, CONFIG_MMC_DW_IDMAC, was added by Will Newton
> (commit:f95f3850) for the first version of dw_mmc and never be touched since
> then. At that time dt-bindings hadn't been introduced into dw_mmc yet means
> we should select CONFIG_MMC_DW_IDMAC to enable internal dma mode at compile
> time. Nowadays, device-tree helps us to support a variety of boards with one
> kernel. That's why we need to remove it and decide the transfer mode by 
> reading
> dw_mmc's HCON reg at runtime.
> 
> This RFC patch needs lots of ACKs. I know it's hard, but it does need someone
> to make the running.
> 
> Patch does the following things:
> - remove CONFIG_MMC_DW_IDMAC config option
> - add bindings for edmac used by synopsys-dw-mshc
>   at runtime
> - add edmac support for synopsys-dw-mshc
> 
> Patch is based on next of git://git.linaro.org/people/ulf.hansson/mmc
> 
> 
> Changes in v4:
> - remove "host->trans_mode" and use "host->use_dma" to indicate
>   transfer mode.
> - remove all bt-bindings' changes since we don't need new properities.
> - check transfer mode at runtime by reading HCON reg
> - spilt defconfig changes for each sub-architecture
> - fix the title of cover letter
> - reuse some code for reducing code size
> 
> Changes in v3:
> - choose transfer mode at runtime
> - remove all CONFIG_MMC_DW_IDMAC config option
> - add supports-idmac property for some platforms
> 
> Changes in v2:
> - Fix typo of dev_info msg
> - remove unused dmach from declaration of dw_mci_dma_slave
> 
> Shawn Lin (9):
>   mmc: dw_mmc: Add external dma interface support
>   Documentation: synopsys-dw-mshc: add bindings for idmac and edmac
>   mips: pistachio_defconfig: remove CONFIG_MMC_DW_IDMAC
>   arc: axs10x_defconfig: remove CONFIG_MMC_DW_IDMAC
>   arm: exynos_defconfig: remove CONFIG_MMC_DW_IDMAC
>   arm: hisi_defconfig: remove CONFIG_MMC_DW_IDMAC
>   arm: lpc18xx_defconfig: remove CONFIG_MMC_DW_IDMAC
>   arm: multi_v7_defconfig: remove CONFIG_MMC_DW_IDMAC
>   arm: zx_defconfig: remove CONFIG_MMC_DW_IDMAC
> 
>  .../devicetree/bindings/mmc/synopsys-dw-mshc.txt   |  25 ++
>  arch/arc/configs/axs101_defconfig  |   1 -
>  arch/arc/configs/axs103_defconfig  |   1 -
>  arch/arc/configs/axs103_smp_defconfig  |   1 -
>  arch/arm/configs/exynos_defconfig  |   1 -
>  arch/arm/configs/hisi_defconfig|   1 -
>  arch/arm/configs/lpc18xx_defconfig |   1 -
>  arch/arm/configs/multi_v7_defconfig|   1 -
>  arch/arm/configs/zx_defconfig  |   1 -
>  arch/mips/configs/pistachio_defconfig  |   1 -
>  drivers/mmc/host/Kconfig   |  11 +-
>  drivers/mmc/host/dw_mmc-pltfm.c|   2 +
>  drivers/mmc/host/dw_mmc.c  | 258 
> +
>  include/linux/mmc/dw_mmc.h |  27 ++-
>  14 files changed, 257 insertions(+), 75 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 0/9]

2015-08-05 Thread Shawn Lin

Add external dma support for Synopsys MSHC

Synopsys DesignWare mobile storage host controller supports three
types of transfer mode: pio, internal dma and external dma. However,
dw_mmc can only supports pio and internal dma now. Thus some platforms
using dw-mshc integrated with generic dma can't work in dma mode. So we
submit this patch to achieve it.

And the config option, CONFIG_MMC_DW_IDMAC, was added by Will Newton
(commit:f95f3850) for the first version of dw_mmc and never be touched since
then. At that time dt-bindings hadn't been introduced into dw_mmc yet means
we should select CONFIG_MMC_DW_IDMAC to enable internal dma mode at compile
time. Nowadays, device-tree helps us to support a variety of boards with one
kernel. That's why we need to remove it and decide the transfer mode by reading
dw_mmc's HCON reg at runtime.

This RFC patch needs lots of ACKs. I know it's hard, but it does need someone
to make the running.

Patch does the following things:
- remove CONFIG_MMC_DW_IDMAC config option
- add bindings for edmac used by synopsys-dw-mshc
  at runtime
- add edmac support for synopsys-dw-mshc

Patch is based on next of git://git.linaro.org/people/ulf.hansson/mmc


Changes in v4:
- remove "host->trans_mode" and use "host->use_dma" to indicate
  transfer mode.
- remove all bt-bindings' changes since we don't need new properities.
- check transfer mode at runtime by reading HCON reg
- spilt defconfig changes for each sub-architecture
- fix the title of cover letter
- reuse some code for reducing code size

Changes in v3:
- choose transfer mode at runtime
- remove all CONFIG_MMC_DW_IDMAC config option
- add supports-idmac property for some platforms

Changes in v2:
- Fix typo of dev_info msg
- remove unused dmach from declaration of dw_mci_dma_slave

Shawn Lin (9):
  mmc: dw_mmc: Add external dma interface support
  Documentation: synopsys-dw-mshc: add bindings for idmac and edmac
  mips: pistachio_defconfig: remove CONFIG_MMC_DW_IDMAC
  arc: axs10x_defconfig: remove CONFIG_MMC_DW_IDMAC
  arm: exynos_defconfig: remove CONFIG_MMC_DW_IDMAC
  arm: hisi_defconfig: remove CONFIG_MMC_DW_IDMAC
  arm: lpc18xx_defconfig: remove CONFIG_MMC_DW_IDMAC
  arm: multi_v7_defconfig: remove CONFIG_MMC_DW_IDMAC
  arm: zx_defconfig: remove CONFIG_MMC_DW_IDMAC

 .../devicetree/bindings/mmc/synopsys-dw-mshc.txt   |  25 ++
 arch/arc/configs/axs101_defconfig  |   1 -
 arch/arc/configs/axs103_defconfig  |   1 -
 arch/arc/configs/axs103_smp_defconfig  |   1 -
 arch/arm/configs/exynos_defconfig  |   1 -
 arch/arm/configs/hisi_defconfig|   1 -
 arch/arm/configs/lpc18xx_defconfig |   1 -
 arch/arm/configs/multi_v7_defconfig|   1 -
 arch/arm/configs/zx_defconfig  |   1 -
 arch/mips/configs/pistachio_defconfig  |   1 -
 drivers/mmc/host/Kconfig   |  11 +-
 drivers/mmc/host/dw_mmc-pltfm.c|   2 +
 drivers/mmc/host/dw_mmc.c  | 258 +
 include/linux/mmc/dw_mmc.h |  27 ++-
 14 files changed, 257 insertions(+), 75 deletions(-)

-- 
2.3.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 0/9] CPU hotplug: stop_machine()-free CPU hotplug

2012-12-11 Thread Srivatsa S. Bhat

Hi,

This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
offline path and provides an alternative (set of APIs) to preempt_disable() to
prevent CPUs from going offline, which can be invoked from atomic context.

This is an RFC patchset with only a few call-sites of preempt_disable()
converted to the new APIs for now, and the main goal is to get feedback on the
design of the new atomic APIs and see if it serves as a viable replacement for
stop_machine()-free CPU hotplug. A brief description of the algorithm is
available in the "Changes in vN" section.

Overview of the patches:
---

Patch 1 introduces the new APIs that can be used from atomic context, to
prevent CPUs from going offline.

Patch 2 is a cleanup; it converts preprocessor macros to static inline
functions.

Patches 3 to 8 convert various call-sites to use the new APIs.

Patch 9 is the one which actually removes stop_machine() from the CPU
offline path.

Changes in v4:
--
  The synchronization scheme has been simplified quite a bit, which makes it
  look a lot less complex than before. Some highlights:

* Implicit ACKs:

  The earlier design required the readers to explicitly ACK the writer's
  signal. The new design uses implicit ACKs instead. The reader switching
  over to rwlock implicitly tells the writer to stop waiting for that reader.

* No atomic operations:

  Since we got rid of explicit ACKs, we no longer have the need for a reader
  and a writer to update the same counter. So we can get rid of atomic ops
  too.

Changes in v3:
--
* Dropped the _light() and _full() variants of the APIs. Provided a single
  interface: get/put_online_cpus_atomic().

* Completely redesigned the synchronization mechanism again, to make it
  fast and scalable at the reader-side in the fast-path (when no hotplug
  writers are active). This new scheme also ensures that there is no
  possibility of deadlocks due to circular locking dependency.
  In summary, this provides the scalability and speed of per-cpu rwlocks
  (without actually using them), while avoiding the downside (deadlock
  possibilities) which is inherent in any per-cpu locking scheme that is
  meant to compete with preempt_disable()/enable() in terms of flexibility.

  The problem with using per-cpu locking to replace preempt_disable()/enable
  was explained here:
  https://lkml.org/lkml/2012/12/6/290

  Basically we use per-cpu counters (for scalability) when no writers are
  active, and then switch to global rwlocks (for lock-safety) when a writer
  becomes active. It is a slightly complex scheme, but it is based on
  standard principles of distributed algorithms.

Changes in v2:
-
* Completely redesigned the synchronization scheme to avoid using any extra
  cpumasks.

* Provided APIs for 2 types of atomic hotplug readers: "light" (for
  light-weight) and "full". We wish to have more "light" readers than
  the "full" ones, to avoid indirectly inducing the "stop_machine effect"
  without even actually using stop_machine().

  And the patches show that it _is_ generally true: 5 patches deal with
  "light" readers, whereas only 1 patch deals with a "full" reader.

  Also, the "light" readers happen to be in very hot paths. So it makes a
  lot of sense to have such a distinction and a corresponding light-weight
  API.

Links to previous versions:
v3: https://lkml.org/lkml/2012/12/7/287
v2: https://lkml.org/lkml/2012/12/5/322
v1: https://lkml.org/lkml/2012/12/4/88

Comments and suggestions welcome!

--
 Paul E. McKenney (1):
  cpu: No more __stop_machine() in _cpu_down()

Srivatsa S. Bhat (8):
  CPU hotplug: Provide APIs to prevent CPU offline from atomic context
  CPU hotplug: Convert preprocessor macros to static inline functions
  smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline 
properly
  smp, cpu hotplug: Fix on_each_cpu_*() to prevent CPU offline properly
  sched, cpu hotplug: Use stable online cpus in try_to_wake_up() & 
select_task_rq()
  kick_process(), cpu-hotplug: Prevent offlining of target CPU properly
  yield_to(), cpu-hotplug: Prevent offlining of other CPUs properly
  kvm, vmx: Add atomic synchronization with CPU Hotplug


  arch/x86/kvm/vmx.c  |8 +-
 include/linux/cpu.h |8 +-
 kernel/cpu.c|  206 ++-
 kernel/sched/core.c |   22 +
 kernel/smp.c|   63 ++--
 5 files changed, 273 insertions(+), 34 deletions(-)



Thanks,
Srivatsa S. Bhat
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v4 0/9] KVM: selftests: some improvement and a new test for kvm page table

[RFC PATCH v4 0/9] KVM: selftests: some improvement and a new test for kvm page table

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

[RFC PATCH v4 0/9] printk: new ringbuffer implementation

Re: [RFC PATCH v4 0/9]

Re: [RFC PATCH v4 0/9]

Re: [RFC PATCH v4 0/9]

[RFC PATCH v4 0/9]

[RFC PATCH v4 0/9] CPU hotplug: stop_machine()-free CPU hotplug

26 matches

Site Navigation

Mail list logo

Footer information