Re: Problem with atomic accesses in pstore on some ARM CPUs

2016-08-16 Thread Colin Cross
On Mon, Aug 15, 2016 at 3:15 PM, Mark Rutland  wrote:
> On Tue, Aug 16, 2016 at 08:02:53AM -0700, Guenter Roeck wrote:
>> On Tue, Aug 16, 2016 at 6:21 AM, Will Deacon  wrote:
>> > On Tue, Aug 16, 2016 at 06:14:53AM -0700, Guenter Roeck wrote:
>> >> On Tue, Aug 16, 2016 at 3:32 AM, Robin Murphy  
>> >> wrote:
>> >> > On 16/08/16 00:19, Guenter Roeck wrote:
>> >> >> we are having a problem with atomic accesses in pstore on some ARM
>> >> >> CPUs (specifically rk3288 and rk3399). With those chips, atomic
>> >> >> accesses fail with both pgprot_noncached and pgprot_writecombine
>> >> >> memory. Atomic accesses do work when selecting PAGE_KERNEL protection.
>> >> >
>> >> > What's the pstore backed by? I'm guessing it's not normal DRAM.
>> >> >
>> >>
>> >> it is normal DRAM.
>> >
>> > In which case, why does it need to be mapped with weird attributes?
>> > Is there an alias in the linear map you can use?
>> >
>>
>> I don't really _want_ to do anything besides using pstore as-is, or,
>> in other words, to have the upstream kernel work with the affected
>> systems.
>>
>> The current pstore code runs the following code for memory with
>> pfn_valid() = true.
>>
>> if (memtype)
>> prot = pgprot_noncached(PAGE_KERNEL);
>> else
>> prot = pgprot_writecombine(PAGE_KERNEL);
>> ...
>> vaddr = vmap(pages, page_count, VM_MAP, prot);
>>
>> It then uses the memory pointed to by vaddr for atomic operations.
>
> This means that the generic ramoops / pstore code is making non-portable
> assumptions about memory types.
>
> So _something_ has to happen to that code.
>
>> In my case, both protection options don't work. Everything works fine
>> (or at least doesn't create an exception) if I use
>> vaddr = vmap(pages, page_count, VM_MAP, PAGE_KERNEL);
>> instead.
>
> Architecturally, that will give you a memory type to which we can safely use
> atomics on.
>
> It would be nice to know why the ramoops/pstore code must use atomics, and
> exactly what it's trying to achieve (i.e. is this just for serialisation, or 
> an
> attempt to ensure persistence).
>
> Depending on that, it may be possible to fix things more generically by using
> memremap by default, for example, and only allowing uncached mappings on those
> architectures which support them.

persistent_ram uses atomic ops in uncached memory to store the start
and end positions in the ringbuffer so that the state of the
ringbuffer will be valid if the kernel crashes at any time.  This was
inherited from Android's ram_console implementation, and worked
through armv7.  It has been causing more and more problems recently,
see for example 027bc8b08242c59e19356b4b2c189f2d849ab660 (pstore-ram:
Allow optional mapping with pgprot_noncached) and
7ae9cb81933515dc7db1aa3c47ef7653717e3090 (pstore-ram: Fix hangs by
using write-combine mappings).

Maybe it should be replaced with a spinlock in normal ram protecting
writes to the uncached region.


Re: Problem with atomic accesses in pstore on some ARM CPUs

2016-08-16 Thread Colin Cross
On Mon, Aug 15, 2016 at 3:15 PM, Mark Rutland  wrote:
> On Tue, Aug 16, 2016 at 08:02:53AM -0700, Guenter Roeck wrote:
>> On Tue, Aug 16, 2016 at 6:21 AM, Will Deacon  wrote:
>> > On Tue, Aug 16, 2016 at 06:14:53AM -0700, Guenter Roeck wrote:
>> >> On Tue, Aug 16, 2016 at 3:32 AM, Robin Murphy  
>> >> wrote:
>> >> > On 16/08/16 00:19, Guenter Roeck wrote:
>> >> >> we are having a problem with atomic accesses in pstore on some ARM
>> >> >> CPUs (specifically rk3288 and rk3399). With those chips, atomic
>> >> >> accesses fail with both pgprot_noncached and pgprot_writecombine
>> >> >> memory. Atomic accesses do work when selecting PAGE_KERNEL protection.
>> >> >
>> >> > What's the pstore backed by? I'm guessing it's not normal DRAM.
>> >> >
>> >>
>> >> it is normal DRAM.
>> >
>> > In which case, why does it need to be mapped with weird attributes?
>> > Is there an alias in the linear map you can use?
>> >
>>
>> I don't really _want_ to do anything besides using pstore as-is, or,
>> in other words, to have the upstream kernel work with the affected
>> systems.
>>
>> The current pstore code runs the following code for memory with
>> pfn_valid() = true.
>>
>> if (memtype)
>> prot = pgprot_noncached(PAGE_KERNEL);
>> else
>> prot = pgprot_writecombine(PAGE_KERNEL);
>> ...
>> vaddr = vmap(pages, page_count, VM_MAP, prot);
>>
>> It then uses the memory pointed to by vaddr for atomic operations.
>
> This means that the generic ramoops / pstore code is making non-portable
> assumptions about memory types.
>
> So _something_ has to happen to that code.
>
>> In my case, both protection options don't work. Everything works fine
>> (or at least doesn't create an exception) if I use
>> vaddr = vmap(pages, page_count, VM_MAP, PAGE_KERNEL);
>> instead.
>
> Architecturally, that will give you a memory type to which we can safely use
> atomics on.
>
> It would be nice to know why the ramoops/pstore code must use atomics, and
> exactly what it's trying to achieve (i.e. is this just for serialisation, or 
> an
> attempt to ensure persistence).
>
> Depending on that, it may be possible to fix things more generically by using
> memremap by default, for example, and only allowing uncached mappings on those
> architectures which support them.

persistent_ram uses atomic ops in uncached memory to store the start
and end positions in the ringbuffer so that the state of the
ringbuffer will be valid if the kernel crashes at any time.  This was
inherited from Android's ram_console implementation, and worked
through armv7.  It has been causing more and more problems recently,
see for example 027bc8b08242c59e19356b4b2c189f2d849ab660 (pstore-ram:
Allow optional mapping with pgprot_noncached) and
7ae9cb81933515dc7db1aa3c47ef7653717e3090 (pstore-ram: Fix hangs by
using write-combine mappings).

Maybe it should be replaced with a spinlock in normal ram protecting
writes to the uncached region.


Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

2016-07-13 Thread Colin Cross
On Wed, Jul 13, 2016 at 11:21 AM, Tejun Heo  wrote:
> (cc'ing Oleg)
>
> Hello,
>
> On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
>>   So Dmitry Shmidt recently noticed that with 4.4 based systems we're
>> seeing quite a bit of performance overhead from
>> __cgroup_procs_write().
>>
>> With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
>> often take 10s of miliseconds to execute (with max times up in the
>> 80ms range).
>
> Yikes, that's pretty high.  Does this happen only while the system is
> generally busy or regardless of overall load?
>
>> While with 4.1 it was quite often in the single usec range, and max
>> time values still in in sub-milisecond range.
>>
>> The majority of these performance regressions seem to come from the
>> locking changes in:
>>
>> 3014dde762f6 ("cgroup: simplify threadgroup locking")
>> and
>> 1ed1328792ff  ("sched, cgroup: replace signal_struct->group_rwsem with
>> a global percpu_rwsem")
>>
>> Dmitry has found that by reverting these two changes (which don't
>> revert easiliy), we can get back down to tens 10-100 usec range for
>> most calls, with max values occasionally spiking to ~18ms.
>>
>> Those two commits do talk about performance regressions, that were
>> supposedly alleviated by percpu_rwsem changes, but I'm not sure we are
>> seeing this.
>>
>> In 1ed1328792ff, the commit talks about the write path being a fairly
>> cold path, but with Android I worry this may not actually be the case,
>> as Android uses cpuset cgroups to group tasks into foreground and
>> background tasks, but this means when switching applications, tasks
>> are migrated between cgroups. Putting an additional 80 milisecond
>> delay on this adds potentially visible latencies on task switching.
>
> Switching between foreground and background isn't a hot path.  It's
> human initiated operations after all.  It taking 80 msecs sure is
> problematic but I'm skeptical that this is from actual contention
> given that the only reader side holders are fork and exit paths.

Slight correction here, we move tasks to the foreground cgroup and
back around binder IPC calls from foreground processes to background
processes, so it is significantly hotter than just human initiated
operations.

>> Reverting those two changes in the Android common.git tree doesn't
>> feel like a good long term solution here, so I was wondering if you
>> had any thoughts on how to further reduce the performance regression
>> here?
>
> One interesting thing to try would be replacing it with a regular
> non-percpu rwsem and see how it behaves.  That should easily tell us
> whether this is from actual contention or artifacts from percpu_rwsem
> implementation.
>
> Thanks.
>
> --
> tejun


Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

2016-07-13 Thread Colin Cross
On Wed, Jul 13, 2016 at 11:21 AM, Tejun Heo  wrote:
> (cc'ing Oleg)
>
> Hello,
>
> On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
>>   So Dmitry Shmidt recently noticed that with 4.4 based systems we're
>> seeing quite a bit of performance overhead from
>> __cgroup_procs_write().
>>
>> With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
>> often take 10s of miliseconds to execute (with max times up in the
>> 80ms range).
>
> Yikes, that's pretty high.  Does this happen only while the system is
> generally busy or regardless of overall load?
>
>> While with 4.1 it was quite often in the single usec range, and max
>> time values still in in sub-milisecond range.
>>
>> The majority of these performance regressions seem to come from the
>> locking changes in:
>>
>> 3014dde762f6 ("cgroup: simplify threadgroup locking")
>> and
>> 1ed1328792ff  ("sched, cgroup: replace signal_struct->group_rwsem with
>> a global percpu_rwsem")
>>
>> Dmitry has found that by reverting these two changes (which don't
>> revert easiliy), we can get back down to tens 10-100 usec range for
>> most calls, with max values occasionally spiking to ~18ms.
>>
>> Those two commits do talk about performance regressions, that were
>> supposedly alleviated by percpu_rwsem changes, but I'm not sure we are
>> seeing this.
>>
>> In 1ed1328792ff, the commit talks about the write path being a fairly
>> cold path, but with Android I worry this may not actually be the case,
>> as Android uses cpuset cgroups to group tasks into foreground and
>> background tasks, but this means when switching applications, tasks
>> are migrated between cgroups. Putting an additional 80 milisecond
>> delay on this adds potentially visible latencies on task switching.
>
> Switching between foreground and background isn't a hot path.  It's
> human initiated operations after all.  It taking 80 msecs sure is
> problematic but I'm skeptical that this is from actual contention
> given that the only reader side holders are fork and exit paths.

Slight correction here, we move tasks to the foreground cgroup and
back around binder IPC calls from foreground processes to background
processes, so it is significantly hotter than just human initiated
operations.

>> Reverting those two changes in the Android common.git tree doesn't
>> feel like a good long term solution here, so I was wondering if you
>> had any thoughts on how to further reduce the performance regression
>> here?
>
> One interesting thing to try would be replacing it with a regular
> non-percpu rwsem and see how it behaves.  That should easily tell us
> whether this is from actual contention or artifacts from percpu_rwsem
> implementation.
>
> Thanks.
>
> --
> tejun


Re: [PATCH] ion: scatterlist offset not used for buffer map

2016-04-08 Thread Colin Cross
On Thu, Apr 7, 2016 at 11:56 PM, John Einar Reitan
 wrote:
> On Thu, Apr 07, 2016 at 12:37:50PM -0700, Laura Abbott wrote:
>> On 04/07/2016 04:29 AM, John Einar Reitan wrote:
>> > ion's default user/kernel page mapping code don't honor the offset
>> > option for scatterlists. It uses sg_page and expect the whole page to be
>> > mapped, while the offset could dictate an offset within a large page.
>> >
>> > sg_phys correctly accounts for the offset, so should be used instead.
>> >
>>
>> Can you be more specific about which heap and which allocation pattern
>> is exposing this bug?
>
> The heap that exposed the bug is one I'm developing and will be posting
> as a RFC soon. It uses compound pages and an sub-divides it into surface
> buffers. The ion buffers are configured to hold sgl's with the compound
> page and the correct offset of the buffer, via
> sg_set_page(.., compound_page, .., offset_of_logical_buffer);

I don't think this is right.  A compound_page still has a page struct
for every page, you should be passing the page struct where your data
starts.  Using an offset > PAGE_SIZE is going to break lots of places,
for example anywhere that uses kmap(sg_page(sg)).

> sg_phys/sg_virt  includes this offset, but if you poke the sg and extract
> the page with sg_page yourself you must include this offset in your
> calculations too.


Re: [PATCH] ion: scatterlist offset not used for buffer map

2016-04-08 Thread Colin Cross
On Thu, Apr 7, 2016 at 11:56 PM, John Einar Reitan
 wrote:
> On Thu, Apr 07, 2016 at 12:37:50PM -0700, Laura Abbott wrote:
>> On 04/07/2016 04:29 AM, John Einar Reitan wrote:
>> > ion's default user/kernel page mapping code don't honor the offset
>> > option for scatterlists. It uses sg_page and expect the whole page to be
>> > mapped, while the offset could dictate an offset within a large page.
>> >
>> > sg_phys correctly accounts for the offset, so should be used instead.
>> >
>>
>> Can you be more specific about which heap and which allocation pattern
>> is exposing this bug?
>
> The heap that exposed the bug is one I'm developing and will be posting
> as a RFC soon. It uses compound pages and an sub-divides it into surface
> buffers. The ion buffers are configured to hold sgl's with the compound
> page and the correct offset of the buffer, via
> sg_set_page(.., compound_page, .., offset_of_logical_buffer);

I don't think this is right.  A compound_page still has a page struct
for every page, you should be passing the page struct where your data
starts.  Using an offset > PAGE_SIZE is going to break lots of places,
for example anywhere that uses kmap(sg_page(sg)).

> sg_phys/sg_virt  includes this offset, but if you poke the sg and extract
> the page with sg_page yourself you must include this offset in your
> calculations too.


VM_GROWSDOWN and fixed size stacks

2016-03-21 Thread Colin Cross
I recently came across some Android userspace code that jumps through
some strange hoops to produce a fixed size stack on the main stack
(https://android.googlesource.com/platform/art/+/db1f7dac02f6dcecac3e032f10abbcdbf3cf4331/runtime/thread.cc#543).
ART (the Android runtime) uses a unified stack for managed and native
code.  It installs its own guard pages at the bottom of the stack, and
converts stack overflow segfaults into the appropriate exceptions.  In
order to run the exception handling code, it unprotects some of the
guard pages and uses them as stack.

To get a fixed size stack, ART accessing every page in the desired
stack, starting from the current SP and moving down to the desired
guard page.  This method was determined empirically, and is required
by a strange combination of rules in arch/*/mm/fault.c and
check_stack_guard_page for VM_GROWSDOWN mappings.

On arm and arm64, fault.c will happily extend the stack as far as
necessary with a single read below the stack and above any other
mapping.  x86 fault.c places an additional restriction, the fault
address cannot be more than ~64kB below the current stack pointer (not
the bottom of the current stack mapping).  However, that stack pointer
restriction is not enforced by check_stack_guard_page, which will grow
the stack by 4kB for any access in the last page of the current stack
mapping, which is why the repeated reads in ART can work on x86.

On a pthread_create'd thread, mprotecting the bottom of the stack to
PROT_NONE would be sufficient.  For the VM_GROWSDOWN stack, manually
placing guard pages at the bottom of the desired stack without
expanding it doesn't work, because check_stack_guard_page will fault
one page before that.  In addition, other non-stack mappings might get
placed between the stack and the guard pages.

Manually mapping the entirety of the desired stack would work, but
causes confusing reporting in /proc/pid/maps.  The manual mapping
would not merge with the VM_GROWSDOWN mapping because of the mismatch
flags, resulting in a stack that spans two mappings, and only one of
them would get annotated with [stack].  There would also be a one page
gap shown in /proc/pid/maps, because task_mmu.c show_map_vma subtracts
off the virtual guard page, although since it is already mapped
accesses to the gap would not fault.

Hiding the stack guard page also causes incorrect reporting for the
current ART stack growing hack.  The code reads up to and including
the desired guard pages, and then mprotects them to PROT_NONE.  The
virtual guard page is one page below the last read, so there is a one
page VM_GROWSDOWN mapping located below the guard page.  When
show_map_vma subtracts a page it ends up showing a mapping whose start
and end addresses are the same:
7ff82c5000-7ff82c5000 rw-p  00:00 0
7ff82c5000-7ff82c6000 ---p  00:00 0
7ff82c6000-7ff8ac5000 rw-p  00:00 0  [stack]

The hack that is in place now works, although it is a unnecessarily
slow.  We've recently restricted it to only running on the main
VM_GROWSDOWN thread and not every spawned thread by first checking if
mprotect PROT_NONE at the bottom of the stack works.  It seems like
there should be a better way to handle this though.  Switching to
another stack would work, but cleaning up and freeing the old stack
would be hard since the top of the stack generally contains TLS and
global getauxval storage.  If there were some way to disable the
VM_GROWSDOWN flag we could manually extend the stack without
introducing the /proc/pid/maps reporting problems.  Or if there was
some way to manually extend a  VM_GROWSDOWN stack we could get the
same behavior as today without faulting 2000 times and hoping that a
future kernel doesn't decide that a check_stack_guard_page far below
the current stack pointer is a segfault.


VM_GROWSDOWN and fixed size stacks

2016-03-21 Thread Colin Cross
I recently came across some Android userspace code that jumps through
some strange hoops to produce a fixed size stack on the main stack
(https://android.googlesource.com/platform/art/+/db1f7dac02f6dcecac3e032f10abbcdbf3cf4331/runtime/thread.cc#543).
ART (the Android runtime) uses a unified stack for managed and native
code.  It installs its own guard pages at the bottom of the stack, and
converts stack overflow segfaults into the appropriate exceptions.  In
order to run the exception handling code, it unprotects some of the
guard pages and uses them as stack.

To get a fixed size stack, ART accessing every page in the desired
stack, starting from the current SP and moving down to the desired
guard page.  This method was determined empirically, and is required
by a strange combination of rules in arch/*/mm/fault.c and
check_stack_guard_page for VM_GROWSDOWN mappings.

On arm and arm64, fault.c will happily extend the stack as far as
necessary with a single read below the stack and above any other
mapping.  x86 fault.c places an additional restriction, the fault
address cannot be more than ~64kB below the current stack pointer (not
the bottom of the current stack mapping).  However, that stack pointer
restriction is not enforced by check_stack_guard_page, which will grow
the stack by 4kB for any access in the last page of the current stack
mapping, which is why the repeated reads in ART can work on x86.

On a pthread_create'd thread, mprotecting the bottom of the stack to
PROT_NONE would be sufficient.  For the VM_GROWSDOWN stack, manually
placing guard pages at the bottom of the desired stack without
expanding it doesn't work, because check_stack_guard_page will fault
one page before that.  In addition, other non-stack mappings might get
placed between the stack and the guard pages.

Manually mapping the entirety of the desired stack would work, but
causes confusing reporting in /proc/pid/maps.  The manual mapping
would not merge with the VM_GROWSDOWN mapping because of the mismatch
flags, resulting in a stack that spans two mappings, and only one of
them would get annotated with [stack].  There would also be a one page
gap shown in /proc/pid/maps, because task_mmu.c show_map_vma subtracts
off the virtual guard page, although since it is already mapped
accesses to the gap would not fault.

Hiding the stack guard page also causes incorrect reporting for the
current ART stack growing hack.  The code reads up to and including
the desired guard pages, and then mprotects them to PROT_NONE.  The
virtual guard page is one page below the last read, so there is a one
page VM_GROWSDOWN mapping located below the guard page.  When
show_map_vma subtracts a page it ends up showing a mapping whose start
and end addresses are the same:
7ff82c5000-7ff82c5000 rw-p  00:00 0
7ff82c5000-7ff82c6000 ---p  00:00 0
7ff82c6000-7ff8ac5000 rw-p  00:00 0  [stack]

The hack that is in place now works, although it is a unnecessarily
slow.  We've recently restricted it to only running on the main
VM_GROWSDOWN thread and not every spawned thread by first checking if
mprotect PROT_NONE at the bottom of the stack works.  It seems like
there should be a better way to handle this though.  Switching to
another stack would work, but cleaning up and freeing the old stack
would be hard since the top of the stack generally contains TLS and
global getauxval storage.  If there were some way to disable the
VM_GROWSDOWN flag we could manually extend the stack without
introducing the /proc/pid/maps reporting problems.  Or if there was
some way to manually extend a  VM_GROWSDOWN stack we could get the
same behavior as today without faulting 2000 times and hoping that a
future kernel doesn't decide that a check_stack_guard_page far below
the current stack pointer is a segfault.


Re: [PATCH] proc: revert /proc//maps [stack:TID] annotation

2016-01-25 Thread Colin Cross
On Mon, Jan 25, 2016 at 3:14 PM, Kirill A. Shutemov
 wrote:
> On Mon, Jan 25, 2016 at 01:30:00PM -0800, Colin Cross wrote:
>> On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
>>  wrote:
>> > On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
>> >> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner  
>> >> wrote:
>> >>
>> >> > b764375 ("procfs: mark thread stack correctly in proc//maps")
>> >> > added [stack:TID] annotation to /proc//maps. Finding the task of
>> >> > a stack VMA requires walking the entire thread list, turning this into
>> >> > quadratic behavior: a thousand threads means a thousand stacks, so the
>> >> > rendering of /proc//maps needs to look at a million threads. The
>> >> > cost is not in proportion to the usefulness as described in the patch.
>> >> >
>> >> > Drop the [stack:TID] annotation to make /proc//maps (and
>> >> > /proc//numa_maps) usable again for higher thread counts.
>> >> >
>> >> > The [stack] annotation inside /proc//task//maps is retained,
>> >> > as identifying the stack VMA there is an O(1) operation.
>> >>
>> >> Four years ago, ouch.
>> >>
>> >> Any thoughts on the obvious back-compatibility concerns?  ie, why did
>> >> Siddhesh implement this in the first place?  My bad for not ensuring
>> >> that the changelog told us this.
>> >>
>> >> https://lkml.org/lkml/2012/1/14/25 has more info:
>> >>
>> >> : Memory mmaped by glibc for a thread stack currently shows up as a
>> >> : simple anonymous map, which makes it difficult to differentiate between
>> >> : memory usage of the thread on stack and other dynamic allocation.
>> >> : Since glibc already uses MAP_STACK to request this mapping, the
>> >> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
>> >> : resulting vma so that the mapping is treated as a stack and not any
>> >> : regular anonymous mapping.  Also, one may use vm_flags to decide if a
>> >> : vma is a stack.
>> >>
>> >> But even that doesn't really tell us what the actual *value* of the
>> >> patch is to end-users.
>> >
>> > I doubt it can be very useful as it's unreliable: if two stacks are
>> > allocated end-to-end (which is not good idea, but still) it can only
>> > report [stack:XXX] for the first one as they are merged into one VMA.
>> > Any other anon VMA merged with the stack will be also claimed as stack,
>> > which is not always correct.
>> >
>> > I think report the VMA as anon is the best we can know about it,
>> > everything else just rather expensive guesses.
>>
>> An alternative to guessing is the anonymous VMA naming patch used on
>> Android, https://lkml.org/lkml/2013/10/30/518.  It allows userspace to
>> name anonymous memory however it wishes, and prevents vma merging
>> adjacent regions with different names.  Android uses it to label
>> native heap memory, but it would work well for stacks too.
>
> I don't think preventing vma merging is fair price for the feature: you
> would pay extra in every find_vma() (meaning all page faults).
>
> I think it would be nice to have a way to store this kind of sideband info
> without impacting critical code path.
>
> One other use case I see for such sideband info is storing hits from
> MADV_HUGEPAGE/MADV_NOHUGEPAGE: need to split vma just for these hints is
> unfortunate.

In practice we don't see many extra VMAs from naming; alignment
requirements, guard pages, and permissions differences are usually
enough to keep adjacent anonymous VMAs from merging.  Here's an
example from a process on Android:
7f9086c000-7f9086d000 rw-p 6000 fd:00 1495
  /system/lib64/libhardware_legacy.so
7f9086d000-7f9086e000 rw-p  00:00 0
7f9086e000-7f9086f000 rw-p  00:00 0
  [anon:linker_alloc]
7f90875000-7f90876000 r--p  00:00 0
  [anon:linker_alloc]
7f9087c000-7f9087d000 r--p  00:00 0
  [anon:linker_alloc]
7f90901000-7f90902000 ---p  00:00 0
  [anon:thread stack guard page]
7f90902000-7f90a0 rw-p  00:00 0
  [stack:410]
7f90a0-7f90c0 rw-p  00:00 0
  [anon:libc_malloc]
7f90c02000-7f90c03000 ---p  00:00 0
  [anon:thread stack guard page]
7f90c03000-7f90d01000 rw-p  00:00 0
  [stack:409]
7f90d01000-7f90d02000 ---p  00:00 0
  [anon:thread stack guard page]
7f90d02000-7f90e0 rw-p  00:00 0
  [stack:408]
7f90e0-7f9120 rw-p  00:00 0
  [anon:libc_ma

Re: [PATCH] proc: revert /proc//maps [stack:TID] annotation

2016-01-25 Thread Colin Cross
On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
 wrote:
> On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
>> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner  
>> wrote:
>>
>> > b764375 ("procfs: mark thread stack correctly in proc//maps")
>> > added [stack:TID] annotation to /proc//maps. Finding the task of
>> > a stack VMA requires walking the entire thread list, turning this into
>> > quadratic behavior: a thousand threads means a thousand stacks, so the
>> > rendering of /proc//maps needs to look at a million threads. The
>> > cost is not in proportion to the usefulness as described in the patch.
>> >
>> > Drop the [stack:TID] annotation to make /proc//maps (and
>> > /proc//numa_maps) usable again for higher thread counts.
>> >
>> > The [stack] annotation inside /proc//task//maps is retained,
>> > as identifying the stack VMA there is an O(1) operation.
>>
>> Four years ago, ouch.
>>
>> Any thoughts on the obvious back-compatibility concerns?  ie, why did
>> Siddhesh implement this in the first place?  My bad for not ensuring
>> that the changelog told us this.
>>
>> https://lkml.org/lkml/2012/1/14/25 has more info:
>>
>> : Memory mmaped by glibc for a thread stack currently shows up as a
>> : simple anonymous map, which makes it difficult to differentiate between
>> : memory usage of the thread on stack and other dynamic allocation.
>> : Since glibc already uses MAP_STACK to request this mapping, the
>> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
>> : resulting vma so that the mapping is treated as a stack and not any
>> : regular anonymous mapping.  Also, one may use vm_flags to decide if a
>> : vma is a stack.
>>
>> But even that doesn't really tell us what the actual *value* of the
>> patch is to end-users.
>
> I doubt it can be very useful as it's unreliable: if two stacks are
> allocated end-to-end (which is not good idea, but still) it can only
> report [stack:XXX] for the first one as they are merged into one VMA.
> Any other anon VMA merged with the stack will be also claimed as stack,
> which is not always correct.
>
> I think report the VMA as anon is the best we can know about it,
> everything else just rather expensive guesses.

An alternative to guessing is the anonymous VMA naming patch used on
Android, https://lkml.org/lkml/2013/10/30/518.  It allows userspace to
name anonymous memory however it wishes, and prevents vma merging
adjacent regions with different names.  Android uses it to label
native heap memory, but it would work well for stacks too.


Re: [PATCH] proc: revert /proc//maps [stack:TID] annotation

2016-01-25 Thread Colin Cross
On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
 wrote:
> On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
>> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner  
>> wrote:
>>
>> > b764375 ("procfs: mark thread stack correctly in proc//maps")
>> > added [stack:TID] annotation to /proc//maps. Finding the task of
>> > a stack VMA requires walking the entire thread list, turning this into
>> > quadratic behavior: a thousand threads means a thousand stacks, so the
>> > rendering of /proc//maps needs to look at a million threads. The
>> > cost is not in proportion to the usefulness as described in the patch.
>> >
>> > Drop the [stack:TID] annotation to make /proc//maps (and
>> > /proc//numa_maps) usable again for higher thread counts.
>> >
>> > The [stack] annotation inside /proc//task//maps is retained,
>> > as identifying the stack VMA there is an O(1) operation.
>>
>> Four years ago, ouch.
>>
>> Any thoughts on the obvious back-compatibility concerns?  ie, why did
>> Siddhesh implement this in the first place?  My bad for not ensuring
>> that the changelog told us this.
>>
>> https://lkml.org/lkml/2012/1/14/25 has more info:
>>
>> : Memory mmaped by glibc for a thread stack currently shows up as a
>> : simple anonymous map, which makes it difficult to differentiate between
>> : memory usage of the thread on stack and other dynamic allocation.
>> : Since glibc already uses MAP_STACK to request this mapping, the
>> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
>> : resulting vma so that the mapping is treated as a stack and not any
>> : regular anonymous mapping.  Also, one may use vm_flags to decide if a
>> : vma is a stack.
>>
>> But even that doesn't really tell us what the actual *value* of the
>> patch is to end-users.
>
> I doubt it can be very useful as it's unreliable: if two stacks are
> allocated end-to-end (which is not good idea, but still) it can only
> report [stack:XXX] for the first one as they are merged into one VMA.
> Any other anon VMA merged with the stack will be also claimed as stack,
> which is not always correct.
>
> I think report the VMA as anon is the best we can know about it,
> everything else just rather expensive guesses.

An alternative to guessing is the anonymous VMA naming patch used on
Android, https://lkml.org/lkml/2013/10/30/518.  It allows userspace to
name anonymous memory however it wishes, and prevents vma merging
adjacent regions with different names.  Android uses it to label
native heap memory, but it would work well for stacks too.


Re: [PATCH] proc: revert /proc//maps [stack:TID] annotation

2016-01-25 Thread Colin Cross
On Mon, Jan 25, 2016 at 3:14 PM, Kirill A. Shutemov
<kir...@shutemov.name> wrote:
> On Mon, Jan 25, 2016 at 01:30:00PM -0800, Colin Cross wrote:
>> On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
>> <kir...@shutemov.name> wrote:
>> > On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
>> >> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner <han...@cmpxchg.org> 
>> >> wrote:
>> >>
>> >> > b764375 ("procfs: mark thread stack correctly in proc//maps")
>> >> > added [stack:TID] annotation to /proc//maps. Finding the task of
>> >> > a stack VMA requires walking the entire thread list, turning this into
>> >> > quadratic behavior: a thousand threads means a thousand stacks, so the
>> >> > rendering of /proc//maps needs to look at a million threads. The
>> >> > cost is not in proportion to the usefulness as described in the patch.
>> >> >
>> >> > Drop the [stack:TID] annotation to make /proc//maps (and
>> >> > /proc//numa_maps) usable again for higher thread counts.
>> >> >
>> >> > The [stack] annotation inside /proc//task//maps is retained,
>> >> > as identifying the stack VMA there is an O(1) operation.
>> >>
>> >> Four years ago, ouch.
>> >>
>> >> Any thoughts on the obvious back-compatibility concerns?  ie, why did
>> >> Siddhesh implement this in the first place?  My bad for not ensuring
>> >> that the changelog told us this.
>> >>
>> >> https://lkml.org/lkml/2012/1/14/25 has more info:
>> >>
>> >> : Memory mmaped by glibc for a thread stack currently shows up as a
>> >> : simple anonymous map, which makes it difficult to differentiate between
>> >> : memory usage of the thread on stack and other dynamic allocation.
>> >> : Since glibc already uses MAP_STACK to request this mapping, the
>> >> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
>> >> : resulting vma so that the mapping is treated as a stack and not any
>> >> : regular anonymous mapping.  Also, one may use vm_flags to decide if a
>> >> : vma is a stack.
>> >>
>> >> But even that doesn't really tell us what the actual *value* of the
>> >> patch is to end-users.
>> >
>> > I doubt it can be very useful as it's unreliable: if two stacks are
>> > allocated end-to-end (which is not good idea, but still) it can only
>> > report [stack:XXX] for the first one as they are merged into one VMA.
>> > Any other anon VMA merged with the stack will be also claimed as stack,
>> > which is not always correct.
>> >
>> > I think report the VMA as anon is the best we can know about it,
>> > everything else just rather expensive guesses.
>>
>> An alternative to guessing is the anonymous VMA naming patch used on
>> Android, https://lkml.org/lkml/2013/10/30/518.  It allows userspace to
>> name anonymous memory however it wishes, and prevents vma merging
>> adjacent regions with different names.  Android uses it to label
>> native heap memory, but it would work well for stacks too.
>
> I don't think preventing vma merging is fair price for the feature: you
> would pay extra in every find_vma() (meaning all page faults).
>
> I think it would be nice to have a way to store this kind of sideband info
> without impacting critical code path.
>
> One other use case I see for such sideband info is storing hits from
> MADV_HUGEPAGE/MADV_NOHUGEPAGE: need to split vma just for these hints is
> unfortunate.

In practice we don't see many extra VMAs from naming; alignment
requirements, guard pages, and permissions differences are usually
enough to keep adjacent anonymous VMAs from merging.  Here's an
example from a process on Android:
7f9086c000-7f9086d000 rw-p 6000 fd:00 1495
  /system/lib64/libhardware_legacy.so
7f9086d000-7f9086e000 rw-p  00:00 0
7f9086e000-7f9086f000 rw-p  00:00 0
  [anon:linker_alloc]
7f90875000-7f90876000 r--p  00:00 0
  [anon:linker_alloc]
7f9087c000-7f9087d000 r--p  00:00 0
  [anon:linker_alloc]
7f90901000-7f90902000 ---p  00:00 0
  [anon:thread stack guard page]
7f90902000-7f90a0 rw-p  00:00 0
  [stack:410]
7f90a0-7f90c0 rw-p  00:00 0
  [anon:libc_malloc]
7f90c02000-7f90c03000 ---p  00:00 0
  [anon:thread stack guard page]
7f90c03000-7f90d01000 rw-p  00:00 0
  [stack:409]
7f90d01000-7f90d02000 ---p  00:00 0
  [anon:thread stack guard page]
7f90d02000-7f90e0 rw-p  00:0

Re: [PATCH] staging: ion: fix corruption of ion_import_dma_buf

2015-09-09 Thread Colin Cross
On Wed, Sep 9, 2015 at 10:19 AM, Laura Abbott  wrote:
> (adding Colin and John)
>
>
> On 09/09/2015 12:41 AM, Shawn Lin wrote:
>>
>> we found this issue but still exit in lastest kernel. Simply
>> keep ion_handle_create under mutex_lock to avoid this race.
>>
>> WARNING: CPU: 2 PID: 2648 at drivers/staging/android/ion/ion.c:512
>> ion_handle_add+0xb4/0xc0()
>> ion_handle_add: buffer already found.
>> Modules linked in: iwlmvm iwlwifi mac80211 cfg80211 compat
>> CPU: 2 PID: 2648 Comm: TimedEventQueue Tainted: GW3.14.0 #7
>>     9a3efd2c 80faf273 9a3efd6c 9a3efd5c 80935dc9 811d7fd3
>>   9a3efd88 0a58 812208a0 0200 80e128d4 80e128d4 8d4ae00c a8cd8600
>>   a8cd8094 9a3efd74 80935e0e 0009 9a3efd6c 811d7fd3 9a3efd88 9a3efd9c
>> Call Trace:
>>[<80faf273>] dump_stack+0x48/0x69
>>[<80935dc9>] warn_slowpath_common+0x79/0x90
>>[<80e128d4>] ? ion_handle_add+0xb4/0xc0
>>[<80e128d4>] ? ion_handle_add+0xb4/0xc0
>>[<80935e0e>] warn_slowpath_fmt+0x2e/0x30
>>[<80e128d4>] ion_handle_add+0xb4/0xc0
>>[<80e144cc>] ion_import_dma_buf+0x8c/0x110
>>[<80c517c4>] reg_init+0x364/0x7d0
>>[<80993363>] ? futex_wait+0x123/0x210
>>[<80992e0e>] ? get_futex_key+0x16e/0x1e0
>>[<8099308f>] ? futex_wake+0x5f/0x120
>>[<80c51e19>] vpu_service_ioctl+0x1e9/0x500
>>[<80994aec>] ? do_futex+0xec/0x8e0
>>[<80971080>] ? prepare_to_wait_event+0xc0/0xc0
>>[<80c51c30>] ? reg_init+0x7d0/0x7d0
>>[<80a22562>] do_vfs_ioctl+0x2d2/0x4c0
>>[<80b198ad>] ? inode_has_perm.isra.41+0x2d/0x40
>>[<80b199cf>] ? file_has_perm+0x7f/0x90
>>[<80b1a5f7>] ? selinux_file_ioctl+0x47/0xf0
>>[<80a227a8>] SyS_ioctl+0x58/0x80
>>[<80fb45e8>] syscall_call+0x7/0x7
>>[<80fb>] ? mmc_do_calc_max_discard+0xab/0xe4
>>
>> Fixes: 83271f626 ("ion: hold reference to handle...")
>> Signed-off-by: Shawn Lin 
>> ---
>>
>>   drivers/staging/android/ion/ion.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion.c
>> b/drivers/staging/android/ion/ion.c
>> index eec878e..32e7b5c 100644
>> --- a/drivers/staging/android/ion/ion.c
>> +++ b/drivers/staging/android/ion/ion.c
>> @@ -1179,13 +1179,13 @@ struct ion_handle *ion_import_dma_buf(struct
>> ion_client *client, int fd)
>> mutex_unlock(>lock);
>> goto end;
>> }
>> -   mutex_unlock(>lock);
>>
>> handle = ion_handle_create(client, buffer);
>> -   if (IS_ERR(handle))
>> +   if (IS_ERR(handle)) {
>> +   mutex_unlock(>lock);
>> goto end;
>> +   }
>>
>> -   mutex_lock(>lock);
>> ret = ion_handle_add(client, handle);
>> mutex_unlock(>lock);
>> if (ret) {
>>
>
> So the patch looks correct but the locking change there seems like it was
> added
> deliberately. Colin/John, do you remember why the locking for
> ion_import_dma_buf
> changed? Was there a deadlock condition somewhere?
>
> Thanks,
> Laura

I can't see any reason to not hold the mutex across ion_handle_create.
The patch that introduced the bug
(83271f6262c91a49df325c52bec8f00f4de294ca, ion: hold reference to
handle after ion_uhandle_get) required that the mutex not be held
around the call to ion_handle_put, but didn't affect
ion_handle_create.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] PM / Runtime: runtime: Add sysfs option for forcing runtime suspend

2015-09-09 Thread Colin Cross
On Wed, Sep 9, 2015 at 1:35 PM, Rafael J. Wysocki  wrote:
>
> On Wednesday, September 09, 2015 11:20:25 AM Alan Stern wrote:
> > On Wed, 9 Sep 2015, Rafael J. Wysocki wrote:
> >
> > > > The best example and actually the very specific problem we want to
> > > > solve is handling touchscreens on a phone / tablet. When the screen is
> > > > turned off, it is ideal to suspend the touchscreen for two reasons: to
> > > > lower the power consumption as much as possible and to prevent
> > > > interrupts to wake-up the CPU when the user touches the device, and
> > > > thus save even more power as we allow the CPU to stay in deep idle
> > > > states for longer periods.
> > > >
> > > > Note that when the screen is turned-on again, we want to resume the
> > > > touchscreen so that it can send events again.
> > >
> > > In fact, then, what you need seems to be the feature discussed by Alan
> > > and me some time ago allowing remote wakeup do be disabled for runtime
> > > PM from user space as that in combination with autosuspend should
> > > address your use case.
> >
> > That, plus they want the touchscreen to go into runtime suspend
> > whenever the screen is off (was this not the main reason for the
> > patch?).
>
> Right.
>
> > It seems to me that it should be possible to arrange for this to happen
> > simply by making userspace close the touchscreen device when the screen
> > is turned off.  Or am I missing something?
>
> Honestly, I don't know.
>
> Octavian, Irina, any reasons why things can't be done as Alan is suggesting?


Early Android used early suspend, which notified various kernel
drivers that the screen was turning off so they could go into low
power states.  That meant there was no reason to close the touchscreen
fd - the kernel already knew about the screen off event.  We then got
rid of the early suspend hack and moved everything into userspace
using the power HAL and sysfs files.  Getting rid of the touchscreen
sysfs files and closing the touchscreen on screen off was on the
nice-to-have list, but hasn't made it onto anybody's todo list.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: ion: fix corruption of ion_import_dma_buf

2015-09-09 Thread Colin Cross
On Wed, Sep 9, 2015 at 10:19 AM, Laura Abbott  wrote:
> (adding Colin and John)
>
>
> On 09/09/2015 12:41 AM, Shawn Lin wrote:
>>
>> we found this issue but still exit in lastest kernel. Simply
>> keep ion_handle_create under mutex_lock to avoid this race.
>>
>> WARNING: CPU: 2 PID: 2648 at drivers/staging/android/ion/ion.c:512
>> ion_handle_add+0xb4/0xc0()
>> ion_handle_add: buffer already found.
>> Modules linked in: iwlmvm iwlwifi mac80211 cfg80211 compat
>> CPU: 2 PID: 2648 Comm: TimedEventQueue Tainted: GW3.14.0 #7
>>     9a3efd2c 80faf273 9a3efd6c 9a3efd5c 80935dc9 811d7fd3
>>   9a3efd88 0a58 812208a0 0200 80e128d4 80e128d4 8d4ae00c a8cd8600
>>   a8cd8094 9a3efd74 80935e0e 0009 9a3efd6c 811d7fd3 9a3efd88 9a3efd9c
>> Call Trace:
>>[<80faf273>] dump_stack+0x48/0x69
>>[<80935dc9>] warn_slowpath_common+0x79/0x90
>>[<80e128d4>] ? ion_handle_add+0xb4/0xc0
>>[<80e128d4>] ? ion_handle_add+0xb4/0xc0
>>[<80935e0e>] warn_slowpath_fmt+0x2e/0x30
>>[<80e128d4>] ion_handle_add+0xb4/0xc0
>>[<80e144cc>] ion_import_dma_buf+0x8c/0x110
>>[<80c517c4>] reg_init+0x364/0x7d0
>>[<80993363>] ? futex_wait+0x123/0x210
>>[<80992e0e>] ? get_futex_key+0x16e/0x1e0
>>[<8099308f>] ? futex_wake+0x5f/0x120
>>[<80c51e19>] vpu_service_ioctl+0x1e9/0x500
>>[<80994aec>] ? do_futex+0xec/0x8e0
>>[<80971080>] ? prepare_to_wait_event+0xc0/0xc0
>>[<80c51c30>] ? reg_init+0x7d0/0x7d0
>>[<80a22562>] do_vfs_ioctl+0x2d2/0x4c0
>>[<80b198ad>] ? inode_has_perm.isra.41+0x2d/0x40
>>[<80b199cf>] ? file_has_perm+0x7f/0x90
>>[<80b1a5f7>] ? selinux_file_ioctl+0x47/0xf0
>>[<80a227a8>] SyS_ioctl+0x58/0x80
>>[<80fb45e8>] syscall_call+0x7/0x7
>>[<80fb>] ? mmc_do_calc_max_discard+0xab/0xe4
>>
>> Fixes: 83271f626 ("ion: hold reference to handle...")
>> Signed-off-by: Shawn Lin 
>> ---
>>
>>   drivers/staging/android/ion/ion.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion.c
>> b/drivers/staging/android/ion/ion.c
>> index eec878e..32e7b5c 100644
>> --- a/drivers/staging/android/ion/ion.c
>> +++ b/drivers/staging/android/ion/ion.c
>> @@ -1179,13 +1179,13 @@ struct ion_handle *ion_import_dma_buf(struct
>> ion_client *client, int fd)
>> mutex_unlock(>lock);
>> goto end;
>> }
>> -   mutex_unlock(>lock);
>>
>> handle = ion_handle_create(client, buffer);
>> -   if (IS_ERR(handle))
>> +   if (IS_ERR(handle)) {
>> +   mutex_unlock(>lock);
>> goto end;
>> +   }
>>
>> -   mutex_lock(>lock);
>> ret = ion_handle_add(client, handle);
>> mutex_unlock(>lock);
>> if (ret) {
>>
>
> So the patch looks correct but the locking change there seems like it was
> added
> deliberately. Colin/John, do you remember why the locking for
> ion_import_dma_buf
> changed? Was there a deadlock condition somewhere?
>
> Thanks,
> Laura

I can't see any reason to not hold the mutex across ion_handle_create.
The patch that introduced the bug
(83271f6262c91a49df325c52bec8f00f4de294ca, ion: hold reference to
handle after ion_uhandle_get) required that the mutex not be held
around the call to ion_handle_put, but didn't affect
ion_handle_create.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] PM / Runtime: runtime: Add sysfs option for forcing runtime suspend

2015-09-09 Thread Colin Cross
On Wed, Sep 9, 2015 at 1:35 PM, Rafael J. Wysocki  wrote:
>
> On Wednesday, September 09, 2015 11:20:25 AM Alan Stern wrote:
> > On Wed, 9 Sep 2015, Rafael J. Wysocki wrote:
> >
> > > > The best example and actually the very specific problem we want to
> > > > solve is handling touchscreens on a phone / tablet. When the screen is
> > > > turned off, it is ideal to suspend the touchscreen for two reasons: to
> > > > lower the power consumption as much as possible and to prevent
> > > > interrupts to wake-up the CPU when the user touches the device, and
> > > > thus save even more power as we allow the CPU to stay in deep idle
> > > > states for longer periods.
> > > >
> > > > Note that when the screen is turned-on again, we want to resume the
> > > > touchscreen so that it can send events again.
> > >
> > > In fact, then, what you need seems to be the feature discussed by Alan
> > > and me some time ago allowing remote wakeup do be disabled for runtime
> > > PM from user space as that in combination with autosuspend should
> > > address your use case.
> >
> > That, plus they want the touchscreen to go into runtime suspend
> > whenever the screen is off (was this not the main reason for the
> > patch?).
>
> Right.
>
> > It seems to me that it should be possible to arrange for this to happen
> > simply by making userspace close the touchscreen device when the screen
> > is turned off.  Or am I missing something?
>
> Honestly, I don't know.
>
> Octavian, Irina, any reasons why things can't be done as Alan is suggesting?


Early Android used early suspend, which notified various kernel
drivers that the screen was turning off so they could go into low
power states.  That meant there was no reason to close the touchscreen
fd - the kernel already knew about the screen off event.  We then got
rid of the early suspend hack and moved everything into userspace
using the power HAL and sysfs files.  Getting rid of the touchscreen
sysfs files and closing the touchscreen on screen off was on the
nice-to-have list, but hasn't made it onto anybody's todo list.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] staging: ion: chunk_heap: use %pad for printing dma_addr_t's

2015-05-04 Thread Colin Cross
On Mon, May 4, 2015 at 1:22 AM, Dan Carpenter  wrote:
> On Thu, Apr 09, 2015 at 06:10:04PM -0700, Mitchel Humpherys wrote:
>> We're currently using %lu and %ld to print some variables of type
>> dma_addr_t, which results in the following warning when dma_addr_t is
>> 64-bits wide:
>>
>> drivers/staging/android/ion/ion_chunk_heap.c: In function 
>> 'ion_chunk_heap_create':
>> drivers/staging/android/ion/ion_chunk_heap.c:176:2: warning: format 
>> '%lu' expects argument of type 'long unsigned int', but argument 3 has type 
>> 'dma_addr_t' [-Wformat=]
>>   pr_info("%s: base %lu size %zu align %ld\n", __func__, 
>> chunk_heap->base,
>>   ^
>> drivers/staging/android/ion/ion_chunk_heap.c:176:2: warning: format 
>> '%ld' expects argument of type 'long int', but argument 5 has type 
>> 'dma_addr_t' [-Wformat=]
>>
>> Fix this by using %pad as instructed in printk-formats.txt.
>>
>> Signed-off-by: Mitchel Humpherys 
>
> This one was just merged and I was about to email you that it introduces
> some new Smatch warnings, but actually looking at it, it's just wrong.
>
> We want to print "chunk_heap->base" and not "_heap->base".

This would be correct if base was a dma_addr_t...

> And anyway "_heap->base" is a regular pointer, not a dma_addr_t.

But it is actually an ion_phys_addr_t, which is currently typedef'd to
unsigned long.  Are you using a local patch that replaces
ion_phys_addr_t with dma_addr_t?

> So please send a new patch that removes the &.

Removing the & is not correct, lib/vsprintf.c will dereference the arg
for %pad or %pap.  I think this patch should just be dropped, the old
%lu was correct for what is in Linus' tree.

> regards,
> dan carpenter
>
>> ---
>>  drivers/staging/android/ion/ion_chunk_heap.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion_chunk_heap.c 
>> b/drivers/staging/android/ion/ion_chunk_heap.c
>> index 54746157d799..6b3e18aa1c64 100644
>> --- a/drivers/staging/android/ion/ion_chunk_heap.c
>> +++ b/drivers/staging/android/ion/ion_chunk_heap.c
>> @@ -173,8 +173,8 @@ struct ion_heap *ion_chunk_heap_create(struct 
>> ion_platform_heap *heap_data)
>>   chunk_heap->heap.ops = _heap_ops;
>>   chunk_heap->heap.type = ION_HEAP_TYPE_CHUNK;
>>   chunk_heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE;
>> - pr_debug("%s: base %lu size %zu align %ld\n", __func__, 
>> chunk_heap->base,
>> - heap_data->size, heap_data->align);
>> + pr_debug("%s: base %pad size %zu align %pad\n", __func__,
>> + _heap->base, heap_data->size, _data->align);
>>
>>   return _heap->heap;
>>
>> --
>> Qualcomm Innovation Center, Inc.
>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>>
>> ___
>> devel mailing list
>> de...@linuxdriverproject.org
>> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.

+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] staging: ion: chunk_heap: use %pad for printing dma_addr_t's

2015-05-04 Thread Colin Cross
On Mon, May 4, 2015 at 1:22 AM, Dan Carpenter dan.carpen...@oracle.com wrote:
 On Thu, Apr 09, 2015 at 06:10:04PM -0700, Mitchel Humpherys wrote:
 We're currently using %lu and %ld to print some variables of type
 dma_addr_t, which results in the following warning when dma_addr_t is
 64-bits wide:

 drivers/staging/android/ion/ion_chunk_heap.c: In function 
 'ion_chunk_heap_create':
 drivers/staging/android/ion/ion_chunk_heap.c:176:2: warning: format 
 '%lu' expects argument of type 'long unsigned int', but argument 3 has type 
 'dma_addr_t' [-Wformat=]
   pr_info(%s: base %lu size %zu align %ld\n, __func__, 
 chunk_heap-base,
   ^
 drivers/staging/android/ion/ion_chunk_heap.c:176:2: warning: format 
 '%ld' expects argument of type 'long int', but argument 5 has type 
 'dma_addr_t' [-Wformat=]

 Fix this by using %pad as instructed in printk-formats.txt.

 Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org

 This one was just merged and I was about to email you that it introduces
 some new Smatch warnings, but actually looking at it, it's just wrong.

 We want to print chunk_heap-base and not chunk_heap-base.

This would be correct if base was a dma_addr_t...

 And anyway chunk_heap-base is a regular pointer, not a dma_addr_t.

But it is actually an ion_phys_addr_t, which is currently typedef'd to
unsigned long.  Are you using a local patch that replaces
ion_phys_addr_t with dma_addr_t?

 So please send a new patch that removes the .

Removing the  is not correct, lib/vsprintf.c will dereference the arg
for %pad or %pap.  I think this patch should just be dropped, the old
%lu was correct for what is in Linus' tree.

 regards,
 dan carpenter

 ---
  drivers/staging/android/ion/ion_chunk_heap.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/drivers/staging/android/ion/ion_chunk_heap.c 
 b/drivers/staging/android/ion/ion_chunk_heap.c
 index 54746157d799..6b3e18aa1c64 100644
 --- a/drivers/staging/android/ion/ion_chunk_heap.c
 +++ b/drivers/staging/android/ion/ion_chunk_heap.c
 @@ -173,8 +173,8 @@ struct ion_heap *ion_chunk_heap_create(struct 
 ion_platform_heap *heap_data)
   chunk_heap-heap.ops = chunk_heap_ops;
   chunk_heap-heap.type = ION_HEAP_TYPE_CHUNK;
   chunk_heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE;
 - pr_debug(%s: base %lu size %zu align %ld\n, __func__, 
 chunk_heap-base,
 - heap_data-size, heap_data-align);
 + pr_debug(%s: base %pad size %zu align %pad\n, __func__,
 + chunk_heap-base, heap_data-size, heap_data-align);

   return chunk_heap-heap;

 --
 Qualcomm Innovation Center, Inc.
 The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
 a Linux Foundation Collaborative Project

 ___
 devel mailing list
 de...@linuxdriverproject.org
 http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

 To unsubscribe from this group and stop receiving emails from it, send an 
 email to kernel-team+unsubscr...@android.com.

+
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3] debug: prevent entering debug mode on errors

2014-12-01 Thread Colin Cross
On Thu, Nov 27, 2014 at 1:49 AM, Daniel Thompson
 wrote:
> On 26/11/14 17:45, Colin Cross wrote:
>> The original patch was more useful as it allowed re-enabling break on
>> panic on specific devices where you were trying to debug a
>> reproducible issue.  What about using a module_param similar to
>> kgdbreboot, but setting the default based on CONFIG_PANIC_TIMEOUT to
>> avoid extra configuration?
>
> This change was due to my review so perhaps I'd better answer this...
>
> panic_timeout is the value of the panic sysctl. In addition to the
> normal sysctl tooling (which I don't think is available on most android
> systems), its value can be set using panic=0 on the kernel command line
> or via /proc/sys/kernel/panic at runtime.
>
> CONFIG_PANIC_TIMEOUT merely sets the default value of the sysctl. I
> guess perhaps the patch description could be improved to make this clearer.
>
> Therefore, the only loss of function I expected versus the original is
> that it would be hard to get as far as a reproducible panic if the
> system also has a ton of reproducible oopses that we don't want to fix.
> Is such a use-case important?

OK, works for me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3] debug: prevent entering debug mode on errors

2014-12-01 Thread Colin Cross
On Thu, Nov 27, 2014 at 1:49 AM, Daniel Thompson
daniel.thomp...@linaro.org wrote:
 On 26/11/14 17:45, Colin Cross wrote:
 The original patch was more useful as it allowed re-enabling break on
 panic on specific devices where you were trying to debug a
 reproducible issue.  What about using a module_param similar to
 kgdbreboot, but setting the default based on CONFIG_PANIC_TIMEOUT to
 avoid extra configuration?

 This change was due to my review so perhaps I'd better answer this...

 panic_timeout is the value of the panic sysctl. In addition to the
 normal sysctl tooling (which I don't think is available on most android
 systems), its value can be set using panic=0 on the kernel command line
 or via /proc/sys/kernel/panic at runtime.

 CONFIG_PANIC_TIMEOUT merely sets the default value of the sysctl. I
 guess perhaps the patch description could be improved to make this clearer.

 Therefore, the only loss of function I expected versus the original is
 that it would be hard to get as far as a reproducible panic if the
 system also has a ton of reproducible oopses that we don't want to fix.
 Is such a use-case important?

OK, works for me.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3] debug: prevent entering debug mode on errors

2014-11-26 Thread Colin Cross
On Wed, Nov 26, 2014 at 1:14 AM, Kiran Raparthy  wrote:
> From: Colin Cross 
>
> debug: prevent entering debug mode on errors
>
> On non-developer devices kgdb prevents CONFIG_PANIC_TIMEOUT from rebooting the
> device after a panic.
>
> In case of panics and exceptions, to honor CONFIG_PANIC_TIMEOUT, prevent
> entering debug mode to avoid getting stuck waiting for the user to interact
> with debugger.
>
> Cc: Jason Wessel 
> Cc: kgdb-bugrep...@lists.sourceforge.net
> Cc: linux-kernel@vger.kernel.org
> Cc: Android Kernel Team 
> Cc: John Stultz 
> Cc: Sumit Semwal 
> Signed-off-by: Colin Cross 
> [Kiran: Added context to commit message.
> panic_timeout is used instead of break_on_panic and
> break_on_exception to honor CONFIG_PANIC_TIMEOUT]
> Signed-off-by: Kiran Raparthy 
> ---
>  kernel/debug/debug_core.c | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 1adf62b..0012a1f 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -689,6 +689,14 @@ kgdb_handle_exception(int evector, int signo, int ecode, 
> struct pt_regs *regs)
>
> if (arch_kgdb_ops.enable_nmi)
> arch_kgdb_ops.enable_nmi(0);
> +   /*
> +* Avoid entering the debugger if we were triggered due to an oops
> +* but panic_timeout indicates the system should automatically
> +* reboot on panic. We don't want to get stuck waiting for input
> +* on such systems, especially if its "just" an oops.
> +*/
> +   if (signo != SIGTRAP && panic_timeout)
> +   return 1;
>
> memset(ks, 0, sizeof(struct kgdb_state));
> ks->cpu = raw_smp_processor_id();
> @@ -821,6 +829,15 @@ static int kgdb_panic_event(struct notifier_block *self,
> unsigned long val,
> void *data)
>  {
> +   /*
> +* Avoid entering the debugger if we were triggered due to a panic
> +* We don't want to get stuck waiting for input from user in such 
> case.
> +* panic_timeout indicates the system should automatically
> +* reboot on panic.
> +*/
> +   if (panic_timeout)
> +   return NOTIFY_DONE;
> +
> if (dbg_kdb_mode)
> kdb_printf("PANIC: %s\n", (char *)data);
> kgdb_breakpoint();
> --
> 1.8.2.1
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.

The original patch was more useful as it allowed re-enabling break on
panic on specific devices where you were trying to debug a
reproducible issue.  What about using a module_param similar to
kgdbreboot, but setting the default based on CONFIG_PANIC_TIMEOUT to
avoid extra configuration?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3] debug: prevent entering debug mode on errors

2014-11-26 Thread Colin Cross
On Wed, Nov 26, 2014 at 1:14 AM, Kiran Raparthy kiran.ku...@linaro.org wrote:
 From: Colin Cross ccr...@android.com

 debug: prevent entering debug mode on errors

 On non-developer devices kgdb prevents CONFIG_PANIC_TIMEOUT from rebooting the
 device after a panic.

 In case of panics and exceptions, to honor CONFIG_PANIC_TIMEOUT, prevent
 entering debug mode to avoid getting stuck waiting for the user to interact
 with debugger.

 Cc: Jason Wessel jason.wes...@windriver.com
 Cc: kgdb-bugrep...@lists.sourceforge.net
 Cc: linux-kernel@vger.kernel.org
 Cc: Android Kernel Team kernel-t...@android.com
 Cc: John Stultz john.stu...@linaro.org
 Cc: Sumit Semwal sumit.sem...@linaro.org
 Signed-off-by: Colin Cross ccr...@android.com
 [Kiran: Added context to commit message.
 panic_timeout is used instead of break_on_panic and
 break_on_exception to honor CONFIG_PANIC_TIMEOUT]
 Signed-off-by: Kiran Raparthy kiran.ku...@linaro.org
 ---
  kernel/debug/debug_core.c | 17 +
  1 file changed, 17 insertions(+)

 diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
 index 1adf62b..0012a1f 100644
 --- a/kernel/debug/debug_core.c
 +++ b/kernel/debug/debug_core.c
 @@ -689,6 +689,14 @@ kgdb_handle_exception(int evector, int signo, int ecode, 
 struct pt_regs *regs)

 if (arch_kgdb_ops.enable_nmi)
 arch_kgdb_ops.enable_nmi(0);
 +   /*
 +* Avoid entering the debugger if we were triggered due to an oops
 +* but panic_timeout indicates the system should automatically
 +* reboot on panic. We don't want to get stuck waiting for input
 +* on such systems, especially if its just an oops.
 +*/
 +   if (signo != SIGTRAP  panic_timeout)
 +   return 1;

 memset(ks, 0, sizeof(struct kgdb_state));
 ks-cpu = raw_smp_processor_id();
 @@ -821,6 +829,15 @@ static int kgdb_panic_event(struct notifier_block *self,
 unsigned long val,
 void *data)
  {
 +   /*
 +* Avoid entering the debugger if we were triggered due to a panic
 +* We don't want to get stuck waiting for input from user in such 
 case.
 +* panic_timeout indicates the system should automatically
 +* reboot on panic.
 +*/
 +   if (panic_timeout)
 +   return NOTIFY_DONE;
 +
 if (dbg_kdb_mode)
 kdb_printf(PANIC: %s\n, (char *)data);
 kgdb_breakpoint();
 --
 1.8.2.1

 To unsubscribe from this group and stop receiving emails from it, send an 
 email to kernel-team+unsubscr...@android.com.

The original patch was more useful as it allowed re-enabling break on
panic on specific devices where you were trying to debug a
reproducible issue.  What about using a module_param similar to
kgdbreboot, but setting the default based on CONFIG_PANIC_TIMEOUT to
avoid extra configuration?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: b.L: fix unmet dependency for CPU_PM

2014-11-14 Thread Colin Cross
On Thu, Nov 13, 2014 at 10:10 PM, Pankaj Dubey  wrote:
> +CC: Colin Cross, James Hogan
>
> On Thursday, November 13, 2014 11:30 PM, Nicolas Pitre wrote:
>> To: Russell King - ARM Linux
>> Cc: Pankaj Dubey; linux-arm-ker...@lists.infradead.org; linux-
>> ker...@vger.kernel.org
>> Subject: Re: [PATCH] ARM: b.L: fix unmet dependency for CPU_PM
>>
>> On Thu, 13 Nov 2014, Russell King - ARM Linux wrote:
>>
>> > On Thu, Nov 13, 2014 at 12:44:22PM -0500, Nicolas Pitre wrote:
>> > > On Thu, 13 Nov 2014, Pankaj Dubey wrote:
>> > >
>> > > > If BL_SWITCHER is enabled but SUSPEND and CPU_IDLE is not enabled
>> > > > we are getting following config warning.
>> > > >
>> > > > warning: (BL_SWITCHER) selects CPU_PM which has unmet direct
>> > > > dependencies (SUSPEND || CPU_IDLE)
>> > > >
>> > > > So BL_SWITCHER should enable CPU_PM only if either of SUSPEND or
>> > > > CPU_IDLE is selected.
>> > > >
>> > > > Signed-off-by: Pankaj Dubey 
>> > > > ---
>> > > >  arch/arm/Kconfig |2 +-
>> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
>> > > > 9d580d0..fe3d969 100644
>> > > > --- a/arch/arm/Kconfig
>> > > > +++ b/arch/arm/Kconfig
>> > > > @@ -1417,7 +1417,7 @@ config BL_SWITCHER
>> > > > bool "big.LITTLE switcher support"
>> > > > depends on BIG_LITTLE && MCPM && HOTPLUG_CPU
>> > > > select ARM_CPU_SUSPEND
>> > > > -   select CPU_PM
>> > > > +   select CPU_PM if (SUSPEND || CPU_IDLE)
>> > >
>> > > NAK.  You just broke the code by doing this. CPU_PM is a requirement
>> > > here.  The dependencies for CPU_PM is lacking.
>> >
>
> OK, got it. Even though compilation worked, but as you mentioned by doing
> this way
> bL_switcher functionality may broke.
>
>> > Is there any real technical reason that CPU_PM depends on SUSPEND ||
>> > CPU_IDLE ?  If not, those dependencies should be killed.
>>
>> Those dependencies look artificial to me.
>>
>
> For me also it looks like these dependencies are artificial.
> As far as I can see CONFIG_CPU_PM compiles following two files
> 1: kernel/cpu_pm.c - I can't see any dependency of SUSPEND or CPU_IDLE in
> this file.
> 2: arch/mips/kernel/pm.c: A quick look does not show any dependency of
> SUSPEND or
>CPU_IDLE here too.
>
> So for me it looks like it is OK to kill these dependencies of CPU_PM.
> Still safer side I am CCing this to author of these files for confirmation.
>
> Thanks,
> Pankaj Dubey
>>
>> Nicolas
>

I agree, the dependencies in kernel/power/Kconfig should be removed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: b.L: fix unmet dependency for CPU_PM

2014-11-14 Thread Colin Cross
On Thu, Nov 13, 2014 at 10:10 PM, Pankaj Dubey pankaj.du...@samsung.com wrote:
 +CC: Colin Cross, James Hogan

 On Thursday, November 13, 2014 11:30 PM, Nicolas Pitre wrote:
 To: Russell King - ARM Linux
 Cc: Pankaj Dubey; linux-arm-ker...@lists.infradead.org; linux-
 ker...@vger.kernel.org
 Subject: Re: [PATCH] ARM: b.L: fix unmet dependency for CPU_PM

 On Thu, 13 Nov 2014, Russell King - ARM Linux wrote:

  On Thu, Nov 13, 2014 at 12:44:22PM -0500, Nicolas Pitre wrote:
   On Thu, 13 Nov 2014, Pankaj Dubey wrote:
  
If BL_SWITCHER is enabled but SUSPEND and CPU_IDLE is not enabled
we are getting following config warning.
   
warning: (BL_SWITCHER) selects CPU_PM which has unmet direct
dependencies (SUSPEND || CPU_IDLE)
   
So BL_SWITCHER should enable CPU_PM only if either of SUSPEND or
CPU_IDLE is selected.
   
Signed-off-by: Pankaj Dubey pankaj.du...@samsung.com
---
 arch/arm/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
   
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
9d580d0..fe3d969 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1417,7 +1417,7 @@ config BL_SWITCHER
bool big.LITTLE switcher support
depends on BIG_LITTLE  MCPM  HOTPLUG_CPU
select ARM_CPU_SUSPEND
-   select CPU_PM
+   select CPU_PM if (SUSPEND || CPU_IDLE)
  
   NAK.  You just broke the code by doing this. CPU_PM is a requirement
   here.  The dependencies for CPU_PM is lacking.
 

 OK, got it. Even though compilation worked, but as you mentioned by doing
 this way
 bL_switcher functionality may broke.

  Is there any real technical reason that CPU_PM depends on SUSPEND ||
  CPU_IDLE ?  If not, those dependencies should be killed.

 Those dependencies look artificial to me.


 For me also it looks like these dependencies are artificial.
 As far as I can see CONFIG_CPU_PM compiles following two files
 1: kernel/cpu_pm.c - I can't see any dependency of SUSPEND or CPU_IDLE in
 this file.
 2: arch/mips/kernel/pm.c: A quick look does not show any dependency of
 SUSPEND or
CPU_IDLE here too.

 So for me it looks like it is OK to kill these dependencies of CPU_PM.
 Still safer side I am CCing this to author of these files for confirmation.

 Thanks,
 Pankaj Dubey

 Nicolas


I agree, the dependencies in kernel/power/Kconfig should be removed.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

2014-10-07 Thread Colin Cross
On Tue, Oct 7, 2014 at 9:07 AM, PINTU KUMAR  wrote:
> - Original Message -
>> From: Colin Cross 
>> To: pint...@samsung.com
>> Cc: Laura Abbott ; Heesub Shin 
>> ; "a...@linux-foundation.org" 
>> ; "gre...@linuxfoundation.org" 
>> ; "john.stu...@linaro.org" 
>> ; "rebe...@android.com" ; 
>> "de...@driverdev.osuosl.org" ; 
>> "linux-kernel@vger.kernel.org" ; IQBAL SHAREEF 
>> ; "pintu_agar...@yahoo.com" 
>> ; Vishnu Pratap Singh ; 
>> "c...@samsung.com" 
>> Sent: Monday, 6 October 2014 11:01 PM
>> Subject: Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER 
>> for high order
>>
>> On Mon, Oct 6, 2014 at 9:26 AM, PINTU KUMAR  wrote:
>>
>>>
>>> Hi,
>>> >
>>> > From: Laura Abbott 
>>> >To: Heesub Shin ; Pintu Kumar
>> ; a...@linux-foundation.org;
>> gre...@linuxfoundation.org; john.stu...@linaro.org; rebe...@android.com;
>> ccr...@android.com; de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org
>>> >Cc: iqbal@samsung.com; pintu_agar...@yahoo.com;
>> vishnu...@samsung.com
>>> >Sent: Monday, 6 October 2014 7:37 PM
>>> >Subject: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER
>> for high order
>>> >
>>> >
>>> >On 10/6/2014 3:27 AM, Heesub Shin wrote:
>>> >
>>> >
>>> >
>>> >
>>> >> Hello Kumar,
>>> >>
>>> >> On 10/06/2014 05:31 PM, Pintu Kumar wrote:
>>> >>> The Android ion_system_heap uses allocation fallback mechanism
>>> >>> based on 8,4,0 order pages available in the system.
>>> >>> It changes gfp flags based on higher order allocation request.
>>> >>> This higher order value is hard-coded as 4, instead of using
>>> >>> the system defined higher order value.
>>> >>> Thus replacing this hard-coded value with
>> PAGE_ALLOC_COSTLY_ORDER
>>> >>> which is defined as 3.
>>> >>> This will help mapping the higher order request in system heap
>> with
>>> >>> the actual allocation request.
>>> >>
>>> >> Quite reasonable.
>>> >>
>>> >> Reviewed-by: Heesub Shin 
>>> >>
>>> >> BTW, Anyone knows how the allocation order (8,4 and 0) was
>> decided? I
>>> >> think only Google guys might know the answer.
>>> >>
>>> >> regards,
>>> >> heesub
>>> >>
>>> >
>>> >My understanding was this was completely unrelated to the costly order
>>> >and was related to the page sizes corresponding to IOMMU page sizes
>>> >(1MB, 64K, 4K). This won't make a difference for the uncached page
>>> >pool case but for the not page pool case, I'm not sure if there
>> would
>>> >be a benefit for trying to get 32K pages with some effort vs. just
>>> >going back to 4K pages.
>>>
>>> No, it is not just related to IOMMU case. It comes into picture also for
>>> normal system-heap allocation (without iommu cases).
>>> Also, it is applicable for both uncached and page_pool cases.
>>> Please also check the changes under ion_system_heap_create.
>>> Here the gfp_flags are set under the pool structure.
>>> This value is used in ion_page_pool_alloc_pages.
>>> In both the cases, it internally calls alloc_pages, with this gfp_flags.
>>> Now, during memory pressure scenario, when alloc_pages moves to slowpath
>>> this gfp_flags will be used to decide allocation retry.
>>> In the current code, the higher-order flag is set only when order is
>> greater than 4.
>>> But, in MM, the order 4 is also considered as higher-order request.
>>> This higher-order is decided based on PAGE_ALLOC_COSTLY_ORDER (3) value.
>>> Hence, I think this value should be in sync with the MM code.
>>> >
>>> >Do you have any data/metrics that show a benefit from this patch?
>>> I think it is not related to any data or metrics.
>>> It is about replacing the hard-coded higher-order check to be in sync with
>>> the MM code.
>>>
>>
>> The selection of the orders used for allocation (8, then 4, then 0) is
>> designed to match with the sizes often found in IOMMUs, but this isn't
>> changing the order of the allocation, it is changing the GFP flags
&

Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

2014-10-07 Thread Colin Cross
On Tue, Oct 7, 2014 at 9:07 AM, PINTU KUMAR pint...@samsung.com wrote:
 - Original Message -
 From: Colin Cross ccr...@android.com
 To: pint...@samsung.com
 Cc: Laura Abbott lau...@codeaurora.org; Heesub Shin 
 heesub.s...@samsung.com; a...@linux-foundation.org 
 a...@linux-foundation.org; gre...@linuxfoundation.org 
 gre...@linuxfoundation.org; john.stu...@linaro.org 
 john.stu...@linaro.org; rebe...@android.com rebe...@android.com; 
 de...@driverdev.osuosl.org de...@driverdev.osuosl.org; 
 linux-kernel@vger.kernel.org linux-kernel@vger.kernel.org; IQBAL SHAREEF 
 iqbal@samsung.com; pintu_agar...@yahoo.com 
 pintu_agar...@yahoo.com; Vishnu Pratap Singh vishnu...@samsung.com; 
 c...@samsung.com c...@samsung.com
 Sent: Monday, 6 October 2014 11:01 PM
 Subject: Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER 
 for high order

 On Mon, Oct 6, 2014 at 9:26 AM, PINTU KUMAR pint...@samsung.com wrote:


 Hi,
 
  From: Laura Abbott lau...@codeaurora.org
 To: Heesub Shin heesub.s...@samsung.com; Pintu Kumar
 pint...@samsung.com; a...@linux-foundation.org;
 gre...@linuxfoundation.org; john.stu...@linaro.org; rebe...@android.com;
 ccr...@android.com; de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org
 Cc: iqbal@samsung.com; pintu_agar...@yahoo.com;
 vishnu...@samsung.com
 Sent: Monday, 6 October 2014 7:37 PM
 Subject: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER
 for high order
 
 
 On 10/6/2014 3:27 AM, Heesub Shin wrote:
 
 
 
 
  Hello Kumar,
 
  On 10/06/2014 05:31 PM, Pintu Kumar wrote:
  The Android ion_system_heap uses allocation fallback mechanism
  based on 8,4,0 order pages available in the system.
  It changes gfp flags based on higher order allocation request.
  This higher order value is hard-coded as 4, instead of using
  the system defined higher order value.
  Thus replacing this hard-coded value with
 PAGE_ALLOC_COSTLY_ORDER
  which is defined as 3.
  This will help mapping the higher order request in system heap
 with
  the actual allocation request.
 
  Quite reasonable.
 
  Reviewed-by: Heesub Shin heesub.s...@samsung.com
 
  BTW, Anyone knows how the allocation order (8,4 and 0) was
 decided? I
  think only Google guys might know the answer.
 
  regards,
  heesub
 
 
 My understanding was this was completely unrelated to the costly order
 and was related to the page sizes corresponding to IOMMU page sizes
 (1MB, 64K, 4K). This won't make a difference for the uncached page
 pool case but for the not page pool case, I'm not sure if there
 would
 be a benefit for trying to get 32K pages with some effort vs. just
 going back to 4K pages.

 No, it is not just related to IOMMU case. It comes into picture also for
 normal system-heap allocation (without iommu cases).
 Also, it is applicable for both uncached and page_pool cases.
 Please also check the changes under ion_system_heap_create.
 Here the gfp_flags are set under the pool structure.
 This value is used in ion_page_pool_alloc_pages.
 In both the cases, it internally calls alloc_pages, with this gfp_flags.
 Now, during memory pressure scenario, when alloc_pages moves to slowpath
 this gfp_flags will be used to decide allocation retry.
 In the current code, the higher-order flag is set only when order is
 greater than 4.
 But, in MM, the order 4 is also considered as higher-order request.
 This higher-order is decided based on PAGE_ALLOC_COSTLY_ORDER (3) value.
 Hence, I think this value should be in sync with the MM code.
 
 Do you have any data/metrics that show a benefit from this patch?
 I think it is not related to any data or metrics.
 It is about replacing the hard-coded higher-order check to be in sync with
 the MM code.


 The selection of the orders used for allocation (8, then 4, then 0) is
 designed to match with the sizes often found in IOMMUs, but this isn't
 changing the order of the allocation, it is changing the GFP flags
 used for the order 4 allocation.  Right now we are using the
 low_order_gfp_flags for order 4, this patch would change it to use
 high_order_gfp_flags.  We originally used low_order_gfp_flags here
 because the MM subsystem can usually satisfy these allocations, and
 the additional load placed on the MM subsystem to kick off kswapd to
 free up more order 4 chunks is generally worth it.  Using order 4
 pages instead of order 0 pages can significantly improve the
 performance of many IOMMUs by reducing TLB pressure and time spent
 updating page tables.  Unless you have data showing that this improves
 something, and doesn't just cause all allocations to be order 0 when
 under memory pressure, I don't suggest merging this.


 Ok agree. It is worth retrying the allocation with order-4 pages.
 But, since 4 is considered higher order for MM and is greater than 
 PAGE_ALLOC_COSTLY_ORDER.
 I guess the retrying will not happen, because of the following check in 
 page_alloc:
 if (order  PAGE_ALLOC_COSTLY_ORDER)
 goto nopage

Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

2014-10-06 Thread Colin Cross
On Mon, Oct 6, 2014 at 9:26 AM, PINTU KUMAR  wrote:
>
> Hi,
> >
> > From: Laura Abbott 
> >To: Heesub Shin ; Pintu Kumar 
> >; a...@linux-foundation.org; 
> >gre...@linuxfoundation.org; john.stu...@linaro.org; rebe...@android.com; 
> >ccr...@android.com; de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org
> >Cc: iqbal@samsung.com; pintu_agar...@yahoo.com; vishnu...@samsung.com
> >Sent: Monday, 6 October 2014 7:37 PM
> >Subject: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for 
> >high order
> >
> >
> >On 10/6/2014 3:27 AM, Heesub Shin wrote:
> >
> >
> >
> >
> >> Hello Kumar,
> >>
> >> On 10/06/2014 05:31 PM, Pintu Kumar wrote:
> >>> The Android ion_system_heap uses allocation fallback mechanism
> >>> based on 8,4,0 order pages available in the system.
> >>> It changes gfp flags based on higher order allocation request.
> >>> This higher order value is hard-coded as 4, instead of using
> >>> the system defined higher order value.
> >>> Thus replacing this hard-coded value with PAGE_ALLOC_COSTLY_ORDER
> >>> which is defined as 3.
> >>> This will help mapping the higher order request in system heap with
> >>> the actual allocation request.
> >>
> >> Quite reasonable.
> >>
> >> Reviewed-by: Heesub Shin 
> >>
> >> BTW, Anyone knows how the allocation order (8,4 and 0) was decided? I
> >> think only Google guys might know the answer.
> >>
> >> regards,
> >> heesub
> >>
> >
> >My understanding was this was completely unrelated to the costly order
> >and was related to the page sizes corresponding to IOMMU page sizes
> >(1MB, 64K, 4K). This won't make a difference for the uncached page
> >pool case but for the not page pool case, I'm not sure if there would
> >be a benefit for trying to get 32K pages with some effort vs. just
> >going back to 4K pages.
>
> No, it is not just related to IOMMU case. It comes into picture also for
> normal system-heap allocation (without iommu cases).
> Also, it is applicable for both uncached and page_pool cases.
> Please also check the changes under ion_system_heap_create.
> Here the gfp_flags are set under the pool structure.
> This value is used in ion_page_pool_alloc_pages.
> In both the cases, it internally calls alloc_pages, with this gfp_flags.
> Now, during memory pressure scenario, when alloc_pages moves to slowpath
> this gfp_flags will be used to decide allocation retry.
> In the current code, the higher-order flag is set only when order is greater 
> than 4.
> But, in MM, the order 4 is also considered as higher-order request.
> This higher-order is decided based on PAGE_ALLOC_COSTLY_ORDER (3) value.
> Hence, I think this value should be in sync with the MM code.
> >
> >Do you have any data/metrics that show a benefit from this patch?
> I think it is not related to any data or metrics.
> It is about replacing the hard-coded higher-order check to be in sync with
> the MM code.
>

The selection of the orders used for allocation (8, then 4, then 0) is
designed to match with the sizes often found in IOMMUs, but this isn't
changing the order of the allocation, it is changing the GFP flags
used for the order 4 allocation.  Right now we are using the
low_order_gfp_flags for order 4, this patch would change it to use
high_order_gfp_flags.  We originally used low_order_gfp_flags here
because the MM subsystem can usually satisfy these allocations, and
the additional load placed on the MM subsystem to kick off kswapd to
free up more order 4 chunks is generally worth it.  Using order 4
pages instead of order 0 pages can significantly improve the
performance of many IOMMUs by reducing TLB pressure and time spent
updating page tables.  Unless you have data showing that this improves
something, and doesn't just cause all allocations to be order 0 when
under memory pressure, I don't suggest merging this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

2014-10-06 Thread Colin Cross
On Mon, Oct 6, 2014 at 9:26 AM, PINTU KUMAR pint...@samsung.com wrote:

 Hi,
 
  From: Laura Abbott lau...@codeaurora.org
 To: Heesub Shin heesub.s...@samsung.com; Pintu Kumar 
 pint...@samsung.com; a...@linux-foundation.org; 
 gre...@linuxfoundation.org; john.stu...@linaro.org; rebe...@android.com; 
 ccr...@android.com; de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org
 Cc: iqbal@samsung.com; pintu_agar...@yahoo.com; vishnu...@samsung.com
 Sent: Monday, 6 October 2014 7:37 PM
 Subject: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for 
 high order
 
 
 On 10/6/2014 3:27 AM, Heesub Shin wrote:
 
 
 
 
  Hello Kumar,
 
  On 10/06/2014 05:31 PM, Pintu Kumar wrote:
  The Android ion_system_heap uses allocation fallback mechanism
  based on 8,4,0 order pages available in the system.
  It changes gfp flags based on higher order allocation request.
  This higher order value is hard-coded as 4, instead of using
  the system defined higher order value.
  Thus replacing this hard-coded value with PAGE_ALLOC_COSTLY_ORDER
  which is defined as 3.
  This will help mapping the higher order request in system heap with
  the actual allocation request.
 
  Quite reasonable.
 
  Reviewed-by: Heesub Shin heesub.s...@samsung.com
 
  BTW, Anyone knows how the allocation order (8,4 and 0) was decided? I
  think only Google guys might know the answer.
 
  regards,
  heesub
 
 
 My understanding was this was completely unrelated to the costly order
 and was related to the page sizes corresponding to IOMMU page sizes
 (1MB, 64K, 4K). This won't make a difference for the uncached page
 pool case but for the not page pool case, I'm not sure if there would
 be a benefit for trying to get 32K pages with some effort vs. just
 going back to 4K pages.

 No, it is not just related to IOMMU case. It comes into picture also for
 normal system-heap allocation (without iommu cases).
 Also, it is applicable for both uncached and page_pool cases.
 Please also check the changes under ion_system_heap_create.
 Here the gfp_flags are set under the pool structure.
 This value is used in ion_page_pool_alloc_pages.
 In both the cases, it internally calls alloc_pages, with this gfp_flags.
 Now, during memory pressure scenario, when alloc_pages moves to slowpath
 this gfp_flags will be used to decide allocation retry.
 In the current code, the higher-order flag is set only when order is greater 
 than 4.
 But, in MM, the order 4 is also considered as higher-order request.
 This higher-order is decided based on PAGE_ALLOC_COSTLY_ORDER (3) value.
 Hence, I think this value should be in sync with the MM code.
 
 Do you have any data/metrics that show a benefit from this patch?
 I think it is not related to any data or metrics.
 It is about replacing the hard-coded higher-order check to be in sync with
 the MM code.


The selection of the orders used for allocation (8, then 4, then 0) is
designed to match with the sizes often found in IOMMUs, but this isn't
changing the order of the allocation, it is changing the GFP flags
used for the order 4 allocation.  Right now we are using the
low_order_gfp_flags for order 4, this patch would change it to use
high_order_gfp_flags.  We originally used low_order_gfp_flags here
because the MM subsystem can usually satisfy these allocations, and
the additional load placed on the MM subsystem to kick off kswapd to
free up more order 4 chunks is generally worth it.  Using order 4
pages instead of order 0 pages can significantly improve the
performance of many IOMMUs by reducing TLB pressure and time spent
updating page tables.  Unless you have data showing that this improves
something, and doesn't just cause all allocations to be order 0 when
under memory pressure, I don't suggest merging this.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: staging: Unwritten function for ion_carveout_heap.c

2014-07-23 Thread Colin Cross
On Wed, Jul 23, 2014 at 1:04 PM, Nick Krause  wrote:
> Hey Greg and others.
> Sorry for another email but it seems the function,
> ion_carveout_heap_unmap_dma is
> just returning and not doing anything useful. Furthermore I am new so
> I don't known how
> to write this function but this may  be causing  some rather  serious
> bugs as if the dma heap
> is not unmaped and we  call this function a lot this will make the
> kernel not able to handle dma requests
> for this driver and other drivers that need this and in turn lead to
> a oops or even a kernel panic due to leaked
> dma allocated memory. I would recommend writing this function or
> helping me write it in
> other to avoid some rather serious bugs without a proper dma unmapping
> function for this driver  :).
> Nick

Look at ion_carveout_heap_map_dma - it doesn't do anything, it just
returns a pre-existing virtual address.  That means there is nothing
to do in unmap.

map_dma is actually a bit of a misnomer here, as the actual mapping is
done in ion_map_dma_buf.  All ion_carveout_heap_map_dma does is return
the sg table for ion_map_dma_buf to pass to dma_map_sg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: staging: Unwritten function for ion_carveout_heap.c

2014-07-23 Thread Colin Cross
On Wed, Jul 23, 2014 at 1:04 PM, Nick Krause xerofo...@gmail.com wrote:
 Hey Greg and others.
 Sorry for another email but it seems the function,
 ion_carveout_heap_unmap_dma is
 just returning and not doing anything useful. Furthermore I am new so
 I don't known how
 to write this function but this may  be causing  some rather  serious
 bugs as if the dma heap
 is not unmaped and we  call this function a lot this will make the
 kernel not able to handle dma requests
 for this driver and other drivers that need this and in turn lead to
 a oops or even a kernel panic due to leaked
 dma allocated memory. I would recommend writing this function or
 helping me write it in
 other to avoid some rather serious bugs without a proper dma unmapping
 function for this driver  :).
 Nick

Look at ion_carveout_heap_map_dma - it doesn't do anything, it just
returns a pre-existing virtual address.  That means there is nothing
to do in unmap.

map_dma is actually a bit of a misnomer here, as the actual mapping is
done in ion_map_dma_buf.  All ion_carveout_heap_map_dma does is return
the sg table for ion_map_dma_buf to pass to dma_map_sg.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers: staging: android: ion: Kconfig: Let it also depend on HAS_DMA

2014-07-09 Thread Colin Cross
On Mon, Jul 7, 2014 at 1:49 AM, Chen Gang  wrote:
> ION need HAS_DMA (e.g. need DMA_SHARED_BUFFER), so it has to depend on
> HAS_DMA, or can not pass compiling with allmodconfig under score which
> NO_DMA.  And the related error:
>
> CC  drivers/staging/android/ion/ion_cma_heap.o
>   drivers/staging/android/ion/ion_cma_heap.c: In function 'ion_cma_mmap':
>   drivers/staging/android/ion/ion_cma_heap.c:168:2: error: implicit 
> declaration of function 'dma_mmap_coherent' 
> [-Werror=implicit-function-declaration]
> return dma_mmap_coherent(dev, vma, info->cpu_addr, info->handle,
> ^
>   cc1: some warnings being treated as errors
>   make[4]: *** [drivers/staging/android/ion/ion_cma_heap.o] Error 1
>   make[3]: *** [drivers/staging/android/ion] Error 2
>   make[2]: *** [drivers/staging/android] Error 2
>   make[1]: *** [drivers/staging] Error 2
>   make: *** [drivers] Error 2
>
>
> Signed-off-by: Chen Gang 
> ---
>  drivers/staging/android/ion/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/Kconfig 
> b/drivers/staging/android/ion/Kconfig
> index 0f8fec1..0a6e4d0 100644
> --- a/drivers/staging/android/ion/Kconfig
> +++ b/drivers/staging/android/ion/Kconfig
> @@ -1,6 +1,6 @@
>  menuconfig ION
> bool "Ion Memory Manager"
> -   depends on HAVE_MEMBLOCK
> +   depends on HAVE_MEMBLOCK && HAS_DMA
>     select GENERIC_ALLOCATOR
> select DMA_SHARED_BUFFER
> ---help---
> --
> 1.9.2.459.g68773ac

Acked-by: Colin Cross 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers: staging: android: ion: Kconfig: Let it also depend on HAS_DMA

2014-07-09 Thread Colin Cross
On Mon, Jul 7, 2014 at 1:49 AM, Chen Gang gang.chen.5...@gmail.com wrote:
 ION need HAS_DMA (e.g. need DMA_SHARED_BUFFER), so it has to depend on
 HAS_DMA, or can not pass compiling with allmodconfig under score which
 NO_DMA.  And the related error:

 CC  drivers/staging/android/ion/ion_cma_heap.o
   drivers/staging/android/ion/ion_cma_heap.c: In function 'ion_cma_mmap':
   drivers/staging/android/ion/ion_cma_heap.c:168:2: error: implicit 
 declaration of function 'dma_mmap_coherent' 
 [-Werror=implicit-function-declaration]
 return dma_mmap_coherent(dev, vma, info-cpu_addr, info-handle,
 ^
   cc1: some warnings being treated as errors
   make[4]: *** [drivers/staging/android/ion/ion_cma_heap.o] Error 1
   make[3]: *** [drivers/staging/android/ion] Error 2
   make[2]: *** [drivers/staging/android] Error 2
   make[1]: *** [drivers/staging] Error 2
   make: *** [drivers] Error 2


 Signed-off-by: Chen Gang gang.chen.5...@gmail.com
 ---
  drivers/staging/android/ion/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/staging/android/ion/Kconfig 
 b/drivers/staging/android/ion/Kconfig
 index 0f8fec1..0a6e4d0 100644
 --- a/drivers/staging/android/ion/Kconfig
 +++ b/drivers/staging/android/ion/Kconfig
 @@ -1,6 +1,6 @@
  menuconfig ION
 bool Ion Memory Manager
 -   depends on HAVE_MEMBLOCK
 +   depends on HAVE_MEMBLOCK  HAS_DMA
 select GENERIC_ALLOCATOR
 select DMA_SHARED_BUFFER
 ---help---
 --
 1.9.2.459.g68773ac

Acked-by: Colin Cross ccr...@android.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM64: implement TASK_SIZE_OF

2014-06-20 Thread Colin Cross
On Fri, Jun 20, 2014 at 1:25 AM, Will Deacon  wrote:
> Hi Colin,
>
> On Wed, Jun 18, 2014 at 09:10:09PM +0100, Colin Cross wrote:
>> include/linux/sched.h implements TASK_SIZE_OF as TASK_SIZE if it
>> is not set by the architecture headers.  TASK_SIZE uses the
>> current task to determine the size of the virtual address space.
>> On a 64-bit kernel this will cause reading /proc/pid/pagemap of a
>> 64-bit process from a 32-bit process to return EOF when it reads
>> past 0x.
>>
>> Implement TASK_SIZE_OF exactly the same as TASK_SIZE with
>> test_tsk_thread_flag instead of test_thread_flag.
>
> Looks sane to me.
>
>   Acked-by: Will Deacon 
>
> I take it you're being bitten by this in real software, hence the CC stable
> tag?
>
> Will

Yes, it causes memory statistics produced by Android's procrank tool
to be completely wrong.  It applies cleanly to at least 3.10.44,
3.12.22, and 3.14.8.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM64: implement TASK_SIZE_OF

2014-06-20 Thread Colin Cross
On Fri, Jun 20, 2014 at 1:25 AM, Will Deacon will.dea...@arm.com wrote:
 Hi Colin,

 On Wed, Jun 18, 2014 at 09:10:09PM +0100, Colin Cross wrote:
 include/linux/sched.h implements TASK_SIZE_OF as TASK_SIZE if it
 is not set by the architecture headers.  TASK_SIZE uses the
 current task to determine the size of the virtual address space.
 On a 64-bit kernel this will cause reading /proc/pid/pagemap of a
 64-bit process from a 32-bit process to return EOF when it reads
 past 0x.

 Implement TASK_SIZE_OF exactly the same as TASK_SIZE with
 test_tsk_thread_flag instead of test_thread_flag.

 Looks sane to me.

   Acked-by: Will Deacon will.dea...@arm.com

 I take it you're being bitten by this in real software, hence the CC stable
 tag?

 Will

Yes, it causes memory statistics produced by Android's procrank tool
to be completely wrong.  It applies cleanly to at least 3.10.44,
3.12.22, and 3.14.8.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST PATCH 4/8] android: convert sync to fence api, v5

2014-06-19 Thread Colin Cross
On Thu, Jun 19, 2014 at 5:28 AM, Daniel Vetter  wrote:
> On Thu, Jun 19, 2014 at 1:48 PM, Thierry Reding
>  wrote:
>>> > With these changes, can we pull the android sync logic out of
>>> > drivers/staging/ now?
>>>
>>> Afaik the google guys never really looked at this and acked it. So I'm not
>>> sure whether they'll follow along. The other issue I have as the
>>> maintainer of gfx driver is that I don't want to implement support for two
>>> different sync object primitives (once for dma-buf and once for android
>>> syncpts), and my impression thus far has been that even with this we're
>>> not there.
>>>
>>> I'm trying to get our own android guys to upstream their i915 syncpts
>>> support, but thus far I haven't managed to convince them to throw people's
>>> time at this.
>>
>> This has been discussed a fair bit internally recently and some of our
>> GPU experts have raised concerns that this may result in seriously
>> degraded performance in our proprietary graphics stack. Now I don't care
>> very much for the proprietary graphics stack, but by extension I would
>> assume that the same restrictions are relevant for any open-source
>> driver as well.
>>
>> I'm still trying to fully understand all the implications and at the
>> same time get some of the people who raised concerns to join in this
>> discussion. As I understand it the concern is mostly about explicit vs.
>> implicit synchronization and having this mechanism in the kernel will
>> implicitly synchronize all accesses to these buffers even in cases where
>> it's not needed (read vs. write locks, etc.). In one particular instance
>> it was even mentioned that this kind of implicit synchronization can
>> lead to deadlocks in some use-cases (this was mentioned for Android
>> compositing, but I suspect that the same may happen for Wayland or X
>> compositors).
>
> Well the implicit fences here actually can't deadlock. That's the
> entire point behind using ww mutexes. I've also heard tons of
> complaints about implicit enforced syncing (especially from opencl
> people), but in the end drivers and always expose unsynchronized
> access for specific cases. We do that in i915 for upload buffers and
> other fun stuff. This is about shared stuff across different drivers
> and different processes.
>
> I also expect that i915 will loose implicit syncing in a few upcoming
> hw modes because explicit syncing is a more natural fit there.
>
> All that isn't about the discussion at hand imo since no matter what
> i915 needs to have on internal representation for a bit of gpu work,
> and afaics right now we don't have that. With this patch android
> syncpts use Maarten's fences internally, but I can't freely exchange
> one for the other. So in i915 I still expect to get stuck with both of
> them, which is one too many.
>
> The other issue (and I haven't dug into details that much) I have with
> syncpts are some of the interface choices. Apparently you can commit a
> fence after creation (or at least the hw composer interface works like
> that) which means userspace can construct deadlocks with android
> syncpts if I'm not super careful in my driver. I haven't seen any
> generic code to do that, so I presume everyone just blindly trusts
> surface-flinger to not do that. Speaks of the average quality of an
> android gfx driver if the kernel is less trusted than the compositor
> in userspace ...

Android sync is designed not to allow userspace to deadlock the
kernel, a sync_pt should only get created by the kernel when it has
received a chunk of work that it expects to complete in the near
future.  The CONFIG_SW_SYNC_USER driver violates that by allowing
userspace to create and signal arbitrary sync points, but that is
intended only for testing sync.

> There's a few other things like exposing timestamps (which are tricky
> to do right, our driver is littered with wrong attempts) and other
> details.

Timestamps are necessary for vsync synchronization to reduce the frame latency.

> Finally I've never seen anyone from google or any android product guy
> push a real driver enabling for syncpts to upstream, and google itself
> has a bit a history of constantly exchanging their gfx framework for
> the next best thing. So I really doubt this is worthwhile to pursue in
> upstream with our essentially eternal api guarantees. At least until
> we see serious uptake from vendors and gfx driver guys. Unfortunately
> the Intel android folks are no exception here and haven't pushed
> anything like this in my direction yet at all. Despite multiple pokes
> from my side.

As far as I know, every SoC vendor that supports android is using sync
now, but none of them have succeeded in pushing their drivers upstream
for a variety of other reasons (interfaces only used by closed source
userspaces, KMS/DRM vs ADF, ION, etc.).

> So from my side I think we should move ahead with Maarten's work and
> figure the android side out once there's real interest.

As long as that doesn't involve 

Re: [REPOST PATCH 4/8] android: convert sync to fence api, v5

2014-06-19 Thread Colin Cross
On Wed, Jun 18, 2014 at 11:37 PM, Daniel Vetter  wrote:
> On Wed, Jun 18, 2014 at 06:15:56PM -0700, Greg KH wrote:
>> On Wed, Jun 18, 2014 at 12:37:11PM +0200, Maarten Lankhorst wrote:
>> > Just to show it's easy.
>> >
>> > Android syncpoints can be mapped to a timeline. This removes the need
>> > to maintain a separate api for synchronization. I've left the android
>> > trace events in place, but the core fence events should already be
>> > sufficient for debugging.
>> >
>> > v2:
>> > - Call fence_remove_callback in sync_fence_free if not all fences have 
>> > fired.
>> > v3:
>> > - Merge Colin Cross' bugfixes, and the android fence merge optimization.
>> > v4:
>> > - Merge with the upstream fixes.
>> > v5:
>> > - Fix small style issues pointed out by Thomas Hellstrom.
>> >
>> > Signed-off-by: Maarten Lankhorst 
>> > Acked-by: John Stultz 
>> > ---
>> >  drivers/staging/android/Kconfig  |1
>> >  drivers/staging/android/Makefile |2
>> >  drivers/staging/android/sw_sync.c|6
>> >  drivers/staging/android/sync.c   |  913 
>> > +++---
>> >  drivers/staging/android/sync.h   |   79 ++-
>> >  drivers/staging/android/sync_debug.c |  247 +
>> >  drivers/staging/android/trace/sync.h |   12
>> >  7 files changed, 609 insertions(+), 651 deletions(-)
>> >  create mode 100644 drivers/staging/android/sync_debug.c
>>
>> With these changes, can we pull the android sync logic out of
>> drivers/staging/ now?
>
> Afaik the google guys never really looked at this and acked it. So I'm not
> sure whether they'll follow along. The other issue I have as the
> maintainer of gfx driver is that I don't want to implement support for two
> different sync object primitives (once for dma-buf and once for android
> syncpts), and my impression thus far has been that even with this we're
> not there.

We have tested these patches to use dma fences to back the android
sync driver and not found any major issues.  However, my understanding
is that dma fences are designed for implicit sync, and explicit sync
through the android sync driver is bolted on the side to share code.
Android is not moving away from explicit sync, but we do wrap all of
our userspace sync accesses through libsync
(https://android.googlesource.com/platform/system/core/+/master/libsync/sync.c,
ignore the sw_sync parts), so if the kernel supported a slightly
different userspace explicit sync interface we could adapt to it
fairly easily.  All we require is that individual kernel drivers need
to be able to accept work alongisde an fd to wait on, and to return an
fd that will signal when the work is done, and that userspace has some
way to merge two of those fds, wait on an fd, and get some debugging
info from an fd.  However, this patch set doesn't do that, it has no
way to export a dma fence as an fd except through the android sync
driver, so it is not yet ready to fully replace android sync.

> I'm trying to get our own android guys to upstream their i915 syncpts
> support, but thus far I haven't managed to convince them to throw people's
> time at this.
>
> It looks like a step into the right direction, but until I have the proof
> in the form of i915 patches that I won't have to support 2 gfx fencing
> frameworks I'm opposed to de-staging android syncpts. Ofc someone else
> could do that too, but besides i915 I don't see a full-fledged (modeset
> side only kinda doesn't count) upstream gfx driver shipping on android.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST PATCH 4/8] android: convert sync to fence api, v5

2014-06-19 Thread Colin Cross
On Wed, Jun 18, 2014 at 11:37 PM, Daniel Vetter dan...@ffwll.ch wrote:
 On Wed, Jun 18, 2014 at 06:15:56PM -0700, Greg KH wrote:
 On Wed, Jun 18, 2014 at 12:37:11PM +0200, Maarten Lankhorst wrote:
  Just to show it's easy.
 
  Android syncpoints can be mapped to a timeline. This removes the need
  to maintain a separate api for synchronization. I've left the android
  trace events in place, but the core fence events should already be
  sufficient for debugging.
 
  v2:
  - Call fence_remove_callback in sync_fence_free if not all fences have 
  fired.
  v3:
  - Merge Colin Cross' bugfixes, and the android fence merge optimization.
  v4:
  - Merge with the upstream fixes.
  v5:
  - Fix small style issues pointed out by Thomas Hellstrom.
 
  Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
  Acked-by: John Stultz john.stu...@linaro.org
  ---
   drivers/staging/android/Kconfig  |1
   drivers/staging/android/Makefile |2
   drivers/staging/android/sw_sync.c|6
   drivers/staging/android/sync.c   |  913 
  +++---
   drivers/staging/android/sync.h   |   79 ++-
   drivers/staging/android/sync_debug.c |  247 +
   drivers/staging/android/trace/sync.h |   12
   7 files changed, 609 insertions(+), 651 deletions(-)
   create mode 100644 drivers/staging/android/sync_debug.c

 With these changes, can we pull the android sync logic out of
 drivers/staging/ now?

 Afaik the google guys never really looked at this and acked it. So I'm not
 sure whether they'll follow along. The other issue I have as the
 maintainer of gfx driver is that I don't want to implement support for two
 different sync object primitives (once for dma-buf and once for android
 syncpts), and my impression thus far has been that even with this we're
 not there.

We have tested these patches to use dma fences to back the android
sync driver and not found any major issues.  However, my understanding
is that dma fences are designed for implicit sync, and explicit sync
through the android sync driver is bolted on the side to share code.
Android is not moving away from explicit sync, but we do wrap all of
our userspace sync accesses through libsync
(https://android.googlesource.com/platform/system/core/+/master/libsync/sync.c,
ignore the sw_sync parts), so if the kernel supported a slightly
different userspace explicit sync interface we could adapt to it
fairly easily.  All we require is that individual kernel drivers need
to be able to accept work alongisde an fd to wait on, and to return an
fd that will signal when the work is done, and that userspace has some
way to merge two of those fds, wait on an fd, and get some debugging
info from an fd.  However, this patch set doesn't do that, it has no
way to export a dma fence as an fd except through the android sync
driver, so it is not yet ready to fully replace android sync.

 I'm trying to get our own android guys to upstream their i915 syncpts
 support, but thus far I haven't managed to convince them to throw people's
 time at this.

 It looks like a step into the right direction, but until I have the proof
 in the form of i915 patches that I won't have to support 2 gfx fencing
 frameworks I'm opposed to de-staging android syncpts. Ofc someone else
 could do that too, but besides i915 I don't see a full-fledged (modeset
 side only kinda doesn't count) upstream gfx driver shipping on android.
 -Daniel
 --
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST PATCH 4/8] android: convert sync to fence api, v5

2014-06-19 Thread Colin Cross
On Thu, Jun 19, 2014 at 5:28 AM, Daniel Vetter dan...@ffwll.ch wrote:
 On Thu, Jun 19, 2014 at 1:48 PM, Thierry Reding
 thierry.red...@gmail.com wrote:
  With these changes, can we pull the android sync logic out of
  drivers/staging/ now?

 Afaik the google guys never really looked at this and acked it. So I'm not
 sure whether they'll follow along. The other issue I have as the
 maintainer of gfx driver is that I don't want to implement support for two
 different sync object primitives (once for dma-buf and once for android
 syncpts), and my impression thus far has been that even with this we're
 not there.

 I'm trying to get our own android guys to upstream their i915 syncpts
 support, but thus far I haven't managed to convince them to throw people's
 time at this.

 This has been discussed a fair bit internally recently and some of our
 GPU experts have raised concerns that this may result in seriously
 degraded performance in our proprietary graphics stack. Now I don't care
 very much for the proprietary graphics stack, but by extension I would
 assume that the same restrictions are relevant for any open-source
 driver as well.

 I'm still trying to fully understand all the implications and at the
 same time get some of the people who raised concerns to join in this
 discussion. As I understand it the concern is mostly about explicit vs.
 implicit synchronization and having this mechanism in the kernel will
 implicitly synchronize all accesses to these buffers even in cases where
 it's not needed (read vs. write locks, etc.). In one particular instance
 it was even mentioned that this kind of implicit synchronization can
 lead to deadlocks in some use-cases (this was mentioned for Android
 compositing, but I suspect that the same may happen for Wayland or X
 compositors).

 Well the implicit fences here actually can't deadlock. That's the
 entire point behind using ww mutexes. I've also heard tons of
 complaints about implicit enforced syncing (especially from opencl
 people), but in the end drivers and always expose unsynchronized
 access for specific cases. We do that in i915 for upload buffers and
 other fun stuff. This is about shared stuff across different drivers
 and different processes.

 I also expect that i915 will loose implicit syncing in a few upcoming
 hw modes because explicit syncing is a more natural fit there.

 All that isn't about the discussion at hand imo since no matter what
 i915 needs to have on internal representation for a bit of gpu work,
 and afaics right now we don't have that. With this patch android
 syncpts use Maarten's fences internally, but I can't freely exchange
 one for the other. So in i915 I still expect to get stuck with both of
 them, which is one too many.

 The other issue (and I haven't dug into details that much) I have with
 syncpts are some of the interface choices. Apparently you can commit a
 fence after creation (or at least the hw composer interface works like
 that) which means userspace can construct deadlocks with android
 syncpts if I'm not super careful in my driver. I haven't seen any
 generic code to do that, so I presume everyone just blindly trusts
 surface-flinger to not do that. Speaks of the average quality of an
 android gfx driver if the kernel is less trusted than the compositor
 in userspace ...

Android sync is designed not to allow userspace to deadlock the
kernel, a sync_pt should only get created by the kernel when it has
received a chunk of work that it expects to complete in the near
future.  The CONFIG_SW_SYNC_USER driver violates that by allowing
userspace to create and signal arbitrary sync points, but that is
intended only for testing sync.

 There's a few other things like exposing timestamps (which are tricky
 to do right, our driver is littered with wrong attempts) and other
 details.

Timestamps are necessary for vsync synchronization to reduce the frame latency.

 Finally I've never seen anyone from google or any android product guy
 push a real driver enabling for syncpts to upstream, and google itself
 has a bit a history of constantly exchanging their gfx framework for
 the next best thing. So I really doubt this is worthwhile to pursue in
 upstream with our essentially eternal api guarantees. At least until
 we see serious uptake from vendors and gfx driver guys. Unfortunately
 the Intel android folks are no exception here and haven't pushed
 anything like this in my direction yet at all. Despite multiple pokes
 from my side.

As far as I know, every SoC vendor that supports android is using sync
now, but none of them have succeeded in pushing their drivers upstream
for a variety of other reasons (interfaces only used by closed source
userspaces, KMS/DRM vs ADF, ION, etc.).

 So from my side I think we should move ahead with Maarten's work and
 figure the android side out once there's real interest.

As long as that doesn't involve removing the Android sync interfaces
from staging until dma fence fds are 

[PATCH] ARM64: implement TASK_SIZE_OF

2014-06-18 Thread Colin Cross
include/linux/sched.h implements TASK_SIZE_OF as TASK_SIZE if it
is not set by the architecture headers.  TASK_SIZE uses the
current task to determine the size of the virtual address space.
On a 64-bit kernel this will cause reading /proc/pid/pagemap of a
64-bit process from a 32-bit process to return EOF when it reads
past 0x.

Implement TASK_SIZE_OF exactly the same as TASK_SIZE with
test_tsk_thread_flag instead of test_thread_flag.

Cc: sta...@vger.kernel.org
Signed-off-by: Colin Cross 
---
 arch/arm64/include/asm/memory.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 993bce527b85..902eb708804a 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -56,6 +56,8 @@
 #define TASK_SIZE_32   UL(0x1)
 #define TASK_SIZE  (test_thread_flag(TIF_32BIT) ? \
TASK_SIZE_32 : TASK_SIZE_64)
+#define TASK_SIZE_OF(tsk)  (test_tsk_thread_flag(tsk, TIF_32BIT) ? \
+   TASK_SIZE_32 : TASK_SIZE_64)
 #else
 #define TASK_SIZE  TASK_SIZE_64
 #endif /* CONFIG_COMPAT */
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM64: implement TASK_SIZE_OF

2014-06-18 Thread Colin Cross
include/linux/sched.h implements TASK_SIZE_OF as TASK_SIZE if it
is not set by the architecture headers.  TASK_SIZE uses the
current task to determine the size of the virtual address space.
On a 64-bit kernel this will cause reading /proc/pid/pagemap of a
64-bit process from a 32-bit process to return EOF when it reads
past 0x.

Implement TASK_SIZE_OF exactly the same as TASK_SIZE with
test_tsk_thread_flag instead of test_thread_flag.

Cc: sta...@vger.kernel.org
Signed-off-by: Colin Cross ccr...@android.com
---
 arch/arm64/include/asm/memory.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 993bce527b85..902eb708804a 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -56,6 +56,8 @@
 #define TASK_SIZE_32   UL(0x1)
 #define TASK_SIZE  (test_thread_flag(TIF_32BIT) ? \
TASK_SIZE_32 : TASK_SIZE_64)
+#define TASK_SIZE_OF(tsk)  (test_tsk_thread_flag(tsk, TIF_32BIT) ? \
+   TASK_SIZE_32 : TASK_SIZE_64)
 #else
 #define TASK_SIZE  TASK_SIZE_64
 #endif /* CONFIG_COMPAT */
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: ion: WARN when the handle kmap_cnt is going to wrap around

2014-05-22 Thread Colin Cross
On Thu, May 22, 2014 at 5:51 PM, Mitchel Humpherys
 wrote:
> There are certain client bugs (double unmap, for example) that can cause
> the handle->kmap_cnt (an unsigned int) to wrap around from zero. This
> causes problems when the handle is destroyed because we have:
>
> while (handle->kmap_cnt)
> ion_handle_kmap_put(handle);
>
> which takes a long time to complete when kmap_cnt starts at ~0 and can
> result in a watchdog timeout.
>
> WARN and bail when kmap_cnt is about to wrap around from zero.
>
> Signed-off-by: Mitchel Humpherys 
> ---
>  drivers/staging/android/ion/ion.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/staging/android/ion/ion.c 
> b/drivers/staging/android/ion/ion.c
> index 3d5bf14722..f55f61a4cc 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -626,6 +626,10 @@ static void ion_handle_kmap_put(struct ion_handle 
> *handle)
>  {
> struct ion_buffer *buffer = handle->buffer;
>
> +   if (!handle->kmap_cnt) {
> +   WARN(1, "%s: Double unmap detected! bailing...\n", __func__);
> +   return;
> +   }
> handle->kmap_cnt--;
> if (!handle->kmap_cnt)
> ion_buffer_kmap_put(buffer);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.

Acked-by: Colin Cross 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: ion: WARN when the handle kmap_cnt is going to wrap around

2014-05-22 Thread Colin Cross
On Thu, May 22, 2014 at 5:51 PM, Mitchel Humpherys
mitch...@codeaurora.org wrote:
 There are certain client bugs (double unmap, for example) that can cause
 the handle-kmap_cnt (an unsigned int) to wrap around from zero. This
 causes problems when the handle is destroyed because we have:

 while (handle-kmap_cnt)
 ion_handle_kmap_put(handle);

 which takes a long time to complete when kmap_cnt starts at ~0 and can
 result in a watchdog timeout.

 WARN and bail when kmap_cnt is about to wrap around from zero.

 Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org
 ---
  drivers/staging/android/ion/ion.c | 4 
  1 file changed, 4 insertions(+)

 diff --git a/drivers/staging/android/ion/ion.c 
 b/drivers/staging/android/ion/ion.c
 index 3d5bf14722..f55f61a4cc 100644
 --- a/drivers/staging/android/ion/ion.c
 +++ b/drivers/staging/android/ion/ion.c
 @@ -626,6 +626,10 @@ static void ion_handle_kmap_put(struct ion_handle 
 *handle)
  {
 struct ion_buffer *buffer = handle-buffer;

 +   if (!handle-kmap_cnt) {
 +   WARN(1, %s: Double unmap detected! bailing...\n, __func__);
 +   return;
 +   }
 handle-kmap_cnt--;
 if (!handle-kmap_cnt)
 ion_buffer_kmap_put(buffer);
 --
 The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
 hosted by The Linux Foundation

 To unsubscribe from this group and stop receiving emails from it, send an 
 email to kernel-team+unsubscr...@android.com.

Acked-by: Colin Cross ccr...@android.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3 1/9] sysrq: Implement __handle_sysrq_nolock to avoid recursive locking in kdb

2014-04-29 Thread Colin Cross
On Tue, Apr 29, 2014 at 1:59 AM, Daniel Thompson
 wrote:
> On 28/04/14 18:44, Colin Cross wrote:
>>>> Is that case documented somewhere in the code comments?
>>>
>>> Perhaps not near enough to the _nolock but the primary bit of comment is
>>> here (and in same file as kdb_sr).
>>> --- cut here ---
>>>  * kdb_main_loop - After initial setup and assignment of the
>>>  *  controlling cpu, all cpus are in this loop.  One cpu is in
>>>  *  control and will issue the kdb prompt, the others will spin
>>>  *  until 'go' or cpu switch.
>>> --- cut here ---
>>>
>>> The mechanism kgdb uses to quiesce other CPUs means other CPUs cannot be
>>> in irqsave critical sections.
>>>
>>>
>>
>> One of the advantages of FIQ debugger is that it can be triggered from
>> an FIQ (NMI for those in x86 land), and Jason and I have discussed
>> using FIQs for kgdb to allow interrupting cpus stuck in critical
>> sections.  If that gets implemented the above assumption will no
>> longer be correct.
>
> Reviewing this I realized I missed one of the most critical points in
> the above.
>
> Today kdb, even if triggered by FIQ/NMI, would still be likely to wedge
> waiting for the IPI interrupts to be delivered to other processors.
>
> Did you and Jason discuss getting the active CPU to quiesce the other
> processors using FIQ/NMI, or to allow the active CPU to timeout while
> waiting for them the stop?
>
>
> Daniel.

Yes, all cpus would have to get an FIQ/NMI.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3 1/9] sysrq: Implement __handle_sysrq_nolock to avoid recursive locking in kdb

2014-04-29 Thread Colin Cross
On Tue, Apr 29, 2014 at 1:59 AM, Daniel Thompson
daniel.thomp...@linaro.org wrote:
 On 28/04/14 18:44, Colin Cross wrote:
 Is that case documented somewhere in the code comments?

 Perhaps not near enough to the _nolock but the primary bit of comment is
 here (and in same file as kdb_sr).
 --- cut here ---
  * kdb_main_loop - After initial setup and assignment of the
  *  controlling cpu, all cpus are in this loop.  One cpu is in
  *  control and will issue the kdb prompt, the others will spin
  *  until 'go' or cpu switch.
 --- cut here ---

 The mechanism kgdb uses to quiesce other CPUs means other CPUs cannot be
 in irqsave critical sections.



 One of the advantages of FIQ debugger is that it can be triggered from
 an FIQ (NMI for those in x86 land), and Jason and I have discussed
 using FIQs for kgdb to allow interrupting cpus stuck in critical
 sections.  If that gets implemented the above assumption will no
 longer be correct.

 Reviewing this I realized I missed one of the most critical points in
 the above.

 Today kdb, even if triggered by FIQ/NMI, would still be likely to wedge
 waiting for the IPI interrupts to be delivered to other processors.

 Did you and Jason discuss getting the active CPU to quiesce the other
 processors using FIQ/NMI, or to allow the active CPU to timeout while
 waiting for them the stop?


 Daniel.

Yes, all cpus would have to get an FIQ/NMI.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3 1/9] sysrq: Implement __handle_sysrq_nolock to avoid recursive locking in kdb

2014-04-28 Thread Colin Cross
On Mon, Apr 28, 2014 at 3:24 AM, Daniel Thompson
 wrote:
> On 25/04/14 17:45, Steven Rostedt wrote:
>> On Fri, 25 Apr 2014 17:29:22 +0100
>> Daniel Thompson  wrote:
>>
>>> If kdb is triggered using SysRq-g then any use of the sr command results
>>> in the SysRq key table lock being recursively acquired, killing the debug
>>> session. That patch resolves the problem by introducing a _nolock
>>> alternative for __handle_sysrq.
>>>
>>> Strictly speaking this approach risks racing on the key table when kdb is
>>> triggered by something other than SysRq-g however in that case any other
>>> CPU involved should release the spin lock before kgdb parks the slave
>>> CPUs.
>>
>> Is that case documented somewhere in the code comments?
>
> Perhaps not near enough to the _nolock but the primary bit of comment is
> here (and in same file as kdb_sr).
> --- cut here ---
>  * kdb_main_loop - After initial setup and assignment of the
>  *  controlling cpu, all cpus are in this loop.  One cpu is in
>  *  control and will issue the kdb prompt, the others will spin
>  *  until 'go' or cpu switch.
> --- cut here ---
>
> The mechanism kgdb uses to quiesce other CPUs means other CPUs cannot be
> in irqsave critical sections.
>
>

One of the advantages of FIQ debugger is that it can be triggered from
an FIQ (NMI for those in x86 land), and Jason and I have discussed
using FIQs for kgdb to allow interrupting cpus stuck in critical
sections.  If that gets implemented the above assumption will no
longer be correct.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v3 1/9] sysrq: Implement __handle_sysrq_nolock to avoid recursive locking in kdb

2014-04-28 Thread Colin Cross
On Mon, Apr 28, 2014 at 3:24 AM, Daniel Thompson
daniel.thomp...@linaro.org wrote:
 On 25/04/14 17:45, Steven Rostedt wrote:
 On Fri, 25 Apr 2014 17:29:22 +0100
 Daniel Thompson daniel.thomp...@linaro.org wrote:

 If kdb is triggered using SysRq-g then any use of the sr command results
 in the SysRq key table lock being recursively acquired, killing the debug
 session. That patch resolves the problem by introducing a _nolock
 alternative for __handle_sysrq.

 Strictly speaking this approach risks racing on the key table when kdb is
 triggered by something other than SysRq-g however in that case any other
 CPU involved should release the spin lock before kgdb parks the slave
 CPUs.

 Is that case documented somewhere in the code comments?

 Perhaps not near enough to the _nolock but the primary bit of comment is
 here (and in same file as kdb_sr).
 --- cut here ---
  * kdb_main_loop - After initial setup and assignment of the
  *  controlling cpu, all cpus are in this loop.  One cpu is in
  *  control and will issue the kdb prompt, the others will spin
  *  until 'go' or cpu switch.
 --- cut here ---

 The mechanism kgdb uses to quiesce other CPUs means other CPUs cannot be
 in irqsave critical sections.



One of the advantages of FIQ debugger is that it can be triggered from
an FIQ (NMI for those in x86 land), and Jason and I have discussed
using FIQs for kgdb to allow interrupting cpus stuck in critical
sections.  If that gets implemented the above assumption will no
longer be correct.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/android: Remove ram_console.h

2014-04-23 Thread Colin Cross
On Tue, Apr 22, 2014 at 2:22 AM, Bintian Wang  wrote:
> ram_console is replaced by pstore and pstore_ram drivers,
> and there is no code to use this head file, so remove it.
>
> Signed-off-by: Bintian Wang 
> ---
>  drivers/staging/android/ram_console.h |   22 --
>  1 file changed, 22 deletions(-)
>  delete mode 100644 drivers/staging/android/ram_console.h
>
> diff --git a/drivers/staging/android/ram_console.h 
> b/drivers/staging/android/ram_console.h
> deleted file mode 100644
> index 9f1125c..000
> --- a/drivers/staging/android/ram_console.h
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/*
> - * Copyright (C) 2010 Google, Inc.
> - *
> - * This software is licensed under the terms of the GNU General Public
> - * License version 2, as published by the Free Software Foundation, and
> - * may be copied, distributed, and modified under those terms.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - */
> -
> -#ifndef _INCLUDE_LINUX_PLATFORM_DATA_RAM_CONSOLE_H_
> -#define _INCLUDE_LINUX_PLATFORM_DATA_RAM_CONSOLE_H_
> -
> -struct ram_console_platform_data {
> -   const char *bootinfo;
> -};
> -
> -#endif /* _INCLUDE_LINUX_PLATFORM_DATA_RAM_CONSOLE_H_ */
> --
> 1.7.9.5
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

Acked-by: Colin Cross 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/android: Remove ram_console.h

2014-04-23 Thread Colin Cross
On Tue, Apr 22, 2014 at 2:22 AM, Bintian Wang bintian.w...@huawei.com wrote:
 ram_console is replaced by pstore and pstore_ram drivers,
 and there is no code to use this head file, so remove it.

 Signed-off-by: Bintian Wang bintian.w...@huawei.com
 ---
  drivers/staging/android/ram_console.h |   22 --
  1 file changed, 22 deletions(-)
  delete mode 100644 drivers/staging/android/ram_console.h

 diff --git a/drivers/staging/android/ram_console.h 
 b/drivers/staging/android/ram_console.h
 deleted file mode 100644
 index 9f1125c..000
 --- a/drivers/staging/android/ram_console.h
 +++ /dev/null
 @@ -1,22 +0,0 @@
 -/*
 - * Copyright (C) 2010 Google, Inc.
 - *
 - * This software is licensed under the terms of the GNU General Public
 - * License version 2, as published by the Free Software Foundation, and
 - * may be copied, distributed, and modified under those terms.
 - *
 - * This program is distributed in the hope that it will be useful,
 - * but WITHOUT ANY WARRANTY; without even the implied warranty of
 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 - * GNU General Public License for more details.
 - *
 - */
 -
 -#ifndef _INCLUDE_LINUX_PLATFORM_DATA_RAM_CONSOLE_H_
 -#define _INCLUDE_LINUX_PLATFORM_DATA_RAM_CONSOLE_H_
 -
 -struct ram_console_platform_data {
 -   const char *bootinfo;
 -};
 -
 -#endif /* _INCLUDE_LINUX_PLATFORM_DATA_RAM_CONSOLE_H_ */
 --
 1.7.9.5


 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

Acked-by: Colin Cross ccr...@android.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging : android : uapi : fix coding style

2014-04-16 Thread Colin Cross
(resending without the html)

On Wed, Apr 16, 2014 at 9:36 AM, John Stultz  wrote:
>
> On 04/16/2014 07:39 AM, Joe Perches wrote:
> > On Wed, 2014-04-16 at 23:27 +0900, Seunghun Lee wrote:
> >> This patch fix checkpatch.pl warnings and errors.
> > []
> >> diff --git a/drivers/staging/android/uapi/binder.h 
> >> b/drivers/staging/android/uapi/binder.h
> > []
> >> @@ -169,7 +169,7 @@ struct binder_ptr_cookie {
> >>  struct binder_handle_cookie {
> >>  __u32 handle;
> >>  binder_uintptr_t cookie;
> >> -} __attribute__((packed));
> >> +} __packed;
> > If this .h file is meant to be a user-space #include,
> > then it should not use the kernel specific __packed
> > but keep the __attribute__((packed))
>
> Agreed.
>
> > It does use __u32 though and that's generally
> > kernel specific.
>
> Hmm. Theres a ton of __u32 usage in include/uapi/*  as well as typedefs
> for it too.
> include/uapi/asm-generic/int-l64.h:typedef unsigned int __u32;
> include/uapi/asm-generic/int-ll64.h:typedef unsigned int __u32;
>
> > John?  Does any of these binder uapi files need a
> > bit more sorting out?
> I suspect this is ok, but Cc'ing Colin to give him a heads up, as it
> would probably cause trouble w/ their libc headers first.

__u32 is explicitly for use in userspace headers so it doesn't
conflict with userspace definitions of u32, uint32_t, etc.  It gets
set through #include  when a copy of include/uapi is in
the include path.

>From Documentation/CodingStyle:
 (e) Types safe for use in userspace.

 In certain structures which are visible to userspace, we cannot
 require C99 types and cannot use the 'u32' form above. Thus, we
 use __u32 and similar types in all structures which are shared
 with userspace.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging : android : uapi : fix coding style

2014-04-16 Thread Colin Cross
(resending without the html)

On Wed, Apr 16, 2014 at 9:36 AM, John Stultz john.stu...@linaro.org wrote:

 On 04/16/2014 07:39 AM, Joe Perches wrote:
  On Wed, 2014-04-16 at 23:27 +0900, Seunghun Lee wrote:
  This patch fix checkpatch.pl warnings and errors.
  []
  diff --git a/drivers/staging/android/uapi/binder.h 
  b/drivers/staging/android/uapi/binder.h
  []
  @@ -169,7 +169,7 @@ struct binder_ptr_cookie {
   struct binder_handle_cookie {
   __u32 handle;
   binder_uintptr_t cookie;
  -} __attribute__((packed));
  +} __packed;
  If this .h file is meant to be a user-space #include,
  then it should not use the kernel specific __packed
  but keep the __attribute__((packed))

 Agreed.

  It does use __u32 though and that's generally
  kernel specific.

 Hmm. Theres a ton of __u32 usage in include/uapi/*  as well as typedefs
 for it too.
 include/uapi/asm-generic/int-l64.h:typedef unsigned int __u32;
 include/uapi/asm-generic/int-ll64.h:typedef unsigned int __u32;

  John?  Does any of these binder uapi files need a
  bit more sorting out?
 I suspect this is ok, but Cc'ing Colin to give him a heads up, as it
 would probably cause trouble w/ their libc headers first.

__u32 is explicitly for use in userspace headers so it doesn't
conflict with userspace definitions of u32, uint32_t, etc.  It gets
set through #include linux/types.h when a copy of include/uapi is in
the include path.

From Documentation/CodingStyle:
 (e) Types safe for use in userspace.

 In certain structures which are visible to userspace, we cannot
 require C99 types and cannot use the 'u32' form above. Thus, we
 use __u32 and similar types in all structures which are shared
 with userspace.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dma-buf: avoid using IS_ERR_OR_NULL

2014-02-07 Thread Colin Cross
On Fri, Feb 7, 2014 at 8:43 AM, Greg Kroah-Hartman
 wrote:
> On Sat, Dec 21, 2013 at 07:42:17AM -0500, Rob Clark wrote:
>> On Fri, Dec 20, 2013 at 7:43 PM, Colin Cross  wrote:
>> > dma_buf_map_attachment and dma_buf_vmap can return NULL or
>> > ERR_PTR on a error.  This encourages a common buggy pattern in
>> > callers:
>> > sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
>> > if (IS_ERR_OR_NULL(sgt))
>> > return PTR_ERR(sgt);
>> >
>> > This causes the caller to return 0 on an error.  IS_ERR_OR_NULL
>> > is almost always a sign of poorly-defined error handling.
>> >
>> > This patch converts dma_buf_map_attachment to always return
>> > ERR_PTR, and fixes the callers that incorrectly handled NULL.
>> > There are a few more callers that were not checking for NULL
>> > at all, which would have dereferenced a NULL pointer later.
>> > There are also a few more callers that correctly handled NULL
>> > and ERR_PTR differently, I left those alone but they could also
>> > be modified to delete the NULL check.
>> >
>> > This patch also converts dma_buf_vmap to always return NULL.
>> > All the callers to dma_buf_vmap only check for NULL, and would
>> > have dereferenced an ERR_PTR and panic'd if one was ever
>> > returned. This is not consistent with the rest of the dma buf
>> > APIs, but matches the expectations of all of the callers.
>> >
>> > Signed-off-by: Colin Cross 
>> > ---
>> >  drivers/base/dma-buf.c | 18 +++---
>> >  drivers/gpu/drm/drm_prime.c|  2 +-
>> >  drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |  2 +-
>> >  drivers/media/v4l2-core/videobuf2-dma-contig.c |  2 +-
>> >  4 files changed, 14 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
>> > index 1e16cbd61da2..cfe1d8bc7bb8 100644
>> > --- a/drivers/base/dma-buf.c
>> > +++ b/drivers/base/dma-buf.c
>> > @@ -251,9 +251,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
>> >   * @dmabuf:[in]buffer to attach device to.
>> >   * @dev:   [in]device to be attached.
>> >   *
>> > - * Returns struct dma_buf_attachment * for this attachment; may return 
>> > negative
>> > - * error codes.
>> > - *
>> > + * Returns struct dma_buf_attachment * for this attachment; returns 
>> > ERR_PTR on
>> > + * error.
>> >   */
>> >  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
>> >   struct device *dev)
>> > @@ -319,9 +318,8 @@ EXPORT_SYMBOL_GPL(dma_buf_detach);
>> >   * @attach:[in]attachment whose scatterlist is to be returned
>> >   * @direction: [in]direction of DMA transfer
>> >   *
>> > - * Returns sg_table containing the scatterlist to be returned; may return 
>> > NULL
>> > - * or ERR_PTR.
>> > - *
>> > + * Returns sg_table containing the scatterlist to be returned; returns 
>> > ERR_PTR
>> > + * on error.
>> >   */
>> >  struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
>> > enum dma_data_direction direction)
>> > @@ -334,6 +332,8 @@ struct sg_table *dma_buf_map_attachment(struct 
>> > dma_buf_attachment *attach,
>> > return ERR_PTR(-EINVAL);
>> >
>> > sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>> > +   if (!sg_table)
>> > +   sg_table = ERR_PTR(-ENOMEM);
>> >
>> > return sg_table;
>> >  }
>> > @@ -544,6 +544,8 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
>> >   * These calls are optional in drivers. The intended use for them
>> >   * is for mapping objects linear in kernel space for high use objects.
>> >   * Please attempt to use kmap/kunmap before thinking about these 
>> > interfaces.
>> > + *
>> > + * Returns NULL on error.
>> >   */
>> >  void *dma_buf_vmap(struct dma_buf *dmabuf)
>> >  {
>> > @@ -566,7 +568,9 @@ void *dma_buf_vmap(struct dma_buf *dmabuf)
>> > BUG_ON(dmabuf->vmap_ptr);
>> >
>> > ptr = dmabuf->ops->vmap(dmabuf);
>> > -   if (IS_ERR_OR_NULL(ptr))
>> > +   if (WARN_ON_ONCE(IS_ERR(ptr)))
>>
>> since vmap is optional, the WARN_ON might be a bit strong..  although
>> it would be a bit strange for an exporter to supply a vmap fxn which
>> always returned NULL, not sure about that.  Just thought I'd mention
>> it in case anyone else had an opinion about that.
>
> Yeah, I don't like this, it could cause unnecessary reports of problems.

The WARN_ON_ONCE is only if the vmap op returns ERR_PTR, not if it
returns NULL.  This is designed to catch vmap ops that don't follow
the spec, so I would call them necessary reports, but I can take it
out if you still disagree.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dma-buf: avoid using IS_ERR_OR_NULL

2014-02-07 Thread Colin Cross
On Fri, Feb 7, 2014 at 8:43 AM, Greg Kroah-Hartman
gre...@linuxfoundation.org wrote:
 On Sat, Dec 21, 2013 at 07:42:17AM -0500, Rob Clark wrote:
 On Fri, Dec 20, 2013 at 7:43 PM, Colin Cross ccr...@android.com wrote:
  dma_buf_map_attachment and dma_buf_vmap can return NULL or
  ERR_PTR on a error.  This encourages a common buggy pattern in
  callers:
  sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
  if (IS_ERR_OR_NULL(sgt))
  return PTR_ERR(sgt);
 
  This causes the caller to return 0 on an error.  IS_ERR_OR_NULL
  is almost always a sign of poorly-defined error handling.
 
  This patch converts dma_buf_map_attachment to always return
  ERR_PTR, and fixes the callers that incorrectly handled NULL.
  There are a few more callers that were not checking for NULL
  at all, which would have dereferenced a NULL pointer later.
  There are also a few more callers that correctly handled NULL
  and ERR_PTR differently, I left those alone but they could also
  be modified to delete the NULL check.
 
  This patch also converts dma_buf_vmap to always return NULL.
  All the callers to dma_buf_vmap only check for NULL, and would
  have dereferenced an ERR_PTR and panic'd if one was ever
  returned. This is not consistent with the rest of the dma buf
  APIs, but matches the expectations of all of the callers.
 
  Signed-off-by: Colin Cross ccr...@android.com
  ---
   drivers/base/dma-buf.c | 18 +++---
   drivers/gpu/drm/drm_prime.c|  2 +-
   drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |  2 +-
   drivers/media/v4l2-core/videobuf2-dma-contig.c |  2 +-
   4 files changed, 14 insertions(+), 10 deletions(-)
 
  diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
  index 1e16cbd61da2..cfe1d8bc7bb8 100644
  --- a/drivers/base/dma-buf.c
  +++ b/drivers/base/dma-buf.c
  @@ -251,9 +251,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
* @dmabuf:[in]buffer to attach device to.
* @dev:   [in]device to be attached.
*
  - * Returns struct dma_buf_attachment * for this attachment; may return 
  negative
  - * error codes.
  - *
  + * Returns struct dma_buf_attachment * for this attachment; returns 
  ERR_PTR on
  + * error.
*/
   struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
  @@ -319,9 +318,8 @@ EXPORT_SYMBOL_GPL(dma_buf_detach);
* @attach:[in]attachment whose scatterlist is to be returned
* @direction: [in]direction of DMA transfer
*
  - * Returns sg_table containing the scatterlist to be returned; may return 
  NULL
  - * or ERR_PTR.
  - *
  + * Returns sg_table containing the scatterlist to be returned; returns 
  ERR_PTR
  + * on error.
*/
   struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
  enum dma_data_direction direction)
  @@ -334,6 +332,8 @@ struct sg_table *dma_buf_map_attachment(struct 
  dma_buf_attachment *attach,
  return ERR_PTR(-EINVAL);
 
  sg_table = attach-dmabuf-ops-map_dma_buf(attach, direction);
  +   if (!sg_table)
  +   sg_table = ERR_PTR(-ENOMEM);
 
  return sg_table;
   }
  @@ -544,6 +544,8 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
* These calls are optional in drivers. The intended use for them
* is for mapping objects linear in kernel space for high use objects.
* Please attempt to use kmap/kunmap before thinking about these 
  interfaces.
  + *
  + * Returns NULL on error.
*/
   void *dma_buf_vmap(struct dma_buf *dmabuf)
   {
  @@ -566,7 +568,9 @@ void *dma_buf_vmap(struct dma_buf *dmabuf)
  BUG_ON(dmabuf-vmap_ptr);
 
  ptr = dmabuf-ops-vmap(dmabuf);
  -   if (IS_ERR_OR_NULL(ptr))
  +   if (WARN_ON_ONCE(IS_ERR(ptr)))

 since vmap is optional, the WARN_ON might be a bit strong..  although
 it would be a bit strange for an exporter to supply a vmap fxn which
 always returned NULL, not sure about that.  Just thought I'd mention
 it in case anyone else had an opinion about that.

 Yeah, I don't like this, it could cause unnecessary reports of problems.

The WARN_ON_ONCE is only if the vmap op returns ERR_PTR, not if it
returns NULL.  This is designed to catch vmap ops that don't follow
the spec, so I would call them necessary reports, but I can take it
out if you still disagree.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64

2014-02-03 Thread Colin Cross
Binaries compiled for arm may run on arm64 if CONFIG_COMPAT is
selected.  Set LSM_MMAP_MIN_ADDR to 32768 if ARM64 && COMPAT to
prevent selinux failures launching 32-bit static executables that
are mapped at 0x8000.

Signed-off-by: Colin Cross 
---
 security/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/Kconfig b/security/Kconfig
index e9c6ac724fef..beb86b500adf 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -103,7 +103,7 @@ config INTEL_TXT
 config LSM_MMAP_MIN_ADDR
int "Low address space for LSM to protect from user allocation"
depends on SECURITY && SECURITY_SELINUX
-   default 32768 if ARM
+   default 32768 if ARM || (ARM64 && COMPAT)
default 65536
help
  This is the portion of low virtual memory which should be protected
-- 
1.9.0.rc1.175.g0b1dcb5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64

2014-02-03 Thread Colin Cross
Binaries compiled for arm may run on arm64 if CONFIG_COMPAT is
selected.  Set LSM_MMAP_MIN_ADDR to 32768 if ARM64  COMPAT to
prevent selinux failures launching 32-bit static executables that
are mapped at 0x8000.

Signed-off-by: Colin Cross ccr...@android.com
---
 security/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/Kconfig b/security/Kconfig
index e9c6ac724fef..beb86b500adf 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -103,7 +103,7 @@ config INTEL_TXT
 config LSM_MMAP_MIN_ADDR
int Low address space for LSM to protect from user allocation
depends on SECURITY  SECURITY_SELINUX
-   default 32768 if ARM
+   default 32768 if ARM || (ARM64  COMPAT)
default 65536
help
  This is the portion of low virtual memory which should be protected
-- 
1.9.0.rc1.175.g0b1dcb5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] sched: allow try_to_wake_up to be used internally outside of core.c

2014-01-13 Thread Colin Cross
On Mon, Jan 13, 2014 at 4:31 AM, Maarten Lankhorst
 wrote:
> The kernel fence implementation doesn't use event queues, but needs
> to perform the same wake up. The symbol is not exported, since the
> fence implementation is not built as a module.
>
> Signed-off-by: Maarten Lankhorst 
> ---
>  include/linux/wait.h |1 +
>  kernel/sched/core.c  |2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/wait.h b/include/linux/wait.h
> index eaa00b10abaa..c54e3ef50134 100644
> --- a/include/linux/wait.h
> +++ b/include/linux/wait.h
> @@ -12,6 +12,7 @@
>  typedef struct __wait_queue wait_queue_t;
>  typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int 
> flags, void *key);
>  int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void 
> *key);
> +int try_to_wake_up(struct task_struct *p, unsigned int state, int 
> wake_flags);
>
>  struct __wait_queue {
> unsigned intflags;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index a88f4a485c5e..f41d317042dd 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1578,7 +1578,7 @@ static void ttwu_queue(struct task_struct *p, int cpu)
>   * Return: %true if @p was woken up, %false if it was already running.
>   * or @state didn't match @p's state.
>   */
> -static int
> +int
>  try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
>  {
> unsigned long flags;
>

wake_up_state is already available in linux/sched.h, can you use that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] sched: allow try_to_wake_up to be used internally outside of core.c

2014-01-13 Thread Colin Cross
On Mon, Jan 13, 2014 at 4:31 AM, Maarten Lankhorst
maarten.lankho...@canonical.com wrote:
 The kernel fence implementation doesn't use event queues, but needs
 to perform the same wake up. The symbol is not exported, since the
 fence implementation is not built as a module.

 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
 ---
  include/linux/wait.h |1 +
  kernel/sched/core.c  |2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

 diff --git a/include/linux/wait.h b/include/linux/wait.h
 index eaa00b10abaa..c54e3ef50134 100644
 --- a/include/linux/wait.h
 +++ b/include/linux/wait.h
 @@ -12,6 +12,7 @@
  typedef struct __wait_queue wait_queue_t;
  typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int 
 flags, void *key);
  int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void 
 *key);
 +int try_to_wake_up(struct task_struct *p, unsigned int state, int 
 wake_flags);

  struct __wait_queue {
 unsigned intflags;
 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index a88f4a485c5e..f41d317042dd 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -1578,7 +1578,7 @@ static void ttwu_queue(struct task_struct *p, int cpu)
   * Return: %true if @p was woken up, %false if it was already running.
   * or @state didn't match @p's state.
   */
 -static int
 +int
  try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
  {
 unsigned long flags;


wake_up_state is already available in linux/sched.h, can you use that?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: mm: add memory type for inner-writeback

2014-01-07 Thread Colin Cross
On Tue, Jan 7, 2014 at 7:09 AM, Catalin Marinas  wrote:
> On Fri, Dec 27, 2013 at 04:58:48AM +, Mark Zhang wrote:
>> From: Colin Cross 
>>
>> For streaming-style operations (e.g., software rendering of graphics
>> surfaces shared with non-coherent DMA devices), the cost of performing
>> L2 cache maintenance can exceed the benefit of having the larger cache
>> (this is particularly true for OUTER_CACHE configurations like the ARM
>> PL2x0).
>>
>> This change uses the currently-unused mapping 5 (TEX[0]=1, C=0, B=1)
>> in the tex remapping tables as an inner-writeback-write-allocate, outer
>> non-cacheable memory type, so that this mapping will be available to
>> clients which will benefit from the reduced L2 maintenance.
>>
>> Signed-off-by: Gary King 
>
> Is Colin signing off this patch as well?
>
>> --- a/arch/arm/mm/proc-v7-2level.S
>> +++ b/arch/arm/mm/proc-v7-2level.S
>> @@ -144,8 +144,8 @@ ENDPROC(cpu_v7_set_pte_ext)
>>*   NS1 = PRRR[19] = 1 - normal shareable property
>>*   NOS = PRRR[24+n] = 1   - not outer shareable
>>*/
>> -.equ PRRR,   0xff0a81a8
>> -.equ NMRR,   0x40e040e0
>> +.equ PRRR,   0xff0a89a8
>> +.equ NMRR,   0x40e044e0
>
> It should be done for the *-3level files.
>
> --
> Catalin

I shouldn't have authorship on that patch at all.  The original is at
https://android.googlesource.com/kernel/tegra/+/fb382752691d74a849996daf77be961ca2cdae97
and I must have screwed up the authorship when fixing conflicts when
cherry-picking it to the android-tegra-moto-2.6.39 branch.  Please
change the author back to "Gary King ".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: mm: add memory type for inner-writeback

2014-01-07 Thread Colin Cross
On Tue, Jan 7, 2014 at 7:09 AM, Catalin Marinas catalin.mari...@arm.com wrote:
 On Fri, Dec 27, 2013 at 04:58:48AM +, Mark Zhang wrote:
 From: Colin Cross ccr...@android.com

 For streaming-style operations (e.g., software rendering of graphics
 surfaces shared with non-coherent DMA devices), the cost of performing
 L2 cache maintenance can exceed the benefit of having the larger cache
 (this is particularly true for OUTER_CACHE configurations like the ARM
 PL2x0).

 This change uses the currently-unused mapping 5 (TEX[0]=1, C=0, B=1)
 in the tex remapping tables as an inner-writeback-write-allocate, outer
 non-cacheable memory type, so that this mapping will be available to
 clients which will benefit from the reduced L2 maintenance.

 Signed-off-by: Gary King gk...@nvidia.com

 Is Colin signing off this patch as well?

 --- a/arch/arm/mm/proc-v7-2level.S
 +++ b/arch/arm/mm/proc-v7-2level.S
 @@ -144,8 +144,8 @@ ENDPROC(cpu_v7_set_pte_ext)
*   NS1 = PRRR[19] = 1 - normal shareable property
*   NOS = PRRR[24+n] = 1   - not outer shareable
*/
 -.equ PRRR,   0xff0a81a8
 -.equ NMRR,   0x40e040e0
 +.equ PRRR,   0xff0a89a8
 +.equ NMRR,   0x40e044e0

 It should be done for the *-3level files.

 --
 Catalin

I shouldn't have authorship on that patch at all.  The original is at
https://android.googlesource.com/kernel/tegra/+/fb382752691d74a849996daf77be961ca2cdae97
and I must have screwed up the authorship when fixing conflicts when
cherry-picking it to the android-tegra-moto-2.6.39 branch.  Please
change the author back to Gary King gk...@nvidia.com.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] dma-buf: avoid using IS_ERR_OR_NULL

2013-12-20 Thread Colin Cross
dma_buf_map_attachment and dma_buf_vmap can return NULL or
ERR_PTR on a error.  This encourages a common buggy pattern in
callers:
sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
if (IS_ERR_OR_NULL(sgt))
return PTR_ERR(sgt);

This causes the caller to return 0 on an error.  IS_ERR_OR_NULL
is almost always a sign of poorly-defined error handling.

This patch converts dma_buf_map_attachment to always return
ERR_PTR, and fixes the callers that incorrectly handled NULL.
There are a few more callers that were not checking for NULL
at all, which would have dereferenced a NULL pointer later.
There are also a few more callers that correctly handled NULL
and ERR_PTR differently, I left those alone but they could also
be modified to delete the NULL check.

This patch also converts dma_buf_vmap to always return NULL.
All the callers to dma_buf_vmap only check for NULL, and would
have dereferenced an ERR_PTR and panic'd if one was ever
returned. This is not consistent with the rest of the dma buf
APIs, but matches the expectations of all of the callers.

Signed-off-by: Colin Cross 
---
 drivers/base/dma-buf.c | 18 +++---
 drivers/gpu/drm/drm_prime.c|  2 +-
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |  2 +-
 drivers/media/v4l2-core/videobuf2-dma-contig.c |  2 +-
 4 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 1e16cbd61da2..cfe1d8bc7bb8 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -251,9 +251,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  * @dmabuf:[in]buffer to attach device to.
  * @dev:   [in]device to be attached.
  *
- * Returns struct dma_buf_attachment * for this attachment; may return negative
- * error codes.
- *
+ * Returns struct dma_buf_attachment * for this attachment; returns ERR_PTR on
+ * error.
  */
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
  struct device *dev)
@@ -319,9 +318,8 @@ EXPORT_SYMBOL_GPL(dma_buf_detach);
  * @attach:[in]attachment whose scatterlist is to be returned
  * @direction: [in]direction of DMA transfer
  *
- * Returns sg_table containing the scatterlist to be returned; may return NULL
- * or ERR_PTR.
- *
+ * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR
+ * on error.
  */
 struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
enum dma_data_direction direction)
@@ -334,6 +332,8 @@ struct sg_table *dma_buf_map_attachment(struct 
dma_buf_attachment *attach,
return ERR_PTR(-EINVAL);
 
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
+   if (!sg_table)
+   sg_table = ERR_PTR(-ENOMEM);
 
return sg_table;
 }
@@ -544,6 +544,8 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
  * These calls are optional in drivers. The intended use for them
  * is for mapping objects linear in kernel space for high use objects.
  * Please attempt to use kmap/kunmap before thinking about these interfaces.
+ *
+ * Returns NULL on error.
  */
 void *dma_buf_vmap(struct dma_buf *dmabuf)
 {
@@ -566,7 +568,9 @@ void *dma_buf_vmap(struct dma_buf *dmabuf)
BUG_ON(dmabuf->vmap_ptr);
 
ptr = dmabuf->ops->vmap(dmabuf);
-   if (IS_ERR_OR_NULL(ptr))
+   if (WARN_ON_ONCE(IS_ERR(ptr)))
+   ptr = NULL;
+   if (!ptr)
goto out_unlock;
 
dmabuf->vmap_ptr = ptr;
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 56805c39c906..bb516fdd195d 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -471,7 +471,7 @@ struct drm_gem_object *drm_gem_prime_import(struct 
drm_device *dev,
get_dma_buf(dma_buf);
 
sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
-   if (IS_ERR_OR_NULL(sgt)) {
+   if (IS_ERR(sgt)) {
ret = PTR_ERR(sgt);
goto fail_detach;
}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c 
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index 59827cc5e770..c786cd4f457b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -224,7 +224,7 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct 
drm_device *drm_dev,
get_dma_buf(dma_buf);
 
sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
-   if (IS_ERR_OR_NULL(sgt)) {
+   if (IS_ERR(sgt)) {
ret = PTR_ERR(sgt);
goto err_buf_detach;
}
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 33d3871d1e13..880be0782dd9 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -719,7 +719,7 @@ static int vb2_dc_map_

[PATCH] dma-buf: avoid using IS_ERR_OR_NULL

2013-12-20 Thread Colin Cross
dma_buf_map_attachment and dma_buf_vmap can return NULL or
ERR_PTR on a error.  This encourages a common buggy pattern in
callers:
sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
if (IS_ERR_OR_NULL(sgt))
return PTR_ERR(sgt);

This causes the caller to return 0 on an error.  IS_ERR_OR_NULL
is almost always a sign of poorly-defined error handling.

This patch converts dma_buf_map_attachment to always return
ERR_PTR, and fixes the callers that incorrectly handled NULL.
There are a few more callers that were not checking for NULL
at all, which would have dereferenced a NULL pointer later.
There are also a few more callers that correctly handled NULL
and ERR_PTR differently, I left those alone but they could also
be modified to delete the NULL check.

This patch also converts dma_buf_vmap to always return NULL.
All the callers to dma_buf_vmap only check for NULL, and would
have dereferenced an ERR_PTR and panic'd if one was ever
returned. This is not consistent with the rest of the dma buf
APIs, but matches the expectations of all of the callers.

Signed-off-by: Colin Cross ccr...@android.com
---
 drivers/base/dma-buf.c | 18 +++---
 drivers/gpu/drm/drm_prime.c|  2 +-
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |  2 +-
 drivers/media/v4l2-core/videobuf2-dma-contig.c |  2 +-
 4 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 1e16cbd61da2..cfe1d8bc7bb8 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -251,9 +251,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  * @dmabuf:[in]buffer to attach device to.
  * @dev:   [in]device to be attached.
  *
- * Returns struct dma_buf_attachment * for this attachment; may return negative
- * error codes.
- *
+ * Returns struct dma_buf_attachment * for this attachment; returns ERR_PTR on
+ * error.
  */
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
  struct device *dev)
@@ -319,9 +318,8 @@ EXPORT_SYMBOL_GPL(dma_buf_detach);
  * @attach:[in]attachment whose scatterlist is to be returned
  * @direction: [in]direction of DMA transfer
  *
- * Returns sg_table containing the scatterlist to be returned; may return NULL
- * or ERR_PTR.
- *
+ * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR
+ * on error.
  */
 struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
enum dma_data_direction direction)
@@ -334,6 +332,8 @@ struct sg_table *dma_buf_map_attachment(struct 
dma_buf_attachment *attach,
return ERR_PTR(-EINVAL);
 
sg_table = attach-dmabuf-ops-map_dma_buf(attach, direction);
+   if (!sg_table)
+   sg_table = ERR_PTR(-ENOMEM);
 
return sg_table;
 }
@@ -544,6 +544,8 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
  * These calls are optional in drivers. The intended use for them
  * is for mapping objects linear in kernel space for high use objects.
  * Please attempt to use kmap/kunmap before thinking about these interfaces.
+ *
+ * Returns NULL on error.
  */
 void *dma_buf_vmap(struct dma_buf *dmabuf)
 {
@@ -566,7 +568,9 @@ void *dma_buf_vmap(struct dma_buf *dmabuf)
BUG_ON(dmabuf-vmap_ptr);
 
ptr = dmabuf-ops-vmap(dmabuf);
-   if (IS_ERR_OR_NULL(ptr))
+   if (WARN_ON_ONCE(IS_ERR(ptr)))
+   ptr = NULL;
+   if (!ptr)
goto out_unlock;
 
dmabuf-vmap_ptr = ptr;
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 56805c39c906..bb516fdd195d 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -471,7 +471,7 @@ struct drm_gem_object *drm_gem_prime_import(struct 
drm_device *dev,
get_dma_buf(dma_buf);
 
sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
-   if (IS_ERR_OR_NULL(sgt)) {
+   if (IS_ERR(sgt)) {
ret = PTR_ERR(sgt);
goto fail_detach;
}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c 
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index 59827cc5e770..c786cd4f457b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -224,7 +224,7 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct 
drm_device *drm_dev,
get_dma_buf(dma_buf);
 
sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
-   if (IS_ERR_OR_NULL(sgt)) {
+   if (IS_ERR(sgt)) {
ret = PTR_ERR(sgt);
goto err_buf_detach;
}
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 33d3871d1e13..880be0782dd9 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -719,7 +719,7 @@ static int vb2_dc_map_dmabuf(void

Re: [PATCH] ion_test: Add compat_ioctl support (v2)

2013-12-19 Thread Colin Cross
On Thu, Dec 19, 2013 at 3:56 PM, John Stultz  wrote:
> Prior to subitting this, Colin reworked the compat_ioctl support
> for the ion_test driver, moving the structure to be the same size
> on both 32 and 64 bit architectures.
>
> Two small things were left out. The compat_ioctl ptr assignment,
> and the fact that despite having uniform sized types in the
> structure, the structure pads out to different sizes on different
> arches.
>
> This patch resolves this issue by adding a padding entry after
> the write flag, and adding the compat_ioctl ptr.
>
> Changes in v2:
> - Add a padding int rather then making write a u64
>
> Cc: Colin Cross 
> Cc: Greg KH 
> Cc: Android Kernel Team 
> Signed-off-by: John Stultz 
> ---
>  drivers/staging/android/ion/ion_test.c  | 1 +
>  drivers/staging/android/uapi/ion_test.h | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/drivers/staging/android/ion/ion_test.c 
> b/drivers/staging/android/ion/ion_test.c
> index 3e20349..654acb5 100644
> --- a/drivers/staging/android/ion/ion_test.c
> +++ b/drivers/staging/android/ion/ion_test.c
> @@ -231,6 +231,7 @@ static int ion_test_release(struct inode *inode, struct 
> file *file)
>  static const struct file_operations ion_test_fops = {
> .owner = THIS_MODULE,
> .unlocked_ioctl = ion_test_ioctl,
> +   .compat_ioctl = ion_test_ioctl,
> .open = ion_test_open,
> .release = ion_test_release,
>  };
> diff --git a/drivers/staging/android/uapi/ion_test.h 
> b/drivers/staging/android/uapi/ion_test.h
> index 614d1e3..ffef06f 100644
> --- a/drivers/staging/android/uapi/ion_test.h
> +++ b/drivers/staging/android/uapi/ion_test.h
> @@ -32,6 +32,7 @@ struct ion_test_rw_data {
> __u64 offset;
> __u64 size;
> int write;
> +   int __padding;
>  };
>
>  #define ION_IOC_MAGIC  'I'
> --
> 1.8.3.2
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.

Acked-by: Colin Cross 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ion_test: Add compat_ioctl support

2013-12-19 Thread Colin Cross
On Thu, Dec 19, 2013 at 3:30 PM, John Stultz  wrote:
> On 12/19/2013 02:52 PM, Colin Cross wrote:
>> On Thu, Dec 19, 2013 at 2:40 PM, John Stultz  wrote:
>>> Prior to subitting this, Colin reworked the compat_ioctl support
>>> for the ion_test driver, moving the structure to be the same size
>>> on both 32 and 64 bit architectures.
>>>
>>> Two small things were left out. The compat_ioctl ptr assignment,
>>> and the fact that despite having uniform sized types in the
>>> structure, the structure pads out to different sizes on different
>>> arches.
>>>
>>> This patch resolves this issue by setting the write flag as
>>> a __u64, and adding the compat_ioctl ptr.
>>>
>>> While this does affect the ABI for 32bit users, its only
>>> the ABI for the ion_test driver, not ion itself.
>>>
>>> Cc: Colin Cross 
>>> Cc: Greg KH 
>>> Cc: Android Kernel Team 
>>> Signed-off-by: John Stultz 
>>> ---
>>>  drivers/staging/android/ion/ion_test.c  | 1 +
>>>  drivers/staging/android/uapi/ion_test.h | 2 +-
>>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/staging/android/ion/ion_test.c 
>>> b/drivers/staging/android/ion/ion_test.c
>>> index 3e20349..654acb5 100644
>>> --- a/drivers/staging/android/ion/ion_test.c
>>> +++ b/drivers/staging/android/ion/ion_test.c
>>> @@ -231,6 +231,7 @@ static int ion_test_release(struct inode *inode, struct 
>>> file *file)
>>>  static const struct file_operations ion_test_fops = {
>>> .owner = THIS_MODULE,
>>> .unlocked_ioctl = ion_test_ioctl,
>>> +   .compat_ioctl = ion_test_ioctl,
>> Setting the compat_ioctl to the same thing as unlocked_ioctl shouldn't
>> be necessary, compat_sys_ioctl will fall through to do_vfs_ioctl if
>> compat_ioctl is not set, which will call unlocked_ioctl.
>
> Hrm. I've not looked into the implementation, but doesn't seem to be the
> case on x86_64. Without it I get -1 back when calling the ioctl from a
> 32bit application. With it the ioctls work.

I missed that compat_sys_ioctl returns -ENOTTY if do_ioctl_trans
fails, so the compat_ioctl is necessary.

>>
>>> .open = ion_test_open,
>>> .release = ion_test_release,
>>>  };
>>> diff --git a/drivers/staging/android/uapi/ion_test.h 
>>> b/drivers/staging/android/uapi/ion_test.h
>>> index 614d1e3..f1727f5 100644
>>> --- a/drivers/staging/android/uapi/ion_test.h
>>> +++ b/drivers/staging/android/uapi/ion_test.h
>>> @@ -31,7 +31,7 @@ struct ion_test_rw_data {
>>> __u64 ptr;
>>> __u64 offset;
>>> __u64 size;
>>> -   int write;
>>> +   __u64 write;
>>>  };
>> Whoops, missed that one.  Can you add "int padding" after "int write"
>> instead?  That at least lets us use the 32 bits later.
>
> Will do.
>
>
> thanks
> -john
>
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ion_test: Add compat_ioctl support

2013-12-19 Thread Colin Cross
On Thu, Dec 19, 2013 at 2:40 PM, John Stultz  wrote:
> Prior to subitting this, Colin reworked the compat_ioctl support
> for the ion_test driver, moving the structure to be the same size
> on both 32 and 64 bit architectures.
>
> Two small things were left out. The compat_ioctl ptr assignment,
> and the fact that despite having uniform sized types in the
> structure, the structure pads out to different sizes on different
> arches.
>
> This patch resolves this issue by setting the write flag as
> a __u64, and adding the compat_ioctl ptr.
>
> While this does affect the ABI for 32bit users, its only
> the ABI for the ion_test driver, not ion itself.
>
> Cc: Colin Cross 
> Cc: Greg KH 
> Cc: Android Kernel Team 
> Signed-off-by: John Stultz 
> ---
>  drivers/staging/android/ion/ion_test.c  | 1 +
>  drivers/staging/android/uapi/ion_test.h | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/ion_test.c 
> b/drivers/staging/android/ion/ion_test.c
> index 3e20349..654acb5 100644
> --- a/drivers/staging/android/ion/ion_test.c
> +++ b/drivers/staging/android/ion/ion_test.c
> @@ -231,6 +231,7 @@ static int ion_test_release(struct inode *inode, struct 
> file *file)
>  static const struct file_operations ion_test_fops = {
> .owner = THIS_MODULE,
> .unlocked_ioctl = ion_test_ioctl,
> +   .compat_ioctl = ion_test_ioctl,

Setting the compat_ioctl to the same thing as unlocked_ioctl shouldn't
be necessary, compat_sys_ioctl will fall through to do_vfs_ioctl if
compat_ioctl is not set, which will call unlocked_ioctl.

> .open = ion_test_open,
> .release = ion_test_release,
>  };
> diff --git a/drivers/staging/android/uapi/ion_test.h 
> b/drivers/staging/android/uapi/ion_test.h
> index 614d1e3..f1727f5 100644
> --- a/drivers/staging/android/uapi/ion_test.h
> +++ b/drivers/staging/android/uapi/ion_test.h
> @@ -31,7 +31,7 @@ struct ion_test_rw_data {
> __u64 ptr;
> __u64 offset;
> __u64 size;
> -   int write;
> +   __u64 write;
>  };

Whoops, missed that one.  Can you add "int padding" after "int write"
instead?  That at least lets us use the 32 bits later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ion_test: Add compat_ioctl support

2013-12-19 Thread Colin Cross
On Thu, Dec 19, 2013 at 2:40 PM, John Stultz john.stu...@linaro.org wrote:
 Prior to subitting this, Colin reworked the compat_ioctl support
 for the ion_test driver, moving the structure to be the same size
 on both 32 and 64 bit architectures.

 Two small things were left out. The compat_ioctl ptr assignment,
 and the fact that despite having uniform sized types in the
 structure, the structure pads out to different sizes on different
 arches.

 This patch resolves this issue by setting the write flag as
 a __u64, and adding the compat_ioctl ptr.

 While this does affect the ABI for 32bit users, its only
 the ABI for the ion_test driver, not ion itself.

 Cc: Colin Cross ccr...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Cc: Android Kernel Team kernel-t...@android.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/ion_test.c  | 1 +
  drivers/staging/android/uapi/ion_test.h | 2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

 diff --git a/drivers/staging/android/ion/ion_test.c 
 b/drivers/staging/android/ion/ion_test.c
 index 3e20349..654acb5 100644
 --- a/drivers/staging/android/ion/ion_test.c
 +++ b/drivers/staging/android/ion/ion_test.c
 @@ -231,6 +231,7 @@ static int ion_test_release(struct inode *inode, struct 
 file *file)
  static const struct file_operations ion_test_fops = {
 .owner = THIS_MODULE,
 .unlocked_ioctl = ion_test_ioctl,
 +   .compat_ioctl = ion_test_ioctl,

Setting the compat_ioctl to the same thing as unlocked_ioctl shouldn't
be necessary, compat_sys_ioctl will fall through to do_vfs_ioctl if
compat_ioctl is not set, which will call unlocked_ioctl.

 .open = ion_test_open,
 .release = ion_test_release,
  };
 diff --git a/drivers/staging/android/uapi/ion_test.h 
 b/drivers/staging/android/uapi/ion_test.h
 index 614d1e3..f1727f5 100644
 --- a/drivers/staging/android/uapi/ion_test.h
 +++ b/drivers/staging/android/uapi/ion_test.h
 @@ -31,7 +31,7 @@ struct ion_test_rw_data {
 __u64 ptr;
 __u64 offset;
 __u64 size;
 -   int write;
 +   __u64 write;
  };

Whoops, missed that one.  Can you add int padding after int write
instead?  That at least lets us use the 32 bits later.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ion_test: Add compat_ioctl support

2013-12-19 Thread Colin Cross
On Thu, Dec 19, 2013 at 3:30 PM, John Stultz john.stu...@linaro.org wrote:
 On 12/19/2013 02:52 PM, Colin Cross wrote:
 On Thu, Dec 19, 2013 at 2:40 PM, John Stultz john.stu...@linaro.org wrote:
 Prior to subitting this, Colin reworked the compat_ioctl support
 for the ion_test driver, moving the structure to be the same size
 on both 32 and 64 bit architectures.

 Two small things were left out. The compat_ioctl ptr assignment,
 and the fact that despite having uniform sized types in the
 structure, the structure pads out to different sizes on different
 arches.

 This patch resolves this issue by setting the write flag as
 a __u64, and adding the compat_ioctl ptr.

 While this does affect the ABI for 32bit users, its only
 the ABI for the ion_test driver, not ion itself.

 Cc: Colin Cross ccr...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Cc: Android Kernel Team kernel-t...@android.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/ion_test.c  | 1 +
  drivers/staging/android/uapi/ion_test.h | 2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

 diff --git a/drivers/staging/android/ion/ion_test.c 
 b/drivers/staging/android/ion/ion_test.c
 index 3e20349..654acb5 100644
 --- a/drivers/staging/android/ion/ion_test.c
 +++ b/drivers/staging/android/ion/ion_test.c
 @@ -231,6 +231,7 @@ static int ion_test_release(struct inode *inode, struct 
 file *file)
  static const struct file_operations ion_test_fops = {
 .owner = THIS_MODULE,
 .unlocked_ioctl = ion_test_ioctl,
 +   .compat_ioctl = ion_test_ioctl,
 Setting the compat_ioctl to the same thing as unlocked_ioctl shouldn't
 be necessary, compat_sys_ioctl will fall through to do_vfs_ioctl if
 compat_ioctl is not set, which will call unlocked_ioctl.

 Hrm. I've not looked into the implementation, but doesn't seem to be the
 case on x86_64. Without it I get -1 back when calling the ioctl from a
 32bit application. With it the ioctls work.

I missed that compat_sys_ioctl returns -ENOTTY if do_ioctl_trans
fails, so the compat_ioctl is necessary.


 .open = ion_test_open,
 .release = ion_test_release,
  };
 diff --git a/drivers/staging/android/uapi/ion_test.h 
 b/drivers/staging/android/uapi/ion_test.h
 index 614d1e3..f1727f5 100644
 --- a/drivers/staging/android/uapi/ion_test.h
 +++ b/drivers/staging/android/uapi/ion_test.h
 @@ -31,7 +31,7 @@ struct ion_test_rw_data {
 __u64 ptr;
 __u64 offset;
 __u64 size;
 -   int write;
 +   __u64 write;
  };
 Whoops, missed that one.  Can you add int padding after int write
 instead?  That at least lets us use the 32 bits later.

 Will do.


 thanks
 -john


 To unsubscribe from this group and stop receiving emails from it, send an 
 email to kernel-team+unsubscr...@android.com.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ion_test: Add compat_ioctl support (v2)

2013-12-19 Thread Colin Cross
On Thu, Dec 19, 2013 at 3:56 PM, John Stultz john.stu...@linaro.org wrote:
 Prior to subitting this, Colin reworked the compat_ioctl support
 for the ion_test driver, moving the structure to be the same size
 on both 32 and 64 bit architectures.

 Two small things were left out. The compat_ioctl ptr assignment,
 and the fact that despite having uniform sized types in the
 structure, the structure pads out to different sizes on different
 arches.

 This patch resolves this issue by adding a padding entry after
 the write flag, and adding the compat_ioctl ptr.

 Changes in v2:
 - Add a padding int rather then making write a u64

 Cc: Colin Cross ccr...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Cc: Android Kernel Team kernel-t...@android.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/ion_test.c  | 1 +
  drivers/staging/android/uapi/ion_test.h | 1 +
  2 files changed, 2 insertions(+)

 diff --git a/drivers/staging/android/ion/ion_test.c 
 b/drivers/staging/android/ion/ion_test.c
 index 3e20349..654acb5 100644
 --- a/drivers/staging/android/ion/ion_test.c
 +++ b/drivers/staging/android/ion/ion_test.c
 @@ -231,6 +231,7 @@ static int ion_test_release(struct inode *inode, struct 
 file *file)
  static const struct file_operations ion_test_fops = {
 .owner = THIS_MODULE,
 .unlocked_ioctl = ion_test_ioctl,
 +   .compat_ioctl = ion_test_ioctl,
 .open = ion_test_open,
 .release = ion_test_release,
  };
 diff --git a/drivers/staging/android/uapi/ion_test.h 
 b/drivers/staging/android/uapi/ion_test.h
 index 614d1e3..ffef06f 100644
 --- a/drivers/staging/android/uapi/ion_test.h
 +++ b/drivers/staging/android/uapi/ion_test.h
 @@ -32,6 +32,7 @@ struct ion_test_rw_data {
 __u64 offset;
 __u64 size;
 int write;
 +   int __padding;
  };

  #define ION_IOC_MAGIC  'I'
 --
 1.8.3.2

 To unsubscribe from this group and stop receiving emails from it, send an 
 email to kernel-team+unsubscr...@android.com.

Acked-by: Colin Cross ccr...@android.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] staging: ion: Avoid using rt_mutexes directly

2013-12-17 Thread Colin Cross
On Tue, Dec 17, 2013 at 5:04 PM, John Stultz  wrote:
> RT_MUTEXES can be configured out of the kernel, causing compile
> problems with ION.
>
> To quote Colin:
> "rt_mutexes were added with the deferred freeing feature.  Heaps need
> to return zeroed memory to userspace, but zeroing the memory on every
> allocation was causing performance issues.  We added a SCHED_IDLE
> thread to zero memory in the background after freeing, but locking the
> heap from the SCHED_IDLE thread might block a high priority allocation
> thread for a long time.
>
> The lock is only used to protect the heap's free_list and
> free_list_size members, and is not held for any long or sleeping
> operations.  Converting to a spinlock should prevent priority
> inversion without using the rt_mutex.  I'd also rename it to free_lock
> to so it doesn't get used as a general heap lock."
>
> Thus this patch converts the rt_mutex usage to a spinlock and
> renames the lock free_lock to be more clear as to its use.
>
> I also had to change a bit of logic in ion_heap_freelist_drain()
> to safely avoid list corruption.
>
> Cc: Colin Cross 
> Cc: Android Kernel Team 
> Cc: Greg KH 
> Reported-by: Jim Davis 
> Signed-off-by: John Stultz 
> ---
>
> Changes from v2:
> - Modified while loop to be more clear, per Colin's suggestion
>
>
>  drivers/staging/android/ion/ion_heap.c | 28 
>  drivers/staging/android/ion/ion_priv.h |  2 +-
>  2 files changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion_heap.c 
> b/drivers/staging/android/ion/ion_heap.c
> index 9cf5622..296c74f 100644
> --- a/drivers/staging/android/ion/ion_heap.c
> +++ b/drivers/staging/android/ion/ion_heap.c
> @@ -160,10 +160,10 @@ int ion_heap_pages_zero(struct page *page, size_t size, 
> pgprot_t pgprot)
>
>  void ion_heap_freelist_add(struct ion_heap *heap, struct ion_buffer *buffer)
>  {
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> list_add(>list, >free_list);
> heap->free_list_size += buffer->size;
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
> wake_up(>waitqueue);
>  }
>
> @@ -171,34 +171,38 @@ size_t ion_heap_freelist_size(struct ion_heap *heap)
>  {
> size_t size;
>
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> size = heap->free_list_size;
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
>
> return size;
>  }
>
>  size_t ion_heap_freelist_drain(struct ion_heap *heap, size_t size)
>  {
> -   struct ion_buffer *buffer, *tmp;
> +   struct ion_buffer *buffer;
> size_t total_drained = 0;
>
> if (ion_heap_freelist_size(heap) == 0)
> return 0;
>
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> if (size == 0)
> size = heap->free_list_size;
>
> -   list_for_each_entry_safe(buffer, tmp, >free_list, list) {
> +   while (!list_empty(>free_list)) {
> if (total_drained >= size)
> break;
> +   buffer = list_first_entry(>free_list, struct ion_buffer,
> + list);
> list_del(>list);
> heap->free_list_size -= buffer->size;
> total_drained += buffer->size;
> +   spin_unlock(>free_lock);
> ion_buffer_destroy(buffer);
> +   spin_lock(>free_lock);
> }
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
>
> return total_drained;
>  }
> @@ -213,16 +217,16 @@ static int ion_heap_deferred_free(void *data)
> wait_event_freezable(heap->waitqueue,
>  ion_heap_freelist_size(heap) > 0);
>
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> if (list_empty(>free_list)) {
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
> continue;
> }
> buffer = list_first_entry(>free_list, struct ion_buffer,
>   list);
> list_del(>list);
> heap->free_list_size -= buffer->size;
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
> ion_buffer_destroy(buffer);
> }
>
> @@ -235,7 +239,7 @@ int ion_heap_init_defer

Re: [PATCH v3] staging: ion: Avoid using rt_mutexes directly

2013-12-17 Thread Colin Cross
On Tue, Dec 17, 2013 at 5:04 PM, John Stultz john.stu...@linaro.org wrote:
 RT_MUTEXES can be configured out of the kernel, causing compile
 problems with ION.

 To quote Colin:
 rt_mutexes were added with the deferred freeing feature.  Heaps need
 to return zeroed memory to userspace, but zeroing the memory on every
 allocation was causing performance issues.  We added a SCHED_IDLE
 thread to zero memory in the background after freeing, but locking the
 heap from the SCHED_IDLE thread might block a high priority allocation
 thread for a long time.

 The lock is only used to protect the heap's free_list and
 free_list_size members, and is not held for any long or sleeping
 operations.  Converting to a spinlock should prevent priority
 inversion without using the rt_mutex.  I'd also rename it to free_lock
 to so it doesn't get used as a general heap lock.

 Thus this patch converts the rt_mutex usage to a spinlock and
 renames the lock free_lock to be more clear as to its use.

 I also had to change a bit of logic in ion_heap_freelist_drain()
 to safely avoid list corruption.

 Cc: Colin Cross ccr...@android.com
 Cc: Android Kernel Team kernel-t...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Reported-by: Jim Davis jim.ep...@gmail.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---

 Changes from v2:
 - Modified while loop to be more clear, per Colin's suggestion


  drivers/staging/android/ion/ion_heap.c | 28 
  drivers/staging/android/ion/ion_priv.h |  2 +-
  2 files changed, 17 insertions(+), 13 deletions(-)

 diff --git a/drivers/staging/android/ion/ion_heap.c 
 b/drivers/staging/android/ion/ion_heap.c
 index 9cf5622..296c74f 100644
 --- a/drivers/staging/android/ion/ion_heap.c
 +++ b/drivers/staging/android/ion/ion_heap.c
 @@ -160,10 +160,10 @@ int ion_heap_pages_zero(struct page *page, size_t size, 
 pgprot_t pgprot)

  void ion_heap_freelist_add(struct ion_heap *heap, struct ion_buffer *buffer)
  {
 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 list_add(buffer-list, heap-free_list);
 heap-free_list_size += buffer-size;
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);
 wake_up(heap-waitqueue);
  }

 @@ -171,34 +171,38 @@ size_t ion_heap_freelist_size(struct ion_heap *heap)
  {
 size_t size;

 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 size = heap-free_list_size;
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);

 return size;
  }

  size_t ion_heap_freelist_drain(struct ion_heap *heap, size_t size)
  {
 -   struct ion_buffer *buffer, *tmp;
 +   struct ion_buffer *buffer;
 size_t total_drained = 0;

 if (ion_heap_freelist_size(heap) == 0)
 return 0;

 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 if (size == 0)
 size = heap-free_list_size;

 -   list_for_each_entry_safe(buffer, tmp, heap-free_list, list) {
 +   while (!list_empty(heap-free_list)) {
 if (total_drained = size)
 break;
 +   buffer = list_first_entry(heap-free_list, struct ion_buffer,
 + list);
 list_del(buffer-list);
 heap-free_list_size -= buffer-size;
 total_drained += buffer-size;
 +   spin_unlock(heap-free_lock);
 ion_buffer_destroy(buffer);
 +   spin_lock(heap-free_lock);
 }
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);

 return total_drained;
  }
 @@ -213,16 +217,16 @@ static int ion_heap_deferred_free(void *data)
 wait_event_freezable(heap-waitqueue,
  ion_heap_freelist_size(heap)  0);

 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 if (list_empty(heap-free_list)) {
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);
 continue;
 }
 buffer = list_first_entry(heap-free_list, struct ion_buffer,
   list);
 list_del(buffer-list);
 heap-free_list_size -= buffer-size;
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);
 ion_buffer_destroy(buffer);
 }

 @@ -235,7 +239,7 @@ int ion_heap_init_deferred_free(struct ion_heap *heap)

 INIT_LIST_HEAD(heap-free_list);
 heap-free_list_size = 0;
 -   rt_mutex_init(heap-lock);
 +   spin_lock_init(heap-free_lock);
 init_waitqueue_head(heap-waitqueue);
 heap-task = kthread_run(ion_heap_deferred_free, heap,
  %s, heap-name);
 diff --git a/drivers/staging/android/ion

Re: [PATCH 3/3] staging: ion: Avoid using rt_mutexes directly.

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 9:07 PM, John Stultz  wrote:
> RT_MUTEXES can be configured out of the kernel, causing compile
> problems with ION.
>
> To quote Colin:
> "rt_mutexes were added with the deferred freeing feature.  Heaps need
> to return zeroed memory to userspace, but zeroing the memory on every
> allocation was causing performance issues.  We added a SCHED_IDLE
> thread to zero memory in the background after freeing, but locking the
> heap from the SCHED_IDLE thread might block a high priority allocation
> thread for a long time.
>
> The lock is only used to protect the heap's free_list and
> free_list_size members, and is not held for any long or sleeping
> operations.  Converting to a spinlock should prevent priority
> inversion without using the rt_mutex.  I'd also rename it to free_lock
> to so it doesn't get used as a general heap lock."
>
> Thus this patch converts the rt_mutex usage to a spinlock and
> renames the lock free_lock to be more clear as to its use.
>
> I also had to change a bit of logic in ion_heap_freelist_drain()
> despite the for loop being a for_each_entry_safe(), I was still
> seeing list corruption or buffer sg table corruption if I dropped
> the lock before calling ion_buffer_destroy().
>
> Not being able to sort out exactly why, I borrowed the loop structure
> from ion_heap_deferred_free() and that works in my testing w/o issue.
>
> Not sure if its the mixing of list traversal methods causing the issue?
> Thoughts would be appreciated.

list_for_each_entry_safe just stores the next pointer to allow
deleting the current pointer, but that isn't safe if you follow my
suggestion to drop the lock because the next pointer could also become
invalid, so the while loop is necessary.

> Cc: Colin Cross 
> Cc: Android Kernel Team 
> Cc: Greg KH 
> Reported-by: Jim Davis 
> Signed-off-by: John Stultz 
> ---
>  drivers/staging/android/ion/ion_heap.c | 31 +++
>  drivers/staging/android/ion/ion_priv.h |  2 +-
>  2 files changed, 20 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion_heap.c 
> b/drivers/staging/android/ion/ion_heap.c
> index 9cf5622..72fe74b 100644
> --- a/drivers/staging/android/ion/ion_heap.c
> +++ b/drivers/staging/android/ion/ion_heap.c
> @@ -160,10 +160,10 @@ int ion_heap_pages_zero(struct page *page, size_t size, 
> pgprot_t pgprot)
>
>  void ion_heap_freelist_add(struct ion_heap *heap, struct ion_buffer *buffer)
>  {
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> list_add(>list, >free_list);
> heap->free_list_size += buffer->size;
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
> wake_up(>waitqueue);
>  }
>
> @@ -171,34 +171,41 @@ size_t ion_heap_freelist_size(struct ion_heap *heap)
>  {
> size_t size;
>
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> size = heap->free_list_size;
> -   rt_mutex_unlock(>lock);
> +   spin_unlock(>free_lock);
>
> return size;
>  }
>
>  size_t ion_heap_freelist_drain(struct ion_heap *heap, size_t size)
>  {
> -   struct ion_buffer *buffer, *tmp;
> +   struct ion_buffer *buffer;
> size_t total_drained = 0;
>
> if (ion_heap_freelist_size(heap) == 0)
> return 0;
>
> -   rt_mutex_lock(>lock);
> +   spin_lock(>free_lock);
> if (size == 0)
> size = heap->free_list_size;
>
> -   list_for_each_entry_safe(buffer, tmp, >free_list, list) {
> +   while (true) {
I'd use while (!list_empty(>free_list)), it makes it clearer
what the loop is for.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] staging: ion: Add HAVE_MEMBLOCK config dependency

2013-12-16 Thread Colin Cross
Acked-by: Colin Cross 

On Mon, Dec 16, 2013 at 9:07 PM, John Stultz  wrote:
> The kbuild test robot reported a build issue w/ ION on m68k:
>
> drivers/staging/android/ion/ion.c: In function 'ion_reserve':
> drivers/staging/android/ion/ion.c:1526:4: error: implicit declaration of 
> function 'memblock_alloc_base' [-Werror=implicit-function-declaration]
> drivers/staging/android/ion/ion.c:1528:11: error: 'MEMBLOCK_ALLOC_ANYWHERE' 
> undeclared (first use in this function)
> drivers/staging/android/ion/ion.c:1528:11: note: each undeclared identifier 
> is reported only once for each function it appears in
> drivers/staging/android/ion/ion.c:1537:4: error: implicit declaration of 
> function 'memblock_reserve' [-Werror=implicit-function-declaration]
>cc1: some warnings being treated as errors
>
> This is caused by ION using memblock functionality which m68k doesn't support.
>
> This patch adds a HAVE_MEMBLOCK dependency to the ION config.
>
> Cc: Colin Cross 
> Cc: Android Kernel Team 
> Cc: Greg KH 
> Cc: kbuild test robot 
> Reported-by: kbuild test robot 
> Signed-off-by: John Stultz 
> ---
>  drivers/staging/android/ion/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/staging/android/ion/Kconfig 
> b/drivers/staging/android/ion/Kconfig
> index b95281e..a9a64ea 100644
> --- a/drivers/staging/android/ion/Kconfig
> +++ b/drivers/staging/android/ion/Kconfig
> @@ -1,5 +1,6 @@
>  menuconfig ION
> bool "Ion Memory Manager"
> +   depends on HAVE_MEMBLOCK
> select GENERIC_ALLOCATOR
> select DMA_SHARED_BUFFER
> ---help---
> --
> 1.8.3.2
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] staging: ion: Fix possible null pointer dereference

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 9:07 PM, John Stultz  wrote:
> The kbuild test robot reported:
>
> drivers/staging/android/ion/ion_system_heap.c:122 alloc_largest_available() 
> error: potential null dereference 'info'.  (kmalloc returns null)
>
> Where the pointer returned from kmalloc goes unchecked for failure.
>
> This patch checks the return for NULL, and reworks the logic, as
> suggested by Colin, so we allocate the page_info structure first.
>
> Cc: Colin Cross 
> Cc: Greg KH 
> Cc: Android Kernel Team 
> Cc: kbuild test robot 
> Reported-by: kbuild test robot 
> Signed-off-by: John Stultz 
> ---
>  drivers/staging/android/ion/ion_system_heap.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/ion_system_heap.c 
> b/drivers/staging/android/ion/ion_system_heap.c
> index 144b2272..7f07291 100644
> --- a/drivers/staging/android/ion/ion_system_heap.c
> +++ b/drivers/staging/android/ion/ion_system_heap.c
> @@ -108,6 +108,10 @@ static struct page_info *alloc_largest_available(struct 
> ion_system_heap *heap,
> struct page_info *info;
> int i;
>
> +   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
> +   if (!info)
> +   return NULL;
> +
> for (i = 0; i < num_orders; i++) {
> if (size < order_to_size(orders[i]))
> continue;
> @@ -118,11 +122,12 @@ static struct page_info *alloc_largest_available(struct 
> ion_system_heap *heap,
> if (!page)
> continue;
>
> -   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
> info->page = page;
>     info->order = orders[i];
> return info;
> }
> +   kfree(info);
> +
> return NULL;
>  }
>

Acked-by: Colin Cross 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 3/3] staging: ion: Avoid using rt_mutexes directly.

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 5:22 PM, John Stultz  wrote:
> On 12/16/2013 04:17 PM, Colin Cross wrote:
>> The lock is only used to protect the heap's free_list and
>> free_list_size members, and is not held for any long or sleeping
>> operations.  Converting to a spinlock should prevent priority
>> inversion without using the rt_mutex.  I'd also rename it to free_lock
>> to so it doesn't get used as a general heap lock.
>
> Hrm.. So at least a trivial conversion to use spinlocks doesn't quite
> work out, as we call ion_buffer_destroy() in ion_heap_freelist_drain()
> while holding the lock, and that calls all sorts of not safe stuff.
>
> I'll spend some more time looking at it later tonight, but let me know
> if you have an approach for this case in mind.

Drop and re-grab the lock around ion_buffer_destroy, it's not
necessary during the destroy, and the list iteration is already using
list_for_each_entry_safe, so it doesn't matter if another caller
modifies the list during the destroy.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ion: remove ion_user_handle_t from ion_test.h

2013-12-16 Thread Colin Cross
ion_test.h should not define ion_user_handle_t, and defining it
causes a warning:
In file included from drivers/staging/android/ion/ion_test.c:31:
drivers/staging/android/ion/../uapi/ion_test.h:23: error: redefinition of 
typedef 'ion_user_handle_t'
drivers/staging/android/ion/../uapi/ion.h:23: note: previous declaration of 
'ion_user_handle_t' was here

Reported-by: Andrew Morton 
Signed-off-by: Colin Cross 
---
 drivers/staging/android/uapi/ion_test.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/android/uapi/ion_test.h 
b/drivers/staging/android/uapi/ion_test.h
index 352379a02690..614d1e36f72c 100644
--- a/drivers/staging/android/uapi/ion_test.h
+++ b/drivers/staging/android/uapi/ion_test.h
@@ -20,8 +20,6 @@
 #include 
 #include 
 
-typedef int ion_user_handle_t;
-
 /**
  * struct ion_test_rw_data - metadata passed to the kernel to read handle
  * @ptr:   a pointer to an area at least as large as size
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2/3] staging: ion: Fix possible null pointer dereference

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 1:32 PM, John Stultz  wrote:
> The kbuild test robot reported:
>
> drivers/staging/android/ion/ion_system_heap.c:122 alloc_largest_available() 
> error: potential null dereference 'info'.  (kmalloc returns null)
>
> Where the pointer returned from kmalloc goes unchecked for failure.
>
> This patch adds a simple check for a null return, and handles the error.
>
> XXX: Not sure if continue or 'return NULL' is the right thing to do.

Returning NULL is fine, there is no reason the kmalloc is going to
succeed if it retries, and it will just have to allocate more of them
if it starts fulfilling the allocation with smaller order chunks.

Allocating the struct before entering the loop might make error handling easier.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 3/3] staging: ion: Avoid using rt_mutexes directly.

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 1:32 PM, John Stultz  wrote:
> RT_MUTEXES can be configured out of the kernel, causing compile
> problems with ION.
>
> Since there is no documentation as to why directly using rt_mutexes
> is necessary (and very few drivers directly use rt_mutexes), simply
> convert the ion_heap lock to a normal mutex.

rt_mutexes were added with the deferred freeing feature.  Heaps need
to return zeroed memory to userspace, but zeroing the memory on every
allocation was causing performance issues.  We added a SCHED_IDLE
thread to zero memory in the background after freeing, but locking the
heap from the SCHED_IDLE thread might block a high priority allocation
thread for a long time.

The lock is only used to protect the heap's free_list and
free_list_size members, and is not held for any long or sleeping
operations.  Converting to a spinlock should prevent priority
inversion without using the rt_mutex.  I'd also rename it to free_lock
to so it doesn't get used as a general heap lock.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 3/3] staging: ion: Avoid using rt_mutexes directly.

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 1:32 PM, John Stultz john.stu...@linaro.org wrote:
 RT_MUTEXES can be configured out of the kernel, causing compile
 problems with ION.

 Since there is no documentation as to why directly using rt_mutexes
 is necessary (and very few drivers directly use rt_mutexes), simply
 convert the ion_heap lock to a normal mutex.

rt_mutexes were added with the deferred freeing feature.  Heaps need
to return zeroed memory to userspace, but zeroing the memory on every
allocation was causing performance issues.  We added a SCHED_IDLE
thread to zero memory in the background after freeing, but locking the
heap from the SCHED_IDLE thread might block a high priority allocation
thread for a long time.

The lock is only used to protect the heap's free_list and
free_list_size members, and is not held for any long or sleeping
operations.  Converting to a spinlock should prevent priority
inversion without using the rt_mutex.  I'd also rename it to free_lock
to so it doesn't get used as a general heap lock.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2/3] staging: ion: Fix possible null pointer dereference

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 1:32 PM, John Stultz john.stu...@linaro.org wrote:
 The kbuild test robot reported:

 drivers/staging/android/ion/ion_system_heap.c:122 alloc_largest_available() 
 error: potential null dereference 'info'.  (kmalloc returns null)

 Where the pointer returned from kmalloc goes unchecked for failure.

 This patch adds a simple check for a null return, and handles the error.

 XXX: Not sure if continue or 'return NULL' is the right thing to do.

Returning NULL is fine, there is no reason the kmalloc is going to
succeed if it retries, and it will just have to allocate more of them
if it starts fulfilling the allocation with smaller order chunks.

Allocating the struct before entering the loop might make error handling easier.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ion: remove ion_user_handle_t from ion_test.h

2013-12-16 Thread Colin Cross
ion_test.h should not define ion_user_handle_t, and defining it
causes a warning:
In file included from drivers/staging/android/ion/ion_test.c:31:
drivers/staging/android/ion/../uapi/ion_test.h:23: error: redefinition of 
typedef 'ion_user_handle_t'
drivers/staging/android/ion/../uapi/ion.h:23: note: previous declaration of 
'ion_user_handle_t' was here

Reported-by: Andrew Morton a...@linux-foundation.org
Signed-off-by: Colin Cross ccr...@android.com
---
 drivers/staging/android/uapi/ion_test.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/android/uapi/ion_test.h 
b/drivers/staging/android/uapi/ion_test.h
index 352379a02690..614d1e36f72c 100644
--- a/drivers/staging/android/uapi/ion_test.h
+++ b/drivers/staging/android/uapi/ion_test.h
@@ -20,8 +20,6 @@
 #include linux/ioctl.h
 #include linux/types.h
 
-typedef int ion_user_handle_t;
-
 /**
  * struct ion_test_rw_data - metadata passed to the kernel to read handle
  * @ptr:   a pointer to an area at least as large as size
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 3/3] staging: ion: Avoid using rt_mutexes directly.

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 5:22 PM, John Stultz john.stu...@linaro.org wrote:
 On 12/16/2013 04:17 PM, Colin Cross wrote:
 The lock is only used to protect the heap's free_list and
 free_list_size members, and is not held for any long or sleeping
 operations.  Converting to a spinlock should prevent priority
 inversion without using the rt_mutex.  I'd also rename it to free_lock
 to so it doesn't get used as a general heap lock.

 Hrm.. So at least a trivial conversion to use spinlocks doesn't quite
 work out, as we call ion_buffer_destroy() in ion_heap_freelist_drain()
 while holding the lock, and that calls all sorts of not safe stuff.

 I'll spend some more time looking at it later tonight, but let me know
 if you have an approach for this case in mind.

Drop and re-grab the lock around ion_buffer_destroy, it's not
necessary during the destroy, and the list iteration is already using
list_for_each_entry_safe, so it doesn't matter if another caller
modifies the list during the destroy.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] staging: ion: Add HAVE_MEMBLOCK config dependency

2013-12-16 Thread Colin Cross
Acked-by: Colin Cross ccr...@android.com

On Mon, Dec 16, 2013 at 9:07 PM, John Stultz john.stu...@linaro.org wrote:
 The kbuild test robot reported a build issue w/ ION on m68k:

 drivers/staging/android/ion/ion.c: In function 'ion_reserve':
 drivers/staging/android/ion/ion.c:1526:4: error: implicit declaration of 
 function 'memblock_alloc_base' [-Werror=implicit-function-declaration]
 drivers/staging/android/ion/ion.c:1528:11: error: 'MEMBLOCK_ALLOC_ANYWHERE' 
 undeclared (first use in this function)
 drivers/staging/android/ion/ion.c:1528:11: note: each undeclared identifier 
 is reported only once for each function it appears in
 drivers/staging/android/ion/ion.c:1537:4: error: implicit declaration of 
 function 'memblock_reserve' [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors

 This is caused by ION using memblock functionality which m68k doesn't support.

 This patch adds a HAVE_MEMBLOCK dependency to the ION config.

 Cc: Colin Cross ccr...@android.com
 Cc: Android Kernel Team kernel-t...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Cc: kbuild test robot fengguang...@intel.com
 Reported-by: kbuild test robot fengguang...@intel.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/Kconfig | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/drivers/staging/android/ion/Kconfig 
 b/drivers/staging/android/ion/Kconfig
 index b95281e..a9a64ea 100644
 --- a/drivers/staging/android/ion/Kconfig
 +++ b/drivers/staging/android/ion/Kconfig
 @@ -1,5 +1,6 @@
  menuconfig ION
 bool Ion Memory Manager
 +   depends on HAVE_MEMBLOCK
 select GENERIC_ALLOCATOR
 select DMA_SHARED_BUFFER
 ---help---
 --
 1.8.3.2

 To unsubscribe from this group and stop receiving emails from it, send an 
 email to kernel-team+unsubscr...@android.com.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] staging: ion: Fix possible null pointer dereference

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 9:07 PM, John Stultz john.stu...@linaro.org wrote:
 The kbuild test robot reported:

 drivers/staging/android/ion/ion_system_heap.c:122 alloc_largest_available() 
 error: potential null dereference 'info'.  (kmalloc returns null)

 Where the pointer returned from kmalloc goes unchecked for failure.

 This patch checks the return for NULL, and reworks the logic, as
 suggested by Colin, so we allocate the page_info structure first.

 Cc: Colin Cross ccr...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Cc: Android Kernel Team kernel-t...@android.com
 Cc: kbuild test robot fengguang...@intel.com
 Reported-by: kbuild test robot fengguang...@intel.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/ion_system_heap.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

 diff --git a/drivers/staging/android/ion/ion_system_heap.c 
 b/drivers/staging/android/ion/ion_system_heap.c
 index 144b2272..7f07291 100644
 --- a/drivers/staging/android/ion/ion_system_heap.c
 +++ b/drivers/staging/android/ion/ion_system_heap.c
 @@ -108,6 +108,10 @@ static struct page_info *alloc_largest_available(struct 
 ion_system_heap *heap,
 struct page_info *info;
 int i;

 +   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
 +   if (!info)
 +   return NULL;
 +
 for (i = 0; i  num_orders; i++) {
 if (size  order_to_size(orders[i]))
 continue;
 @@ -118,11 +122,12 @@ static struct page_info *alloc_largest_available(struct 
 ion_system_heap *heap,
 if (!page)
 continue;

 -   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
 info-page = page;
 info-order = orders[i];
 return info;
 }
 +   kfree(info);
 +
 return NULL;
  }


Acked-by: Colin Cross ccr...@android.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] staging: ion: Avoid using rt_mutexes directly.

2013-12-16 Thread Colin Cross
On Mon, Dec 16, 2013 at 9:07 PM, John Stultz john.stu...@linaro.org wrote:
 RT_MUTEXES can be configured out of the kernel, causing compile
 problems with ION.

 To quote Colin:
 rt_mutexes were added with the deferred freeing feature.  Heaps need
 to return zeroed memory to userspace, but zeroing the memory on every
 allocation was causing performance issues.  We added a SCHED_IDLE
 thread to zero memory in the background after freeing, but locking the
 heap from the SCHED_IDLE thread might block a high priority allocation
 thread for a long time.

 The lock is only used to protect the heap's free_list and
 free_list_size members, and is not held for any long or sleeping
 operations.  Converting to a spinlock should prevent priority
 inversion without using the rt_mutex.  I'd also rename it to free_lock
 to so it doesn't get used as a general heap lock.

 Thus this patch converts the rt_mutex usage to a spinlock and
 renames the lock free_lock to be more clear as to its use.

 I also had to change a bit of logic in ion_heap_freelist_drain()
 despite the for loop being a for_each_entry_safe(), I was still
 seeing list corruption or buffer sg table corruption if I dropped
 the lock before calling ion_buffer_destroy().

 Not being able to sort out exactly why, I borrowed the loop structure
 from ion_heap_deferred_free() and that works in my testing w/o issue.

 Not sure if its the mixing of list traversal methods causing the issue?
 Thoughts would be appreciated.

list_for_each_entry_safe just stores the next pointer to allow
deleting the current pointer, but that isn't safe if you follow my
suggestion to drop the lock because the next pointer could also become
invalid, so the while loop is necessary.

 Cc: Colin Cross ccr...@android.com
 Cc: Android Kernel Team kernel-t...@android.com
 Cc: Greg KH gre...@linuxfoundation.org
 Reported-by: Jim Davis jim.ep...@gmail.com
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/ion_heap.c | 31 +++
  drivers/staging/android/ion/ion_priv.h |  2 +-
  2 files changed, 20 insertions(+), 13 deletions(-)

 diff --git a/drivers/staging/android/ion/ion_heap.c 
 b/drivers/staging/android/ion/ion_heap.c
 index 9cf5622..72fe74b 100644
 --- a/drivers/staging/android/ion/ion_heap.c
 +++ b/drivers/staging/android/ion/ion_heap.c
 @@ -160,10 +160,10 @@ int ion_heap_pages_zero(struct page *page, size_t size, 
 pgprot_t pgprot)

  void ion_heap_freelist_add(struct ion_heap *heap, struct ion_buffer *buffer)
  {
 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 list_add(buffer-list, heap-free_list);
 heap-free_list_size += buffer-size;
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);
 wake_up(heap-waitqueue);
  }

 @@ -171,34 +171,41 @@ size_t ion_heap_freelist_size(struct ion_heap *heap)
  {
 size_t size;

 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 size = heap-free_list_size;
 -   rt_mutex_unlock(heap-lock);
 +   spin_unlock(heap-free_lock);

 return size;
  }

  size_t ion_heap_freelist_drain(struct ion_heap *heap, size_t size)
  {
 -   struct ion_buffer *buffer, *tmp;
 +   struct ion_buffer *buffer;
 size_t total_drained = 0;

 if (ion_heap_freelist_size(heap) == 0)
 return 0;

 -   rt_mutex_lock(heap-lock);
 +   spin_lock(heap-free_lock);
 if (size == 0)
 size = heap-free_list_size;

 -   list_for_each_entry_safe(buffer, tmp, heap-free_list, list) {
 +   while (true) {
I'd use while (!list_empty(heap-free_list)), it makes it clearer
what the loop is for.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ion: Don't allow building ION as a module.

2013-12-14 Thread Colin Cross
On Sat, Dec 14, 2013 at 2:19 PM, John Stultz  wrote:
> On 12/14/2013 01:48 PM, Greg KH wrote:
>> On Sat, Dec 14, 2013 at 12:06:45PM -0800, John Stultz wrote:
>>> ION doesn't export the proper symbols for it to be a module. This
>>> causes build issues when ION is configured as a module.
>>>
>>> Since Andorid kernels rarely use modules (I think recent policy
>>> requires no modules?), go ahead and set the ION config to a bool
>>> from the tristate option.
>>>
>>> If folks decide ION as a module is important, we will have to go
>>> through and export the various needed symbols.
>>>
>>> Signed-off-by: John Stultz 
>>> ---
>>>  drivers/staging/android/ion/Kconfig | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/staging/android/ion/Kconfig 
>>> b/drivers/staging/android/ion/Kconfig
>>> index a342d96..b95281e 100644
>>> --- a/drivers/staging/android/ion/Kconfig
>>> +++ b/drivers/staging/android/ion/Kconfig
>>> @@ -1,5 +1,5 @@
>>>  menuconfig ION
>>> -tristate "Ion Memory Manager"
>>> +bool "Ion Memory Manager"
>> Or you can do:
>>   depends on m
>> and leave it as a "tristate", then when switching it off, I think you
>> can then be prompted as 'm' again, but I could be wrong about that.
>>
>> So should ION ever be able to be built as a module in the future?
> My understanding is that Android is moving away from using modules, but
> others could comment more authoritatively. So I don't think this is an
> issue. But as I said in the patch, if folks do want to have ION as a
> module, we'll have to export all the various symbols that ION users need.

There's no reason ION couldn't be built as a module, it just doesn't
make much sense given how central it is to the Android devices it is
used on.  It will need some TODOs fixed in the removal paths, and
probably some refcounting to prevent removing the module while buffers
are allocated.

As for modules in general on Android, they are rarely used on ARM
Android devices and were considered a security risk.  I don't think
anybody would have a problem re-enabling them if
CONFIG_MODULE_SIG_FORCE was also enabled.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 001/115] gpu: ion: Add ION Memory Manager

2013-12-14 Thread Colin Cross
On Sat, Dec 14, 2013 at 1:44 PM, Greg KH  wrote:
> On Sat, Dec 14, 2013 at 01:10:38PM -0800, Colin Cross wrote:
>> On Sat, Dec 14, 2013 at 8:52 AM, Greg KH  wrote:
>> > On Fri, Dec 13, 2013 at 02:23:35PM -0800, John Stultz wrote:
>> >> + idev->debug_root = debugfs_create_dir("ion", NULL);
>> >> + if (IS_ERR_OR_NULL(idev->debug_root))
>> >> + pr_err("ion: failed to create debug files.\n");
>> >
>> > There's no need to check for this at all, you just printed out an error
>> > message if CONFIG_DEBUGFS is not enabled, which isn't all that nice.
>> >
>> > Something to clean up later, no big deal now.
>>
>> Patch 75/115 (ion: remove IS_ERR_OR_NULL) replaces this chunk with:
>> idev->debug_root = debugfs_create_dir("ion", NULL);
>> if (!idev->debug_root)
>> pr_err("ion: failed to create debug files.\n");
>> which won't print an error if debugfs is disabled.
>
> If debugfs is disabled, debugfs_create_dir() returns ERR_PTR(-ENODEV).

ERR_PTR(-ENODEV) is not NULL, so this won't print an error.  I thought
that was the point of returning ERR_PTR when disabled and NULL when
enabled.

> So are you sure you tested this?  :)

I have not tested this, but it seems correct to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 001/115] gpu: ion: Add ION Memory Manager

2013-12-14 Thread Colin Cross
On Sat, Dec 14, 2013 at 8:52 AM, Greg KH  wrote:
> On Fri, Dec 13, 2013 at 02:23:35PM -0800, John Stultz wrote:
>> + idev->debug_root = debugfs_create_dir("ion", NULL);
>> + if (IS_ERR_OR_NULL(idev->debug_root))
>> + pr_err("ion: failed to create debug files.\n");
>
> There's no need to check for this at all, you just printed out an error
> message if CONFIG_DEBUGFS is not enabled, which isn't all that nice.
>
> Something to clean up later, no big deal now.

Patch 75/115 (ion: remove IS_ERR_OR_NULL) replaces this chunk with:
idev->debug_root = debugfs_create_dir("ion", NULL);
if (!idev->debug_root)
pr_err("ion: failed to create debug files.\n");
which won't print an error if debugfs is disabled.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 001/115] gpu: ion: Add ION Memory Manager

2013-12-14 Thread Colin Cross
On Sat, Dec 14, 2013 at 8:52 AM, Greg KH gre...@linuxfoundation.org wrote:
 On Fri, Dec 13, 2013 at 02:23:35PM -0800, John Stultz wrote:
 + idev-debug_root = debugfs_create_dir(ion, NULL);
 + if (IS_ERR_OR_NULL(idev-debug_root))
 + pr_err(ion: failed to create debug files.\n);

 There's no need to check for this at all, you just printed out an error
 message if CONFIG_DEBUGFS is not enabled, which isn't all that nice.

 Something to clean up later, no big deal now.

Patch 75/115 (ion: remove IS_ERR_OR_NULL) replaces this chunk with:
idev-debug_root = debugfs_create_dir(ion, NULL);
if (!idev-debug_root)
pr_err(ion: failed to create debug files.\n);
which won't print an error if debugfs is disabled.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 001/115] gpu: ion: Add ION Memory Manager

2013-12-14 Thread Colin Cross
On Sat, Dec 14, 2013 at 1:44 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Sat, Dec 14, 2013 at 01:10:38PM -0800, Colin Cross wrote:
 On Sat, Dec 14, 2013 at 8:52 AM, Greg KH gre...@linuxfoundation.org wrote:
  On Fri, Dec 13, 2013 at 02:23:35PM -0800, John Stultz wrote:
  + idev-debug_root = debugfs_create_dir(ion, NULL);
  + if (IS_ERR_OR_NULL(idev-debug_root))
  + pr_err(ion: failed to create debug files.\n);
 
  There's no need to check for this at all, you just printed out an error
  message if CONFIG_DEBUGFS is not enabled, which isn't all that nice.
 
  Something to clean up later, no big deal now.

 Patch 75/115 (ion: remove IS_ERR_OR_NULL) replaces this chunk with:
 idev-debug_root = debugfs_create_dir(ion, NULL);
 if (!idev-debug_root)
 pr_err(ion: failed to create debug files.\n);
 which won't print an error if debugfs is disabled.

 If debugfs is disabled, debugfs_create_dir() returns ERR_PTR(-ENODEV).

ERR_PTR(-ENODEV) is not NULL, so this won't print an error.  I thought
that was the point of returning ERR_PTR when disabled and NULL when
enabled.

 So are you sure you tested this?  :)

I have not tested this, but it seems correct to me.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ion: Don't allow building ION as a module.

2013-12-14 Thread Colin Cross
On Sat, Dec 14, 2013 at 2:19 PM, John Stultz john.stu...@linaro.org wrote:
 On 12/14/2013 01:48 PM, Greg KH wrote:
 On Sat, Dec 14, 2013 at 12:06:45PM -0800, John Stultz wrote:
 ION doesn't export the proper symbols for it to be a module. This
 causes build issues when ION is configured as a module.

 Since Andorid kernels rarely use modules (I think recent policy
 requires no modules?), go ahead and set the ION config to a bool
 from the tristate option.

 If folks decide ION as a module is important, we will have to go
 through and export the various needed symbols.

 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  drivers/staging/android/ion/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/staging/android/ion/Kconfig 
 b/drivers/staging/android/ion/Kconfig
 index a342d96..b95281e 100644
 --- a/drivers/staging/android/ion/Kconfig
 +++ b/drivers/staging/android/ion/Kconfig
 @@ -1,5 +1,5 @@
  menuconfig ION
 -tristate Ion Memory Manager
 +bool Ion Memory Manager
 Or you can do:
   depends on m
 and leave it as a tristate, then when switching it off, I think you
 can then be prompted as 'm' again, but I could be wrong about that.

 So should ION ever be able to be built as a module in the future?
 My understanding is that Android is moving away from using modules, but
 others could comment more authoritatively. So I don't think this is an
 issue. But as I said in the patch, if folks do want to have ION as a
 module, we'll have to export all the various symbols that ION users need.

There's no reason ION couldn't be built as a module, it just doesn't
make much sense given how central it is to the Android devices it is
used on.  It will need some TODOs fixed in the removal paths, and
probably some refcounting to prevent removing the module while buffers
are allocated.

As for modules in general on Android, they are rarely used on ARM
Android devices and were considered a security risk.  I don't think
anybody would have a problem re-enabling them if
CONFIG_MODULE_SIG_FORCE was also enabled.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ioprio: move userspace api into uapi

2013-12-09 Thread Colin Cross
Move the userspace interface described in
Documentation/block/ioprio.txt from include/linux/ioprio.h to
include/uapi/linux/ioprio.h.

Signed-off-by: Colin Cross 
---
 include/linux/ioprio.h  | 42 +
 include/uapi/linux/ioprio.h | 46 +
 2 files changed, 47 insertions(+), 41 deletions(-)
 create mode 100644 include/uapi/linux/ioprio.h

diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h
index beb9ce1c2c23..45e19bafae6a 100644
--- a/include/linux/ioprio.h
+++ b/include/linux/ioprio.h
@@ -4,47 +4,7 @@
 #include 
 #include 
 
-/*
- * Gives us 8 prio classes with 13-bits of data for each class
- */
-#define IOPRIO_BITS(16)
-#define IOPRIO_CLASS_SHIFT (13)
-#define IOPRIO_PRIO_MASK   ((1UL << IOPRIO_CLASS_SHIFT) - 1)
-
-#define IOPRIO_PRIO_CLASS(mask)((mask) >> IOPRIO_CLASS_SHIFT)
-#define IOPRIO_PRIO_DATA(mask) ((mask) & IOPRIO_PRIO_MASK)
-#define IOPRIO_PRIO_VALUE(class, data) (((class) << IOPRIO_CLASS_SHIFT) | data)
-
-#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE)
-
-/*
- * These are the io priority groups as implemented by CFQ. RT is the realtime
- * class, it always gets premium service. BE is the best-effort scheduling
- * class, the default for any process. IDLE is the idle scheduling class, it
- * is only served when no one else is using the disk.
- */
-enum {
-   IOPRIO_CLASS_NONE,
-   IOPRIO_CLASS_RT,
-   IOPRIO_CLASS_BE,
-   IOPRIO_CLASS_IDLE,
-};
-
-/*
- * 8 best effort priority levels are supported
- */
-#define IOPRIO_BE_NR   (8)
-
-enum {
-   IOPRIO_WHO_PROCESS = 1,
-   IOPRIO_WHO_PGRP,
-   IOPRIO_WHO_USER,
-};
-
-/*
- * Fallback BE priority
- */
-#define IOPRIO_NORM(4)
+#include 
 
 /*
  * if process has set io priority explicitly, use that. if not, convert
diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h
new file mode 100644
index ..2ee40e2881b8
--- /dev/null
+++ b/include/uapi/linux/ioprio.h
@@ -0,0 +1,46 @@
+#ifndef _UAPI_LINUX_IOPRIO_H
+#define _UAPI_LINUX_IOPRIO_H
+
+/*
+ * Gives us 8 prio classes with 13-bits of data for each class
+ */
+#define IOPRIO_BITS(16)
+#define IOPRIO_CLASS_SHIFT (13)
+#define IOPRIO_PRIO_MASK   ((1UL << IOPRIO_CLASS_SHIFT) - 1)
+
+#define IOPRIO_PRIO_CLASS(mask)((mask) >> IOPRIO_CLASS_SHIFT)
+#define IOPRIO_PRIO_DATA(mask) ((mask) & IOPRIO_PRIO_MASK)
+#define IOPRIO_PRIO_VALUE(class, data) (((class) << IOPRIO_CLASS_SHIFT) | data)
+
+#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE)
+
+/*
+ * These are the io priority groups as implemented by CFQ. RT is the realtime
+ * class, it always gets premium service. BE is the best-effort scheduling
+ * class, the default for any process. IDLE is the idle scheduling class, it
+ * is only served when no one else is using the disk.
+ */
+enum {
+   IOPRIO_CLASS_NONE,
+   IOPRIO_CLASS_RT,
+   IOPRIO_CLASS_BE,
+   IOPRIO_CLASS_IDLE,
+};
+
+/*
+ * 8 best effort priority levels are supported
+ */
+#define IOPRIO_BE_NR   (8)
+
+enum {
+   IOPRIO_WHO_PROCESS = 1,
+   IOPRIO_WHO_PGRP,
+   IOPRIO_WHO_USER,
+};
+
+/*
+ * Fallback BE priority
+ */
+#define IOPRIO_NORM(4)
+
+#endif /* _UAPI_LINUX_IOPRIO_H */
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ioprio: move userspace api into uapi

2013-12-09 Thread Colin Cross
Move the userspace interface described in
Documentation/block/ioprio.txt from include/linux/ioprio.h to
include/uapi/linux/ioprio.h.

Signed-off-by: Colin Cross ccr...@android.com
---
 include/linux/ioprio.h  | 42 +
 include/uapi/linux/ioprio.h | 46 +
 2 files changed, 47 insertions(+), 41 deletions(-)
 create mode 100644 include/uapi/linux/ioprio.h

diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h
index beb9ce1c2c23..45e19bafae6a 100644
--- a/include/linux/ioprio.h
+++ b/include/linux/ioprio.h
@@ -4,47 +4,7 @@
 #include linux/sched.h
 #include linux/iocontext.h
 
-/*
- * Gives us 8 prio classes with 13-bits of data for each class
- */
-#define IOPRIO_BITS(16)
-#define IOPRIO_CLASS_SHIFT (13)
-#define IOPRIO_PRIO_MASK   ((1UL  IOPRIO_CLASS_SHIFT) - 1)
-
-#define IOPRIO_PRIO_CLASS(mask)((mask)  IOPRIO_CLASS_SHIFT)
-#define IOPRIO_PRIO_DATA(mask) ((mask)  IOPRIO_PRIO_MASK)
-#define IOPRIO_PRIO_VALUE(class, data) (((class)  IOPRIO_CLASS_SHIFT) | data)
-
-#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE)
-
-/*
- * These are the io priority groups as implemented by CFQ. RT is the realtime
- * class, it always gets premium service. BE is the best-effort scheduling
- * class, the default for any process. IDLE is the idle scheduling class, it
- * is only served when no one else is using the disk.
- */
-enum {
-   IOPRIO_CLASS_NONE,
-   IOPRIO_CLASS_RT,
-   IOPRIO_CLASS_BE,
-   IOPRIO_CLASS_IDLE,
-};
-
-/*
- * 8 best effort priority levels are supported
- */
-#define IOPRIO_BE_NR   (8)
-
-enum {
-   IOPRIO_WHO_PROCESS = 1,
-   IOPRIO_WHO_PGRP,
-   IOPRIO_WHO_USER,
-};
-
-/*
- * Fallback BE priority
- */
-#define IOPRIO_NORM(4)
+#include uapi/linux/ioprio.h
 
 /*
  * if process has set io priority explicitly, use that. if not, convert
diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h
new file mode 100644
index ..2ee40e2881b8
--- /dev/null
+++ b/include/uapi/linux/ioprio.h
@@ -0,0 +1,46 @@
+#ifndef _UAPI_LINUX_IOPRIO_H
+#define _UAPI_LINUX_IOPRIO_H
+
+/*
+ * Gives us 8 prio classes with 13-bits of data for each class
+ */
+#define IOPRIO_BITS(16)
+#define IOPRIO_CLASS_SHIFT (13)
+#define IOPRIO_PRIO_MASK   ((1UL  IOPRIO_CLASS_SHIFT) - 1)
+
+#define IOPRIO_PRIO_CLASS(mask)((mask)  IOPRIO_CLASS_SHIFT)
+#define IOPRIO_PRIO_DATA(mask) ((mask)  IOPRIO_PRIO_MASK)
+#define IOPRIO_PRIO_VALUE(class, data) (((class)  IOPRIO_CLASS_SHIFT) | data)
+
+#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE)
+
+/*
+ * These are the io priority groups as implemented by CFQ. RT is the realtime
+ * class, it always gets premium service. BE is the best-effort scheduling
+ * class, the default for any process. IDLE is the idle scheduling class, it
+ * is only served when no one else is using the disk.
+ */
+enum {
+   IOPRIO_CLASS_NONE,
+   IOPRIO_CLASS_RT,
+   IOPRIO_CLASS_BE,
+   IOPRIO_CLASS_IDLE,
+};
+
+/*
+ * 8 best effort priority levels are supported
+ */
+#define IOPRIO_BE_NR   (8)
+
+enum {
+   IOPRIO_WHO_PROCESS = 1,
+   IOPRIO_WHO_PGRP,
+   IOPRIO_WHO_USER,
+};
+
+/*
+ * Fallback BE priority
+ */
+#define IOPRIO_NORM(4)
+
+#endif /* _UAPI_LINUX_IOPRIO_H */
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 4:02 PM, Greg KH  wrote:
> On Wed, Dec 04, 2013 at 02:22:13PM -0800, Colin Cross wrote:
>> On Wed, Dec 4, 2013 at 2:02 PM, Greg KH  wrote:
>> > On Wed, Dec 04, 2013 at 01:55:34PM -0800, Colin Cross wrote:
>> >> On Wed, Dec 4, 2013 at 1:43 PM, Greg KH  
>> >> wrote:
>> >> > On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
>> >> >> On Wed, Dec 4, 2013 at 10:35 AM, Greg KH  
>> >> >> wrote:
>> >> >> 
>> >> >>
>> >> >> > And finally, is this all really needed?  Why not just fix the 
>> >> >> > structures
>> >> >> > to be "correct", and then fix userspace to use the correct 
>> >> >> > structures as
>> >> >> > well, thereby not needing a compat layer at all?
>> >> >>
>> >> >> Some of the binder ioctls take userspace pointers.  Are you suggesting
>> >> >> storing those pointers in a __u64 to avoid having to have a
>> >> >> compat_ioctl?
>> >> >
>> >> > Yes, that's the best way to solve the issue, right?
>> >>
>> >> It's the least code, but in exchange you lose all the type safety and
>> >> warnings when copying in and out of the pointers, as well as sparse
>> >> checking on the __user attribute.
>> >
>> > Not if you make the cast right at the beginning, when you first "touch"
>> > the data, but yes, it does take some of the type saftey away, at the
>> > expense of simpler code to mess up :)
>> >
>> >> That doesn't seem like a good tradeoff to me.  In addition it requires
>> >> modifying the existing heavily used 32 bit api, which means a
>> >> mostly-equivalent compat layer added in libbinder to support old
>> >> kernels.
>> >
>> > Wait, I thought that libbinder would have to be changed anyway here, to
>> > handle 64bit kernels (in both 32 and 64bit userspace).  Since you are
>> > already changing it, why not just "do it correctly"?
>>
>> libbinder will need changes to support 64-bit userspace and especially
>> a mixed 64-bit and 32-bit userspace, but this patch series is only
>> addressing a pure 32-bit userspace on a 64-bit kernel.  Support for a
>> 64-bit userspace in Android is obviously going to require a future
>> version of Android including, among other things, libbinder changes.
>> As far as I know, those changes won't need to change the ioctl api,
>> only the layout of the buffers that are passed through the ioctl api.
>
> "only" means you can rearrange things at that point in time, as you will
> have to be doing that anyway :)
>
>> > Or does this patch series mean that no userspace code is changed?  Is
>> > that a "requirement" here?
>>
>> Since this series only addresses 32-bit userspace on 64-bit kernel
>> support there are no associated userspace changes.  Changing the
>> 32-bit api here means that combining the KitKat branch from
>> http://android.googlesource.com with a newer kernel version will not
>> work.
>
> Is that something that anyone has said would work in the past?  It seems
> that other parts of the Android userspace are pretty tied to specific
> kernel features / versions, is this anything new if the binder code had
> to change?

We have explicitly said that Android userspace does not require a
specific kernel version.  I expect to see KitKat devices running at
least 3.4, 3.10, and whatever is the next long term stable version.
Sometimes new kernels need userspace support, but we try to avoid that
as much as possible.  The last major one I can remember was removing
early suspend in 3.4, but in that case I provided a compat 3.4 branch
to allow using 3.4 with an older userspace.  Changing the binder api
is not completely unsupportable for old versions, my guess is some
people would just ship with a new kernel with the binder changes
reverted.

> Anyway, the code as submitted has problems, see my response to the
> second patch, so it's not ready yet anyway :(
>
> thanks,
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 3:21 PM, One Thousand Gnomes
 wrote:
> On Wed, 4 Dec 2013 10:35:54 -0800
> Greg KH  wrote:
>
>> On Wed, Dec 04, 2013 at 06:09:41PM +, Serban Constantinescu wrote:
>> > +#define size_helper(x) ({  \
>> > +   size_t __size;  \
>> > +   if (!is_compat_task())  \
>> > +   __size = sizeof(x); \
>> > +   else if (sizeof(x) == sizeof(struct flat_binder_object))\
>> > +   __size = sizeof(struct compat_flat_binder_object);  \
>> > +   else if (sizeof(x) == sizeof(struct binder_transaction_data))   \
>> > +   __size = sizeof(struct compat_binder_transaction_data); \
>> > +   else if (sizeof(x) == sizeof(size_t))   \
>> > +   __size = sizeof(compat_size_t); \
>> > +   else\
>> > +BUG(); \
>> > +   __size; \
>> > +   })
>>
>> Ick.
>>
>> First off, no driver should ever be able to crash the kernel, which you
>> just did.
>
> And which would appear to mean that this code hasn't actually been
> tested - at least not properly with error cases ?
>
> You talk about type safety too but your code is already full of
> "put_user(node->ptr, (void * __user *)ptr))"

None of this (the patch series or the original code) is mine.  My
question was more of a general one on designing ioctls, as well as
concerns with changing the existing 32-bit api.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 2:02 PM, Greg KH  wrote:
> On Wed, Dec 04, 2013 at 01:55:34PM -0800, Colin Cross wrote:
>> On Wed, Dec 4, 2013 at 1:43 PM, Greg KH  wrote:
>> > On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
>> >> On Wed, Dec 4, 2013 at 10:35 AM, Greg KH  
>> >> wrote:
>> >> 
>> >>
>> >> > And finally, is this all really needed?  Why not just fix the structures
>> >> > to be "correct", and then fix userspace to use the correct structures as
>> >> > well, thereby not needing a compat layer at all?
>> >>
>> >> Some of the binder ioctls take userspace pointers.  Are you suggesting
>> >> storing those pointers in a __u64 to avoid having to have a
>> >> compat_ioctl?
>> >
>> > Yes, that's the best way to solve the issue, right?
>>
>> It's the least code, but in exchange you lose all the type safety and
>> warnings when copying in and out of the pointers, as well as sparse
>> checking on the __user attribute.
>
> Not if you make the cast right at the beginning, when you first "touch"
> the data, but yes, it does take some of the type saftey away, at the
> expense of simpler code to mess up :)
>
>> That doesn't seem like a good tradeoff to me.  In addition it requires
>> modifying the existing heavily used 32 bit api, which means a
>> mostly-equivalent compat layer added in libbinder to support old
>> kernels.
>
> Wait, I thought that libbinder would have to be changed anyway here, to
> handle 64bit kernels (in both 32 and 64bit userspace).  Since you are
> already changing it, why not just "do it correctly"?

libbinder will need changes to support 64-bit userspace and especially
a mixed 64-bit and 32-bit userspace, but this patch series is only
addressing a pure 32-bit userspace on a 64-bit kernel.  Support for a
64-bit userspace in Android is obviously going to require a future
version of Android including, among other things, libbinder changes.
As far as I know, those changes won't need to change the ioctl api,
only the layout of the buffers that are passed through the ioctl api.

> Or does this patch series mean that no userspace code is changed?  Is
> that a "requirement" here?

Since this series only addresses 32-bit userspace on 64-bit kernel
support there are no associated userspace changes.  Changing the
32-bit api here means that combining the KitKat branch from
http://android.googlesource.com with a newer kernel version will not
work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 1:43 PM, Greg KH  wrote:
> On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
>> On Wed, Dec 4, 2013 at 10:35 AM, Greg KH  wrote:
>> 
>>
>> > And finally, is this all really needed?  Why not just fix the structures
>> > to be "correct", and then fix userspace to use the correct structures as
>> > well, thereby not needing a compat layer at all?
>>
>> Some of the binder ioctls take userspace pointers.  Are you suggesting
>> storing those pointers in a __u64 to avoid having to have a
>> compat_ioctl?
>
> Yes, that's the best way to solve the issue, right?

It's the least code, but in exchange you lose all the type safety and
warnings when copying in and out of the pointers, as well as sparse
checking on the __user attribute.  That doesn't seem like a good
tradeoff to me.  In addition it requires modifying the existing
heavily used 32 bit api, which means a mostly-equivalent compat layer
added in libbinder to support old kernels.

I would suggest fixing the 32-bit structures to use fixed sizes where
appropriate (__u32 instead of unsigned long) while maintaining
compatibility, and then using compat_ioctl where necessary to handle
pointers.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 10:35 AM, Greg KH  wrote:


> And finally, is this all really needed?  Why not just fix the structures
> to be "correct", and then fix userspace to use the correct structures as
> well, thereby not needing a compat layer at all?

Some of the binder ioctls take userspace pointers.  Are you suggesting
storing those pointers in a __u64 to avoid having to have a
compat_ioctl?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 10:35 AM, Greg KH gre...@linuxfoundation.org wrote:
snip

 And finally, is this all really needed?  Why not just fix the structures
 to be correct, and then fix userspace to use the correct structures as
 well, thereby not needing a compat layer at all?

Some of the binder ioctls take userspace pointers.  Are you suggesting
storing those pointers in a __u64 to avoid having to have a
compat_ioctl?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 1:43 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
 On Wed, Dec 4, 2013 at 10:35 AM, Greg KH gre...@linuxfoundation.org wrote:
 snip

  And finally, is this all really needed?  Why not just fix the structures
  to be correct, and then fix userspace to use the correct structures as
  well, thereby not needing a compat layer at all?

 Some of the binder ioctls take userspace pointers.  Are you suggesting
 storing those pointers in a __u64 to avoid having to have a
 compat_ioctl?

 Yes, that's the best way to solve the issue, right?

It's the least code, but in exchange you lose all the type safety and
warnings when copying in and out of the pointers, as well as sparse
checking on the __user attribute.  That doesn't seem like a good
tradeoff to me.  In addition it requires modifying the existing
heavily used 32 bit api, which means a mostly-equivalent compat layer
added in libbinder to support old kernels.

I would suggest fixing the 32-bit structures to use fixed sizes where
appropriate (__u32 instead of unsigned long) while maintaining
compatibility, and then using compat_ioctl where necessary to handle
pointers.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 2:02 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Wed, Dec 04, 2013 at 01:55:34PM -0800, Colin Cross wrote:
 On Wed, Dec 4, 2013 at 1:43 PM, Greg KH gre...@linuxfoundation.org wrote:
  On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
  On Wed, Dec 4, 2013 at 10:35 AM, Greg KH gre...@linuxfoundation.org 
  wrote:
  snip
 
   And finally, is this all really needed?  Why not just fix the structures
   to be correct, and then fix userspace to use the correct structures as
   well, thereby not needing a compat layer at all?
 
  Some of the binder ioctls take userspace pointers.  Are you suggesting
  storing those pointers in a __u64 to avoid having to have a
  compat_ioctl?
 
  Yes, that's the best way to solve the issue, right?

 It's the least code, but in exchange you lose all the type safety and
 warnings when copying in and out of the pointers, as well as sparse
 checking on the __user attribute.

 Not if you make the cast right at the beginning, when you first touch
 the data, but yes, it does take some of the type saftey away, at the
 expense of simpler code to mess up :)

 That doesn't seem like a good tradeoff to me.  In addition it requires
 modifying the existing heavily used 32 bit api, which means a
 mostly-equivalent compat layer added in libbinder to support old
 kernels.

 Wait, I thought that libbinder would have to be changed anyway here, to
 handle 64bit kernels (in both 32 and 64bit userspace).  Since you are
 already changing it, why not just do it correctly?

libbinder will need changes to support 64-bit userspace and especially
a mixed 64-bit and 32-bit userspace, but this patch series is only
addressing a pure 32-bit userspace on a 64-bit kernel.  Support for a
64-bit userspace in Android is obviously going to require a future
version of Android including, among other things, libbinder changes.
As far as I know, those changes won't need to change the ioctl api,
only the layout of the buffers that are passed through the ioctl api.

 Or does this patch series mean that no userspace code is changed?  Is
 that a requirement here?

Since this series only addresses 32-bit userspace on 64-bit kernel
support there are no associated userspace changes.  Changing the
32-bit api here means that combining the KitKat branch from
http://android.googlesource.com with a newer kernel version will not
work.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 3:21 PM, One Thousand Gnomes
gno...@lxorguk.ukuu.org.uk wrote:
 On Wed, 4 Dec 2013 10:35:54 -0800
 Greg KH gre...@linuxfoundation.org wrote:

 On Wed, Dec 04, 2013 at 06:09:41PM +, Serban Constantinescu wrote:
  +#define size_helper(x) ({  \
  +   size_t __size;  \
  +   if (!is_compat_task())  \
  +   __size = sizeof(x); \
  +   else if (sizeof(x) == sizeof(struct flat_binder_object))\
  +   __size = sizeof(struct compat_flat_binder_object);  \
  +   else if (sizeof(x) == sizeof(struct binder_transaction_data))   \
  +   __size = sizeof(struct compat_binder_transaction_data); \
  +   else if (sizeof(x) == sizeof(size_t))   \
  +   __size = sizeof(compat_size_t); \
  +   else\
  +BUG(); \
  +   __size; \
  +   })

 Ick.

 First off, no driver should ever be able to crash the kernel, which you
 just did.

 And which would appear to mean that this code hasn't actually been
 tested - at least not properly with error cases ?

 You talk about type safety too but your code is already full of
 put_user(node-ptr, (void * __user *)ptr))

None of this (the patch series or the original code) is mine.  My
question was more of a general one on designing ioctls, as well as
concerns with changing the existing 32-bit api.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Colin Cross
On Wed, Dec 4, 2013 at 4:02 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Wed, Dec 04, 2013 at 02:22:13PM -0800, Colin Cross wrote:
 On Wed, Dec 4, 2013 at 2:02 PM, Greg KH gre...@linuxfoundation.org wrote:
  On Wed, Dec 04, 2013 at 01:55:34PM -0800, Colin Cross wrote:
  On Wed, Dec 4, 2013 at 1:43 PM, Greg KH gre...@linuxfoundation.org 
  wrote:
   On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
   On Wed, Dec 4, 2013 at 10:35 AM, Greg KH gre...@linuxfoundation.org 
   wrote:
   snip
  
And finally, is this all really needed?  Why not just fix the 
structures
to be correct, and then fix userspace to use the correct 
structures as
well, thereby not needing a compat layer at all?
  
   Some of the binder ioctls take userspace pointers.  Are you suggesting
   storing those pointers in a __u64 to avoid having to have a
   compat_ioctl?
  
   Yes, that's the best way to solve the issue, right?
 
  It's the least code, but in exchange you lose all the type safety and
  warnings when copying in and out of the pointers, as well as sparse
  checking on the __user attribute.
 
  Not if you make the cast right at the beginning, when you first touch
  the data, but yes, it does take some of the type saftey away, at the
  expense of simpler code to mess up :)
 
  That doesn't seem like a good tradeoff to me.  In addition it requires
  modifying the existing heavily used 32 bit api, which means a
  mostly-equivalent compat layer added in libbinder to support old
  kernels.
 
  Wait, I thought that libbinder would have to be changed anyway here, to
  handle 64bit kernels (in both 32 and 64bit userspace).  Since you are
  already changing it, why not just do it correctly?

 libbinder will need changes to support 64-bit userspace and especially
 a mixed 64-bit and 32-bit userspace, but this patch series is only
 addressing a pure 32-bit userspace on a 64-bit kernel.  Support for a
 64-bit userspace in Android is obviously going to require a future
 version of Android including, among other things, libbinder changes.
 As far as I know, those changes won't need to change the ioctl api,
 only the layout of the buffers that are passed through the ioctl api.

 only means you can rearrange things at that point in time, as you will
 have to be doing that anyway :)

  Or does this patch series mean that no userspace code is changed?  Is
  that a requirement here?

 Since this series only addresses 32-bit userspace on 64-bit kernel
 support there are no associated userspace changes.  Changing the
 32-bit api here means that combining the KitKat branch from
 http://android.googlesource.com with a newer kernel version will not
 work.

 Is that something that anyone has said would work in the past?  It seems
 that other parts of the Android userspace are pretty tied to specific
 kernel features / versions, is this anything new if the binder code had
 to change?

We have explicitly said that Android userspace does not require a
specific kernel version.  I expect to see KitKat devices running at
least 3.4, 3.10, and whatever is the next long term stable version.
Sometimes new kernels need userspace support, but we try to avoid that
as much as possible.  The last major one I can remember was removing
early suspend in 3.4, but in that case I provided a compat 3.4 branch
to allow using 3.4 with an older userspace.  Changing the binder api
is not completely unsupportable for old versions, my guess is some
people would just ship with a new kernel with the binder changes
reverted.

 Anyway, the code as submitted has problems, see my response to the
 second patch, so it's not ready yet anyway :(

 thanks,

 greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   >