date:20130805

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:51 AM, H. Peter Anvin  wrote:
>>
>> Also, how would you pass the parameters? Every tracepoint has its own
>> parameters to pass to it. How would a trap know what where to get "prev"
>> and "next"?
>
> How do you do that now?
>
> You have to do an IP lookup to find out what you are doing.

No, he just generates the code for the call and then uses a static_key
to jump to it. So normally it's all out-of-line, and the only thing in
the hot-path is that 5-byte nop (which gets turned into a 5-byte jump
when the tracing key is enabled)

Works fine, but the normally unused stubs end up mixing in the normal
code segment. Which I actually think is fine, but right now we don't
get the short-jump advantage from it (and there is likely some I$
disadvantage from just fragmentation of the code).

With two-byte jumps, you'd still get the I$ fragmentation (the
argument generation and the call and the branch back would all be in
the same code segment as the hot code), but that would be offset by
the fact that at least the hot code itself could use a short jump when
possible (ie a 2-byte nop rather than a 5-byte one).

Don't know which way it would go performance-wise. But it shouldn't
need gcc changes, it just needs the static key branch/nop rewriting to
be able to handle both sizes. I couldn't tell why Steven's series to
do that was so complex, though - I only glanced through the patches.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] timers/nohz updates for 3.12

2013-08-05 Thread Frederic Weisbecker

Ingo,

Please pull the timers/nohz branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/nohz

It mostly contains fixes and full dynticks off-case optimizations. I believe 
that
distros want to enable this feature so it seems important to optimize the case
where the "nohz_full=" parameter is empty. ie: I'm trying to remove any 
performance
regression that comes with NO_HZ_FULL=y when the feature is not used.

This patchset improves the current situation a lot (off-case appears to be 
around 11% faster
with hackbench, although I guess it may vary depending on the configuration but 
it should be
significantly faster in any case) now there is still some work to do: I can 
still observe a
remaining loss of 1.6% throughput seen with hackbench compared to 
CONFIG_NO_HZ_FULL=n

Thanks,
Frederic
---

Frederic Weisbecker (23):
  sched: Consolidate open coded preemptible() checks
  context_tracing: Fix guest accounting with native vtime
  vtime: Update a few comments
  context_tracking: Fix runtime CPU off-case
  nohz: Only enable context tracking on full dynticks CPUs
  context_tracking: Remove full dynticks' hacky dependency on wide context 
tracking
  context_tracking: Ground setup for static key use
  context_tracking: Optimize main APIs off case with static key
  context_tracking: Optimize guest APIs off case with static key
  context_tracking: Optimize context switch off case with static keys
  context_tracking: User/kernel broundary cross trace events
  vtime: Remove a few unneeded generic vtime state checks
  vtime: Fix racy cputime delta update
  context_tracking: Split low level state headers
  hardirq: Split preempt count mask definitions
  m68k: hardirq_count() only need preempt_mask.h
  vtime: Describe overriden functions in dedicated arch headers
  vtime: Optimize full dynticks accounting off case with static keys
  vtime: Always scale generic vtime accounting results
  vtime: Always debug check snapshot source _before_ updating it
  nohz: Rename a few state variables
  nohz: Optimize full dynticks state checks with static keys
  nohz: Optimize full dynticks's sched hooks with static keys


 arch/ia64/include/asm/Kbuild|1 +
 arch/m68k/include/asm/irqflags.h|2 +-
 arch/powerpc/include/asm/Kbuild |1 +
 arch/s390/include/asm/cputime.h |3 -
 arch/s390/include/asm/vtime.h   |7 ++
 arch/s390/kernel/vtime.c|1 +
 include/linux/context_tracking.h|  120 +++--
 include/linux/context_tracking_state.h  |   39 +
 include/linux/hardirq.h |  117 +
 include/linux/preempt_mask.h|  122 +
 include/linux/tick.h|   45 +--
 include/linux/vtime.h   |   74 --
 include/trace/events/context_tracking.h |   58 ++
 init/Kconfig|   28 +--
 kernel/context_tracking.c   |  128 ++-
 kernel/sched/core.c |4 +-
 kernel/sched/cputime.c  |   53 -
 kernel/time/Kconfig |1 -
 kernel/time/tick-sched.c|   56 ++
 19 files changed, 534 insertions(+), 326 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] staging: ozwpan: Return correct hub status.

2013-08-05 Thread Rupesh Gujare


On 05/08/13 19:56, Dan Carpenter wrote:

On Mon, Aug 05, 2013 at 06:40:15PM +0100, Rupesh Gujare wrote:

Fix a bug where we were not returning correct hub status
for 8th port.

What are the user visible effects of this bug?


8 th port is never assigned to new device & we loose one port.


Style nits below.


Signed-off-by: Rupesh Gujare 
---
  drivers/staging/ozwpan/ozhcd.c |   11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
index b060e43..2f93a00 100644
--- a/drivers/staging/ozwpan/ozhcd.c
+++ b/drivers/staging/ozwpan/ozhcd.c
@@ -1871,17 +1871,24 @@ static int oz_hcd_hub_status_data(struct usb_hcd *hcd, 
char *buf)
int i;
  
  	buf[0] = 0;

+   buf[1] = 0;
  
  	spin_lock_bh(&ozhcd->hcd_lock);

for (i = 0; i < OZ_NB_PORTS; i++) {
if (ozhcd->ports[i].flags & OZ_PORT_F_CHANGED) {
oz_dbg(HUB, "Port %d changed\n", i);
ozhcd->ports[i].flags &= ~OZ_PORT_F_CHANGED;
-   buf[0] |= 1<<(i+1);
+   if (i < 7)
+   buf[0] |= 1 << (i+1);

Put spaces around math operations:
buf[0] |= 1 << (i + 1);


+   else
+   buf[1] |= 1 << (i-7);

buf[1] |= 1 << (i - 7);



Ok. I will rework on style nits.

--
Regards,
Rupesh Gujare

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] staging: ozwpan: Reset port configuration number.

2013-08-05 Thread Rupesh Gujare


On 05/08/13 19:21, Dan Carpenter wrote:

On Mon, Aug 05, 2013 at 06:40:14PM +0100, Rupesh Gujare wrote:

Make sure that we reset port configuration no. when PD departs.


What happens if we don't do this?  What is the user visible effect
of this patch?




There is no user visible effect at the moment. Clearing this variable as 
safeguard measure.


--
Regards,
Rupesh Gujare

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH cgroup/for-3.12] cgroup: make css_for_each_descendant() and friends include the origin css in the iteration

2013-08-05 Thread Aristeu Rozanski

On Sun, Aug 04, 2013 at 07:07:03PM -0400, Tejun Heo wrote:
> I've been wanting to do this for some time and given all the recent
> API updates now seems like a pretty good opportunity.  Verified
> freezer and blkcg.  The conversions are mostly straight forward but
> I'd much appreciate acks from controller maintainers.
> 
> The patch is on top of
> 
>   cgroup/for-3.12 61584e3f4 ("cgroup: Merge branch 'for-3.11-fixes' into 
> for-3.12")
> + [1] cgroup: use cgroup_subsys_state as the primary subsystem interface 
> handle
> + [2] cgroup: make cgroup_event specific to memcg
> 
> and available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 
> review-descendant-update
> 
> Thanks.
> 
> [1] https://lkml.org/lkml/2013/8/1/722
> [2] http://thread.gmane.org/gmane.linux.kernel.cgroups/8726
> 
> --- 8< ---
> 
> From 0e84b0865ab8a87f1c1443e4777c20c7f14e13b6 Mon Sep 17 00:00:00 2001
> From: Tejun Heo 
> Date: Sun, 4 Aug 2013 19:01:23 -0400
> 
> Previously, all css descendant iterators didn't include the origin
> (root of subtree) css in the iteration.  The reasons were maintaining
> consistency with css_for_each_child() and that at the time of
> introduction more use cases needed skipping the origin anyway;
> however, given that css_is_descendant() considers self to be a
> descendant, omitting the origin css has become more confusing and
> looking at the accumulated use cases rather clearly indicates that
> including origin would result in simpler code overall.
> 
> While this is a change which can easily lead to subtle bugs, cgroup
> API including the iterators has recently gone through major
> restructuring and no out-of-tree changes will be applicable without
> adjustments making this a relatively acceptable opportunity for this
> type of change.
> 
> The conversions are mostly straight-forward.  If the iteration block
> had explicit origin handling before or after, it's moved inside the
> iteration.  If not, if (pos == origin) continue; is added.  Some
> conversions add extra reference get/put around origin handling by
> consolidating origin handling and the rest.  While the extra ref
> operations aren't strictly necessary, this shouldn't cause any
> noticeable difference.

Looks good. Thanks for the heads up, saved some hours of head scratching
:)

Acked-by: Aristeu Rozanski 

-- 
Aristeu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)

2013-08-05 Thread Oleg Nesterov

On 08/05, Steven Rostedt wrote:
>
> On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:
>
> > Signed-off-by: Oleg Nesterov 
> > Tested-by: David Ahern 
> > Reviewed-and-Acked-by: Steven Rostedt 
>
> Just so you know. The standard that we now want to live by is only one
> tag per line. I know I gave you that tag in my email, but when adding it
> to a patch it needs to be:
>
> Reviewed-by: Steven Rostedt 
> Acked-by: Steven Rostedt 
>
> I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
> both, it needs to be like that.

Sorry... should I resend once again ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] staging: ozwpan: Return correct hub status.

2013-08-05 Thread Dan Carpenter

On Mon, Aug 05, 2013 at 06:40:15PM +0100, Rupesh Gujare wrote:
> Fix a bug where we were not returning correct hub status
> for 8th port.

What are the user visible effects of this bug?

Style nits below.

> 
> Signed-off-by: Rupesh Gujare 
> ---
>  drivers/staging/ozwpan/ozhcd.c |   11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
> index b060e43..2f93a00 100644
> --- a/drivers/staging/ozwpan/ozhcd.c
> +++ b/drivers/staging/ozwpan/ozhcd.c
> @@ -1871,17 +1871,24 @@ static int oz_hcd_hub_status_data(struct usb_hcd 
> *hcd, char *buf)
>   int i;
>  
>   buf[0] = 0;
> + buf[1] = 0;
>  
>   spin_lock_bh(&ozhcd->hcd_lock);
>   for (i = 0; i < OZ_NB_PORTS; i++) {
>   if (ozhcd->ports[i].flags & OZ_PORT_F_CHANGED) {
>   oz_dbg(HUB, "Port %d changed\n", i);
>   ozhcd->ports[i].flags &= ~OZ_PORT_F_CHANGED;
> - buf[0] |= 1<<(i+1);
> + if (i < 7)
> + buf[0] |= 1 << (i+1);

Put spaces around math operations:
buf[0] |= 1 << (i + 1);

> + else
> + buf[1] |= 1 << (i-7);

buf[1] |= 1 << (i - 7);

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:46 AM, Oleg Nesterov  wrote:
>
> Heh. I pulled wine-git.
>
> set_thread_context() does a lot of PTRACE_POKEUSER requests and then
> it calls resume_after_ptrace() which simply does PTRACE_DETACH.
>
> I'll recheck tomorrow, but it really looks as if it _wants_ to leak
> the debug registers after detach. And more, it does PTRACE_ATTACH
> only to set these regs.
>
> And this is exactly what fab840f tries to prevent.

Ok, so I guess it's effectively the ABI, and we should just make the
rule be that "if you don't want stale breakpoints, then remove the
breakpoints when you detach".

And thus reverting it the right thing to do. Agreed?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Oleg Nesterov

On 08/04, Felipe Contreras wrote:
>
> I found a regression while running all v3.11-rcX kernels; Starcract II
> through wine crashes. The culprit is fab840f (ptrace: PTRACE_DETACH
> should do flush_ptrace_hw_breakpoint(child)), I revert that commit and
> there's no crash.

Heh. I pulled wine-git.

set_thread_context() does a lot of PTRACE_POKEUSER requests and then
it calls resume_after_ptrace() which simply does PTRACE_DETACH.

I'll recheck tomorrow, but it really looks as if it _wants_ to leak
the debug registers after detach. And more, it does PTRACE_ATTACH
only to set these regs.

And this is exactly what fab840f tries to prevent.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 11:49 AM, Steven Rostedt wrote:
> On Mon, 2013-08-05 at 11:29 -0700, H. Peter Anvin wrote:
> 
>> Traps nest, that's why there is a stack.  (OK, so you don't want to take
>> the same trap inside the trap handler, but that code should be very
>> limited.)  The trap instruction just becomes very short, but rather
>> slow, call-return.
>>
>> However, when you consider the cost you have to consider that the
>> tracepoint is doing other work, so it may very well amortize out.
> 
> Also, how would you pass the parameters? Every tracepoint has its own
> parameters to pass to it. How would a trap know what where to get "prev"
> and "next"?
> 

How do you do that now?

You have to do an IP lookup to find out what you are doing.

(Note: I wonder how much the parameter generation costs the tracepoints.)

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:39 AM, Steven Rostedt  wrote:
>
> I had patches that did exactly this:
>
>  https://lkml.org/lkml/2012/3/8/461
>
> But it got dropped for some reason. I don't remember why. Maybe because
> of the complexity?

Ugh. Why the crazy update_jump_label script stuff? I'd go "Eww" at
that too, it looks crazy. The assembler already knows to make short
2-byte "jmp" instructions for near jumps, and you can just look at the
opcode itself to determine size, why is all that other stuff required?

IOW, 5/7 looks sane, but 4/7 makes me go "there's something wrong with
that series".

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 11:29 -0700, H. Peter Anvin wrote:

> Traps nest, that's why there is a stack.  (OK, so you don't want to take
> the same trap inside the trap handler, but that code should be very
> limited.)  The trap instruction just becomes very short, but rather
> slow, call-return.
> 
> However, when you consider the cost you have to consider that the
> tracepoint is doing other work, so it may very well amortize out.

Also, how would you pass the parameters? Every tracepoint has its own
parameters to pass to it. How would a trap know what where to get "prev"
and "next"?

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] Squashfs: Optimized uncompressed buffer loop

2013-08-05 Thread Manish Sharma

Merged the two for loops. We might get a little gain by overlapping
wait_on_bh and the memcpy operations.

Signed-off-by: Manish Sharma 
---
 fs/squashfs/block.c |9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c
index fb50652..5012f98 100644
--- a/fs/squashfs/block.c
+++ b/fs/squashfs/block.c
@@ -169,12 +169,6 @@ int squashfs_read_data(struct super_block *sb, void 
**buffer, u64 index,
 */
int i, in, pg_offset = 0;

-   for (i = 0; i < b; i++) {
-   wait_on_buffer(bh[i]);
-   if (!buffer_uptodate(bh[i]))
-   goto block_release;
-   }
-
for (bytes = length; k < b; k++) {
in = min(bytes, msblk->devblksize - offset);
bytes -= in;
@@ -185,6 +179,9 @@ int squashfs_read_data(struct super_block *sb, void 
**buffer, u64 index,
}
avail = min_t(int, in, PAGE_CACHE_SIZE -
pg_offset);
+   wait_on_buffer(bh[k]);
+   if (!buffer_uptodate(bh[k]))
+   goto block_release;
memcpy(buffer[page] + pg_offset,
bh[k]->b_data + offset, avail);
in -= avail;
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 11:34 AM, Linus Torvalds wrote:
> On Mon, Aug 5, 2013 at 11:24 AM, Linus Torvalds
>  wrote:
>>
>> Ugh. I can see the attraction of your section thing for that case, I
>> just get the feeling that we should be able to do better somehow.
> 
> Hmm.. Quite frankly, Steven, for your use case I think you actually
> want the C goto *labels* associated with a section. Which sounds like
> it might be a cleaner syntax than making it about the basic block
> anyway.
> 

A label wouldn't have an endpoint, though...

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] staging: ozwpan: Increment port number for new device.

2013-08-05 Thread Rupesh Gujare


On 05/08/13 18:53, Dan Carpenter wrote:

On Mon, Aug 05, 2013 at 06:40:13PM +0100, Rupesh Gujare wrote:

This patch fixes crash issue when there is quick cycle of
de-enumeration & enumeration due to loss of wireless link.

It is found that sometimes new device (or coming back device)
returns very fast, even before USB core read out hub status,
resulting in allocation of same port, which results in unstable
system & crash.

Above issue is resolved by making sure that we always assign
new port to new device, making sure that USB core reads correct
hub status.


This feels like papering over the problem.  Surely the real fix
would be to improve the reference counting.

This patch is probably effective but it makes the code more subtle
and it shows that we don't really understand what we are doing with
regards to reference counting.




Probably this is easier way to fix issue, since we don't have reference 
count for ports & we rely on flags to check port status.

Any suggestions are appreciated.

--
Regards,
Rupesh Gujare

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG cxgb3: Check and handle the dma mapping errors

2013-08-05 Thread Jay Fenlason

On Mon, Aug 05, 2013 at 12:59:04PM +1000, Alexey Kardashevskiy wrote:
> Hi!
> 
> Recently I started getting multiple errors like this:
> 
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c3067980 vaddr
> c01fbdaaa882 npages 1
> ... and so on
> 
> This is all happening on a PPC64 "powernv" platform machine. To trigger the
> error state, it is enough to _flood_ ping CXGB3 card from another machine
> (which has Emulex 10Gb NIC + Cisco switch). Just do "ping -f 172.20.1.2"
> and wait 10-15 seconds.
> 
> 
> The messages are coming from arch/powerpc/kernel/iommu.c and basically
> mean that the driver requested more pages than the DMA window has which is
> normally 1GB (there could be another possible source of errors -
> ppc_md.tce_build callback - but on powernv platform it always succeeds).
> 
> 
> The patch after which it broke is:
> commit f83331bab149e29fa2c49cf102c0cd8c3f1ce9f9
> Author: Santosh Rastapur 
> Date:   Tue May 21 04:21:29 2013 +
> cxgb3: Check and handle the dma mapping errors
> 
> Any quick ideas? Thanks!

That patch adds error checking to detect failed dma mapping requests.
Before it, the code always assumed that dma mapping requests succeded,
whether they actually do or not, so the fact that the older kernel
does not log errors only means that the failures are being ignored,
and any appearance of working is through pure luck.  The machine could
have just crashed at that point.

What is the observed behavior of the system by the machine initiating
the ping flood?  Do the older and newer kernels differ in the
percentage of pings that do not receive replies?  O the newer kernel,
when the mapping errors are detected, the packet that it is trying to
transmit is dropped, but I'm not at all sure what happens on the older
kernel after the dma mapping fails.  As I mentioned earlier, I'm
surprised it does not crash.  Perhaps the folks from Chelsio have a
better idea what happens after a dma mapping error is ignored?

-- JF
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 11:20 -0700, Linus Torvalds wrote:

> Of course, it would be good to optimize static_key_false() itself -
> right now those static key jumps are always five bytes, and while they
> get nopped out, it would still be nice if there was some way to have
> just a two-byte nop (turning into a short branch) *if* we can reach
> another jump that way..For small functions that would be lovely. Oh
> well.

I had patches that did exactly this:

 https://lkml.org/lkml/2012/3/8/461

But it got dropped for some reason. I don't remember why. Maybe because
of the complexity?

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: fix some scripts/kernel-doc warnings

2013-08-05 Thread Yacine Belkadi

On 08/04/2013 11:29 PM, Greg Kroah-Hartman wrote:
> On Sun, Aug 04, 2013 at 10:05:36PM +0200, Yacine Belkadi wrote:
>> On 08/03/2013 05:29 AM, Greg Kroah-Hartman wrote:
>>> On Fri, Aug 02, 2013 at 08:10:04PM +0200, Yacine Belkadi wrote:
 When building the htmldocs (in verbose mode), scripts/kernel-doc reports 
 the
 following type of warnings:

 Warning(drivers/usb/core/usb.c:76): No description found for return value 
 of
 'usb_find_alt_setting'

 Fix them by:
 - adding some missing descriptions of return values
 - using "Return" sections for those descriptions

 Signed-off-by: Yacine Belkadi 
 ---

  Applied to b3a3a9c441e2c8f6b6760de9331023a7906a4ac6
>>>
>>> What does this line mean?
>>
>> It's the commit on which I created and applied the patch:
>>
>> commit b3a3a9c441e2c8f6b6760de9331023a7906a4ac6
>> Merge: a582e5f e70e78e
>> Author: Linus Torvalds 
>> Date:   Mon Jul 22 19:07:24 2013 -0700
> 
> Odd, I've never seen anyone use that before.  It's really not needed, so
> you don't have to do that in the future.
> 

In hindsight, I should probably have used something like:
"Patch against commit b3a3a9c441e2c8f6b6760de9331023a7906a4ac6".

I thought this information may prove useful in some cases, because of the
nature of the patch, which only modifies comments and may get out of sync
with the code.

Here is an example:
- My local HEAD is at commit c1 when I start creating the patch.
- Some function f doesn't have a description for its return value. I look
into the code and deduce the description. So the description I add is based
on the code at the commit c1.
- Someone else submits a patch that changes the code of the function f, but
I don't see it.
- I send my patch to the maintainer.
- My patch may apply cleanly on top of the other patch (mine only touched
the comments), but the description now doesn't match the function's code,
which is a problem.

I assumed that if I specify the commit on which I worked, it may help the
maintainer decide whether my patch got invalidated by some other patch that
was applied first. Continuing with the example: The maintainer sees that I
worked based on c1, but knows that a patch was applied in the mean time, so
he/she asks me to update my patch first.

What do you think?

Thanks,
Yacine
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:24 AM, Linus Torvalds
 wrote:
>
> Ugh. I can see the attraction of your section thing for that case, I
> just get the feeling that we should be able to do better somehow.

Hmm.. Quite frankly, Steven, for your use case I think you actually
want the C goto *labels* associated with a section. Which sounds like
it might be a cleaner syntax than making it about the basic block
anyway.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 11:20 AM, Linus Torvalds wrote:
> 
> Of course, it would be good to optimize static_key_false() itself -
> right now those static key jumps are always five bytes, and while they
> get nopped out, it would still be nice if there was some way to have
> just a two-byte nop (turning into a short branch) *if* we can reach
> another jump that way..For small functions that would be lovely. Oh
> well.
> 

That would definitely require gcc support.  It would be useful, but
probably requires a lot of machinery.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.*

2013-08-05 Thread Jeff Layton

On Mon, 5 Aug 2013 18:18:03 +
"Myklebust, Trond"  wrote:

> On Mon, 2013-08-05 at 13:37 -0400, Jeff Layton wrote:
> > On Mon, 5 Aug 2013 16:15:01 +
> > "Myklebust, Trond"  wrote:
> > 
> > > From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001
> > > From: Trond Myklebust 
> > > Date: Mon, 5 Aug 2013 12:06:12 -0400
> > > Subject: [PATCH] LOCKD: Don't call utsname()->nodename from
> > >  nlmclnt_setlockargs
> > > MIME-Version: 1.0
> > > Content-Type: text/plain; charset=UTF-8
> > > Content-Transfer-Encoding: 8bit
> > > 
> > > Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
> > > which case we're in entirely the wrong namespace.
> > > Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move
> > > exit_task_namespaces() outside of exit_notify()) now means that
> > > exit_task_work() is called after exit_task_namespaces(), which
> > > triggers an Oops when we're freeing up the locks.
> > > 
> > > Signed-off-by: Trond Myklebust 
> > > Cc: Toralf Förster 
> > > Cc: Oleg Nesterov 
> > > Cc: Nix 
> > > Cc: Jeff Layton 
> > > ---
> > >  fs/lockd/clntproc.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> > > index 9760ecb..acd3947 100644
> > > --- a/fs/lockd/clntproc.c
> > > +++ b/fs/lockd/clntproc.c
> > > @@ -125,14 +125,15 @@ static void nlmclnt_setlockargs(struct nlm_rqst 
> > > *req, struct file_lock *fl)
> > >  {
> > >   struct nlm_args *argp = &req->a_args;
> > >   struct nlm_lock *lock = &argp->lock;
> > > + char *nodename = req->a_host->h_rpcclnt->cl_nodename;
> > >  
> > >   nlmclnt_next_cookie(&argp->cookie);
> > >   memcpy(&lock->fh, NFS_FH(file_inode(fl->fl_file)), sizeof(struct 
> > > nfs_fh));
> > > - lock->caller  = utsname()->nodename;
> > > + lock->caller  = nodename;
> > >   lock->oh.data = req->a_owner;
> > >   lock->oh.len  = snprintf(req->a_owner, sizeof(req->a_owner), "%u@%s",
> > >   (unsigned int)fl->fl_u.nfs_fl.owner->pid,
> > > - utsname()->nodename);
> > > + nodename);
> > >   lock->svid = fl->fl_u.nfs_fl.owner->pid;
> > >   lock->fl.fl_start = fl->fl_start;
> > >   lock->fl.fl_end = fl->fl_end;
> > 
> > Looks good to me...
> > 
> > Reviewed-by: Jeff Layton 
> > 
> > Trond, any thoughts on the other oops that Nix posted? The issue there
> > seems to be that we're trying to do the pathwalk to the rpcbind unix
> > socket from exit_task_work(), but that's happening after we've already
> > called exit_fs().
> > 
> > The trivial answer seems to be to simply call exit_task_work() before
> > exit_fs() there, but it seems like we ought to be doing the upcall to
> > rpcbind in a mount namespace from which we know we can reach the
> > socket...
> 
> Isn't it enough to just do the same thing as we did for gss proxy? i.e.
> set the RPC_CLNT_CREATE_NO_IDLE_TIMEOUT flag.
> 
> See attachment.

Yeah, that looks like a reasonable thing to do...

OTOH, Is there any other way for a unix socket to end up disconnected
other than if we were to close it? Maybe if rpcbind stopped, the socket
unlinked and recreated and then started again?

If so then you still could potentially end up in this situation even if
you didn't autoclose it.

-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.*

2013-08-05 Thread Nix

On 5 Aug 2013, Trond Myklebust told this:
> Does the attached patch fix the problem?

> From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust 
> Date: Mon, 5 Aug 2013 12:06:12 -0400
> Subject: [PATCH] LOCKD: Don't call utsname()->nodename from
>  nlmclnt_setlockargs
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit

It makes it worse. Much, much worse. From a crash every so often when
I'm doing compilations over NFS, I get an immediate panic on startx,
long long before I even try to replicate the earlier panic:

[   83.432358] task: 88041aaa5ac0 ti: 8804199e2000 task.ti: 
8804199e2000
[   83.432428] RIP: 0010:[] [] 
encode_nlm4_lock+0x26/0xbe
[   83.432512] RSP: 0018:8804199e3a78  EFLAGS: 00010286
[   83.432564] RAX:  RBX: 88041a577038 RCX: 
[   83.432630] RDX: 8804193b3098 RSI: 88041a577038 RDI: 008c
[   83.432697] RBP: 8804199e3aa8 R08: 8804193b3098 R09: 0001
[   83.432763] R10: 88042fa12980 R11: 88042fa12980 R12: 8804199e3ae8
[   83.432830] R13: 008c R14: 8804199e3fd8 R15: 815de80e
[   83.432898] FS:  7f594b40c740() GS:88042fa0() 
knlGS:
[   83.432974] CS:  0010 DS:  ES:  CR0: 80050033
[   83.433028] CR2: 008c CR3: 00041ab3d000 CR4: 001407f0
[   83.433095] DR0:  DR1:  DR2: 
[   83.433176] DR3:  DR6: 0ff0 DR7: 0400
[   83.433255] Stack:
[   83.433276]  88041a44fb70 88040004 8804199e3ae8 
88041a577010 
[   83.433360]  8804188e0e00 8804199e3fd8 8804199e3ac8 
8124b0d7 
[   83.433443]  8804188e0e00 8124b086 8804199e3b38 
815e6032 
[   83.433616] Call Trace:
[   83.433646]  [] nlm4_xdr_enc_lockargs+0x51/0x76
[   83.433707]  [] ? nlm4_xdr_enc_cancargs+0x56/0x56
[   83.433769]  [] rpcauth_wrap_req+0x57/0x62
[   83.433826]  [] call_transmit+0x17c/0x1f9
[   83.433880]  [] __rpc_execute+0xe8/0x2ca
[   83.433935]  [] rpc_execute+0x76/0x9d
[   83.433986]  [] rpc_run_task+0x78/0x80
[   83.434039]  [] rpc_call_sync+0x88/0x9e
[   83.434092]  [] nlmclnt_call+0xb5/0x240
[   83.434146]  [] nlmclnt_proc+0x226/0x5fb
[   83.434226]  [] nfs3_proc_lock+0x21/0x23
[   83.434280]  [] do_setlk+0x65/0xee
[   83.434329]  [] nfs_lock+0x14e/0x162
[   83.434382]  [] vfs_lock_file+0x29/0x35
[   83.434435]  [] fcntl_setlk+0x139/0x2c5
[   83.434490]  [] SyS_fcntl+0x2b6/0x47d
[   83.434543]  [] system_call_fastpath+0x16/0x1b
[   83.434600] Code: 5b 41 5c 5d c3 0f 1f 44 00 00 55 31 c0 48 83 c9 ff 48 89 
e5 41 56 41 55 41 54 49 89 fc 53 48 89 f3 48 83 ec 10 4c 8b 2e 4c 89 ef  ae 
4c 89 e7 48 f7 d1 4c 8d 71 ff 41 8d 76 04 e8 9f 16 3a 00 
[   83.435077] RIP [] encode_nlm4_lock+0x26/0xbe
[   83.435140]  RSP 
[   83.435197] CR2: 008c

That's here:

(gdb) list *(encode_nlm4_lock+0x26)
0x8124af69 is in encode_nlm4_lock (fs/lockd/clnt4xdr.c:329).
324  *  string caller_name;
325  */
326 static void encode_caller_name(struct xdr_stream *xdr, const char *name)
327 {
328 /* NB: client-side does not set lock->len */
329 u32 length = strlen(name);
330 __be32 *p;
331
332 p = xdr_reserve_space(xdr, 4 + length);
333 xdr_encode_opaque(p, name, length);

   0x8124af69 <+38>:repnz scas %es:(%rdi),%al

Pretty clearly, "name" can be NULL after this patch...

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 11:23 AM, Steven Rostedt wrote:
> On Mon, 2013-08-05 at 11:17 -0700, H. Peter Anvin wrote:
>> On 08/05/2013 10:55 AM, Steven Rostedt wrote:
>>>
>>> Well, as tracepoints are being added quite a bit in Linux, my concern is
>>> with the inlined functions that they bring. With jump labels they are
>>> disabled in a very unlikely way (the static_key_false() is a nop to skip
>>> the code, and is dynamically enabled to a jump).
>>>
>>
>> Have you considered using traps for tracepoints?  A trapping instruction
>> can be as small as a single byte.  The downside, of course, is that it
>> is extremely suppressed -- the trap is always expensive -- and you then
>> have to do a lookup to find the target based on the originating IP.
> 
> No, never considered it, nor would I. Those that use tracepoints, do use
> them extensively, and adding traps like this would probably cause
> heissenbugs and make tracepoints useless.
> 
> Not to mention, how would we add a tracepoint to a trap handler?
> 

Traps nest, that's why there is a stack.  (OK, so you don't want to take
the same trap inside the trap handler, but that code should be very
limited.)  The trap instruction just becomes very short, but rather
slow, call-return.

However, when you consider the cost you have to consider that the
tracepoint is doing other work, so it may very well amortize out.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] cpufreq: loongson2: fix broken cpufreq

2013-08-05 Thread Aaro Koskinen

Commit 42913c799 (MIPS: Loongson2: Use clk API instead of direct
dereferences) broke the cpufreq functionality on Loongson2 boards:
clk_set_rate() is called before the CPU frequency table is initialized,
and therefore will always fail.

Fix by moving the clk_set_rate() after the table initialization.
Tested on Lemote FuLoong mini-PC.

Signed-off-by: Aaro Koskinen 
Acked-by: Viresh Kumar 
Cc: sta...@vger.kernel.org
---

Changes since the first version
(http://marc.info/?l=linux-kernel&m=137357177225034&w=2):

- Changed the subject prefix. I guess this should be merged through
  the cpufreq/PM instead of MIPS tree?

- Added ACK from Viresh Kumar.

 drivers/cpufreq/loongson2_cpufreq.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/loongson2_cpufreq.c 
b/drivers/cpufreq/loongson2_cpufreq.c
index bb838b9..9536852 100644
--- a/drivers/cpufreq/loongson2_cpufreq.c
+++ b/drivers/cpufreq/loongson2_cpufreq.c
@@ -118,11 +118,6 @@ static int loongson2_cpufreq_cpu_init(struct 
cpufreq_policy *policy)
clk_put(cpuclk);
return -EINVAL;
}
-   ret = clk_set_rate(cpuclk, rate);
-   if (ret) {
-   clk_put(cpuclk);
-   return ret;
-   }
 
/* clock table init */
for (i = 2;
@@ -130,6 +125,12 @@ static int loongson2_cpufreq_cpu_init(struct 
cpufreq_policy *policy)
 i++)
loongson2_clockmod_table[i].frequency = (rate * i) / 8;
 
+   ret = clk_set_rate(cpuclk, rate);
+   if (ret) {
+   clk_put(cpuclk);
+   return ret;
+   }
+
policy->cur = loongson2_cpufreq_get(policy->cpu);
 
cpufreq_frequency_table_get_attr(&loongson2_clockmod_table[0],
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:20 AM, Linus Torvalds
 wrote:
>
> The static_key_false() approach with minimal inlining sounds like a
> much better approach overall.

Sorry, I misunderstood your thing. That's actually what you want that
section thing for, because right now you cannot generate the argument
expansion otherwise.

Ugh. I can see the attraction of your section thing for that case, I
just get the feeling that we should be able to do better somehow.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 11:17 -0700, H. Peter Anvin wrote:
> On 08/05/2013 10:55 AM, Steven Rostedt wrote:
> > 
> > Well, as tracepoints are being added quite a bit in Linux, my concern is
> > with the inlined functions that they bring. With jump labels they are
> > disabled in a very unlikely way (the static_key_false() is a nop to skip
> > the code, and is dynamically enabled to a jump).
> > 
> 
> Have you considered using traps for tracepoints?  A trapping instruction
> can be as small as a single byte.  The downside, of course, is that it
> is extremely suppressed -- the trap is always expensive -- and you then
> have to do a lookup to find the target based on the originating IP.

No, never considered it, nor would I. Those that use tracepoints, do use
them extensively, and adding traps like this would probably cause
heissenbugs and make tracepoints useless.

Not to mention, how would we add a tracepoint to a trap handler?

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck

See attachment for what I actually applied - I think I got what you
suggested (I added a declaration for "total_len").

Forcing a panic worked some things were logged to pstore.

But on reboot with your patches applied I'm still seeing a GP fault
when pstore is mounted and we find compressed records and inflate them
and install them into the pstore filesystem.  Here's the oops:

general protection fault:  [#1] SMP
Modules linked in:
CPU: 29 PID: 10252 Comm: mount Not tainted 3.11.0-rc3-12-g73bec18 #2
Hardware name: Intel Corporation LH Pass ../SVRBD-ROW_T, BIOS
SE5C600.86B.99.99.x059.091020121352 09/10/2012
task: 88082e934040 ti: 88082e2ec000 task.ti: 88082e2ec000
RIP: 0010:[]  [] pstore_mkfile+0x84/0x410
RSP: 0018:88082e2edc70  EFLAGS: 00010007
RAX: 0246 RBX: 81ca7b20 RCX: 625f6963703e373c
RDX: 00040004 RSI: 0004 RDI: 820aa7e8
RBP: 88082e2edd10 R08: 881026a48000 R09: 
R10: 88102d21efb8 R11:  R12: 881026a48000
R13: 51ffe3560003 R14:  R15: 4450
FS:  7fbd37a2d7e0() GS:88103fca() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fbd37a47000 CR3: 00103dc78000 CR4: 000407e0
Stack:
 881026a4c450 5227 81a3703d 881026a48000
 2e2edd70 88103db34140 0001abaf 36383039
 003a0fb8 881026a48000 88102d21e000 448a
Call Trace:
 [] pstore_get_records+0xed/0x2c0
 [] ? pstore_get_inode+0x50/0x50
 [] pstore_fill_super+0xa2/0xc0
 [] mount_single+0xa2/0xd0
 [] pstore_mount+0x18/0x20
 [] mount_fs+0x43/0x1b0
 [] ? __alloc_percpu+0x10/0x20
 [] vfs_kern_mount+0x6f/0x100
 [] do_mount+0x259/0xa10
 [] ? strndup_user+0x5b/0x80
 [] SyS_mount+0x8e/0xe0
 [] system_call_fastpath+0x16/0x1b
Code: 88 e8 f1 0f 39 00 48 8b 0d 0a 3a a2 00 48 81 f9 00 0d c9 81 75
15 eb 67 0f 1f 80 00 00 00 00 48 8b 09 48 81 f9 00 0d c9 81 74 54 <44>
39 71 18 75 ee 4c 39 69 20 75 e8 48 39 59 10 75 e2 48 89 c6
RIP  [] pstore_mkfile+0x84/0x410
 RSP 
---[ end trace 0e1dd8e3ccfa3dcc ]---
/etc/init.d/functions: line 530: 10252 Segmentation fault  "$@"

Here's the start of my pstore_mkfile() code where the GP fault occurred:

8126d290 :
8126d290:   e8 2b 91 39 00  callq
816063c0 <__fentry__>
8126d295:   55  push   %rbp
8126d296:   48 89 e5mov%rsp,%rbp
8126d299:   41 57   push   %r15
8126d29b:   41 56   push   %r14
8126d29d:   41 89 femov%edi,%r14d
8126d2a0:   48 c7 c7 e8 a7 0a 82mov$0x820aa7e8,%rdi
8126d2a7:   41 55   push   %r13
8126d2a9:   49 89 d5mov%rdx,%r13
8126d2ac:   41 54   push   %r12
8126d2ae:   53  push   %rbx
8126d2af:   48 83 ec 78 sub$0x78,%rsp
8126d2b3:   89 4d 84mov%ecx,-0x7c(%rbp)
8126d2b6:   48 89 b5 70 ff ff ffmov%rsi,-0x90(%rbp)
8126d2bd:   65 48 8b 04 25 28 00mov%gs:0x28,%rax
8126d2c4:   00 00
8126d2c6:   48 89 45 d0 mov%rax,-0x30(%rbp)
8126d2ca:   31 c0   xor%eax,%eax
8126d2cc:   48 8b 05 0d d5 e3 00mov
0xe3d50d(%rip),%rax# 820aa7e0 
8126d2d3:   4c 89 85 78 ff ff ffmov%r8,-0x88(%rbp)
8126d2da:   44 89 4d 80 mov%r9d,-0x80(%rbp)
8126d2de:   48 8b 5d 28 mov0x28(%rbp),%rbx
8126d2e2:   48 8b 40 60 mov0x60(%rax),%rax
8126d2e6:   48 89 45 88 mov%rax,-0x78(%rbp)
8126d2ea:   e8 f1 0f 39 00  callq
815fe2e0 <_raw_spin_lock_irqsave>
8126d2ef:   48 8b 0d 0a 3a a2 00mov
0xa23a0a(%rip),%rcx# 81c90d00 
8126d2f6:   48 81 f9 00 0d c9 81cmp$0x81c90d00,%rcx
8126d2fd:   75 15   jne
8126d314 
8126d2ff:   eb 67   jmp
8126d368 
8126d301:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
8126d308:   48 8b 09mov(%rcx),%rcx
8126d30b:   48 81 f9 00 0d c9 81cmp$0x81c90d00,%rcx
8126d312:   74 54   je
8126d368 
8126d314:   44 39 71 18 cmp
%r14d,0x18(%rcx)   << GP fault here
8126d318:   75 ee   jne
8126d308 
8126d31a:   4c 39 69 20 cmp%r13,0x20(%rcx)
8126d31e:   75 e8   jne
8126d308 
8126d320:   48 39 59 10 cmp

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov

On Mon, Aug 05, 2013 at 08:50:17AM -0700, Andrew Fish wrote:
> AFAICT EFI pre-dates kexec merge into mainline by a number of years as
> SetVirtualaddressMap() was part of EFI 1.0 (previous millennium)

Ok, fair enough.

> The EFI to UEFI conversion was placing EFI 1.10 into an industry
> standard, UEFI 2.0. UEFI is an industry standard so some one just
> needs to make a proposal to update the spec. The edk2 open source
> project is not part of the standards body so complaining on this
> mailing list is not going to get anything changed.

Right, I don't think that even changing the spec would help - it would
actually make things worse because then we'd have to differentiate
between UEFI versions: those which can do SetVirtualaddressMap() more
than once and the older ones.

So let's drop the discussion here - it is what it is, it is too late to
change anything. At least we talked about it. :-)

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] staging: ozwpan: Reset port configuration number.

2013-08-05 Thread Dan Carpenter

On Mon, Aug 05, 2013 at 06:40:14PM +0100, Rupesh Gujare wrote:
> Make sure that we reset port configuration no. when PD departs.
> 

What happens if we don't do this?  What is the user visible effect
of this patch?

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 10:55 AM, Steven Rostedt  wrote:
>
> My main concern is with tracepoints. Which on 90% (or more) of systems
> running Linux, is completely off, and basically just dead code, until
> someone wants to see what's happening and enables them.

The static_key_false() approach with minimal inlining sounds like a
much better approach overall. Sure, it might add a call/ret, but it
adds it to just the unlikely tracepoint taken path.

Of course, it would be good to optimize static_key_false() itself -
right now those static key jumps are always five bytes, and while they
get nopped out, it would still be nice if there was some way to have
just a two-byte nop (turning into a short branch) *if* we can reach
another jump that way..For small functions that would be lovely. Oh
well.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] staging: ozwpan: Fixes crash due to invalid port aceess.

2013-08-05 Thread Dan Carpenter

On Mon, Aug 05, 2013 at 06:40:12PM +0100, Rupesh Gujare wrote:
> This patch fixes kernel crash issue, when we receive URB request
> after de-enumerating device.
> 

In other words we were getting a NULL dereference dereferencing
"ep".  There is an existing check already, which should be cleaned
up.

drivers/staging/ozwpan/ozhcd.c
   498  
   499  if (ep && port->hpd) {
^^
This useless existing check should be removed.

   500  list_add_tail(&urbl->link, &ep->urb_list);
   501  if (!in_dir && ep_addr && (ep->credit < 0)) {
   502  getrawmonotonic(&ep->timestamp);
   503  ep->credit = 0;
   504  }
   505  } else {
   506  err = -EPIPE;
   507  }

I'm not sure that think -ENOMEM is the correct error code but I
also don't know what else to use.

I had a style nit pick as well, below.

> Signed-off-by: Rupesh Gujare 
> ---
>  drivers/staging/ozwpan/ozhcd.c |9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
> index ed63868..d313a63 100644
> --- a/drivers/staging/ozwpan/ozhcd.c
> +++ b/drivers/staging/ozwpan/ozhcd.c
> @@ -480,10 +480,14 @@ static int oz_enqueue_ep_urb(struct oz_port *port, u8 
> ep_addr, int in_dir,
>   oz_free_urb_link(urbl);
>   return 0;
>   }
> - if (in_dir)
> + if (in_dir && port->in_ep[ep_addr])
>   ep = port->in_ep[ep_addr];
> - else
> + else if (!in_dir && port->out_ep[ep_addr])
>   ep = port->out_ep[ep_addr];

In the future, use kernel braces style.  If one side of the if else
statement gets a brace then they both get one.  So it's like this:

if (in_dir && port->in_ep[ep_addr]) {
ep = port->in_ep[ep_addr];
} else if (!in_dir && port->out_ep[ep_addr]) {
ep = port->out_ep[ep_addr];
} else {
err = -ENOMEM;
goto out;
}

Or another simpler way to write this would be:

ep = NULL;
if (in_dir)
ep = port->in_ep[ep_addr];
else
ep = port->out_ep[ep_addr];
if (!ep) {
err = -ENOMEM;
goto unlock;
}

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND] ARM: dts: Add USBPHY nodes to Exynos4x12

2013-08-05 Thread Dongjin Kim

This patch adds device nodes for USBPHY to Exynos4x12.

CC: Sachin Kamat 
Signed-off-by: Dongjin Kim 
---
 arch/arm/boot/dts/exynos4x12.dtsi |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm/boot/dts/exynos4x12.dtsi 
b/arch/arm/boot/dts/exynos4x12.dtsi
index 01da194..9c3335b 100644
--- a/arch/arm/boot/dts/exynos4x12.dtsi
+++ b/arch/arm/boot/dts/exynos4x12.dtsi
@@ -73,4 +73,22 @@
clock-names = "sclk_fimg2d", "fimg2d";
status = "disabled";
};
+
+   usbphy@125B0 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "samsung,exynos4x12-usb2phy";
+   reg = <0x125B 0x100>;
+   ranges;
+
+   clocks = <&clock 2>, <&clock 305>;
+   clock-names = "xusbxti", "otg";
+   status = "disabled";
+
+   usbphy-sys {
+   /* USB device and host PHY_CONTROL registers */
+   reg = <0x10020704 0xc>,
+ <0x1001021c 0x4>;
+   };
+   };
 };
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.*

2013-08-05 Thread Myklebust, Trond

On Mon, 2013-08-05 at 13:37 -0400, Jeff Layton wrote:
> On Mon, 5 Aug 2013 16:15:01 +
> "Myklebust, Trond"  wrote:
> 
> > From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001
> > From: Trond Myklebust 
> > Date: Mon, 5 Aug 2013 12:06:12 -0400
> > Subject: [PATCH] LOCKD: Don't call utsname()->nodename from
> >  nlmclnt_setlockargs
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: 8bit
> > 
> > Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
> > which case we're in entirely the wrong namespace.
> > Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move
> > exit_task_namespaces() outside of exit_notify()) now means that
> > exit_task_work() is called after exit_task_namespaces(), which
> > triggers an Oops when we're freeing up the locks.
> > 
> > Signed-off-by: Trond Myklebust 
> > Cc: Toralf Förster 
> > Cc: Oleg Nesterov 
> > Cc: Nix 
> > Cc: Jeff Layton 
> > ---
> >  fs/lockd/clntproc.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> > index 9760ecb..acd3947 100644
> > --- a/fs/lockd/clntproc.c
> > +++ b/fs/lockd/clntproc.c
> > @@ -125,14 +125,15 @@ static void nlmclnt_setlockargs(struct nlm_rqst *req, 
> > struct file_lock *fl)
> >  {
> > struct nlm_args *argp = &req->a_args;
> > struct nlm_lock *lock = &argp->lock;
> > +   char *nodename = req->a_host->h_rpcclnt->cl_nodename;
> >  
> > nlmclnt_next_cookie(&argp->cookie);
> > memcpy(&lock->fh, NFS_FH(file_inode(fl->fl_file)), sizeof(struct 
> > nfs_fh));
> > -   lock->caller  = utsname()->nodename;
> > +   lock->caller  = nodename;
> > lock->oh.data = req->a_owner;
> > lock->oh.len  = snprintf(req->a_owner, sizeof(req->a_owner), "%u@%s",
> > (unsigned int)fl->fl_u.nfs_fl.owner->pid,
> > -   utsname()->nodename);
> > +   nodename);
> > lock->svid = fl->fl_u.nfs_fl.owner->pid;
> > lock->fl.fl_start = fl->fl_start;
> > lock->fl.fl_end = fl->fl_end;
> 
> Looks good to me...
> 
> Reviewed-by: Jeff Layton 
> 
> Trond, any thoughts on the other oops that Nix posted? The issue there
> seems to be that we're trying to do the pathwalk to the rpcbind unix
> socket from exit_task_work(), but that's happening after we've already
> called exit_fs().
> 
> The trivial answer seems to be to simply call exit_task_work() before
> exit_fs() there, but it seems like we ought to be doing the upcall to
> rpcbind in a mount namespace from which we know we can reach the
> socket...

Isn't it enough to just do the same thing as we did for gss proxy? i.e.
set the RPC_CLNT_CREATE_NO_IDLE_TIMEOUT flag.

See attachment.
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
trond.mykleb...@netapp.com
www.netapp.com
From ab56d77893815b1b9f0aaa7a89cee7c832a31cff Mon Sep 17 00:00:00 2001
From: Trond Myklebust 
Date: Mon, 5 Aug 2013 14:10:43 -0400
Subject: [PATCH] SUNRPC: Don't auto-disconnect from the local rpcbind socket

There is no need for the kernel to time out the AF_LOCAL connection to
the rpcbind socket, and doing so is problematic because when it is
time to reconnect, our process may no longer be using the same mount
namespace.

Reported-by: Nix 
Signed-off-by: Trond Myklebust 
Cc: Jeff Layton 
Cc: sta...@vger.kernel.org # 3.9.x
---
 net/sunrpc/rpcb_clnt.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 3df764d..4b00555 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -238,6 +238,15 @@ static int rpcb_create_local_unix(struct net *net)
 		.program	= &rpcb_program,
 		.version	= RPCBVERS_2,
 		.authflavor	= RPC_AUTH_NULL,
+		/*
+		 * We turn off the idle timeout to prevent the kernel
+		 * from automatically disconnecting the socket.
+		 * Otherwise, we'd have to cache the mount namespace
+		 * of the caller and somehow pass that to the socket
+		 * reconnect code.
+		 */
+		.flags		= RPC_CLNT_CREATE_NOPING |
+  RPC_CLNT_CREATE_NO_IDLE_TIMEOUT,
 	};
 	struct rpc_clnt *clnt, *clnt4;
 	int result = 0;
-- 
1.8.3.1

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 10:55 AM, Steven Rostedt wrote:
> 
> Well, as tracepoints are being added quite a bit in Linux, my concern is
> with the inlined functions that they bring. With jump labels they are
> disabled in a very unlikely way (the static_key_false() is a nop to skip
> the code, and is dynamically enabled to a jump).
> 

Have you considered using traps for tracepoints?  A trapping instruction
can be as small as a single byte.  The downside, of course, is that it
is extremely suppressed -- the trap is always expensive -- and you then
have to do a lookup to find the target based on the originating IP.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/vmw_vsock/af_vsock.c: drop unneeded semicolon

2013-08-05 Thread David Miller

From: Julia Lawall 
Date: Mon,  5 Aug 2013 16:47:38 +0200

> From: Julia Lawall 
> 
> Drop the semicolon at the end of the list_for_each_entry loop header.
> 
> Signed-off-by: Julia Lawall 
> 
> ---
> Not tested, but I can't imagine how the current code could work, since vsk
> should end up pointing to a dummy value.

This bug has been there since the code was first checked in, and indeed
it's going to work on garbage since it's going to pass in the list
head transformed into a vsock structure.

Applied, thanks Julia.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

PROBLEM: Kernel causing 5minute delay mounting USB memory stick

2013-08-05 Thread Vern Clark

kernel 3.8.0.28-generic problem started

syslog output:
Aug  2 09:39:15 u kernel: [ 2268.769492] usb 1-8: new high-speed USB
device number 11 using ehci-pci
...
Aug  2 09:44:26 u udevd[3606]: timeout: killing '/sbin/blkid -o udev -p 
/dev/sdb' [3717]
Aug  2 09:44:57  udevd[3606]: last message repeated 30 times
Aug  2 09:44:57 u kernel: [ 2610.106976] usb 1-8: reset high-speed USB device 
number 11 using ehci-pci
Aug  2 09:44:57 u udevd[3606]: timeout: killing '/sbin/blkid -o udev -p 
/dev/sdb' [3717]
Aug  2 09:45:28  udevd[3606]: last message repeated 30 times
Aug  2 09:45:28 u kernel: [ 2641.011767] usb 1-8: reset high-speed USB device 
number 11 using ehci-pci
Aug  2 09:45:28 u kernel: [ 2641.146199] sd 11:0:0:0: [sdb] Attached SCSI 
removable disk
Aug  2 09:45:28 u udevd[3606]: '/sbin/blkid -o udev -p /dev/sdb' [3717] 
terminated by signal 9 (Killed)
Aug  2 09:45:28 u udevd[3606]: timeout 'udisks-part-id /dev/sdb'


kernel 3.5.0-38-generic is the last known one that USB  works.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:

> Signed-off-by: Oleg Nesterov 
> Tested-by: David Ahern 
> Reviewed-and-Acked-by: Steven Rostedt 

Just so you know. The standard that we now want to live by is only one
tag per line. I know I gave you that tag in my email, but when adding it
to a patch it needs to be:

Reviewed-by: Steven Rostedt 
Acked-by: Steven Rostedt 

I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
both, it needs to be like that.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 07/22] ARM: dts: Remove '0x's from Exynos4210 DTSI file

2013-08-05 Thread Kukjin Kim


On 07/25/13 00:09, Lee Jones wrote:

... for the sake of consistency and assumed convention.

Cc: Kukjin Kim
Cc: linux-samsung-...@vger.kernel.org
Signed-off-by: Lee Jones

diff --git a/arch/arm/boot/dts/exynos4210.dtsi 
b/arch/arm/boot/dts/exynos4210.dtsi
index b7f358a..53e2527 100644
--- a/arch/arm/boot/dts/exynos4210.dtsi
+++ b/arch/arm/boot/dts/exynos4210.dtsi
@@ -72,7 +72,7 @@
 };

   

 };

   


-   clock: clock-controller@0x1003 {

   

+   clock: clock-controller@1003 {

   

 compatible = "samsung,exynos4210-clock";

   

 reg =<0x1003 0x2>;

   

 #clock-cells =<1>;

   

BTW, should be tab at the mark '^' not white space :(

- Kukjin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND] ARM: dts: Add USB host node for Exynos4

2013-08-05 Thread Dongjin Kim

This patch adds EHCI and OHCI host device nodes for Exynos4.

CC: Jingoo Han 
Signed-off-by: Dongjin Kim 
---
 arch/arm/boot/dts/exynos4.dtsi |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm/boot/dts/exynos4.dtsi b/arch/arm/boot/dts/exynos4.dtsi
index 3f94fe8..cbe5219 100644
--- a/arch/arm/boot/dts/exynos4.dtsi
+++ b/arch/arm/boot/dts/exynos4.dtsi
@@ -155,6 +155,24 @@
status = "disabled";
};
 
+   ehci@1258 {
+   compatible = "samsung,exynos4210-ehci";
+   reg = <0x1258 0x100>;
+   interrupts = <0 70 0>;
+   clocks = <&clock 304>;
+   clock-names = "usbhost";
+   status = "disabled";
+   };
+
+   ohci@1259 {
+   compatible = "samsung,exynos4210-ohci";
+   reg = <0x1259 0x100>;
+   interrupts = <0 70 0>;
+   clocks = <&clock 304>;
+   clock-names = "usbhost";
+   status = "disabled";
+   };
+
mfc: codec@1340 {
compatible = "samsung,mfc-v5";
reg = <0x1340 0x1>;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup

2013-08-05 Thread Aristeu Rozanski

On Thu, Aug 01, 2013 at 05:49:53PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using css
> (cgroup_subsys_state) as the primary handle instead of cgroup in
> subsystem API.  For hierarchy iterators, this is beneficial because
> 
> * In most cases, css is the only thing subsystems care about anyway.
> 
> * On the planned unified hierarchy, iterations for different
>   subsystems will need to skip over different subtrees of the
>   hierarchy depending on which subsystems are enabled on each cgroup.
>   Passing around css makes it unnecessary to explicitly specify the
>   subsystem in question as css is intersection between cgroup and
>   subsystem
> 
> * For the planned unified hierarchy, css's would need to be created
>   and destroyed dynamically independent from cgroup hierarchy.  Having
>   cgroup core manage css iteration makes enforcing deref rules a lot
>   easier.
> 
> Most subsystem conversions are straight-forward.  Noteworthy changes
> are
> 
> * blkio: cgroup_to_blkcg() is no longer used.  Removed.
> 
> * freezer: cgroup_freezer() is no longer used.  Removed.
> 
> * devices: cgroup_to_devcgroup() is no longer used.  Removed.

Acked-by: Aristeu Rozanski 

-- 
Aristeu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Felipe Contreras

On Mon, Aug 5, 2013 at 12:43 PM, Felipe Contreras
 wrote:
> On Mon, Aug 5, 2013 at 12:39 PM, Linus Torvalds
>  wrote:
>
>> That said, Felipe, can you double-check that it's not timing-related
>> in some subtle way, and test multiple times with just that commit
>> reverted (and not reverted) to make sure that it's 100% that one
>> single line by that particular commit? Because it does seem very
>> benign..
>
> I tested perhaps dozens of times with the patch, and every one of them
> failed. I tested at least 6 times with the patch reverted, every one
> of them worked.
>
> I'm fairly certain that it's 100% reproducible, so it doesn't seem to be a 
> race.
>
> But I'll double-check.

Yeah, I just tested 5 times; with the patch all 5 times failed,
without the patch all 5 times worked.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/ipv4: fix the conditions of entering TCP_CA_Disorder state

2013-08-05 Thread Eric Dumazet

On Mon, 2013-08-05 at 20:45 -0400, Dong Fang wrote:
> if have some packets loss by network, the kernel can't reach here, we can see
> the tcp_time_to_recover() function:
> 
> static bool tcp_time_to_recover(struct sock *sk, int flag)
> {
>   struct tcp_sock *tp = tcp_sk(sk);
>   __u32 packets_out;
> 
>   /* Trick#1: The loss is proven. */
>   if (tp->lost_out)
>   return true;
>   //...
> }
> 
> when it return true, the following condition will be failed:
> 
> //...
> if (!tcp_time_to_recover(sk, flag)) {
>   tcp_try_to_open(sk, flag, prior_unsacked);
>   return;
> }
> //...
> 

I honestly do not understand this changelog, and how it is related to
the patch.

Also its not 'net/ipv4:' issue but 'tcp:' one


Could you please explain the issue again ?

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [PATCH 2/3] MAINTAINERS: Update the Xen subsystem's with proper mailing list.

2013-08-05 Thread Ian Campbell

On Mon, 2013-08-05 at 14:05 -0400, Konrad Rzeszutek Wilk wrote:
> And also drop the virtualization one since we don't really use it.
> 
> CC: stefano.stabell...@eu.citrix.com
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 

The whole thing but especially:

>  XEN NETWORK BACKEND DRIVER
>  M:   Ian Campbell 
> -L:   xen-de...@lists.xensource.com (moderated for non-subscribers)
> +L:   xen-de...@lists.xenproject.org (moderated for non-subscribers)
>  L:   net...@vger.kernel.org
>  S:   Supported
>  F:   drivers/net/xen-netback/*

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 13:55 -0400, Steven Rostedt wrote:
>  The difference between this and the
> "section" hack I suggested, is that this would use a "call"/"ret" when
> enabled instead of a "jmp"/"jmp".

I wonder if this is what Kris Kross meant in their song?

/me goes back to work...

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] MAINTAINERS: Remove Jeremy from the Xen subsystem.

2013-08-05 Thread Konrad Rzeszutek Wilk

Jeremy has been a key person in making Linux work with Xen.
He has been enjoying the last year working on something
different so reflect that in the maintainers file.

CC: Jeremy Fitzhardinge 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 CREDITS | 1 +
 MAINTAINERS | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/CREDITS b/CREDITS
index 206d0fc..646a0a9 100644
--- a/CREDITS
+++ b/CREDITS
@@ -1120,6 +1120,7 @@ D: author of userfs filesystem
 D: Improved mmap and munmap handling
 D: General mm minor tidyups
 D: autofs v4 maintainer
+D: Xen subsystem
 S: 987 Alabama St
 S: San Francisco
 S: CA, 94110
diff --git a/MAINTAINERS b/MAINTAINERS
index defc053..440af74 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9237,7 +9237,6 @@ F:drivers/media/tuners/tuner-xc2028.*
 
 XEN HYPERVISOR INTERFACE
 M: Konrad Rzeszutek Wilk 
-M: Jeremy Fitzhardinge 
 L: xen-de...@lists.xensource.com (moderated for non-subscribers)
 L: virtualizat...@lists.linux-foundation.org
 S: Supported
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Update MAINTAINERS file in Linux. (v1)

2013-08-05 Thread Konrad Rzeszutek Wilk

Please see the three patches that update the MAINTAINERS file.

They do need Acks so please provide them if you are comfortable.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] MAINTAINERS: Add in two extra co-maintainers of the Xen tree.

2013-08-05 Thread Konrad Rzeszutek Wilk

Both Boris and David have graciously volunteered to help in
maintaining the Xen subsystem tree. Cementing this in the
MAINTAINERS file so they are copied on Xen related patches.

CC: Boris Ostrovsky 
CC: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 785f56a..0161dad 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9237,6 +9237,8 @@ F:drivers/media/tuners/tuner-xc2028.*
 
 XEN HYPERVISOR INTERFACE
 M: Konrad Rzeszutek Wilk 
+M: Boris Ostrovsky 
+M: David Vrabel 
 L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 S: Supported
 F: arch/x86/xen/
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] MAINTAINERS: Update the Xen subsystem's with proper mailing list.

2013-08-05 Thread Konrad Rzeszutek Wilk

And also drop the virtualization one since we don't really use it.

CC: stefano.stabell...@eu.citrix.com
Signed-off-by: Konrad Rzeszutek Wilk 
---
 MAINTAINERS | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 440af74..785f56a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9237,8 +9237,7 @@ F:drivers/media/tuners/tuner-xc2028.*
 
 XEN HYPERVISOR INTERFACE
 M: Konrad Rzeszutek Wilk 
-L: xen-de...@lists.xensource.com (moderated for non-subscribers)
-L: virtualizat...@lists.linux-foundation.org
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 S: Supported
 F: arch/x86/xen/
 F: drivers/*/xen-*front.c
@@ -9249,35 +9248,35 @@ F:  include/uapi/xen/
 
 XEN HYPERVISOR ARM
 M: Stefano Stabellini 
-L: xen-de...@lists.xensource.com (moderated for non-subscribers)
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 S: Supported
 F: arch/arm/xen/
 F: arch/arm/include/asm/xen/
 
 XEN HYPERVISOR ARM64
 M: Stefano Stabellini 
-L: xen-de...@lists.xensource.com (moderated for non-subscribers)
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 S: Supported
 F: arch/arm64/xen/
 F: arch/arm64/include/asm/xen/
 
 XEN NETWORK BACKEND DRIVER
 M: Ian Campbell 
-L: xen-de...@lists.xensource.com (moderated for non-subscribers)
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 L: net...@vger.kernel.org
 S: Supported
 F: drivers/net/xen-netback/*
 
 XEN PCI SUBSYSTEM
 M: Konrad Rzeszutek Wilk 
-L: xen-de...@lists.xensource.com (moderated for non-subscribers)
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 S: Supported
 F: arch/x86/pci/*xen*
 F: drivers/pci/*xen*
 
 XEN SWIOTLB SUBSYSTEM
 M: Konrad Rzeszutek Wilk 
-L: xen-de...@lists.xensource.com (moderated for non-subscribers)
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 S: Supported
 F: arch/x86/xen/*swiotlb*
 F: drivers/xen/*swiotlb*
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods

2013-08-05 Thread Aristeu Rozanski

On Thu, Aug 01, 2013 at 05:49:50PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup.
> Please see the previous commit which converts the subsystem methods
> for rationale.
> 
> This patch converts all cftype file operations to take @css instead of
> @cgroup.  cftypes for the cgroup core files don't have their subsytem
> pointer set.  These will automatically use the dummy_css added by the
> previous patch and can be converted the same way.
> 
> Most subsystem conversions are straight forwards but there are some
> interesting ones.
> 
> * freezer: update_if_frozen() is also converted to take @css instead
>   of @cgroup for consistency.  This will make the code look simpler
>   too once iterators are converted to use css.
> 
> * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
>   vmpressure while mem_cgroup_from_cont() can be made static.
>   Updated accordingly.
> 
> * cpu: cgroup_tg() doesn't have any user left.  Removed.
> 
> * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
> 
> * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
>   Removed.
> 
> * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.

Also looks good on devcg part

Acked-by: Aristeu Rozanski 

-- 
Aristeu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] macvlan: validate flags

2013-08-05 Thread David Miller

From: "Michael S. Tsirkin" 
Date: Mon, 5 Aug 2013 18:25:54 +0300

> commit df8ef8f3aaa6692970a436204c4429210addb23a
> macvlan: add FDB bridge ops and macvlan flags
> added a flags field to macvlan, which can be
> controlled from userspace.
> The idea is to make the interface future-proof
> so we can add flags and not new fields.
> 
> However, flags value isn't validated, as a result,
> userspace can't detect which flags are supported.
> 
> Cc: "David S. Miller" 
> Cc: John Fastabend 
> Signed-off-by: Michael S. Tsirkin 

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Oleg Nesterov

On 08/05, Felipe Contreras wrote:
>
> Would it be possible to just revert that patch for v3.11, and fix it later?

Sure, but it would be nice to investigate.

I think we have the time for revert, this patch was added after 3.10
so I hope we can always revert it before 3.11.

Felipe, I'll try to make a stupid debugging patch tomorrow, perhaps
you can test it...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods

2013-08-05 Thread Aristeu Rozanski

On Thu, Aug 01, 2013 at 05:49:46PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup *
> in subsystem implementations for the following reasons.
> 
> * With unified hierarchy, subsystems will be dynamically bound and
>   unbound from cgroups and thus css's (cgroup_subsys_state) may be
>   created and destroyed dynamically over the lifetime of a cgroup,
>   which is different from the current state where all css's are
>   allocated and destroyed together with the associated cgroup.  This
>   in turn means that cgroup_css() should be synchronized and may
>   return NULL, making it more cumbersome to use.
> 
> * Differing levels of per-subsystem granularity in the unified
>   hierarchy means that the task and descendant iterators should behave
>   differently depending on the specific subsystem the iteration is
>   being performed for.
> 
> * In majority of the cases, subsystems only care about its part in the
>   cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
>   often obtain the matching css pointer from the cgroup and don't
>   bother with the cgroup pointer itself.  Passing around css fits
>   much better.
> 
> This patch converts all cgroup_subsys methods to take @css instead of
> @cgroup.  The conversions are mostly straight-forward.  A few
> noteworthy changes are
> 
> * ->css_alloc() now takes css of the parent cgroup rather than the
>   pointer to the new cgroup as the css for the new cgroup doesn't
>   exist yet.  Knowing the parent css is enough for all the existing
>   subsystems.
> 
> * In kernel/cgroup.c::offline_css(), unnecessary open coded css
>   dereference is replaced with local variable access.
> 
> This patch shouldn't cause any behavior differences.

looks fine on device_cgroup.c bit

Acked-by: Aristeu Rozanski 

-- 
Aristeu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 07/22] ARM: dts: Remove '0x's from Exynos4210 DTSI file

2013-08-05 Thread Kukjin Kim


On 07/25/13 16:52, Lee Jones wrote:

On Wed, 24 Jul 2013, Tomasz Figa wrote:


On Wednesday 24 of July 2013 16:09:37 Lee Jones wrote:

... for the sake of consistency and assumed convention.

Cc: Kukjin Kim
Cc: linux-samsung-...@vger.kernel.org
Signed-off-by: Lee Jones

diff --git a/arch/arm/boot/dts/exynos4210.dtsi
b/arch/arm/boot/dts/exynos4210.dtsi index b7f358a..53e2527 100644
--- a/arch/arm/boot/dts/exynos4210.dtsi
+++ b/arch/arm/boot/dts/exynos4210.dtsi
@@ -72,7 +72,7 @@
 };
 };

-   clock: clock-controller@0x1003 {
+   clock: clock-controller@1003 {
 compatible = "samsung,exynos4210-clock";
 reg =<0x1003 0x2>;
 #clock-cells =<1>;


Acked-by: Tomasz Figa


Thanks Tomasz.


Thanks, Lee Jones and Tomasz.

Applied #7 ~ #11 into the cleanup of samsung tree.

- Kukjin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 10:12 -0700, Linus Torvalds wrote:
> On Mon, Aug 5, 2013 at 9:55 AM, Steven Rostedt  wrote:

> First off, we have very few things that are *so* unlikely that they
> never get executed. Putting things in a separate section would
> actually be really bad.

My main concern is with tracepoints. Which on 90% (or more) of systems
running Linux, is completely off, and basically just dead code, until
someone wants to see what's happening and enables them.

> 
> Secondly, you don't want a separate section anyway for any normal
> kernel code, since you want short jumps if possible (pretty much every
> single architecture out there has a concept of shorter jumps that are
> noticeably cheaper than long ones). You want the unlikely code to be
> out-of-line, but still *close*. Which is largely what gcc already does
> (except if you use "-Os", which disables all the basic block movement
> and thus makes "likely/unlikely" pointless to begin with).
> 
> There are some situations where you'd want extremely unlikely code to
> really be elsewhere, but they are rare as hell, and mostly in user
> code where you might try to avoid demand-loading such code entirely.

Well, as tracepoints are being added quite a bit in Linux, my concern is
with the inlined functions that they bring. With jump labels they are
disabled in a very unlikely way (the static_key_false() is a nop to skip
the code, and is dynamically enabled to a jump).

I did a make kernel/sched/core.i to get what we have in the current
sched_switch code:

static inline __attribute__((no_instrument_function)) void
trace_sched_switch (struct task_struct *prev, struct task_struct *next) {
if (static_key_false(& __tracepoint_sched_switch .key)) do {
struct tracepoint_func *it_func_ptr;
void *it_func;
void *__data;
rcu_read_lock_sched_notrace();
it_func_ptr = ({
typeof(*((&__tracepoint_sched_switch)->funcs)) 
*_p1 =

(typeof(*((&__tracepoint_sched_switch)->funcs))* )
(*(volatile 
typeof(((&__tracepoint_sched_switch)->funcs)) *)
&(((&__tracepoint_sched_switch)->funcs)));
do {
static bool __attribute__ 
((__section__(".data.unlikely"))) __warned;
if (debug_lockdep_rcu_enabled() && !__warned && 
!(rcu_read_lock_sched_held() || (0))) {
__warned = true;
lockdep_rcu_suspicious( , 153 , 
"suspicious rcu_dereference_check()" " usage");
}
} while (0);
((typeof(*((&__tracepoint_sched_switch)->funcs)) 
*)(_p1));
});
if (it_func_ptr) {
do {
it_func = (it_func_ptr)->func;
__data = (it_func_ptr)->data;
((void(*)(void *__data, struct task_struct 
*prev, struct task_struct *next))(it_func))(__data, prev, next);
} while ((++it_func_ptr)->func);
}
rcu_read_unlock_sched_notrace();
} while (0);
} 

I massaged it to look more readable. This is inlined right at the
beginning of the prepare_task_switch(). Now, most of this code should be
moved to the end of the function by gcc (well, as you stated -Os may not
play nice here). And perhaps its not that bad of an issue. That is, how
much of the icache does this actually take up? Maybe we are lucky and it
sits outside the icache of the hot path.

I still need to start running a bunch of benchmarks to see how much
overhead these tracepoints cause. Herbert Xu brought up the concern
about various latencies in the kernel, including tracing, in his ATTEND
request on the kernel-discuss mailing list.

> 
> So give up on sections. They are a bad idea for anything except the
> things we already use them for. Sure, you can try to fix the problems
> with sections with link-time optimization work and a *lot* of small
> individual sections (the way per-function sections work already), but
> that's basically just undoing the stupidity of using sections to begin
> with.

OK, this was just a suggestion. Perhaps my original patch that just
moves this code into a real function where the trace_sched_switch() only
contains the jump_label and a call to another function that does all the
work when enabled, is still a better idea. That is, if benchmarks prove
that it's worth it.

Instead of the above, my patches would make the code into:

static inline __attribute__((no_instrument_function)) void
trace_sched_switch (struct task_struct *prev, struct task_struct *next)
{
if (static_key_false(& __tracepoint_sched_switch .key))
__trace_sched_switch(prev, ne

Re: [PATCH 2/4] staging: ozwpan: Increment port number for new device.

2013-08-05 Thread Dan Carpenter

On Mon, Aug 05, 2013 at 06:40:13PM +0100, Rupesh Gujare wrote:
> This patch fixes crash issue when there is quick cycle of
> de-enumeration & enumeration due to loss of wireless link.
> 
> It is found that sometimes new device (or coming back device)
> returns very fast, even before USB core read out hub status,
> resulting in allocation of same port, which results in unstable
> system & crash.
> 
> Above issue is resolved by making sure that we always assign
> new port to new device, making sure that USB core reads correct
> hub status.
> 

This feels like papering over the problem.  Surely the real fix
would be to improve the reference counting.

This patch is probably effective but it makes the code more subtle
and it shows that we don't really understand what we are doing with
regards to reference counting.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Oleg Nesterov

On 08/05, Linus Torvalds wrote:
>
> On Mon, Aug 5, 2013 at 6:29 AM, Oleg Nesterov  wrote:
> >
> > I never used wine, but I am puzzled anyway. This patch really looks
> > like a simple and minor bugfix.
>
> The patch is indeed trivial, but.. What's the locking here?
>
> Afaik, ptrace_detach() by the parent can race with do_exit() by the
> child, and they now _both_ do flush_ptrace_hw_breakpoint().

That would be bad. And that is why exit_ptrace() doesn't do this.

But we rely on ptrace_freeze_traced(). If the child can exit (or
even run), we have other problems which were hopefully fixed by
9899d11f "ensure arch_ptrace/ptrace_request can never race with
SIGKILL"

> We have that whole "get tasklist_lock for writing and then
> check child->ptrace" logic there exactly due to that race, no?

Exactly. But note that this code is very old. We can remove the
"This child can be already killed" logic, and we can do more
simplifications in ptrace paths.

In fact, some recent changes already rely on the fact the tracee
can't go away, say ptrace_peek_siginfo()->spin_lock_irq(siglock)
is not safe without ptrace_freeze_traced().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Felipe Contreras

On Mon, Aug 5, 2013 at 12:39 PM, Linus Torvalds
 wrote:

> That said, Felipe, can you double-check that it's not timing-related
> in some subtle way, and test multiple times with just that commit
> reverted (and not reverted) to make sure that it's 100% that one
> single line by that particular commit? Because it does seem very
> benign..

I tested perhaps dozens of times with the patch, and every one of them
failed. I tested at least 6 times with the patch reverted, every one
of them worked.

I'm fairly certain that it's 100% reproducible, so it doesn't seem to be a race.

But I'll double-check.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Felipe Contreras

On Mon, Aug 5, 2013 at 12:11 PM, Oleg Nesterov  wrote:
> On 08/05, Felipe Contreras wrote:
>>
>> On Mon, Aug 5, 2013 at 9:39 AM, Oleg Nesterov  wrote:
>> >
>> > Hmm. It should not crash under strace... please see below.
>> >
>> >> 953   ptrace(PTRACE_ATTACH, 1035, 0, 0) = -1 EPERM (Operation not 
>> >> permitted)
>> >
>> > OK, so it actually uses ptrace ;)
>> >
>> > PTRACE_ATTACH fails because this child is already traced by strace, I 
>> > guess.
>> >
>> > So does Starcraft crash this way? Or does it fail in some other way?
>>
>> It's crashing just the same.
>
> But then it is not clear how fab840f can make any difference.

Yeah, it's very strange.

> wine can not use ptrace when it runs after "strace -f". But, to remind,
> I know nothing about wine. Perhaps wine uses some daemons which actually
> run/ptrace the workload?

There's this thing called wineserver, I'm not exactly sure how it would affect.

But I found this:

http://askubuntu.com/questions/146160/what-is-the-ptrace-scope-workaround-for-wine-programs-and-are-there-any-risks

Would it be possible to just revert that patch for v3.11, and fix it later?

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/4] staging: ozwpan: Reset port configuration number.

2013-08-05 Thread Rupesh Gujare

Make sure that we reset port configuration no. when PD departs.

Signed-off-by: Rupesh Gujare 
---
 drivers/staging/ozwpan/ozhcd.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
index a739986..b060e43 100644
--- a/drivers/staging/ozwpan/ozhcd.c
+++ b/drivers/staging/ozwpan/ozhcd.c
@@ -721,6 +721,7 @@ void oz_hcd_pd_departed(void *hport)
hpd = port->hpd;
port->hpd = NULL;
port->bus_addr = 0xff;
+   port->config_num = 0;
port->flags &= ~(OZ_PORT_F_PRESENT | OZ_PORT_F_DYING);
port->flags |= OZ_PORT_F_CHANGED;
port->status &= ~USB_PORT_STAT_CONNECTION;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] input: ti_tsc: Enable shared IRQ for TSC

2013-08-05 Thread Dmitry Torokhov

On Mon, Aug 05, 2013 at 06:02:02PM +0100, Zubair Lutfullah : wrote:
> On Mon, Aug 05, 2013 at 09:12:56AM -0700, Dmitry Torokhov wrote:
> > > > Touchscreen and ADC share the same IRQ line from parent MFD core.
> > > > Previously only Touchscreen was interrupt based.
> > > > With continuous mode support added in ADC driver, driver requires
> > > > interrupt to process the ADC samples, so enable shared IRQ flag bit for
> > > > touchscreen.
> > > > 
> > > > @@ -260,8 +260,18 @@ static irqreturn_t titsc_irq(int irq, void *dev)
> > > > unsigned int fsm;
> > > >  
> > > > +   /*
> > > > +* ADC and touchscreen share the IRQ line.
> > > > +* FIFO1 threshold, FIFO1 Overrun and FIFO1 underflow
> > > > +* interrupts are used by ADC,
> > > > +* hence return from touchscreen IRQ handler if FIFO1
> > > > +* related interrupts occurred.
> > > > +*/
> > > > +   if ((status & IRQENB_FIFO1THRES) ||
> > > > +   (status & IRQENB_FIFO1OVRRUN) ||
> > > > +   (status & IRQENB_FIFO1UNDRFLW))
> > > > +   return IRQ_NONE;
> > > > +   else if (status & IRQENB_FIFO0THRES) {
> > 
> > What happens if both parts have data at the same time? Can both
> > IRQENB_FIFO1THRES and IRQENB_FIFO0THRES be signalled? What will happen
> > in this case?
> 
> If ADC is sampling and someone is touching the TSC, both interrupts
> can signal so closely that for the purpose of the kernel,
> they can be seen as signaled together.
> 
> FIFO 1 used only by ADC and FIFO1THRES handler is inside the iio/adc driver
> FIFO 0 used only by TSC and FIFO0THRES handler is inside the input/touchscreen
> 
> Note: These are level interrupts.
> 
> I would like some input on how to handle such a situation. 

It looks like you need to have smart demultiplexing in MFD core of your
driver instead of relying on shared interrupt handler.

Another option would be to check "your" bits, handle the data, clear the
status and then check bits again and return IRQ_NONE instead of
IRQ_HANDLED if other guys bits are set, but it is way too ugly.

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] staging: ozwpan: Fixes crash due to invalid port aceess.

2013-08-05 Thread Rupesh Gujare

This patch fixes kernel crash issue, when we receive URB request
after de-enumerating device.

Signed-off-by: Rupesh Gujare 
---
 drivers/staging/ozwpan/ozhcd.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
index ed63868..d313a63 100644
--- a/drivers/staging/ozwpan/ozhcd.c
+++ b/drivers/staging/ozwpan/ozhcd.c
@@ -480,10 +480,14 @@ static int oz_enqueue_ep_urb(struct oz_port *port, u8 
ep_addr, int in_dir,
oz_free_urb_link(urbl);
return 0;
}
-   if (in_dir)
+   if (in_dir && port->in_ep[ep_addr])
ep = port->in_ep[ep_addr];
-   else
+   else if (!in_dir && port->out_ep[ep_addr])
ep = port->out_ep[ep_addr];
+   else {
+   err = -ENOMEM;
+   goto out;
+   }
 
/*For interrupt endpoint check for buffered data
* & complete urb
@@ -505,6 +509,7 @@ static int oz_enqueue_ep_urb(struct oz_port *port, u8 
ep_addr, int in_dir,
} else {
err = -EPIPE;
}
+out:
spin_unlock_bh(&port->ozhcd->hcd_lock);
if (err)
oz_free_urb_link(urbl);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] staging: ozwpan: Return correct hub status.

2013-08-05 Thread Rupesh Gujare

Fix a bug where we were not returning correct hub status
for 8th port.

Signed-off-by: Rupesh Gujare 
---
 drivers/staging/ozwpan/ozhcd.c |   11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
index b060e43..2f93a00 100644
--- a/drivers/staging/ozwpan/ozhcd.c
+++ b/drivers/staging/ozwpan/ozhcd.c
@@ -1871,17 +1871,24 @@ static int oz_hcd_hub_status_data(struct usb_hcd *hcd, 
char *buf)
int i;
 
buf[0] = 0;
+   buf[1] = 0;
 
spin_lock_bh(&ozhcd->hcd_lock);
for (i = 0; i < OZ_NB_PORTS; i++) {
if (ozhcd->ports[i].flags & OZ_PORT_F_CHANGED) {
oz_dbg(HUB, "Port %d changed\n", i);
ozhcd->ports[i].flags &= ~OZ_PORT_F_CHANGED;
-   buf[0] |= 1<<(i+1);
+   if (i < 7)
+   buf[0] |= 1 << (i+1);
+   else
+   buf[1] |= 1 << (i-7);
}
}
spin_unlock_bh(&ozhcd->hcd_lock);
-   return buf[0] ? 1 : 0;
+   if (buf[0] != 0 || buf[1] != 0)
+   return 2;
+   else
+   return 0;
 }
 
/*--
  * Context: process
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/4] staging: ozwpan: Fix crash issues.

2013-08-05 Thread Rupesh Gujare

This patch series fixes crash issues observed,
& fix a bug in hub status code.

Rupesh Gujare (4):
  staging: ozwpan: Fixes crash due to invalid port aceess.
  staging: ozwpan: Increment port number for new device.
  staging: ozwpan: Reset port configuration number.
  staging: ozwpan: Return correct hub status.

 drivers/staging/ozwpan/ozhcd.c |   37 -
 1 file changed, 28 insertions(+), 9 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] staging: ozwpan: Increment port number for new device.

2013-08-05 Thread Rupesh Gujare

This patch fixes crash issue when there is quick cycle of
de-enumeration & enumeration due to loss of wireless link.

It is found that sometimes new device (or coming back device)
returns very fast, even before USB core read out hub status,
resulting in allocation of same port, which results in unstable
system & crash.

Above issue is resolved by making sure that we always assign
new port to new device, making sure that USB core reads correct
hub status.

Signed-off-by: Rupesh Gujare 
---
 drivers/staging/ozwpan/ozhcd.c |   16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
index d313a63..a739986 100644
--- a/drivers/staging/ozwpan/ozhcd.c
+++ b/drivers/staging/ozwpan/ozhcd.c
@@ -127,6 +127,7 @@ struct oz_hcd {
struct list_head urb_cancel_list;
struct list_head orphanage;
int conn_port; /* Port that is currently connecting, -1 if none.*/
+   int last_port;
struct oz_port ports[OZ_NB_PORTS];
uint flags;
struct usb_hcd *hcd;
@@ -645,7 +646,9 @@ void *oz_hcd_pd_arrived(void *hpd)
goto out;
}
for (i = 0; i < OZ_NB_PORTS; i++) {
-   struct oz_port *port = &ozhcd->ports[i];
+   struct oz_port *port = &ozhcd->ports[ozhcd->last_port++];
+   if (ozhcd->last_port >= OZ_NB_PORTS)
+   ozhcd->last_port = 0;
spin_lock(&port->port_lock);
if ((port->flags & OZ_PORT_F_PRESENT) == 0) {
oz_acquire_port(port, hpd);
@@ -655,13 +658,16 @@ void *oz_hcd_pd_arrived(void *hpd)
spin_unlock(&port->port_lock);
}
if (i < OZ_NB_PORTS) {
-   oz_dbg(ON, "Setting conn_port = %d\n", i);
-   ozhcd->conn_port = i;
+   if (!ozhcd->last_port)
+   ozhcd->conn_port = OZ_NB_PORTS - 1;
+   else
+   ozhcd->conn_port = ozhcd->last_port - 1;
+   oz_dbg(ON, "Setting conn_port = %d\n", ozhcd->conn_port);
/* Attach out endpoint 0.
 */
-   ozhcd->ports[i].out_ep[0] = ep;
+   ozhcd->ports[ozhcd->conn_port].out_ep[0] = ep;
ep = NULL;
-   hport = &ozhcd->ports[i];
+   hport = &ozhcd->ports[ozhcd->conn_port];
spin_unlock_bh(&ozhcd->hcd_lock);
if (ozhcd->flags & OZ_HDC_F_SUSPENDED) {
oz_dbg(ON, "Resuming root hub\n");
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 6:29 AM, Oleg Nesterov  wrote:
>
> I never used wine, but I am puzzled anyway. This patch really looks
> like a simple and minor bugfix.

The patch is indeed trivial, but.. What's the locking here?

Afaik, ptrace_detach() by the parent can race with do_exit() by the
child, and they now _both_ do flush_ptrace_hw_breakpoint(). Or am I
wrong? We have that whole "get tasklist_lock for writing and then
check child->ptrace" logic there exactly due to that race, no?

That said, Felipe, can you double-check that it's not timing-related
in some subtle way, and test multiple times with just that commit
reverted (and not reverted) to make sure that it's 100% that one
single line by that particular commit? Because it does seem very
benign..

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.*

2013-08-05 Thread Jeff Layton

On Mon, 5 Aug 2013 16:15:01 +
"Myklebust, Trond"  wrote:

> From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust 
> Date: Mon, 5 Aug 2013 12:06:12 -0400
> Subject: [PATCH] LOCKD: Don't call utsname()->nodename from
>  nlmclnt_setlockargs
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
> which case we're in entirely the wrong namespace.
> Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move
> exit_task_namespaces() outside of exit_notify()) now means that
> exit_task_work() is called after exit_task_namespaces(), which
> triggers an Oops when we're freeing up the locks.
> 
> Signed-off-by: Trond Myklebust 
> Cc: Toralf Förster 
> Cc: Oleg Nesterov 
> Cc: Nix 
> Cc: Jeff Layton 
> ---
>  fs/lockd/clntproc.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> index 9760ecb..acd3947 100644
> --- a/fs/lockd/clntproc.c
> +++ b/fs/lockd/clntproc.c
> @@ -125,14 +125,15 @@ static void nlmclnt_setlockargs(struct nlm_rqst *req, 
> struct file_lock *fl)
>  {
>   struct nlm_args *argp = &req->a_args;
>   struct nlm_lock *lock = &argp->lock;
> + char *nodename = req->a_host->h_rpcclnt->cl_nodename;
>  
>   nlmclnt_next_cookie(&argp->cookie);
>   memcpy(&lock->fh, NFS_FH(file_inode(fl->fl_file)), sizeof(struct 
> nfs_fh));
> - lock->caller  = utsname()->nodename;
> + lock->caller  = nodename;
>   lock->oh.data = req->a_owner;
>   lock->oh.len  = snprintf(req->a_owner, sizeof(req->a_owner), "%u@%s",
>   (unsigned int)fl->fl_u.nfs_fl.owner->pid,
> - utsname()->nodename);
> + nodename);
>   lock->svid = fl->fl_u.nfs_fl.owner->pid;
>   lock->fl.fl_start = fl->fl_start;
>   lock->fl.fl_end = fl->fl_end;

Looks good to me...

Reviewed-by: Jeff Layton 

Trond, any thoughts on the other oops that Nix posted? The issue there
seems to be that we're trying to do the pathwalk to the rpcbind unix
socket from exit_task_work(), but that's happening after we've already
called exit_fs().

The trivial answer seems to be to simply call exit_task_work() before
exit_fs() there, but it seems like we ought to be doing the upcall to
rpcbind in a mount namespace from which we know we can reach the
socket...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: O_TMPFILE fs corruption (Re: Linux 3.11-rc4)

2013-08-05 Thread Jörn Engel

On Mon, 5 August 2013 01:26:46 -0700, Christoph Hellwig wrote:
> On Sun, Aug 04, 2013 at 08:45:16PM -0700, Linus Torvalds wrote:
> > The patch looks right to me - we should pass in similar flags for the
> > create case as for tmpfile to the filesystem.
> > 
> > But let's make sure we're all on the same page. Al?
> 
> Given all the problems and very limited fs support I'd much prefer
> disabling O_TMPFILE for this release.  That'd give it the needed
> exposure it was missing by being merged without any previous public
> review.

Agreed.  This has not been in -next at all.  It is not an urgent
security thing or regression fix, so there is no good excuse for
avoiding the normal process.

Jörn

--
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard Feynman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] x86/mce fix to queue for 3.12

2013-08-05 Thread Luck, Tony

The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f:

  Linux 3.11-rc4 (2013-08-04 13:46:46 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-mce-f-bit

for you to fetch changes up to 0ca06c0857aee11911f91621db14498496f2c2cd:

  x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors 
(2013-08-05 10:09:40 -0700)


Bit 12 may or may not be set in MCi_STATUS.MCACOD when
an uncorrected error is reported. Ignore it when checking
error signatures.


Tony Luck (1):
  x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors

 arch/x86/include/asm/mce.h | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/10] vfio: add external user support

2013-08-05 Thread Alex Williamson

On Tue, 2013-07-23 at 19:07 +1000, Alexey Kardashevskiy wrote:
> On 07/23/2013 12:23 PM, Alex Williamson wrote:
> > On Tue, 2013-07-16 at 10:53 +1000, Alexey Kardashevskiy wrote:
> >> VFIO is designed to be used via ioctls on file descriptors
> >> returned by VFIO.
> >>
> >> However in some situations support for an external user is required.
> >> The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
> >> use the existing VFIO groups for exclusive access in real/virtual mode
> >> on a host to avoid passing map/unmap requests to the user space which
> >> would made things pretty slow.
> >>
> >> The protocol includes:
> >>
> >> 1. do normal VFIO init operation:
> >>- opening a new container;
> >>- attaching group(s) to it;
> >>- setting an IOMMU driver for a container.
> >> When IOMMU is set for a container, all groups in it are
> >> considered ready to use by an external user.
> >>
> >> 2. User space passes a group fd to an external user.
> >> The external user calls vfio_group_get_external_user()
> >> to verify that:
> >>- the group is initialized;
> >>- IOMMU is set for it.
> >> If both checks passed, vfio_group_get_external_user()
> >> increments the container user counter to prevent
> >> the VFIO group from disposal before KVM exits.
> >>
> >> 3. The external user calls vfio_external_user_iommu_id()
> >> to know an IOMMU ID. PPC64 KVM uses it to link logical bus
> >> number (LIOBN) with IOMMU ID.
> >>
> >> 4. When the external KVM finishes, it calls
> >> vfio_group_put_external_user() to release the VFIO group.
> >> This call decrements the container user counter.
> >> Everything gets released.
> >>
> >> The "vfio: Limit group opens" patch is also required for the consistency.
> >>
> >> Signed-off-by: Alexey Kardashevskiy 
> > 
> > This looks fine to me.  Is the plan to add this through the ppc tree
> > again?  Thanks,
> 
> 
> Nope, better to add this through your tree. And faster for sure :) Thanks!

Applied to my next branch for v3.12.  Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

On Mon, 2013-08-05 at 10:02 -0700, H. Peter Anvin wrote:

> > if (x) __attibute__((section(".foo"))) {
> > /* do something */
> > }
> > 
> 
> One concern I have is how this kind of code would work when embedded
> inside a function which already has a section attribute.  This could
> easily cause really weird bugs when someone "optimizes" an inline or
> macro and breaks a single call site...

I would say that it overrides the section it is embedded in. Basically
like a .pushsection and .popsection would work.

What bugs do you think would happen? Sure, this used in an .init section
would have this code sit around after boot up. I'm sure modules could
handle this properly. What other uses of attribute section is there for
code? I'm aware of locks and sched using it but that's more for
debugging purposes and even there, the worse thing I see is that a debug
report wont say that the code is in the section.

We do a lot of tricks with sections in the Linux kernel, so I too share
your concern. But even with that, if we audit all use cases, we may
still be able to safely do this. This is why I'm asking for comments :-)

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 17/18] ARM: SAMSUNG: Switch to sched_clock_register()

2013-08-05 Thread Kukjin Kim


On 08/01/13 07:31, Stephen Boyd wrote:

The 32 bit sched_clock interface now supports 64 bits. Upgrade to
the 64 bit function to allow us to remove the 32 bit registration
interface.

Cc: Ben Dooks
Cc: Kukjin Kim


Acked-by: Kukjin Kim 

Thanks,
Kukjin


Signed-off-by: Stephen Boyd
---
  arch/arm/plat-samsung/samsung-time.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/plat-samsung/samsung-time.c 
b/arch/arm/plat-samsung/samsung-time.c
index 2957075..1e2119b 100644
--- a/arch/arm/plat-samsung/samsung-time.c
+++ b/arch/arm/plat-samsung/samsung-time.c
@@ -312,7 +312,7 @@ static void __iomem *samsung_timer_reg(void)
   * this wraps around for now, since it is just a relative time
   * stamp. (Inspired by U300 implementation.)
   */
-static u32 notrace samsung_read_sched_clock(void)
+static u64 notrace samsung_read_sched_clock(void)
  {
void __iomem *reg = samsung_timer_reg();

@@ -337,7 +337,7 @@ static void __init samsung_clocksource_init(void)
samsung_time_setup(timer_source.source_id, TCNT_MAX);
samsung_time_start(timer_source.source_id, PERIODIC);

-   setup_sched_clock(samsung_read_sched_clock, TSIZE, clock_rate);
+   sched_clock_register(samsung_read_sched_clock, TSIZE, clock_rate);

if (clocksource_mmio_init(samsung_timer_reg(), 
"samsung_clocksource_timer",
clock_rate, 250, TSIZE, clocksource_mmio_readl_down))

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mips/kvm: Improve code formatting in arch/mips/kvm/kvm_locore.S

2013-08-05 Thread Gleb Natapov

On Mon, Aug 05, 2013 at 07:06:10PM +0200, Ralf Baechle wrote:
> On Mon, Aug 05, 2013 at 04:43:27PM +0300, Gleb Natapov wrote:
> > Date:   Mon, 5 Aug 2013 16:43:27 +0300
> > From: Gleb Natapov 
> > To: Ralf Baechle 
> > Cc: James Hogan , David Daney
> >  , linux-m...@linux-mips.org, k...@vger.kernel.org,
> >  Sanjay Lal , linux-kernel@vger.kernel.org, David
> >  Daney 
> > Subject: Re: [PATCH 1/3] mips/kvm: Improve code formatting in
> >  arch/mips/kvm/kvm_locore.S
> > Content-Type: text/plain; charset=us-ascii
> > 
> > On Mon, Aug 05, 2013 at 03:21:57PM +0200, Ralf Baechle wrote:
> > > On Mon, Aug 05, 2013 at 02:17:01PM +0100, James Hogan wrote:
> > > 
> > > > 
> > > > On 01/08/13 21:22, David Daney wrote:
> > > > > From: David Daney 
> > > > > 
> > > > > No code changes, just reflowing some comments and consistently using
> > > > > tabs and spaces.  Object code is verified to be unchanged.
> > > > > 
> > > > > Signed-off-by: David Daney 
> > > > > Acked-by: Ralf Baechle 
> > > > 
> > > > 
> > > > > +  /* Put the saved pointer to vcpu (s1) back into the DDATA_LO 
> > > > > Register */
> > > > 
> > > > git am detects a whitespace error here ("space before tab in indent").
> > > > It's got spaces before and after the tab actually.
> > > > 
> > > > >  /* load the guest context from VCPU and return */
> > > > 
> > > > this comment could have it's indentation fixed too
> > > > 
> > > > Otherwise, for all 3 patches:
> > > > 
> > > > Reviewed-by: James Hogan 
> > > 
> > > I'm happy with the patch series as well and will fix this issue when
> > > applying the patch.
> > > 
> > kvm fixes usually go through kvm.git tree for all arches. Any special
> > reasons you want to get those through mips tree?
> 
> MIPS fixes usually go through the MIPS tree ;-)
> 
arch/*/kvm/ fixes usually go through the kvm.git though :) KVM arch
code, after it is reasonably stable, usually depends more on kvm common
code then arch code and kvm development suppose to happen against
kvm.git otherwise APIs can go out of sync. I need to get acks of MIPS
people before taking patches of course.

When patch series touches code outside of arch/*/kvm, like David says
the next one will, it make sense to merge it through MIPS tree, just
please take KVM maintainers ACK for kvm part.

> I don't care which tree this stuff goes through - but a general experience
> is that things that affect MIPS systems receive most testing if going
> through the MIPS tree.
> 
>   Ralf

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH aio-next] aio: fix error handling and rcu usage in "convert the ioctx list to table lookup v3"

2013-08-05 Thread Benjamin LaHaise

On Mon, Aug 05, 2013 at 12:08:28PM -0400, Benjamin LaHaise wrote:
> Hi Sasha,
> 
> On Mon, Aug 05, 2013 at 09:57:08AM -0400, Sasha Levin wrote:
> > Hi all,
> > 
> > While fuzzing with trinity inside a KVM tools guest running latest -next 
> > kernel,
> > I've stumbled on the following spew caused by a new BUG() added in "aio: fix
> > io_destroy() regression by using call_rcu()".
> 
> I did some investigating, and it looks like there is a problem with 
> db446a08c23d5475e6b08c87acca79ebb20f283c (aio: convert the ioctx list to 
> table lookup v3).  Can you confirm if reverting this patch eliminates 
> the BUG() you're hitting?  In my testing, I wasn't able to trigger the 
> BUG(), but I was able to trip up slab corruption with debugging on.  

And here is a patch that should fix the problems introduced in the table 
lookup patch without reverting.  I will add this to the aio-next.git tree.  
This bug is not present in Linus' tree.

-ben

 aio.c |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 588aff9..3bc068c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -475,7 +475,7 @@ static int ioctx_add_table(struct kioctx *ctx, struct 
mm_struct *mm)
struct aio_ring *ring;
 
spin_lock(&mm->ioctx_lock);
-   table = rcu_dereference(mm->ioctx_table);
+   table = mm->ioctx_table;
 
while (1) {
if (table)
@@ -503,7 +503,7 @@ static int ioctx_add_table(struct kioctx *ctx, struct 
mm_struct *mm)
table->nr = new_nr;
 
spin_lock(&mm->ioctx_lock);
-   old = rcu_dereference(mm->ioctx_table);
+   old = mm->ioctx_table;
 
if (!old) {
rcu_assign_pointer(mm->ioctx_table, table);
@@ -579,10 +579,6 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
if (ctx->req_batch < 1)
ctx->req_batch = 1;
 
-   err = ioctx_add_table(ctx, mm);
-   if (err)
-   goto out_cleanup_noerr;
-
/* limit the number of system wide aios */
spin_lock(&aio_nr_lock);
if (aio_nr + nr_events > (aio_max_nr * 2UL) ||
@@ -595,13 +591,18 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
 
percpu_ref_get(&ctx->users); /* io_setup() will drop this ref */
 
+   err = ioctx_add_table(ctx, mm);
+   if (err)
+   goto out_cleanup_put;
+
pr_debug("allocated ioctx %p[%ld]: mm=%p mask=0x%x\n",
 ctx, ctx->user_id, mm, ctx->nr_events);
return ctx;
 
+out_cleanup_put:
+   percpu_ref_put(&ctx->users);
 out_cleanup:
err = -EAGAIN;
-out_cleanup_noerr:
aio_free_ring(ctx);
 out_freepcpu:
free_percpu(ctx->cpu);
@@ -626,7 +627,7 @@ static void kill_ioctx(struct mm_struct *mm, struct kioctx 
*ctx)
struct kioctx_table *table;
 
spin_lock(&mm->ioctx_lock);
-   table = rcu_dereference(mm->ioctx_table);
+   table = mm->ioctx_table;
 
WARN_ON(ctx != table->table[ctx->id]);
table->table[ctx->id] = NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUERY] lguest64

2013-08-05 Thread Konrad Rzeszutek Wilk

> >>>  struct pv_cpu_ops pv_cpu_ops;
> >>>
> >>>   [only end up using cpuid. This one is a tricky one. We could
> >>>arguable remove it but it does do some filtering - for example
> >>>THERM is turned off, or MWAIT if a certain hypercall tells us 
> >>> to
> >>>disable that. Since this is now a trapped operation this could 
> >>> be
> >>>handled in the hypervisor - but then it would be in charge of
> >>>filtering certain CPUID - and this is at bootup - so there is 
> >>> not
> >>>user interaction. This needs a bit more of thinking]
> >>>
> >> read_msr/write_msr in this one make all msr accesses safe. IIRC there
> >> are MSRs that Linux uses without checking cpuid bits.
> >> IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.
> > 
> > Right, those are needed as well. Completly forgot about them.
> 
> CPUID is not too bad.  RDMSR/WRMSR is actually worse since there are
> some MSRs which are performance-critical.  The really messy pvops are
> the memory-related ones, as they don't match the hardware behavior.

Would you have a by any chance a nice test-case to demonstrate the
rdmsr/wrmsr paths which performance-critical under baremetal?
> 
> Similarly, beyond pvops, what new assumptions does this code add to the
> code base?

We have not yet narrowed down on how to "negotiate" the GDT values - as
the VMX code in the hypervisor has setup those before it loads the kernel.
I think Mukesh was thinking to extend the .Xen.note to enumerate some of the
ones that are needed and somehow the hypervisor slurps them in.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc4

2013-08-05 Thread Oleg Nesterov

On 08/05, Felipe Contreras wrote:
>
> On Mon, Aug 5, 2013 at 9:39 AM, Oleg Nesterov  wrote:
> >
> > Hmm. It should not crash under strace... please see below.
> >
> >> 953   ptrace(PTRACE_ATTACH, 1035, 0, 0) = -1 EPERM (Operation not 
> >> permitted)
> >
> > OK, so it actually uses ptrace ;)
> >
> > PTRACE_ATTACH fails because this child is already traced by strace, I guess.
> >
> > So does Starcraft crash this way? Or does it fail in some other way?
>
> It's crashing just the same.

But then it is not clear how fab840f can make any difference.

wine can not use ptrace when it runs after "strace -f". But, to remind,
I know nothing about wine. Perhaps wine uses some daemons which actually
run/ptrace the workload?

> > And just in case... perhaps wine does some logging too?
>
> Yeah, but there doesn't seem to be anything interesting:
>
> http://bugs.winehq.org/attachment.cgi?id=45489

at least there is certainly nothing interesting for me since
I don't understand this ;) Thanks.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/3] dma: Add Freescale eDMA engine driver support

2013-08-05 Thread Vinod Koul

On Mon, Aug 05, 2013 at 02:07:04PM +0800, Jingchang Lu wrote:
> Add Freescale enhanced direct memory(eDMA) controller support.
> The eDMA controller deploys DMAMUXs routing DMA request sources(slot)
> to eDMA channels.
> This module can be found on Vybrid and LS-1 SoCs.
> 
> Signed-off-by: Alison Wang 
> Signed-off-by: Xiaochun Li 
> Signed-off-by: Jingchang Lu 
> ---

> +
> +static void fsl_edma_free_desc(struct virt_dma_desc *vdesc)
> +{
> + struct fsl_edma_desc *fsl_desc;
> + int i;
> +
> + fsl_desc = to_fsl_edma_desc(vdesc);
> + for (i = 0; i < fsl_desc->n_tcds; i++)
> + dma_pool_free(fsl_desc->echan->tcd_pool,
> + fsl_desc->tcd[i].vtcd,
> + fsl_desc->tcd[i].ptcd);
> + kfree(fsl_desc);
should this be called with lock held or not?

> +}
> +
> +static int fsl_edma_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
> + unsigned long arg)
> +{
> + struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> + struct dma_slave_config *cfg = (void *)arg;
> +
> + switch (cmd) {
> + case DMA_TERMINATE_ALL:
> + fsl_edma_disable_request(fsl_chan);
> + fsl_chan->edesc = NULL;
> + vchan_free_chan_resources(&fsl_chan->vchan);
> + return 0;
empty line here pls

> + case DMA_SLAVE_CONFIG:
> + fsl_chan->fsc.dir = cfg->direction;
> + if (cfg->direction == DMA_DEV_TO_MEM) {
> + fsl_chan->fsc.dev_addr = cfg->src_addr;
> + fsl_chan->fsc.addr_width = cfg->src_addr_width;
> + fsl_chan->fsc.burst = cfg->src_maxburst;
> + } else {
i would prefer you check for DMA_MEM_TO_DEV here and discard rest of the cases
if sent as error

> + fsl_chan->fsc.dev_addr = cfg->dst_addr;
> + fsl_chan->fsc.addr_width = cfg->dst_addr_width;
> + fsl_chan->fsc.burst = cfg->dst_maxburst;
> + }
> + fsl_chan->fsc.attr = 
> fsl_edma_get_tcd_attr(fsl_chan->fsc.addr_width);
> + return 0;
> + default:
> + return -ENOSYS;
-ENXIO perhaps...

> + }
> +}
> +

> +static struct fsl_edma_desc *fsl_edma_alloc_desc(struct fsl_edma_chan 
> *fsl_chan,
> + int sg_len)
> +{
> + struct fsl_edma_desc *fsl_desc;
> + int size, i;
> +
> + size = sizeof(struct fsl_edma_desc);
> + size += sizeof(struct fsl_edma_sw_tcd) * sg_len;
> + fsl_desc = kzalloc(size, GFP_KERNEL);
how about, kzalloc(sizeof(*fsl_desc) * sg_len, GFP_NOWAIT)
> + if (!fsl_desc)
> + return NULL;
> +
> + fsl_desc->echan = fsl_chan;
> + fsl_desc->n_tcds = sg_len;
> + for (i = 0; i < sg_len; i++) {
> + fsl_desc->tcd[i].vtcd = dma_pool_alloc(fsl_chan->tcd_pool,
> + GFP_ATOMIC, &fsl_desc->tcd[i].ptcd);
> + if (!fsl_desc->tcd[i].vtcd)
> + goto free_on_err;
> + }
> + return fsl_desc;
empty line here

> +free_on_err:
> + while (--i >= 0)
> + dma_pool_free(fsl_chan->tcd_pool, fsl_desc->tcd[i].vtcd,
> + fsl_desc->tcd[i].ptcd);
> + kfree(fsl_desc);
> + return NULL;
> +}
> +
> +static struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
> + struct dma_chan *chan, dma_addr_t dma_addr, size_t buf_len,
> + size_t period_len, enum dma_transfer_direction direction,
> + unsigned long flags, void *context)
> +{
> + struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> + struct fsl_edma_desc *fsl_desc;
> + dma_addr_t dma_buf_next;
> + int sg_len, i;
> + u32 src_addr, dst_addr, last_sg, nbytes;
> + u16 soff, doff, iter;
> +
> + sg_len = buf_len / period_len;
> + fsl_desc = fsl_edma_alloc_desc(fsl_chan, sg_len);
> + if (!fsl_desc)
> + return NULL;
> + fsl_desc->iscyclic = true;
> +
> + dma_buf_next = dma_addr;
> + nbytes = fsl_chan->fsc.addr_width * fsl_chan->fsc.burst;
> + iter = period_len / nbytes;
> + for (i = 0; i < sg_len; i++) {
> + if (dma_buf_next >= dma_addr + buf_len)
> + dma_buf_next = dma_addr;
> +
> + /* get next sg's physical address */
> + last_sg = fsl_desc->tcd[(i + 1) % sg_len].ptcd;
> +
> + if (fsl_chan->fsc.dir == DMA_MEM_TO_DEV) {
> + src_addr = dma_buf_next;
> + dst_addr = fsl_chan->fsc.dev_addr;
> + soff = fsl_chan->fsc.addr_width;
> + doff = 0;
> + } else {
again check for dirrection would be apt

> + src_addr = fsl_chan->fsc.dev_addr;
> + dst_addr = dma_buf_next;
> + soff = 0;
> + doff = fsl_chan->fsc.addr_width;
> + }
> +
> + fill_tcd_par

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 10:12 AM, Linus Torvalds
 wrote:
>
> Secondly, you don't want a separate section anyway for any normal
> kernel code, since you want short jumps if possible

Just to clarify: the short jump is important regardless of how
unlikely the code you're jumping is, since even if you'd be jumping to
very unlikely ("never executed") code, the branch to that code is
itself in the hot path.

And the difference between a two-byte short jump to the end of a short
function, and a five-byte long jump (to pick the x86 case) is quite
noticeable.

Other cases do long jumps by jumping to a thunk, and so the "hot case"
is unaffected, but at least one common architecture very much sees the
difference in the likely code.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 9:55 AM, Steven Rostedt  wrote:
>
> Almost a full year ago, Mathieu suggested something like:
>
> if (unlikely(x)) __attribute__((section(".unlikely"))) {
> ...
> } else __attribute__((section(".likely"))) {
> ...
> }

It's almost certainly a horrible idea.

First off, we have very few things that are *so* unlikely that they
never get executed. Putting things in a separate section would
actually be really bad.

Secondly, you don't want a separate section anyway for any normal
kernel code, since you want short jumps if possible (pretty much every
single architecture out there has a concept of shorter jumps that are
noticeably cheaper than long ones). You want the unlikely code to be
out-of-line, but still *close*. Which is largely what gcc already does
(except if you use "-Os", which disables all the basic block movement
and thus makes "likely/unlikely" pointless to begin with).

There are some situations where you'd want extremely unlikely code to
really be elsewhere, but they are rare as hell, and mostly in user
code where you might try to avoid demand-loading such code entirely.

So give up on sections. They are a bad idea for anything except the
things we already use them for. Sure, you can try to fix the problems
with sections with link-time optimization work and a *lot* of small
individual sections (the way per-function sections work already), but
that's basically just undoing the stupidity of using sections to begin
with.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/5] memcg: rename cgroup_event to mem_cgroup_event

2013-08-05 Thread Tejun Heo

>From 2d3340a32a52602ec4cda348b26affeae52d1964 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 5 Aug 2013 12:00:24 -0400

cgroup_event is only available in memcg now.  Let's brand it that way.
While at it, add a comment encouraging deprecation of the feature and
remove the respective section from cgroup documentation.

This patch is cosmetic.

v2: Index in cgroups.txt updated accordingly as suggested by Li Zefan.

Signed-off-by: Tejun Heo 
Cc: Li Zefan 
---
 Documentation/cgroups/cgroups.txt | 20 --
 mm/memcontrol.c   | 57 +--
 2 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt 
b/Documentation/cgroups/cgroups.txt
index 638bf17..821de56 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -24,7 +24,6 @@ CONTENTS:
   2.1 Basic Usage
   2.2 Attaching processes
   2.3 Mounting hierarchies by name
-  2.4 Notification API
 3. Kernel API
   3.1 Overview
   3.2 Synchronization
@@ -472,25 +471,6 @@ you give a subsystem a name.
 The name of the subsystem appears as part of the hierarchy description
 in /proc/mounts and /proc//cgroups.
 
-2.4 Notification API
-
-
-There is mechanism which allows to get notifications about changing
-status of a cgroup.
-
-To register a new notification handler you need to:
- - create a file descriptor for event notification using eventfd(2);
- - open a control file to be monitored (e.g. memory.usage_in_bytes);
- - write "  " to cgroup.event_control.
-   Interpretation of args is defined by control file implementation;
-
-eventfd will be woken up by control file implementation or when the
-cgroup is removed.
-
-To unregister a notification handler just close eventfd.
-
-NOTE: Support of notifications should be implemented for the control
-file. See documentation for the subsystem.
 
 3. Kernel API
 =
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 023077d..24f5843 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -243,7 +243,7 @@ struct mem_cgroup_eventfd_list {
 /*
  * cgroup_event represents events which userspace want to receive.
  */
-struct cgroup_event {
+struct mem_cgroup_event {
/*
 * css which the event belongs to.
 */
@@ -5978,14 +5978,27 @@ static void kmem_cgroup_css_offline(struct mem_cgroup 
*memcg)
 #endif
 
 /*
+ * DO NOT USE IN NEW FILES.
+ *
+ * "cgroup.event_control" implementation.
+ *
+ * This is way over-engineered.  It tries to support fully configureable
+ * events for each user.  Such level of flexibility is completely
+ * unnecessary especially in the light of the planned unified hierarchy.
+ *
+ * Please deprecate this and replace with something simpler if at all
+ * possible.
+ */
+
+/*
  * Unregister event and free resources.
  *
  * Gets called from workqueue.
  */
-static void cgroup_event_remove(struct work_struct *work)
+static void memcg_event_remove(struct work_struct *work)
 {
-   struct cgroup_event *event = container_of(work, struct cgroup_event,
-   remove);
+   struct mem_cgroup_event *event = container_of(work,
+   struct mem_cgroup_event, remove);
struct cgroup_subsys_state *css = event->css;
struct cgroup *cgrp = css->cgroup;
 
@@ -6006,11 +6019,11 @@ static void cgroup_event_remove(struct work_struct 
*work)
  *
  * Called with wqh->lock held and interrupts disabled.
  */
-static int cgroup_event_wake(wait_queue_t *wait, unsigned mode,
-   int sync, void *key)
+static int memcg_event_wake(wait_queue_t *wait, unsigned mode,
+   int sync, void *key)
 {
-   struct cgroup_event *event = container_of(wait,
-   struct cgroup_event, wait);
+   struct mem_cgroup_event *event =
+   container_of(wait, struct mem_cgroup_event, wait);
struct mem_cgroup *memcg = mem_cgroup_from_css(event->css);
unsigned long flags = (unsigned long)key;
 
@@ -6039,28 +6052,30 @@ static int cgroup_event_wake(wait_queue_t *wait, 
unsigned mode,
return 0;
 }
 
-static void cgroup_event_ptable_queue_proc(struct file *file,
+static void memcg_event_ptable_queue_proc(struct file *file,
wait_queue_head_t *wqh, poll_table *pt)
 {
-   struct cgroup_event *event = container_of(pt,
-   struct cgroup_event, pt);
+   struct mem_cgroup_event *event =
+   container_of(pt, struct mem_cgroup_event, pt);
 
event->wqh = wqh;
add_wait_queue(wqh, &event->wait);
 }
 
 /*
+ * DO NOT USE IN NEW FILES.
+ *
  * Parse input and register new memcg event handler.
  *
  * Input must be in format '  '.
  * Interpretation of args is defined by control file implementation.
  */
-static int cgroup_write_event_control(struct cgroup_subsys_state *css,
- struct cftype *cft, const char *buffer)
+static int memcg

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Aruna Balakrishnaiah


Hi Tony,

Thank you very much for testing my patches.

On Saturday 03 August 2013 03:42 AM, Tony Luck wrote:

A quick experiment to use your patchset - but with compression
disabled by tweaking this line in pstore_dump():

 zipped_len = -1; //zip_data(dst, hsize + len);

turned out well. This kernel dumps uncompressed dmesg blobs into pstore
and gets them back out again.  So it seems likely that the problems are
someplace in the compression/decompression code.


A quick look on my code suggests that problem could be in this part
of code.

In pstore_dump:

 if (zipped_len < 0) {
dst = psinfo->buf;
hsize = sprintf(dst, "%s#%d Part%d\n",
why, oopscount, part);
size = psinfo->bufsize - hsize;
dst += hsize;
compressed = false;

if (!kmsg_dump_get_buffer(dumper, true, dst,
size, &len))
break;
} else {
compressed = true;
 --->   len = zipped_len;
}

I am returning zipped_len as the length of the compressed data (which also
has hsize compressed). So returning hsize + len in pstore_write callback
will be wrong. It should just have been zipped_len. This might be adding
junk characters.

Can you please replace this hunk with:

if (zipped_len < 0) {
pr_err("Compression failed\n");
dst = psinfo->buf;
hsize = sprintf(dst, "%s#%d Part%d\n",
why, oopscount, part);
size = psinfo->bufsize - hsize;
dst += hsize;
compressed = false;

if (!kmsg_dump_get_buffer(dumper, true, dst,
size, &len))
break;
total_len = hsize + len;
} else {
compressed = true;
total_len = zipped_len;
}

ret = psinfo->write(PSTORE_TYPE_DMESG, reason, &id, part,
oopscount, compressed, total_len, psinfo);
if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted())
pstore_new_entry = 1;

total += total_len;
part++;

With the above hunk, atleast I dont see junk characters at the end in power.

I apologise, since I do not have the suitable machine to test this I am
not able to reproduce the scenarios you are stating. I need your help
in testing this.

- Aruna


-Tony
___
Linuxppc-dev mailing list
linuxppc-...@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 3/5] cgroup, memcg: move cgroup_event implementation to memcg

2013-08-05 Thread Tejun Heo

>From ab9c79598563898b0af18aa26b8a218fe5cbfda6 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 5 Aug 2013 12:00:23 -0400

cgroup_event is way over-designed and tries to build a generic
flexible event mechanism into cgroup - fully customizable event
specification for each user of the interface.  This is utterly
unnecessary and overboard especially in the light of the planned
unified hierarchy as there's gonna be single agent.  Simply generating
events at fixed points, or if that's too restrictive, configureable
cadence or single set of configureable points should be enough.

Thankfully, memcg is the only user and gets to keep it.  Replacing it
with something simpler on sane_behavior is strongly recommended.

This patch moves cgroup_event and "cgroup.event_control"
implementation to mm/memcontrol.c.  Clearing of events on cgroup
destruction is moved from cgroup_destroy_locked() to
mem_cgroup_css_offline(), which shouldn't make any noticeable
difference.

Note that "cgroup.event_control" will now exist only on the hierarchy
with memcg attached to it.  While this change is visible to userland,
it is unlikely to be noticeable as the file has never been meaningful
outside memcg.

v2: Per Li Zefan's comments, init/Kconfig updated accordingly and
poll.h inclusion moved from cgroup.c to memcontrol.c.

Signed-off-by: Tejun Heo 
Cc: Li Zefan 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Balbir Singh 
---
 init/Kconfig|   3 +-
 kernel/cgroup.c | 238 ---
 mm/memcontrol.c | 239 
 3 files changed, 240 insertions(+), 240 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 54d3fa5..b806453 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -844,7 +844,6 @@ config NUMA_BALANCING
 
 menuconfig CGROUPS
boolean "Control Group support"
-   depends on EVENTFD
help
  This option adds support for grouping sets of processes together, for
  use with process control subsystems such as Cpusets, CFS, memory
@@ -911,6 +910,7 @@ config MEMCG
bool "Memory Resource Controller for Control Groups"
depends on RESOURCE_COUNTERS
select MM_OWNER
+   select EVENTFD
help
  Provides a memory resource controller that manages both anonymous
  memory and page cache. (See Documentation/cgroups/memory.txt)
@@ -1163,7 +1163,6 @@ config UIDGID_STRICT_TYPE_CHECKS
 
 config SCHED_AUTOGROUP
bool "Automatic process group scheduling"
-   select EVENTFD
select CGROUPS
select CGROUP_SCHED
select FAIR_GROUP_SCHED
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 568d031..b7c4696 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -56,8 +56,6 @@
 #include 
 #include 
 #include  /* TODO: replace with more sophisticated array */
-#include 
-#include 
 #include  /* used in cgroup_attach_task */
 #include 
 
@@ -154,36 +152,6 @@ struct css_id {
unsigned short stack[0]; /* Array of Length (depth+1) */
 };
 
-/*
- * cgroup_event represents events which userspace want to receive.
- */
-struct cgroup_event {
-   /*
-* css which the event belongs to.
-*/
-   struct cgroup_subsys_state *css;
-   /*
-* Control file which the event associated.
-*/
-   struct cftype *cft;
-   /*
-* eventfd to signal userspace about the event.
-*/
-   struct eventfd_ctx *eventfd;
-   /*
-* Each of these stored in a list by the cgroup.
-*/
-   struct list_head list;
-   /*
-* All fields below needed to unregister event when
-* userspace closes eventfd.
-*/
-   poll_table pt;
-   wait_queue_head_t *wqh;
-   wait_queue_t wait;
-   struct work_struct remove;
-};
-
 /* The list of hierarchy roots */
 
 static LIST_HEAD(cgroup_roots);
@@ -3964,194 +3932,6 @@ void __cgroup_dput(struct cgroup *cgrp)
deactivate_super(sb);
 }
 
-/*
- * Unregister event and free resources.
- *
- * Gets called from workqueue.
- */
-static void cgroup_event_remove(struct work_struct *work)
-{
-   struct cgroup_event *event = container_of(work, struct cgroup_event,
-   remove);
-   struct cgroup_subsys_state *css = event->css;
-   struct cgroup *cgrp = css->cgroup;
-
-   remove_wait_queue(event->wqh, &event->wait);
-
-   event->cft->unregister_event(css, event->cft, event->eventfd);
-
-   /* Notify userspace the event is going away. */
-   eventfd_signal(event->eventfd, 1);
-
-   eventfd_ctx_put(event->eventfd);
-   kfree(event);
-   __cgroup_dput(cgrp);
-}
-
-/*
- * Gets called on POLLHUP on eventfd when user closes it.
- *
- * Called with wqh->lock held and interrupts disabled.
- */
-static int cgroup_event_wake(wait_queue_t *wait, unsigned mode,
-   int sync, void *key)
-{
-   struct cgroup_event *event = container_of(wait,
-

[PATCH v2 2/5] cgroup: make __cgroup_from_dentry() and __cgroup_dput() global

2013-08-05 Thread Tejun Heo

>From eb06a03636eb8477ea034780c37463a086112115 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 5 Aug 2013 12:00:23 -0400

cgroup_event will no longer be supported as cgroup generic mechanism
and be moved to memcg.  To enable the relocation, implement and expose
__cgroup_from_dentry() which combines cgroup file dentry -> croup
mapping and cft discovery, and prefix cgroup_dput() with __ and make
it global.

These functions exist and are exported only to enable moving
cgroup_event implementation to memcg and shouldn't grow any new users
and thus the __ prefix.

This patch is pure reorganization and doesn't introduce any functional
difference.

v2: The original patch had EXPORT_SYMBOL_GPL() for the two functions
because I for some reason thought that memcg could be built as
module.  Removed as suggested by Li Zefan.

Signed-off-by: Tejun Heo 
Cc: Li Zefan 
---
 include/linux/cgroup.h |  4 
 kernel/cgroup.c| 29 +++--
 2 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 30d6ec4..2ac1021 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -923,6 +923,10 @@ bool css_is_ancestor(struct cgroup_subsys_state *cg,
 unsigned short css_id(struct cgroup_subsys_state *css);
 struct cgroup_subsys_state *cgroup_css_from_dir(struct file *f, int id);
 
+/* do not add new users of the following two functions */
+struct cgroup *__cgroup_from_dentry(struct dentry *dentry, struct cftype 
**cftp);
+void __cgroup_dput(struct cgroup *cgrp);
+
 #else /* !CONFIG_CGROUPS */
 
 static inline int cgroup_init_early(void) { return 0; }
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1b87e2b..568d031 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2661,14 +2661,16 @@ static struct dentry *cgroup_lookup(struct inode *dir, 
struct dentry *dentry, un
return NULL;
 }
 
-/*
- * Check if a file is a control file
- */
-static inline struct cftype *__file_cft(struct file *file)
+/* do not add new users */
+struct cgroup *__cgroup_from_dentry(struct dentry *dentry, struct cftype 
**cftp)
 {
-   if (file_inode(file)->i_fop != &cgroup_file_operations)
-   return ERR_PTR(-EINVAL);
-   return __d_cft(file->f_dentry);
+   if (!dentry->d_inode ||
+   dentry->d_inode->i_op != &cgroup_file_inode_operations)
+   return NULL;
+
+   if (cftp)
+   *cftp = __d_cft(dentry);
+   return __d_cgrp(dentry->d_parent);
 }
 
 static int cgroup_create_file(struct dentry *dentry, umode_t mode,
@@ -3953,7 +3955,7 @@ static int cgroup_write_notify_on_release(struct 
cgroup_subsys_state *css,
  *
  * That's why we hold a reference before dput() and drop it right after.
  */
-static void cgroup_dput(struct cgroup *cgrp)
+void __cgroup_dput(struct cgroup *cgrp)
 {
struct super_block *sb = cgrp->root->sb;
 
@@ -3983,7 +3985,7 @@ static void cgroup_event_remove(struct work_struct *work)
 
eventfd_ctx_put(event->eventfd);
kfree(event);
-   cgroup_dput(cgrp);
+   __cgroup_dput(cgrp);
 }
 
 /*
@@ -4095,9 +4097,9 @@ static int cgroup_write_event_control(struct 
cgroup_subsys_state *css,
if (ret < 0)
goto out_put_cfile;
 
-   event->cft = __file_cft(cfile);
-   if (IS_ERR(event->cft)) {
-   ret = PTR_ERR(event->cft);
+   cgrp_cfile = __cgroup_from_dentry(cfile->f_dentry, &event->cft);
+   if (!cgrp_cfile) {
+   ret = -EINVAL;
goto out_put_cfile;
}
 
@@ -4105,7 +4107,6 @@ static int cgroup_write_event_control(struct 
cgroup_subsys_state *css,
 * The file to be monitored must be in the same cgroup as
 * cgroup.event_control is.
 */
-   cgrp_cfile = __d_cgrp(cfile->f_dentry->d_parent);
if (cgrp_cfile != cgrp) {
ret = -EINVAL;
goto out_put_cfile;
@@ -4272,7 +4273,7 @@ static void css_dput_fn(struct work_struct *work)
struct cgroup_subsys_state *css =
container_of(work, struct cgroup_subsys_state, dput_work);
 
-   cgroup_dput(css->cgroup);
+   __cgroup_dput(css->cgroup);
 }
 
 static void css_release(struct percpu_ref *ref)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek

On 08/05/13 18:47, Borislav Petkov wrote:
> On Mon, Aug 05, 2013 at 06:41:20PM +0200, Laszlo Ersek wrote:
>> I didn't realize the timestamps survive kexec. (As far as I remember
>> the kernels I played with kexec on didn't have the automatic
>> timestamps yet in dmesg, but I might have messed up just as well...)
> 
> No, no, no, kexec is not involved at all.

I understand. I just explained why I could not derive that fact from the
timestamps. You said,

> No, kexec is not even involved yet. If you look at the timestamps,
> there's 0.005 seconds between the two dumps during the *same* kernel
> booting on the machine, baremetal, straight from grub.

There are four memmap dumps:

(1) first boot, initial dump,
(2) first boot, dump when entering virtual mode,
(3) kexec boot, initial dump,
(4) kexec boot, dump when entering virtual mode.

I was aware that we were discussing a problem either between (1) and
(2), *or* between (3) and (4); I just didn't know inside "which pair".

I misunderstood your reply and thought that you were implying the
(1)+(2) pair by the low absolute timestamps. I assumed that (3)+(4)
would print low timestamps as well (due to the time offset starting from
zero in the kexec kernel too) and took your message as a correction to
that idea. But, you didn't say anything about the magnitude of the
timestamps, only about the differences between them.

Sorry for the noise, it's clear now that we're looking at (1)->(2).

Thanks
Laszlo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] trivial: adjust code alignment

2013-08-05 Thread Dan Carpenter

On Mon, Aug 05, 2013 at 06:24:43PM +0200, walter harms wrote:
> Hello Julia,
> 
> IMHO keep the patch as it is.
> It does not change any code that is good.
> Suspicious code that comes up here can be addressed
> in a separate patch.
> 

Gar... No, if we silence static checker warnings without fixing the
bug then we are hiding real problems and making them more difficult
to find.

Just drop this chunk.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mips/kvm: Improve code formatting in arch/mips/kvm/kvm_locore.S

2013-08-05 Thread Ralf Baechle

On Mon, Aug 05, 2013 at 04:43:27PM +0300, Gleb Natapov wrote:
> Date:   Mon, 5 Aug 2013 16:43:27 +0300
> From: Gleb Natapov 
> To: Ralf Baechle 
> Cc: James Hogan , David Daney
>  , linux-m...@linux-mips.org, k...@vger.kernel.org,
>  Sanjay Lal , linux-kernel@vger.kernel.org, David
>  Daney 
> Subject: Re: [PATCH 1/3] mips/kvm: Improve code formatting in
>  arch/mips/kvm/kvm_locore.S
> Content-Type: text/plain; charset=us-ascii
> 
> On Mon, Aug 05, 2013 at 03:21:57PM +0200, Ralf Baechle wrote:
> > On Mon, Aug 05, 2013 at 02:17:01PM +0100, James Hogan wrote:
> > 
> > > 
> > > On 01/08/13 21:22, David Daney wrote:
> > > > From: David Daney 
> > > > 
> > > > No code changes, just reflowing some comments and consistently using
> > > > tabs and spaces.  Object code is verified to be unchanged.
> > > > 
> > > > Signed-off-by: David Daney 
> > > > Acked-by: Ralf Baechle 
> > > 
> > > 
> > > > +/* Put the saved pointer to vcpu (s1) back into the DDATA_LO 
> > > > Register */
> > > 
> > > git am detects a whitespace error here ("space before tab in indent").
> > > It's got spaces before and after the tab actually.
> > > 
> > > >  /* load the guest context from VCPU and return */
> > > 
> > > this comment could have it's indentation fixed too
> > > 
> > > Otherwise, for all 3 patches:
> > > 
> > > Reviewed-by: James Hogan 
> > 
> > I'm happy with the patch series as well and will fix this issue when
> > applying the patch.
> > 
> kvm fixes usually go through kvm.git tree for all arches. Any special
> reasons you want to get those through mips tree?

MIPS fixes usually go through the MIPS tree ;-)

I don't care which tree this stuff goes through - but a general experience
is that things that affect MIPS systems receive most testing if going
through the MIPS tree.

  Ralf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 09:55 AM, Steven Rostedt wrote:
> 
> Almost a full year ago, Mathieu suggested something like:
> 
> if (unlikely(x)) __attribute__((section(".unlikely"))) {
> ...
> } else __attribute__((section(".likely"))) {
> ...
> }
> 
> https://lkml.org/lkml/2012/8/9/658
> 
> Which got me thinking. How hard would it be to set a block in its own
> section. Like what Mathieu suggested, but it doesn't have to be
> ".unlikely".
> 
> if (x) __attibute__((section(".foo"))) {
>   /* do something */
> }
> 

One concern I have is how this kind of code would work when embedded
inside a function which already has a section attribute.  This could
easily cause really weird bugs when someone "optimizes" an inline or
macro and breaks a single call site...

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V9 00/14] perf tools: some fixes and tweaks

2013-08-05 Thread Adrian Hunter

Hi

Here are some fixes and tweaks to perf tools (version 9).

Changes in V9:
perf: Update perf_event_type documentation
Dropped since its been applied
perf tools: tidy up sample parsing overflow checking
Changed to use a single overflow function
Updated for PERF_SAMPLE_READ
perf: make events stream always parsable
Added more about sample parsing to the commit message
perf tools: add support for PERF_SAMPLE_IDENTFIER
When selecting PERF_SAMPLE_IDENTFIER, ensure PERF_SAMPLE_ID
is deselected
perf tools: expand perf_event__synthesize_sample()
Updated for PERF_SAMPLE_READ
perf tools: add a function to calculate sample event size
Updated for PERF_SAMPLE_READ
perf tools: add a sample parsing test
Updated for PERF_SAMPLE_READ

Changes in V8:
perf tools: add debug prints
Fixed Python link errors
perf tools: move perf_evlist__config() to a new source file
New Patch to avoid Python link errors
perf tools: add support for PERF_SAMPLE_IDENTFIER
Adjustments due to patch above

Changes in V7:
perf: Update perf_event_type documentation
Proposed new patch from Peter Zijlstra
perf: make events stream always parsable
Adjustments due to patch above
perf tools: tidy up sample parsing overflow checking
Change to a single overflow function
Amend comment
perf tools: add a function to calculate sample event size
New patch
perf tools: add a sample parsing test
Amended to use sample event size calculation

Changes in V6:
Some checkpatch fixes

perf: make events stream always parsable
Add sample format comments

Changes in V5:
Re-based to Arnaldo's tree and dropped already applied patches:
perf tools: remove unused parameter
perf tools: fix missing tool parameter
perf tools: fix missing 'finished_round'
perf tools: fix parse_events_terms() segfault on error path
perf tools: fix new_term() missing free on error path
perf tools: add const specifier to perf_pmu__find name parameter
perf tools: tidy duplicated munmap code
perf tools: validate perf event header size

perf tools: add debug prints
Changed to perf_event_attr__fprintf()
perf tools: add pid to struct thread
Always set the pid, even if a pid is already set
perf tools: change machine__findnew_thread() to set thread pid
Replaces: perf tools: change "machine" functions to set thread 
pid
perf tools: add support for PERF_SAMPLE_IDENTFIER
Only use PERF_SAMPLE_IDENTFIER if sample types are different
perf tools: expand perf_event__synthesize_sample()
New patch in preparation of a sample parsing test
perf tools: add a sample parsing test
New patch

Changes in V4:
I added kernel support for matching sample types via
PERF_SAMPLE_IDENTIFIER.  perf tools support for that required
first fixing some other things.

perf tools: fix parse_events_terms() freeing local variable on error 
path
Dropped - covered by David Ahern
perf tools: struct thread has a tid not a pid
Added ack by David Ahern
perf tools: add pid to struct thread
Remove unused function
perf tools: fix missing increment in sample parsing
New patch
perf tools: tidy up sample parsing overflow checking
New patch
perf tools: remove unnecessary callchain validation
New patch
perf tools: remove references to struct ip_event
New patch
perf tools: move struct ip_event
New patch
perf: make events stream always parsable
New patch
perf tools: add support for PERF_SAMPLE_IDENTFIER
New patch

Changes in V3:
perf tools: add pid to struct thread
Split into 2 patches
perf tools: fix ppid in thread__fork()
Dropped for now

Changes in V2:
perf tools: fix missing tool parameter
Fixed one extra occurrence
perf tools: fix parse_events_terms() freeing local variable on error 
path
Made "freeing" code into a new function
perf tools: validate perf event header size
Corrected byte-swapping
perf tools: allow non-matching sample types
Added comments
Fixed id_pos calculation
id_pos/is_pos updated whenever sample_type c

Re: Linux 3.11-rc4

2013-08-05 Thread Felipe Contreras

On Mon, Aug 5, 2013 at 9:39 AM, Oleg Nesterov  wrote:
> On 08/05, Felipe Contreras wrote:
>>
>> On Mon, Aug 5, 2013 at 8:29 AM, Oleg Nesterov  wrote:
>> >
>> > Could you please run wine under strace
>> >
>> > strace -f -e ptrace -o LOG wine ...
>> >
>> > and show the result?
>>
>> Sure.
>
> Thanks.
>
>> Note that the crash might have happened some time before the end
>> of the log.
>
> Hmm. It should not crash under strace... please see below.
>
>> 953   ptrace(PTRACE_ATTACH, 1035, 0, 0) = -1 EPERM (Operation not permitted)
>
> OK, so it actually uses ptrace ;)
>
> PTRACE_ATTACH fails because this child is already traced by strace, I guess.
>
> So does Starcraft crash this way? Or does it fail in some other way?

It's crashing just the same.

> And just in case... perhaps wine does some logging too?

Yeah, but there doesn't seem to be anything interesting:

http://bugs.winehq.org/attachment.cgi?id=45489

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] input: ti_tsc: Enable shared IRQ for TSC

2013-08-05 Thread Zubair Lutfullah :

On Mon, Aug 05, 2013 at 09:12:56AM -0700, Dmitry Torokhov wrote:
> > > Touchscreen and ADC share the same IRQ line from parent MFD core.
> > > Previously only Touchscreen was interrupt based.
> > > With continuous mode support added in ADC driver, driver requires
> > > interrupt to process the ADC samples, so enable shared IRQ flag bit for
> > > touchscreen.
> > > 
> > > @@ -260,8 +260,18 @@ static irqreturn_t titsc_irq(int irq, void *dev)
> > >   unsigned int fsm;
> > >  
> > > + /*
> > > +  * ADC and touchscreen share the IRQ line.
> > > +  * FIFO1 threshold, FIFO1 Overrun and FIFO1 underflow
> > > +  * interrupts are used by ADC,
> > > +  * hence return from touchscreen IRQ handler if FIFO1
> > > +  * related interrupts occurred.
> > > +  */
> > > + if ((status & IRQENB_FIFO1THRES) ||
> > > + (status & IRQENB_FIFO1OVRRUN) ||
> > > + (status & IRQENB_FIFO1UNDRFLW))
> > > + return IRQ_NONE;
> > > + else if (status & IRQENB_FIFO0THRES) {
> 
> What happens if both parts have data at the same time? Can both
> IRQENB_FIFO1THRES and IRQENB_FIFO0THRES be signalled? What will happen
> in this case?

If ADC is sampling and someone is touching the TSC, both interrupts
can signal so closely that for the purpose of the kernel,
they can be seen as signaled together.

FIFO 1 used only by ADC and FIFO1THRES handler is inside the iio/adc driver
FIFO 0 used only by TSC and FIFO0THRES handler is inside the input/touchscreen

Note: These are level interrupts.

I would like some input on how to handle such a situation. 

Thanks
Zubair Lutfullah
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [edk2] Corrupted EFI region

2013-08-05 Thread Kinney, Michael D

Boris,

A memory map entry with zero size does not look right to me.

The memory map passed into SetVirtualAddressMap() must contain the exact same 
set of memory map entries that existed when ExitBootServices() was called with 
a return result of EFI_SUCCESS.

When you are showing comparisons of memory maps, are you showing the 
ExitBootServices() one and the SeVirtualAddressMap() one?  If the memory maps 
are not identical, then somehow the memory map is being modified, and we need 
to figure that out.

If the ExitBootServices() memory map has the zero sized entry, then we need to 
see how GetMemoryMap() is returning a zero sized entry.  It is not clear that a 
zero sized entry would actually break anything, but it is a good idea to root 
cause that issue and make sure those types of memory map entries are not pass 
from the FW to the OS.

Thanks,

Mike


-Original Message-
From: Borislav Petkov [mailto:b...@alien8.de] 
Sent: Monday, August 05, 2013 9:48 AM
To: Laszlo Ersek
Cc: linux-...@vger.kernel.org; Gleb Natapov; edk2-de...@lists.sourceforge.net; 
lkml; David Woodhouse
Subject: Re: [edk2] Corrupted EFI region

On Mon, Aug 05, 2013 at 06:41:20PM +0200, Laszlo Ersek wrote:
> I didn't realize the timestamps survive kexec. (As far as I remember
> the kernels I played with kexec on didn't have the automatic
> timestamps yet in dmesg, but I might have messed up just as well...)

No, no, no, kexec is not involved at all.

Here's the whole dmesg up until efi_enter_virtual_map. When we have entered
efi_enter_virtual_mode, the region has changed from

[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)

to

[0.023004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0ad000) (0MB)


And yes, I still need to audit whether the kernel actually does that
change. I'm still looking...


[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01H[0m[35m[40m[2J[01;01H[=3h[2J[01;01H[0m[37m[40m[2J[01;01Hearly
 console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.10.0-rc7+ (boris@nazgul) (gcc version 4.7.3 
(Debian 4.7.3-4) ) #9 SMP PREEMPT Mon Aug 5 16:27:00 CEST 2013
[0.00] Command line: root=/dev/sda1 debug ignore_loglevel 
log_buf_len=10M earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009] usable
[0.00] BIOS-e820: [mem 0x0010-0x7e667fff] usable
[0.00] BIOS-e820: [mem 0x7e668000-0x7e691fff] reserved
[0.00] BIOS-e820: [mem 0x7e692000-0x7fb11fff] usable
[0.00] BIOS-e820: [mem 0x7fb12000-0x7fb69fff] reserved
[0.00] BIOS-e820: [mem 0x7fb6a000-0x7fb71fff] ACPI data
[0.00] BIOS-e820: [mem 0x7fb72000-0x7fb75fff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7fb76000-0x7ffd] usable
[0.00] BIOS-e820: [mem 0x7ffe-0x7fff] reserved
[0.00] debug: ignoring loglevel setting.
[0.00] bootconsole [earlyser0] enabled
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.31 by EDK II
[0.00] efi:  ACPI=0x7fb71000  ACPI 2.0=0x7fb71014 
[0.00] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.00] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.00] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.00] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.00] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.00] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.00] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.00] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.00] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.00] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)
[0.00] efi: mem13: type=4, attr=0xf, 
range=[0x7e0cd000-0x7e55d000) (4MB)
[0.00] efi: mem14: type=3, attr=0xf, 
range=[0x0

Re: [QUERY] lguest64

2013-08-05 Thread H. Peter Anvin

On 08/05/2013 09:50 AM, Konrad Rzeszutek Wilk wrote:
>>>
>>> Let me iterate down what the experimental patch uses:
>>>
>>>  struct pv_init_ops pv_init_ops;
>>>  
>>> [still use xen_patch, but I think that is not needed anymore]
>>>
>>>  struct pv_time_ops pv_time_ops;
>>>  
>>> [we need that as we are using the PV clock source]
>>>
>>>  struct pv_cpu_ops pv_cpu_ops;  
>>>  
>>> [only end up using cpuid. This one is a tricky one. We could
>>>  arguable remove it but it does do some filtering - for example
>>>  THERM is turned off, or MWAIT if a certain hypercall tells us 
>>> to
>>>  disable that. Since this is now a trapped operation this could 
>>> be
>>>  handled in the hypervisor - but then it would be in charge of
>>>  filtering certain CPUID - and this is at bootup - so there is 
>>> not
>>>  user interaction. This needs a bit more of thinking]
>>>
>> read_msr/write_msr in this one make all msr accesses safe. IIRC there
>> are MSRs that Linux uses without checking cpuid bits.
>> IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.
> 
> Right, those are needed as well. Completly forgot about them.

CPUID is not too bad.  RDMSR/WRMSR is actually worse since there are
some MSRs which are performance-critical.  The really messy pvops are
the memory-related ones, as they don't match the hardware behavior.

Similarly, beyond pvops, what new assumptions does this code add to the
code base?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7] dmaengine: Add MOXA ART DMA engine driver

2013-08-05 Thread Mark Rutland

On Mon, Aug 05, 2013 at 03:37:37PM +0100, Jonas Jensen wrote:
> The MOXA ART SoC has a DMA controller capable of offloading expensive
> memory operations, such as large copies. This patch adds support for
> the controller including four channels. Two of these are used to
> handle MMC copy on the UC-7112-LX hardware. The remaining two can be
> used in a future audio driver or client application.
>
> Signed-off-by: Jonas Jensen 
> ---
>
> Notes:
> Thanks for the replies.
>
> Changes since v6:
>
> 1. move callback from interrupt context to tasklet
> 2. remove callback and callback_param, use those provided by tx_desc
> 3. don't rely on structs for register offsets
> 4. remove local bool "found" variable from moxart_alloc_chan_resources()
> 5. check return value of irq_of_parse_and_map
> 6. use devm_request_irq instead of setup_irq
> 7. elaborate commit message
>
> device tree bindings document:
> 8. in the example, change "#dma-cells" to "<2>"
>
> Applies to next-20130805
>
>  .../devicetree/bindings/dma/moxa,moxart-dma.txt|  21 +
>  drivers/dma/Kconfig|   7 +
>  drivers/dma/Makefile   |   1 +
>  drivers/dma/moxart-dma.c   | 614 
> +
>  4 files changed, 643 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/dma/moxa,moxart-dma.txt
>  create mode 100644 drivers/dma/moxart-dma.c
>
> diff --git a/Documentation/devicetree/bindings/dma/moxa,moxart-dma.txt 
> b/Documentation/devicetree/bindings/dma/moxa,moxart-dma.txt
> new file mode 100644
> index 000..5b9f82c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/moxa,moxart-dma.txt
> @@ -0,0 +1,21 @@
> +MOXA ART DMA Controller
> +
> +See dma.txt first
> +
> +Required properties:
> +
> +- compatible : Must be "moxa,moxart-dma"
> +- reg :Should contain registers location and length
> +- interrupts : Should contain the interrupt number
> +- #dma-cells : Should be 2
> +   cell index 0: channel number between 0-3
> +   cell index 1: line request number
> +
> +Example:
> +
> +   dma: dma@9050 {
> +   compatible = "moxa,moxart-dma";
> +   reg = <0x9050 0x1000>;
> +   interrupts = <24 0>;
> +   #dma-cells = <2>;
> +   };

Thanks for the updates on this. :)

The binding and example look sensible to me; it would be nice if someone
familiar with the dma subsystem could check that this has the necessary
information.

[...]

> +static int moxart_alloc_chan_resources(struct dma_chan *chan)
> +{
> +   struct moxart_dma_chan *mchan = to_moxart_dma_chan(chan);
> +   int i;
> +
> +   for (i = 0; i < APB_DMA_MAX_CHANNEL; i++) {
> +   if (i == mchan->ch_num
> +   && !mchan->allocated) {
> +   dev_dbg(chan2dev(chan), "%s: allocating channel 
> #%d\n",
> +   __func__, mchan->ch_num);
> +   mchan->allocated = true;
> +   return 0;
> +   }
> +   }

Come to think of it, why do you need to iterate over all of the channels
to handle a particular channel number that you already know, and already
have the struct for?

I'm not familiar with the dma subsystem, and I couldn't spot when the
dma channel is actually assigned/selected prior to this.

[...]

> +static enum dma_status moxart_tx_status(struct dma_chan *chan,
> +   dma_cookie_t cookie,
> +   struct dma_tx_state *txstate)
> +{
> +   enum dma_status ret;
> +
> +   ret = dma_cookie_status(chan, cookie, txstate);
> +   if (ret == DMA_SUCCESS || !txstate)
> +   return ret;
> +
> +   return ret;

No special status handling?

This function is equivalent to:

return dma_cookie_status(chan, cookie, txstate);

[...]

> +static int moxart_probe(struct platform_device *pdev)
> +{
> +   struct device *dev = &pdev->dev;
> +   struct device_node *node = dev->of_node;
> +   struct resource *res;
> +   static void __iomem *dma_base_addr;
> +   int ret, i;
> +   unsigned int irq;
> +   struct moxart_dma_chan *mchan;
> +   struct moxart_dma_container *mdc;
> +
> +   mdc = devm_kzalloc(dev, sizeof(*mdc), GFP_KERNEL);
> +   if (!mdc) {
> +   dev_err(dev, "can't allocate DMA container\n");
>

[PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)

2013-08-05 Thread Oleg Nesterov

To simplify the review of the next patches:

1. We are going to reimplent __perf_task/counter and embedd them
   into TP_ARGS(). expand TRACE_EVENT(sched_stat_runtime) into
   DECLARE_EVENT_CLASS() + DEFINE_EVENT(), this way they can use
   different TP_ARGS's.

2. Change perf_trace_##call() macro to do perf_fetch_caller_regs()
   right before perf_trace_buf_prepare().

   This way it evaluates TP_ARGS() asap, the next patch explores
   this fact.

   Note: after 87f44bbc perf_trace_buf_prepare() doesn't need
   "struct pt_regs *regs", perhaps it makes sense to remove this
   argument. And perhaps we can teach perf_trace_buf_submit()
   to accept regs == NULL and do fetch_caller_regs(CALLER_ADDR1)
   in this case.

3. Cosmetic, but the typecast from "void*" buys nothing. It just
   adds the noise, remove it.

Signed-off-by: Oleg Nesterov 
Tested-by: David Ahern 
Reviewed-and-Acked-by: Steven Rostedt 
---
 include/trace/events/sched.h |6 +-
 include/trace/ftrace.h   |7 +++
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index e5586ca..249c024 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -372,7 +372,7 @@ DEFINE_EVENT(sched_stat_template, sched_stat_blocked,
  * Tracepoint for accounting runtime (time the task is executing
  * on a CPU).
  */
-TRACE_EVENT(sched_stat_runtime,
+DECLARE_EVENT_CLASS(sched_stat_runtime,
 
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
 
@@ -401,6 +401,10 @@ TRACE_EVENT(sched_stat_runtime,
(unsigned long long)__entry->vruntime)
 );
 
+DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
+TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
+TP_ARGS(tsk, runtime, vruntime));
+
 /*
  * Tracepoint for showing priority inheritance modifying a tasks
  * priority.
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 41a6643..618af05 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -663,15 +663,14 @@ perf_trace_##call(void *__data, proto)
\
int __data_size;\
int rctx;   \
\
-   perf_fetch_caller_regs(&__regs);\
-   \
__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
 sizeof(u64));  \
__entry_size -= sizeof(u32);\
\
-   entry = (struct ftrace_raw_##call *)perf_trace_buf_prepare( \
-   __entry_size, event_call->event.type, &__regs, &rctx);  \
+   perf_fetch_caller_regs(&__regs);\
+   entry = perf_trace_buf_prepare(__entry_size,\
+   event_call->event.type, &__regs, &rctx);\
if (!entry) \
return; \
\
-- 
1.5.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible

2013-08-05 Thread Oleg Nesterov

perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL)
make no sense if hlist_empty(head). Change perf_trace_##call()
to check ->perf_events beforehand and do nothing if it is empty.

This removes the overhead for tasks without events associated
with them. For example, "perf record -e sched:sched_switch -p1"
attaches the counter(s) to the single task, but every task in
system will do perf_trace_buf_prepare/submit() just to realize
that it was not attached to this event.

However, we can only do this if __task == NULL, so we also add
the __builtin_constant_p(__task) check.

With this patch "perf bench sched pipe" shows approximately 4%
improvement when "perf record -p1" runs in parallel, many thanks
to Steven for the testing.

Signed-off-by: Oleg Nesterov 
Tested-by: David Ahern 
Reviewed-and-Acked-by: Steven Rostedt 
---
 include/trace/ftrace.h |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 4163d93..5c7ab17 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -667,6 +667,12 @@ perf_trace_##call(void *__data, proto) 
\
int rctx;   \
\
__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
+   \
+   head = this_cpu_ptr(event_call->perf_events);   \
+   if (__builtin_constant_p(!__task) && !__task && \
+   hlist_empty(head))  \
+   return; \
+   \
__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
 sizeof(u64));  \
__entry_size -= sizeof(u32);\
@@ -681,7 +687,6 @@ perf_trace_##call(void *__data, proto)  
\
\
{ assign; } \
\
-   head = this_cpu_ptr(event_call->perf_events);   \
perf_trace_buf_submit(entry, __entry_size, rctx, __addr,\
__count, &__regs, head, __task);\
 }
-- 
1.5.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events)

2013-08-05 Thread Oleg Nesterov

Sorry for double post, forgot to cc lkml...

On 07/19, Ingo Molnar wrote:
>
> * Oleg Nesterov  wrote:
>
> > Hello.
> >
> > The patches are the same, I only tried to update the changelogs a bit.
> > I am also quoting my old email below, to explain what this hack tries
> > to do.
> >
> > Say, "perf record -e sched:sched_switch -p1".
> >
> > Every task except /sbin/init will do perf_trace_sched_switch() and
> > perf_trace_buf_prepare() + perf_trace_buf_submit for no reason(),
> > it doesn't have a counter.
> >
> > So it makes sense to add the fast-path check at the start of
> > perf_trace_##call(),
> >
> > if (hlist_empty(event_call->perf_events))
> > return;
> >
> > The problem is, we should not do this if __task != NULL (iow, if
> > DECLARE_EVENT_CLASS() uses __perf_task()), perf_tp_event() has the
> > additional code for this case.
> >
> > So we should do
> >
> > if (!__task && hlist_empty(event_call->perf_events))
> > return;
> >
> > But __task is changed by "{ assign; }" block right before
> > perf_trace_buf_submit(). Too late for the fast-path check,
> > we already called perf_trace_buf_prepare/fetch_regs.
> >
> > So. After 2/3 __perf_task() (and __perf_count/addr) is called
> > when ftrace_get_offsets_##call(args) evaluates the arguments,
> > and we can check !__task && hlist_empty() right after that.
> >
> > Oleg.
>
> Nice improvement.
>
> Peter, Steve, any objections?

Ingo,

It seems that everybody agree with this hack but it was forgotten,
let me resend it again.

The only change is that I added the following tags:

Tested-by: David Ahern 
Reviewed-and-Acked-by: Steven Rostedt 

Oleg.

 include/trace/events/sched.h |   22 --
 include/trace/ftrace.h   |   33 -
 2 files changed, 28 insertions(+), 27 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic

2013-08-05 Thread Oleg Nesterov

The next patch tries to avoid the costly perf_trace_buf_* calls
when possible but there is a problem. We can only do this if
__task == NULL, perf_tp_event(task != NULL) has the additional
code for this case.

Unfortunately, TP_perf_assign/__perf_xxx which changes the default
values of __count/__task variables for perf_trace_buf_submit() is
called "too late", after we already did perf_trace_buf_prepare(),
and the optimization above can't work.

So this patch simply embeds __perf_xxx() into TP_ARGS(), this way
DECLARE_EVENT_CLASS() can use the result of assignments hidden in
"args" right after ftrace_get_offsets_##call() which is mostly
trivial. This allows us to have the fast-path "__task != NULL"
check at the start, see the next patch.

Signed-off-by: Oleg Nesterov 
Tested-by: David Ahern 
Reviewed-and-Acked-by: Steven Rostedt 
---
 include/trace/events/sched.h |   16 +++-
 include/trace/ftrace.h   |   19 +++
 2 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 249c024..2e7d994 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -57,7 +57,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
 
TP_PROTO(struct task_struct *p, int success),
 
-   TP_ARGS(p, success),
+   TP_ARGS(__perf_task(p), success),
 
TP_STRUCT__entry(
__array(char,   comm,   TASK_COMM_LEN   )
@@ -73,9 +73,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
__entry->prio   = p->prio;
__entry->success= success;
__entry->target_cpu = task_cpu(p);
-   )
-   TP_perf_assign(
-   __perf_task(p);
),
 
TP_printk("comm=%s pid=%d prio=%d success=%d target_cpu=%03d",
@@ -313,7 +310,7 @@ DECLARE_EVENT_CLASS(sched_stat_template,
 
TP_PROTO(struct task_struct *tsk, u64 delay),
 
-   TP_ARGS(tsk, delay),
+   TP_ARGS(__perf_task(tsk), __perf_count(delay)),
 
TP_STRUCT__entry(
__array( char,  comm,   TASK_COMM_LEN   )
@@ -325,10 +322,6 @@ DECLARE_EVENT_CLASS(sched_stat_template,
memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
__entry->pid= tsk->pid;
__entry->delay  = delay;
-   )
-   TP_perf_assign(
-   __perf_count(delay);
-   __perf_task(tsk);
),
 
TP_printk("comm=%s pid=%d delay=%Lu [ns]",
@@ -376,7 +369,7 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
 
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
 
-   TP_ARGS(tsk, runtime, vruntime),
+   TP_ARGS(tsk, __perf_count(runtime), vruntime),
 
TP_STRUCT__entry(
__array( char,  comm,   TASK_COMM_LEN   )
@@ -390,9 +383,6 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
__entry->pid= tsk->pid;
__entry->runtime= runtime;
__entry->vruntime   = vruntime;
-   )
-   TP_perf_assign(
-   __perf_count(runtime);
),
 
TP_printk("comm=%s pid=%d runtime=%Lu [ns] vruntime=%Lu [ns]",
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 618af05..4163d93 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -507,8 +507,14 @@ static inline notrace int ftrace_get_offsets_##call(   
\
 #undef TP_fast_assign
 #define TP_fast_assign(args...) args
 
-#undef TP_perf_assign
-#define TP_perf_assign(args...)
+#undef __perf_addr
+#define __perf_addr(a) (a)
+
+#undef __perf_count
+#define __perf_count(c)(c)
+
+#undef __perf_task
+#define __perf_task(t) (t)
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@@ -636,16 +642,13 @@ __attribute__((section("_ftrace_events"))) 
*__event_##call = &event_##call
 #define __get_str(field) (char *)__get_dynamic_array(field)
 
 #undef __perf_addr
-#define __perf_addr(a) __addr = (a)
+#define __perf_addr(a) (__addr = (a))
 
 #undef __perf_count
-#define __perf_count(c) __count = (c)
+#define __perf_count(c)(__count = (c))
 
 #undef __perf_task
-#define __perf_task(t) __task = (t)
-
-#undef TP_perf_assign
-#define TP_perf_assign(args...) args
+#define __perf_task(t) (__task = (t))
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
-- 
1.5.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Steven Rostedt

[ sent to both Linux kernel mailing list and to gcc list ]

I was looking at some of the old code I still have marked in my TODO
list, that I never pushed to get mainlined. One of them is to move trace
point logic out of the fast path to get rid of the stress that it
imposes on the icache.

Almost a full year ago, Mathieu suggested something like:

if (unlikely(x)) __attribute__((section(".unlikely"))) {
...
} else __attribute__((section(".likely"))) {
...
}

https://lkml.org/lkml/2012/8/9/658

Which got me thinking. How hard would it be to set a block in its own
section. Like what Mathieu suggested, but it doesn't have to be
".unlikely".

if (x) __attibute__((section(".foo"))) {
/* do something */
}

Then have in the assembly, simply:

test x
beq 2f
1:
/* continue */
ret

2:
jmp foo1
3:
jmp 1b


Then in section ".foo":

foo1:
/* do something */
jmp 3b

Perhaps we can't use the section attribute. We could create a new
attribute. Perhaps a __jmp_section__ or whatever (I'm horrible with
names).

Is this a possibility?

If this is possible, we can get a lot of code out of the fast path.
Things like stats and tracing, which is mostly default off. I would
imagine that we would get better performance by doing this. Especially
as tracepoints are being added all over the place.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mips/kvm: Improve code formatting in arch/mips/kvm/kvm_locore.S

2013-08-05 Thread David Daney


On 08/05/2013 06:43 AM, Gleb Natapov wrote:

On Mon, Aug 05, 2013 at 03:21:57PM +0200, Ralf Baechle wrote:

On Mon, Aug 05, 2013 at 02:17:01PM +0100, James Hogan wrote:



On 01/08/13 21:22, David Daney wrote:

From: David Daney 

No code changes, just reflowing some comments and consistently using
tabs and spaces.  Object code is verified to be unchanged.

Signed-off-by: David Daney 
Acked-by: Ralf Baechle 




+/* Put the saved pointer to vcpu (s1) back into the DDATA_LO Register 
*/


git am detects a whitespace error here ("space before tab in indent").
It's got spaces before and after the tab actually.


  /* load the guest context from VCPU and return */


this comment could have it's indentation fixed too

Otherwise, for all 3 patches:

Reviewed-by: James Hogan 


I'm happy with the patch series as well and will fix this issue when
applying the patch.


kvm fixes usually go through kvm.git tree for all arches. Any special
reasons you want to get those through mips tree?



I don't really care which tree takes this particular patch set.

However, in the near future, I will be sending revised versions of 
patches needed by MIPS/KVM that are in files outside of the 
arch/mips/kvm directory and it is possible that those may suffer patch 
ordering problems if merged through a 'foreign tree'.


In any event, there is the problem with the whitespace error in the 
comment.  I blame checkpatch.pl for not flagging it, but that is not 
really a good excuse.  If it goes by the KVM tree, do you want me to 
send a corrected patch?  Or can you fix it when you merge it.


David Daney

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] trivial: adjust code alignment

2013-08-05 Thread Jonathan Corbet

On Mon, 5 Aug 2013 18:19:18 +0200 (CEST)
Julia Lawall  wrote:

> Oops, thanks for spotting that.  I'm not sure whether it is safe to abort 
> these calls as soon as the first one fails, but perhaps I could introduce 
> some more variables, and test them all afterwards.

Yes, it would be safe.  But it's hard to imagine a scenario where any of
those particular calls would fail that doesn't involve smoke.

The code is evidence of ancient laziness on my part.  I'll add fixing it
up to my list of things to do.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUERY] lguest64

2013-08-05 Thread Konrad Rzeszutek Wilk

On Sun, Aug 04, 2013 at 03:37:08PM +0300, Gleb Natapov wrote:
> On Fri, Aug 02, 2013 at 03:09:34PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 31, 2013 at 06:25:04AM -0700, H. Peter Anvin wrote:
> > > On 07/31/2013 06:17 AM, Konrad Rzeszutek Wilk wrote:
> > > >>
> > > >> The big problem with pvops is that they are a permanent tax on future
> > > >> development -- a classic case of "the hooks problem."  As such it is
> > > >> important that there be a real, significant, use case with enough users
> > > >> to make the pain worthwhile.  With Xen looking at sunsetting PV support
> > > >> with a long horizon, it might currently be possible to remove pvops 
> > > >> some
> > > > 
> > > > PV MMU parts specifically.
> > > > 
> > > 
> > > Pretty much stuff that is driverized on plain hardware doesn't matter.
> > > What are you looking at with respect to the basic CPU control state?
> > 
> > 
> > CC-ing Mukesh here.
> > 
> > Let me iterate down what the experimental patch uses:
> > 
> >  struct pv_init_ops pv_init_ops;
> >  
> > [still use xen_patch, but I think that is not needed anymore]
> > 
> >  struct pv_time_ops pv_time_ops;
> >  
> > [we need that as we are using the PV clock source]
> > 
> >  struct pv_cpu_ops pv_cpu_ops;  
> >  
> > [only end up using cpuid. This one is a tricky one. We could
> >  arguable remove it but it does do some filtering - for example
> >  THERM is turned off, or MWAIT if a certain hypercall tells us 
> > to
> >  disable that. Since this is now a trapped operation this could 
> > be
> >  handled in the hypervisor - but then it would be in charge of
> >  filtering certain CPUID - and this is at bootup - so there is 
> > not
> >  user interaction. This needs a bit more of thinking]
> > 
> read_msr/write_msr in this one make all msr accesses safe. IIRC there
> are MSRs that Linux uses without checking cpuid bits.
> IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.

Right, those are needed as well. Completly forgot about them.
> 
> 
> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2 3 4 5 6 7 8 >

301 - 400 of 734 matches

Mail list logo