date:20150526

scripts/gdb: multi arch lx_current

2015-05-26 Thread Thiébaud Weksteen

Hi Jan,

I've been working on lx_current and cpus.py to support other architectures 
than just x86. From my understanding, current/get_current are not available 
with the default debug option (-g). We could either modify that level so that 
the inline functions/macros are available or reimplement part of logic to 
retrieve the current task (sp masking, etc).

Are there any other options to retrieve the current task? What do you 
recommend?

Thiebaud
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] zram: check comp algorithm availability earlier

2015-05-26 Thread Minchan Kim

On Wed, May 27, 2015 at 02:53:20PM +0900, Sergey Senozhatsky wrote:
> On (05/27/15 12:51), Minchan Kim wrote:
> [..]
> > > @@ -378,6 +378,12 @@ static ssize_t comp_algorithm_store(struct device 
> > > *dev,
> > >   if (sz > 0 && zram->compressor[sz - 1] == '\n')
> > >   zram->compressor[sz - 1] = 0x00;
> > >  
> > > + if (!zcomp_available_algorithm(zram->compressor)) {
> > > + pr_err("Error: unavailable compression algorithm: %s\n",
> > > + zram->compressor);
> > > + len = -EINVAL;
> > > + }
> > > +
> > 
> > I'm not against this patch because it's better than old.
> > But let's think more about the pr_err part.
> > 
> > If user try to set wrong algo name, he can see EINVAL.
> > Isn't it enough?
> > 
> > I think every sane admin can think he passed wrong argument
> > if he sees -EINVAL.
> > So, I don't think we need to emit pr_err in here.
> > 
> 
> well, it's here simply to make failure investigation easier.
> one surely will know that supplied string was not recognized
> as a compression algorithm name, but what was it.. "$3 instead
> of $2... or, wait, did $i contain something wrong?". zram knew
> what was wrong.
> 
> /* and you asked to put this warn here in your previous email. */

Yes, Sorry about that. At that time, you put the warning in find_backend
and I didn't like it. Instead, I want to move it to in there.
But more thinking about it, I don't feel we need it.

> 
> 
> sure, can remove it.
> 
> 
> > The reason I am paranoid about that is that I really don't want
> > to argue with syslog info which is part of ABI or not in future.
> > If possible, I don't want to depend on pr_xxx.
> > 
> 
> just for the record...  I don't understand this part.

I meant if we remove the pr_err in future by some reason,
someone might shout

"No, it's ABI so if you guys removes it, it will break user interface's
semantic". Maybe he seems to depends on parse on dmesg.
That is not what I want.

> 
> 
> ok. I'll resend later today.
> 
>   -ss

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] fbcon: use the cursor blink interval provided by vt

2015-05-26 Thread Andrey Wagin

2015-02-27 22:15 GMT+03:00 Scot Doyle :
> vt now provides a cursor blink interval via vc_data. Use this
> interval instead of the currently hardcoded 200 msecs. Store it in
> fbcon_ops to avoid locking the console in cursor_timer_handler().

I regularly execute criu tests on linux-next. For this, I use virtual
machine from the digitalocean clould. The current version of
linux-next hangs after a few seconds. I use git bisect to find the
commit where the problem is appeaed. And it looks like the problem is
in this patch.

When the kernel hangs, it doesn't report anything on the screen and
there is nothing suspicious in logs after reboot.

I will try to reproduce the problem in my local enviroment to get more
information.

There is my config file:
https://github.com/avagin/criu-jenkins-digitalocean/blob/d95d9e30a7da8755c47b290630bac7ee1fe7132d/jenkins-scripts/config

Thanks,
Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3 0/3] cdc-acm: fix incorrect runtime wakeup in acm_tty_write

2015-05-26 Thread Zhang, Yanmin

On 2015/5/27 12:13, Zhang, Yanmin wrote:
> Resend as V1/V2 have email format issue. Sorry for bothering.

Greg,

We have to abandon this patchset. Zhuang Jin Can, a USB expert,
reviewed the patches. He says acm_tty_write already considers
it carefully.

acm_tty_write puts the urb to a delayed queue when acm->susp_count
is not 0. acm_suspend adds 1 to acm->susp_count and acm_resume
decreases 1 from it .

Sorry for the bothering. Also thank Jin Can for the comments.

Yanmin

>
> I use Thunderbird. It has no a button to enable LKML email simply. :)
>
> V3: Change email config to resend.
>   Add a space in comment.
>
>  ---
>
> There is a scenario about cdc-acm utilization.Application opens
> n_gsm tty and cdc-acm tty. cdc-acm tty connects to xhci device.
> The application configures cdc-adm tty to n_gsm tty as ldisc tty.
>
> n_gsm=>cdc-acm=>xhci driver
>
> acm_tty_write can be called from n_gsm driver by ldisc connection,
> and from application when application opens cdc-acm tty directly.
> acm_tty_write wakes up the device by calling usb_autopm_get_interface_async,
> which calls pm_runtime_get. However, pm_runtime_get can't wake up
> the device before returning as it's an async wake up. Then, acm_tty_write
> might access the device when it is off.
>
> The patchset fixes it by:
> 1) add a new function usb_autopm_get_interface_upgrade to deal with
> above 2 requirements;
> 2) wake up device in n_gsm driver if n_gsm drivers calls cdc-acm driver;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] zram: check comp algorithm availability earlier

2015-05-26 Thread Sergey Senozhatsky

On (05/27/15 12:51), Minchan Kim wrote:
[..]
> > @@ -378,6 +378,12 @@ static ssize_t comp_algorithm_store(struct device *dev,
> > if (sz > 0 && zram->compressor[sz - 1] == '\n')
> > zram->compressor[sz - 1] = 0x00;
> >  
> > +   if (!zcomp_available_algorithm(zram->compressor)) {
> > +   pr_err("Error: unavailable compression algorithm: %s\n",
> > +   zram->compressor);
> > +   len = -EINVAL;
> > +   }
> > +
> 
> I'm not against this patch because it's better than old.
> But let's think more about the pr_err part.
> 
> If user try to set wrong algo name, he can see EINVAL.
> Isn't it enough?
> 
> I think every sane admin can think he passed wrong argument
> if he sees -EINVAL.
> So, I don't think we need to emit pr_err in here.
> 

well, it's here simply to make failure investigation easier.
one surely will know that supplied string was not recognized
as a compression algorithm name, but what was it.. "$3 instead
of $2... or, wait, did $i contain something wrong?". zram knew
what was wrong.

/* and you asked to put this warn here in your previous email. */


sure, can remove it.


> The reason I am paranoid about that is that I really don't want
> to argue with syslog info which is part of ABI or not in future.
> If possible, I don't want to depend on pr_xxx.
> 

just for the record...  I don't understand this part.


ok. I'll resend later today.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/36] HMM: introduce heterogeneous memory management v3.

2015-05-26 Thread Aneesh Kumar K.V

j.gli...@gmail.com writes:

> From: Jérôme Glisse 
>
> This patch only introduce core HMM functions for registering a new
> mirror and stopping a mirror as well as HMM device registering and
> unregistering.
>
> The lifecycle of HMM object is handled differently then the one of
> mmu_notifier because unlike mmu_notifier there can be concurrent
> call from both mm code to HMM code and/or from device driver code
> to HMM code. Moreover lifetime of HMM can be uncorrelated from the
> lifetime of the process that is being mirror (GPU might take longer
> time to cleanup).
>

..

> +struct hmm_device_ops {
> + /* release() - mirror must stop using the address space.
> +  *
> +  * @mirror: The mirror that link process address space with the device.
> +  *
> +  * When this is call, device driver must kill all device thread using

s/call/called, ?

> +  * this mirror. Also, this callback is the last thing call by HMM and
> +  * HMM will not access the mirror struct after this call (ie no more
> +  * dereference of it so it is safe for the device driver to free it).
> +  * It is call either from :
> +  *   - mm dying (all process using this mm exiting).
> +  *   - hmm_mirror_unregister() (if no other thread holds a reference)
> +  *   - outcome of some device error reported by any of the device
> +  * callback against that mirror.
> +  */
> + void (*release)(struct hmm_mirror *mirror);
> +};
> +
> +
> +/* struct hmm - per mm_struct HMM states.
> + *
> + * @mm: The mm struct this hmm is associated with.
> + * @mirrors: List of all mirror for this mm (one per device).
> + * @vm_end: Last valid address for this mm (exclusive).
> + * @kref: Reference counter.
> + * @rwsem: Serialize the mirror list modifications.
> + * @mmu_notifier: The mmu_notifier of this mm.
> + * @rcu: For delayed cleanup call from mmu_notifier.release() callback.
> + *
> + * For each process address space (mm_struct) there is one and only one hmm
> + * struct. hmm functions will redispatch to each devices the change made to
> + * the process address space.
> + *
> + * Device driver must not access this structure other than for getting the
> + * mm pointer.
> + */

.

>  #ifndef AT_VECTOR_SIZE_ARCH
>  #define AT_VECTOR_SIZE_ARCH 0
>  #endif
> @@ -451,6 +455,16 @@ struct mm_struct {
>  #ifdef CONFIG_MMU_NOTIFIER
>   struct mmu_notifier_mm *mmu_notifier_mm;
>  #endif
> +#ifdef CONFIG_HMM
> + /*
> +  * hmm always register an mmu_notifier we rely on mmu notifier to keep
> +  * refcount on mm struct as well as forbiding registering hmm on a
> +  * dying mm
> +  *
> +  * This field is set with mmap_sem old in write mode.

s/old/held/ ?


> +  */
> + struct hmm *hmm;
> +#endif
>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
>   pgtable_t pmd_huge_pte; /* protected by page_table_lock */
>  #endif
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 0e0ae9a..4083be7 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -597,6 +598,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, 
> struct task_struct *p)
>   mm_init_aio(mm);
>   mm_init_owner(mm, p);
>   mmu_notifier_mm_init(mm);
> + hmm_mm_init(mm);
>   clear_tlb_flush_pending(mm);
>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
>   mm->pmd_huge_pte = NULL;
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 52ffb86..189e48f 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -653,3 +653,18 @@ config DEFERRED_STRUCT_PAGE_INIT
> when kswapd starts. This has a potential performance impact on
> processes running early in the lifetime of the systemm until kswapd
> finishes the initialisation.
> +
> +if STAGING
> +config HMM
> + bool "Enable heterogeneous memory management (HMM)"
> + depends on MMU
> + select MMU_NOTIFIER
> + select GENERIC_PAGE_TABLE

What is GENERIC_PAGE_TABLE ?

> + default n
> + help
> +   Heterogeneous memory management provide infrastructure for a device
> +   to mirror a process address space into an hardware mmu or into any
> +   things supporting pagefault like event.
> +
> +   If unsure, say N to disable hmm.

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the char-misc tree

2015-05-26 Thread Stephen Rothwell

Hi all,

After merging the char-misc tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

drivers/nfc/microread/mei.c:70:2: error: implicit declaration of function 
'__UUID_LE' [-Werror=implicit-function-declaration]
  { MICROREAD_DRIVER_NAME, MEI_NFC_UUID},
  ^
drivers/nfc/microread/mei.c:70:2: warning: missing braces around initializer 
[-Wmissing-braces]
drivers/nfc/microread/mei.c:70:2: warning: (near initialization for 
'microread_mei_tbl[0].uuid') [-Wmissing-braces]
drivers/nfc/microread/mei.c:70:2: error: initializer element is not constant
drivers/nfc/microread/mei.c:70:2: error: (near initialization for 
'microread_mei_tbl[0].uuid[0]')
cc1: some warnings being treated as errors

Caused by commit c93b76b34b4d ("mei: bus: report also uuid in module
alias").

I have used the char-misc tree from next-20150526 for today.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpL5A7TADMrP.pgp
Description: OpenPGP digital signature

Re: [patch 1/7] timers: Sanitize catchup_timer_jiffies() usage

2015-05-26 Thread Viresh Kumar

On 26-05-15, 22:50, Thomas Gleixner wrote:
> 3) __run_timers()
> 
>We only check on entry, which is silly, because base->timer_jiffies
>can be behind - especially on NOHZ kernels - and if there is a
>single deferrable timer somewhere between base->timer_jiffies and
>jiffies we expire it and then loop until base->timer_jiffies ==
>jiffies.

This may be incorrect. Once we expire that single deferrable timer, we
call detach_expired_timer(), which calls catchup_timer_jiffies() at
its end. And so the following loop should end right away, isn't it ?

while (time_after_eq(jiffies, base->timer_jiffies))

> +++ tip/kernel/time/timer.c
> -static bool catchup_timer_jiffies(struct tvec_base *base)
> +static inline bool catchup_timer_jiffies(struct tvec_base *base)

There is only one user left for this routine now, i.e. __run_timers().
Should we drop this routine ?

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] NILFS2: support NFSv2 export

2015-05-26 Thread Ryusuke Konishi

From: NeilBrown 

The "fh_len" passed to ->fh_to_* is not guaranteed to be that same as
that returned by encode_fh - it may be larger.

With NFSv2, the filehandle is fixed length, so it may appear longer
than expected and be zero-padded.

So we must test that fh_len is at least some value, not exactly equal
to it.

Signed-off-by: NeilBrown 
Signed-off-by: Ryusuke Konishi 
---
 fs/nilfs2/namei.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c
index 2218083..37dd6b0 100644
--- a/fs/nilfs2/namei.c
+++ b/fs/nilfs2/namei.c
@@ -496,8 +496,7 @@ static struct dentry *nilfs_fh_to_dentry(struct super_block 
*sb, struct fid *fh,
 {
struct nilfs_fid *fid = (struct nilfs_fid *)fh;
 
-   if ((fh_len != NILFS_FID_SIZE_NON_CONNECTABLE &&
-fh_len != NILFS_FID_SIZE_CONNECTABLE) ||
+   if (fh_len < NILFS_FID_SIZE_NON_CONNECTABLE ||
(fh_type != FILEID_NILFS_WITH_PARENT &&
 fh_type != FILEID_NILFS_WITHOUT_PARENT))
return NULL;
@@ -510,7 +509,7 @@ static struct dentry *nilfs_fh_to_parent(struct super_block 
*sb, struct fid *fh,
 {
struct nilfs_fid *fid = (struct nilfs_fid *)fh;
 
-   if (fh_len != NILFS_FID_SIZE_CONNECTABLE ||
+   if (fh_len < NILFS_FID_SIZE_CONNECTABLE ||
fh_type != FILEID_NILFS_WITH_PARENT)
return NULL;
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 2/2] ALSA: set no sound proc fs for reduced memory footprint

2015-05-26 Thread Jie, Yang

> -Original Message-
> From: Sudip Mukherjee [mailto:sudipm.mukher...@gmail.com]
> Sent: Wednesday, May 27, 2015 12:50 PM
> To: Jie, Yang
> Cc: ti...@suse.de; broo...@kernel.org; alsa-de...@alsa-project.org; linux-
> ker...@vger.kernel.org; Girdwood, Liam R; Zhang, Vivian
> Subject: Re: [PATCH v2 2/2] ALSA: set no sound proc fs for reduced memory
> footprint
> 
> On Tue, May 26, 2015 at 09:13:57PM +0800, Jie Yang wrote:
> > Disable sound proc fs, when CONFIG_SND_NO_PROC_FS is selected, which
> > can save about 9KB memory size for reducing memory footprint purpose.
> > ---
> missing Signed-off-by.
 
Thanks for pointing out.

Signed-off-by: Jie Yang 

Takashi, do I need to resend them?

Thanks,
~Keyon

> 
> regards
> sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/1] NILFS2: support NFSv2 export

2015-05-26 Thread Ryusuke Konishi

Hi Andrew,

please queue the following patch for the next merge window.  It fixes
an NFSv2 related issue reported in:

[1] http://marc.info/?l=linux-fsdevel=143104630128997
"[PATCH 0/3] make BTRFS, UDF, NILFS2 work with NFSv2."

Thanks,
Ryusuke Konishi
--
NeilBrown (1):
  NILFS2: support NFSv2 export

 fs/nilfs2/namei.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v3] apple-gmux: lock iGP IO to protect from vgaarb changes

2015-05-26 Thread Michael Marineau

On Tue, May 26, 2015 at 9:47 PM, Darren Hart  wrote:
> On Tue, May 26, 2015 at 12:10:48PM -0700, Michael Marineau wrote:
>> FYI, this actually broke backlight controls on my MBP11,3 because the
>> assumption the patch makes that gmux is always loaded before graphics
>> drivers didn't hold true. At least for me dracut included the nouveau
>> module in the initrd but not gmux, ensuring the ordering was wrong. No
>> errors were reporting, and gmux still offered the backlight device, it
>> just became inoperable. I worked around this for my kernel by building
>> gmux into vmlinuz instead of as a module but that isn't going to in
>> more general configs because there is an apple backlight driver which
>> cannot be built at all in that configuration.
>>
>
> Thank you for reporting this Michael,
>
> That is tough as nouveau doesn't have an explicit dependency on gmux, so we
> could do something like a passive request_module(), but if it isn't in the
> initrd image, it would still fail as you describe.
>
>> Is there a way to make the ordering between nouveau and gmux more
>> explicit/reliable? Can gmux complain loudly if the ordering is ever
>> wrong?
>
> It should print an error if the probe fails due to the IO already being in use
> or if it can't be allocated. The disabled IO case is only info level though,
> perhaps that should be higher priority. Printing something when failing to 
> probe
> seems like a reasonable thing to do.
>
> Michael, which message do you get if you boot with "debug" or "loglevel=6" 
> when
> apple-gmux is not built-in?

No error, gmux claims it worked:
[   13.693379] apple_gmux: Found gmux version 4.0.8 [indexed]
[   13.693400] vgaarb: device changed decodes:
PCI::01:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
[   13.693404] apple_gmux: locked IO for PCI::01:00.0

Full dmesg: https://gist.github.com/marineam/0e5a23548e8b3b2e1d50
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v4 4/4] arm: kvm: add stub implementation for kvm_cpu_reset()

2015-05-26 Thread AKASHI Takahiro


On 05/26/2015 06:36 PM, Marc Zyngier wrote:

On 08/05/15 02:18, AKASHI Takahiro wrote:

Just to avoid compiling errors on arm.

Signed-off-by: AKASHI Takahiro 
---
  arch/arm/include/asm/kvm_asm.h  |1 +
  arch/arm/include/asm/kvm_host.h |   12 
  arch/arm/include/asm/kvm_mmu.h  |5 +
  arch/arm/kvm/init.S |6 ++
  4 files changed, 24 insertions(+)


(snip)


So before this patch, KVM is broken on ARM. This is not acceptable.
Please merge it with patch #1.


OK.

-Takahiro AKASHI



Thanks,

M.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v4 2/4] arm64: kvm: add kvm cpu hotplug

2015-05-26 Thread AKASHI Takahiro


On 05/26/2015 06:35 PM, Marc Zyngier wrote:

On 08/05/15 02:18, AKASHI Takahiro wrote:

This patch allows cpu cores to be up and down by adding
kvm_arch_hardware_enable/isable(). This way, especially in kexec case,
cores are reset to initial states and kexec can gracefully shutdown the
system and reboot a new kernel from EL2.

Signed-off-by: AKASHI Takahiro 
---
  arch/arm/include/asm/kvm_host.h   |1 -
  arch/arm/kvm/arm.c|   29 +++--
  arch/arm64/include/asm/kvm_host.h |1 -
  3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 41008cd..ca97764 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -237,7 +237,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);

  struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);

-static inline void kvm_arch_hardware_disable(void) {}
  static inline void kvm_arch_hardware_unsetup(void) {}
  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 251ab9e..e989925 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -87,11 +87,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
return _arm_running_vcpu;
  }

-int kvm_arch_hardware_enable(void)
-{
-   return 0;
-}
-
  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
  {
return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -885,6 +880,9 @@ static void cpu_init_hyp_mode(void *dummy)


Since you removed the IPI, why do you keep the dummy argument?


Yeah, will remove it.


unsigned long stack_page;
unsigned long vector_ptr;

+   if (__hyp_get_vectors() != hyp_default_vectors)
+   return;
+
/* Switch from the HYP stub to our own HYP init vector */
__hyp_set_vectors(kvm_get_idmap_vector());

@@ -921,6 +919,10 @@ static int hyp_init_cpu_notify(struct notifier_block *self,
if (__hyp_get_vectors() == hyp_default_vectors)
cpu_init_hyp_mode(NULL);
break;
+   case CPU_DYING:
+   case CPU_DYING_FROZEN:
+   kvm_cpu_reset(NULL);
+   break;
}

return NOTIFY_OK;
@@ -936,6 +938,7 @@ static int hyp_init_cpu_pm_notifier(struct notifier_block 
*self,
void *v)
  {
if (cmd == CPU_PM_EXIT &&
+   kvm_arm_get_running_vcpu() &&
__hyp_get_vectors() == hyp_default_vectors) {
cpu_init_hyp_mode(NULL);
return NOTIFY_OK;
@@ -1039,11 +1042,6 @@ static int init_hyp_mode(void)
}

/*
-* Execute the init code on each CPU.
-*/
-   on_each_cpu(cpu_init_hyp_mode, NULL, 1);
-
-   /*
 * Init HYP view of VGIC
 */
err = kvm_vgic_hyp_init();
@@ -1144,6 +1142,17 @@ out_err:
return err;
  }

+int kvm_arch_hardware_enable(void)
+{
+   cpu_init_hyp_mode(NULL);
+   return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+   kvm_cpu_reset(NULL);
+}
+


Bahhh... Just rename cpu_init_hyp_mode to kvm_arch_hardware_enable, and
kvM_cpu_reset to kvm_arch_hardware_disable. I don't see the point of
keeping this indirection.


Historical reason ...
Anyway, if we don't have to add anything else here, yes, I will rename them.

-Takahiro AKASHI


  /* NOP: Compiling as a module not supported */
  void kvm_arch_exit(void)
  {
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 6a8da9c..831e6a4 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -262,7 +262,6 @@ static inline void vgic_arch_setup(const struct vgic_params 
*vgic)
}
  }

-static inline void kvm_arch_hardware_disable(void) {}
  static inline void kvm_arch_hardware_unsetup(void) {}
  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH-v2 2/4] target: Drop lun_sep_lock for se_lun->lun_se_dev RCU usage

2015-05-26 Thread Nicholas A. Bellinger

On Tue, 2015-05-26 at 16:30 +0200, Bart Van Assche wrote:
> On 05/26/15 08:57, Nicholas A. Bellinger wrote:
> > @@ -625,6 +626,7 @@ int core_dev_add_initiator_node_lun_acl(
> > u32 lun_access)
> >   {
> > struct se_node_acl *nacl = lacl->se_lun_nacl;
> > +   struct se_device *dev = lockless_dereference(lun->lun_se_dev);
> >   
> > if (!nacl)
> > return -EINVAL;
> 
> An attempt to run this code on a system with RCU debugging enabled
> resulted in the following complaint:
> 
> ===
> [ INFO: suspicious RCU usage. ]
> 4.1.0-rc1-lio-dbg+ #1 Not tainted
> ---
> drivers/target/target_core_device.c:617 suspicious rcu_dereference_check() 
> usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 1
> 2 locks held by ln/1497:
>  #0:  (sb_writers#11){.+.+.+}, at: [] 
> mnt_want_write+0x24/0x50
>  #1:  (>s_type->i_mutex_key#14/1){+.+.+.}, at: [] 
> filename_create+0xad/0x1a0
> 
> stack backtrace:
> CPU: 0 PID: 1497 Comm: ln Not tainted 4.1.0-rc1-lio-dbg+ #1
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>  0001 88005955bd68 814fa346 0011
>  880058bf1270 88005955bd98 810ab235 880050db9a68
>  880058ae2e68 0002 880058ae4120 88005955be08
> Call Trace:
>  [] dump_stack+0x4f/0x7b
>  [] lockdep_rcu_suspicious+0xd5/0x110
>  [] core_dev_add_initiator_node_lun_acl+0xec/0x190 
> [target_core_mod]
>  [] ? get_parent_ip+0x11/0x50
>  [] target_fabric_mappedlun_link+0x129/0x240 
> [target_core_mod]
>  [] ? target_fabric_mappedlun_link+0x9c/0x240 
> [target_core_mod]
>  [] configfs_symlink+0x13d/0x360 [configfs]
>  [] vfs_symlink+0x58/0xb0
>  [] SyS_symlink+0x65/0xc0
>  [] system_call_fastpath+0x16/0x7a
> 

In this particular case, the se_device behind se_lun->lun_se_dev
__rcu protected pointer can't be released without first releasing the
pre-existing se_lun->lun_group reference to se_device->dev_group.

And since se_lun->lun_group is the source of a configfs symlink to
se_lun_acl->se_lun_group here, the se_lun associated RCU pointer and
underlying se_device can't be released out from under the above
target_fabric_mappedlun_link() code accessing a __rcu protected pointer.

Paul, is lockless_dereference the correct notation for this type of
use-case..?

Thank you,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 19/29] bpf tools: Load eBPF programs in object files into kernel

2015-05-26 Thread Wang Nan

This patch utilizes previous introduced bpf_load_program to load
programs in the ELF file into kernel. Result is stored in 'fd' field
in 'struct bpf_program'.

During loading, it allocs a log buffer and free it before return.
Note that that buffer is not passed to bpf_load_program() if the first
loading try is successful. Doesn't use a statically allocated log
buffer to avoid potention multi-thread problem.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 75 ++
 1 file changed, 75 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index f2071ae..02fc880 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -99,6 +99,8 @@ struct bpf_program {
int map_idx;
} *reloc_desc;
int nr_reloc;
+
+   int fd;
 };
 
 struct bpf_object {
@@ -130,11 +132,20 @@ struct bpf_object {
 };
 #define obj_elf_valid(o)   ((o)->efile.fd >= 0)
 
+static void bpf_program__unload(struct bpf_program *prog)
+{
+   if (!prog)
+   return;
+
+   zclose(prog->fd);
+}
+
 static void bpf_program__clear(struct bpf_program *prog)
 {
if (!prog)
return;
 
+   bpf_program__unload(prog);
zfree(>section_name);
zfree(>insns);
zfree(>reloc_desc);
@@ -194,6 +205,7 @@ bpf_program__new(struct bpf_object *obj, void *data, size_t 
size,
memcpy(prog->insns, data,
   prog->insns_cnt * sizeof(struct bpf_insn));
prog->idx = idx;
+   prog->fd = -1;
 
return prog;
 out:
@@ -672,6 +684,64 @@ static int bpf_object__collect_reloc(struct bpf_object 
*obj)
return 0;
 }
 
+static int
+bpf_program__load(struct bpf_program *prog,
+ char *license, u32 kern_version)
+{
+   int fd, err;
+   char *log_buf;
+
+   log_buf = malloc(BPF_LOG_BUF_SIZE);
+   if (!log_buf)
+   pr_warning("Alloc log buffer for bpf loader error, continue 
without log\n");
+
+   fd = bpf_load_program(BPF_PROG_TYPE_KPROBE, prog->insns,
+ prog->insns_cnt, license,
+ kern_version, log_buf,
+ BPF_LOG_BUF_SIZE);
+
+   if (fd >= 0) {
+   prog->fd = fd;
+   pr_debug("load bpf program '%s': fd = %d\n",
+prog->section_name, prog->fd);
+   err = 0;
+   goto out;
+   }
+
+   err = -EINVAL;
+   pr_warning("load bpf program '%s' failed: %s\n",
+  prog->section_name, strerror(errno));
+
+   if (log_buf) {
+   pr_warning("bpf: load: failed to load program '%s':\n",
+  prog->section_name);
+   pr_warning("-- BEGIN DUMP LOG ---\n");
+   pr_warning("%s\n", log_buf);
+   pr_warning("-- END LOG --\n");
+   }
+
+out:
+   free(log_buf);
+   return err;
+}
+
+static int
+bpf_object__load_progs(struct bpf_object *obj)
+{
+   size_t i;
+   int err;
+
+   for (i = 0; i < obj->nr_programs; i++) {
+   err = bpf_program__load(>programs[i],
+   obj->license,
+   obj->kern_version);
+   if (err)
+   return err;
+   }
+   return 0;
+}
+
+
 static int bpf_object__validate(struct bpf_object *obj)
 {
if (obj->kern_version == 0) {
@@ -731,6 +801,9 @@ int bpf_object__unload(struct bpf_object *obj)
zclose(obj->maps_fds[i]);
zfree(>maps_fds);
 
+   for (i = 0; i < obj->nr_programs; i++)
+   bpf_program__unload(>programs[i]);
+
return 0;
 }
 
@@ -743,6 +816,8 @@ int bpf_object__load(struct bpf_object *obj)
goto out;
if (bpf_object__relocate(obj))
goto out;
+   if (bpf_object__load_progs(obj))
+   goto out;
 
return 0;
 out:
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 17/29] bpf tools: Relocate eBPF programs

2015-05-26 Thread Wang Nan

If an eBPF program access a map, LLVM generates a relocated load
instruction. To enable the usage of that map, relocation must be done
by replacing original instructions by map loading instructions.

Based on relocation description collected during 'opening' phase, this
patch replaces the instructions with map loading with correct map fd.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fe4d282..f2071ae 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -587,6 +587,53 @@ bpf_object__create_maps(struct bpf_object *obj)
return 0;
 }
 
+static int
+bpf_program__relocate(struct bpf_program *prog, int *maps_fds)
+{
+   int i;
+
+   if (!prog || !prog->reloc_desc)
+   return 0;
+
+   for (i = 0; i < prog->nr_reloc; i++) {
+   int insn_idx, map_idx;
+   struct bpf_insn *insns = prog->insns;
+
+   insn_idx = prog->reloc_desc[i].insn_idx;
+   map_idx = prog->reloc_desc[i].map_idx;
+
+   if (insn_idx >= (int)prog->insns_cnt) {
+   pr_warning("relocation out of range: '%s'\n",
+  prog->section_name);
+   return -ERANGE;
+   }
+   insns[insn_idx].src_reg = BPF_PSEUDO_MAP_FD;
+   insns[insn_idx].imm = maps_fds[map_idx];
+   }
+
+   return 0;
+}
+
+
+static int
+bpf_object__relocate(struct bpf_object *obj)
+{
+   struct bpf_program *prog;
+   size_t i;
+   int err;
+
+   for (i = 0; i < obj->nr_programs; i++) {
+   prog = >programs[i];
+
+   if ((err = bpf_program__relocate(prog, obj->maps_fds))) {
+   pr_warning("failed to relocate '%s'\n",
+  prog->section_name);
+   return err;
+   }
+   }
+   return 0;
+}
+
 static int bpf_object__collect_reloc(struct bpf_object *obj)
 {
int i, err;
@@ -694,6 +741,8 @@ int bpf_object__load(struct bpf_object *obj)
 
if (bpf_object__create_maps(obj))
goto out;
+   if (bpf_object__relocate(obj))
+   goto out;
 
return 0;
 out:
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 24/29] perf record: Enable passing bpf object file to --event

2015-05-26 Thread Wang Nan

By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by
an eBPF object file. It calls parse_events_load_bpf() to load that
file, which uses bpf__prepare_load() and finally calls
bpf_object__open() for the object files.

Instead of introducing evsel to evlist during parsing, events
selected by eBPF object files are appended separately. The reason
is:

 1. During parsing, the probing points have not been initialized.

 2. Currently we are unable to call add_perf_probe_events() twice,
therefore we have to wait until all such events are collected,
then probe all points by one call.

The real probing and selecting is reside in following patches.

Signed-off-by: Wang Nan 
---
 tools/perf/util/Build  |  1 +
 tools/perf/util/bpf-loader.c   | 60 ++
 tools/perf/util/bpf-loader.h   | 11 
 tools/perf/util/debug.c|  5 
 tools/perf/util/debug.h|  1 +
 tools/perf/util/parse-events.c | 16 +++
 tools/perf/util/parse-events.h |  2 ++
 tools/perf/util/parse-events.l |  5 +++-
 tools/perf/util/parse-events.y | 18 -
 9 files changed, 117 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/util/bpf-loader.c
 create mode 100644 tools/perf/util/bpf-loader.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 797490a..609f6d6 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -75,6 +75,7 @@ libperf-$(CONFIG_X86) += tsc.o
 libperf-y += cloexec.o
 libperf-y += thread-stack.o
 
+libperf-$(CONFIG_LIBELF) += bpf-loader.o
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-event.o
 
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
new file mode 100644
index 000..eef15f3
--- /dev/null
+++ b/tools/perf/util/bpf-loader.c
@@ -0,0 +1,60 @@
+/*
+ * bpf-loader.c
+ *
+ * Copyright (C) 2015 Wang Nan 
+ * Copyright (C) 2015 Huawei Inc.
+ */
+
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "bpf-loader.h"
+
+#define DEFINE_PRINT_FN(name, level) \
+static int libbpf_##name(const char *fmt, ...) \
+{  \
+   va_list args;   \
+   int ret;\
+   \
+   va_start(args, fmt);\
+   ret = veprintf(level, verbose, pr_fmt(fmt), args);\
+   va_end(args);   \
+   return ret; \
+}
+
+DEFINE_PRINT_FN(warning, 0)
+DEFINE_PRINT_FN(info, 0)
+DEFINE_PRINT_FN(debug, 1)
+
+static bool libbpf_initialized = false;
+
+int bpf__prepare_load(const char *filename)
+{
+   struct bpf_object *obj;
+
+   if (!libbpf_initialized)
+   libbpf_set_print(libbpf_warning,
+libbpf_info,
+libbpf_debug);
+   
+   obj = bpf_object__open(filename);
+   if (!obj) {
+   pr_err("bpf: failed to load %s\n", filename);
+   return -EINVAL;
+   }
+
+   /*
+* Throw object pointer away: it will be retrived using
+* bpf_objects iterater.
+*/
+
+   return 0;
+}
+
+void bpf__clear(void)
+{
+   struct bpf_object *obj, *tmp;
+
+   bpf_object__for_each(obj, tmp)
+   bpf_object__close(obj);
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
new file mode 100644
index 000..d5c22da
--- /dev/null
+++ b/tools/perf/util/bpf-loader.h
@@ -0,0 +1,11 @@
+/*
+ * Copyright (C) 2015, Wang Nan 
+ * Copyright (C) 2015, Huawei Inc.
+ */
+#ifndef __BPF_LOADER_H
+#define __BPF_LOADER_H
+
+int bpf__prepare_load(const char *filename);
+
+void bpf__clear(void);
+#endif
diff --git a/tools/perf/util/debug.c b/tools/perf/util/debug.c
index 2da5581..86d9c73 100644
--- a/tools/perf/util/debug.c
+++ b/tools/perf/util/debug.c
@@ -36,6 +36,11 @@ static int _eprintf(int level, int var, const char *fmt, 
va_list args)
return ret;
 }
 
+int veprintf(int level, int var, const char *fmt, va_list args)
+{
+   return _eprintf(level, var, fmt, args);
+}
+
 int eprintf(int level, int var, const char *fmt, ...)
 {
va_list args;
diff --git a/tools/perf/util/debug.h b/tools/perf/util/debug.h
index caac2fd..8b9a088 100644
--- a/tools/perf/util/debug.h
+++ b/tools/perf/util/debug.h
@@ -50,6 +50,7 @@ void pr_stat(const char *fmt, ...);
 
 int eprintf(int level, int var, const char *fmt, ...) 
__attribute__((format(printf, 3, 4)));
 int eprintf_time(int level, int var, u64 t, const char *fmt, ...) 
__attribute__((format(printf, 4, 5)));
+int veprintf(int level, int var, const char *fmt, va_list args);
 
 int perf_debug_option(const char *str);
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index be06553..6030ea3 100644
--- a/tools/perf/util/parse-events.c
+++

[RFC PATCH v4 20/29] bpf tools: Introduce accessors for struct bpf_program

2015-05-26 Thread Wang Nan

This patch introduces accessors for user of libbpf to retrive section
name and fd of a opened/loaded eBPF program. 'struct bpf_prog_handler'
is used for that purpose. Accessors of programs section name and file
descriptor are provided. Set/get private data are also impelmented.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 82 ++
 tools/lib/bpf/libbpf.h | 25 +++
 2 files changed, 107 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 02fc880..a577f3e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -101,6 +101,10 @@ struct bpf_program {
int nr_reloc;
 
int fd;
+
+   struct bpf_object *obj;
+   void *priv;
+   bpf_program_clear_priv_t clear_priv;
 };
 
 struct bpf_object {
@@ -145,6 +149,12 @@ static void bpf_program__clear(struct bpf_program *prog)
if (!prog)
return;
 
+   if (prog->clear_priv)
+   prog->clear_priv(prog, prog->priv);
+
+   prog->priv = NULL;
+   prog->clear_priv = NULL;
+
bpf_program__unload(prog);
zfree(>section_name);
zfree(>insns);
@@ -206,6 +216,7 @@ bpf_program__new(struct bpf_object *obj, void *data, size_t 
size,
   prog->insns_cnt * sizeof(struct bpf_insn));
prog->idx = idx;
prog->fd = -1;
+   prog->obj = obj;
 
return prog;
 out:
@@ -844,3 +855,74 @@ void bpf_object__close(struct bpf_object *obj)
 
free(obj);
 }
+
+struct bpf_program *
+bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
+{
+   size_t idx;
+
+   if (!obj->programs)
+   return NULL;
+   /* First handler */
+   if (prev == NULL)
+   return (>programs[0]);
+
+   if (prev->obj != obj) {
+   pr_warning("error: program handler doesn't match object\n");
+   return NULL;
+   }
+
+   idx = (prev - obj->programs) + 1;
+   if (idx >= obj->nr_programs)
+   return NULL;
+   return >programs[idx];
+}
+
+int bpf_program__set_private(struct bpf_program *prog,
+void *priv,
+bpf_program_clear_priv_t clear_priv)
+{
+   if (prog->priv && prog->clear_priv)
+   prog->clear_priv(prog, prog->priv);
+
+   prog->priv = priv;
+   prog->clear_priv = clear_priv;
+   return 0;
+}
+
+int bpf_program__get_private(struct bpf_program *prog, void **ppriv)
+{
+   *ppriv = prog->priv;
+   return 0;
+}
+
+int bpf_program__get_title(struct bpf_program *prog,
+  const char **ptitle, bool dup)
+{
+   const char *title;
+   
+   if (!ptitle)
+   return -EINVAL;
+
+   title = prog->section_name;
+   if (dup) {
+   title = strdup(title);
+   if (!title) {
+   pr_warning("failed to strdup program title\n");
+   *ptitle = NULL;
+   return -ENOMEM;
+   }
+   }
+
+   *ptitle = title;
+   return 0;
+}
+
+int bpf_program__get_fd(struct bpf_program *prog, int *pfd)
+{
+   if (!pfd)
+   return -EINVAL;
+
+   *pfd = prog->fd;
+   return 0;
+}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 716e6df..8276735 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -9,6 +9,7 @@
 #define __BPF_LIBBPF_H
 
 #include 
+#include 
 
 void libbpf_set_print(int (*warn)(const char *format, ...),
  int (*info)(const char *format, ...),
@@ -24,6 +25,30 @@ void bpf_object__close(struct bpf_object *object);
 int bpf_object__load(struct bpf_object *obj);
 int bpf_object__unload(struct bpf_object *obj);
 
+/* Accessors of bpf_program. */
+struct bpf_program;
+struct bpf_program *bpf_program__next(struct bpf_program *prog,
+ struct bpf_object *obj);
+
+#define bpf_object__for_each_program(pos, obj) \
+   for ((pos) = bpf_program__next(NULL, (obj));\
+(pos) != NULL; \
+(pos) = bpf_program__next((pos), (obj)))
+
+typedef void (*bpf_program_clear_priv_t)(struct bpf_program *,
+void *);
+
+int bpf_program__set_private(struct bpf_program *prog, void *priv,
+bpf_program_clear_priv_t clear_priv);
+
+int bpf_program__get_private(struct bpf_program *prog,
+void **ppriv);
+
+int bpf_program__get_title(struct bpf_program *prog,
+  const char **ptitle, bool dup);
+
+int bpf_program__get_fd(struct bpf_program *prog, int *pfd);
+
 /*
  * We don't need __attribute__((packed)) now since it is
  * unnecessary for 'bpf_map_def' because they are all aligned.
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to

[RFC PATCH v4 00/29] perf tools: filtering events using eBPF programs

2015-05-26 Thread Wang Nan

This is the 4th version of patch series which tries to introduce eBPF
programs to perf. Based on v4.1-rc3. This patch series improves
'perf record', enables commands like

 # perf record --event bpf-file.o sleep 1

to select events defined in bpf-file.o and filter those events using
bpf programs inside it.

Different from previous (perf tools: introduce 'perf bpf' command to
load eBPF programs.) version, the 4th version drops 'perf bpf'
subcommand, merges event filtering use case directly into 'perf
record'.

Other improvments including:

 1. Simply return an error if byte ordering mismatch, instead of trying
to correct it.

 2. Introduce zfree() and zclose() to free memory and close file to
ensure the pointers are set to NULL and fd set to -1.

 3. Use OO style naming. For example, bpf_object__open() instead of
bpf_open_object() in v3.

 4. libbpf use linked list to link all bpf_object together. Caller
doesn't need to store pointers of bpf_object and bpf_program.

 5. Doesn't treat 'config' section specially.

 6. Remove 'atexit' hook.

 7. Bugfix: if multiple perf events are created for one kprobe event,
only the first try to hook eBPF program can success. Other try
returns EEXIST. Such error should be ignored.

 8. Coding style, makefile and license correction.

Patch 1/29 - 4/29 are preparations, moves some headers from perf
internal include directory to tools/include. libbpf will use them.

Patch 5/29 - 22/29 introduces libbpf. The design principle is similar
to v3, except that allow caller iterate over objects using macro.

Patch 23/29 - 29/29 improve 'perf record' subcommand. In patch 24,
event parsing syntax is improved to accept strings like
'./bpf-file.o' and 'bpf-object.bpf' to be passed by '--event'.

To make it work, following acked patches should be cherry-picked before
applying this series:

 tools: Change FEATURE_TESTS and FEATURE_DISPLAY to weak binding
 perf tools: Set vmlinux_path__nr_entries to 0 in vmlinux_path__exit
 perf/events/core: fix race in bpf program unregister

Wang Nan (29):
  tools: Add __aligned_u64 to types.h
  perf tools: Move linux/kernel.h to tools/include
  perf tools: Move linux/{list.h,poison.h} to tools/include
  bpf tools: Introduce 'bpf' library to tools
  bpf tools: Allow caller to set printing function
  bpf tools: Open eBPF object file and do basic validation
  bpf tools: Check endianess and make libbpf fail early
  bpf tools: Iterate over ELF sections to collect information
  bpf tools: Collect version and license from ELF sections
  bpf tools: Collect map definitions from 'maps' section
  bpf tools: Collect symbol table from SHT_SYMTAB section
  bpf tools: Collect eBPF programs from their own sections
  bpf tools: Collect relocation sections from SHT_REL sections
  bpf tools: Record map accessing instructions for each program
  bpf tools: Add bpf.c/h for common bpf operations
  bpf tools: Create eBPF maps defined in an object file
  bpf tools: Relocate eBPF programs
  bpf tools: Introduce bpf_load_program() to bpf.c
  bpf tools: Load eBPF programs in object files into kernel
  bpf tools: Introduce accessors for struct bpf_program
  bpf tools: Introduce accessors for struct bpf_object
  bpf tools: Link all bpf objects onto a list
  perf tools: Make perf depend on libbpf
  perf record: Enable passing bpf object file to --event
  perf tools: Parse probe points of eBPF programs during preparation
  perf record: Probe at kprobe points
  perf record: Load all eBPF object into kernel
  perf tools: Add bpf_fd field to evsel and config it
  perf tools: Attach eBPF program to perf event

 tools/{perf/util => }/include/linux/kernel.h |   0
 tools/{perf/util => }/include/linux/list.h   |   6 +-
 tools/include/linux/poison.h |   1 +
 tools/include/linux/types.h  |   5 +
 tools/lib/bpf/.gitignore |   2 +
 tools/lib/bpf/Build  |   1 +
 tools/lib/bpf/Makefile   | 190 ++
 tools/lib/bpf/bpf.c  |  90 +++
 tools/lib/bpf/bpf.h  |  23 +
 tools/lib/bpf/libbpf.c   | 969 +++
 tools/lib/bpf/libbpf.h   |  75 +++
 tools/perf/Makefile.perf |  19 +-
 tools/perf/builtin-record.c  |  32 +
 tools/perf/util/Build|   1 +
 tools/perf/util/bpf-loader.c | 284 
 tools/perf/util/bpf-loader.h |  32 +
 tools/perf/util/debug.c  |   5 +
 tools/perf/util/debug.h  |   1 +
 tools/perf/util/evlist.c |  32 +
 tools/perf/util/evlist.h |   1 +
 tools/perf/util/evsel.c  |  17 +
 tools/perf/util/evsel.h  |   1 +
 tools/perf/util/include/linux/poison.h   |   1 -
 tools/perf/util/parse-events.c   |  16 +

[RFC PATCH v4 16/29] bpf tools: Create eBPF maps defined in an object file

2015-05-26 Thread Wang Nan

This patch creates maps based on 'map' section in object file using
bpf_create_map(), and store the fds into an array in
'struct bpf_object'. Since the byte order of the object may differ
from the host, swap map definition before processing.

This is the first patch in 'loading' phase. Previous patches parse ELF
object file and create needed data structure, but doesn't play with
kernel. They belong to 'opening' phase.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 84 ++
 tools/lib/bpf/libbpf.h |  4 +++
 2 files changed, 88 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 3a7ff7d..fe4d282 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -21,6 +21,7 @@
 #include 
 
 #include "libbpf.h"
+#include "bpf.h"
 
 #define __printf(a, b) __attribute__((format(printf, a, b)))
 
@@ -108,6 +109,7 @@ struct bpf_object {
 
struct bpf_program *programs;
size_t nr_programs;
+   int *maps_fds;
 
/*
 * Information when doing elf related work. Only valid if fd
@@ -534,6 +536,57 @@ bpf_program__collect_reloc(struct bpf_program *prog,
return 0;
 }
 
+static int
+bpf_object__create_maps(struct bpf_object *obj)
+{
+   unsigned int i;
+   size_t nr_maps;
+   int *pfd;
+
+   nr_maps = obj->maps_buf_sz / sizeof(struct bpf_map_def);
+   if (!obj->maps_buf || !nr_maps) {
+   pr_debug("don't need create maps for %s\n",
+obj->path);
+   return 0;
+   }
+
+   obj->maps_fds = malloc(sizeof(int) * nr_maps);
+   if (!obj->maps_fds) {
+   pr_warning("realloc perf_bpf_maps_fds failed\n");
+   return -ENOMEM;
+   }
+
+   /* fill all fd with -1 */
+   memset(obj->maps_fds, 0xff, sizeof(int) * nr_maps);
+   
+   pfd = obj->maps_fds;
+   for (i = 0; i < nr_maps; i++) {
+   struct bpf_map_def def;
+
+   def = *(struct bpf_map_def *)(obj->maps_buf +
+   i * sizeof(struct bpf_map_def));
+
+   *pfd = bpf_create_map(def.type,
+ def.key_size,
+ def.value_size,
+ def.max_entries);
+   if (*pfd < 0) {
+   size_t j;
+   int err = *pfd;
+
+   pr_warning("failed to create map: %s\n",
+  strerror(errno));
+   for (j = 0; j < i; j++)
+   zclose(obj->maps_fds[j]);
+   zfree(>maps_fds);
+   return err;
+   }
+   pr_debug("create map: fd=%d\n", *pfd);
+   pfd ++;
+   }
+   return 0;
+}
+
 static int bpf_object__collect_reloc(struct bpf_object *obj)
 {
int i, err;
@@ -619,6 +672,36 @@ out:
return NULL;
 }
 
+int bpf_object__unload(struct bpf_object *obj)
+{
+   size_t i;
+   size_t sz = sizeof(struct bpf_map_def);
+
+   if (!obj)
+   return -EINVAL;
+
+   for (i = 0; i < obj->maps_buf_sz; i += sz)
+   zclose(obj->maps_fds[i]);
+   zfree(>maps_fds);
+
+   return 0;
+}
+
+int bpf_object__load(struct bpf_object *obj)
+{
+   if (!obj)
+   return -EINVAL;
+
+   if (bpf_object__create_maps(obj))
+   goto out;
+
+   return 0;
+out:
+   bpf_object__unload(obj);
+   pr_warning("failed to load object '%s'\n", obj->path);
+   return -EINVAL;
+}
+
 void bpf_object__close(struct bpf_object *obj)
 {
size_t i;
@@ -627,6 +710,7 @@ void bpf_object__close(struct bpf_object *obj)
return;
 
bpf_object__elf_finish(obj);
+   bpf_object__unload(obj);
 
zfree(>maps_buf);
 
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 73f796f..716e6df 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -20,6 +20,10 @@ struct bpf_object;
 struct bpf_object *bpf_object__open(const char *path);
 void bpf_object__close(struct bpf_object *object);
 
+/* Load/unload object into/from kernel */
+int bpf_object__load(struct bpf_object *obj);
+int bpf_object__unload(struct bpf_object *obj);
+
 /*
  * We don't need __attribute__((packed)) now since it is
  * unnecessary for 'bpf_map_def' because they are all aligned.
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 12/29] bpf tools: Collect eBPF programs from their own sections

2015-05-26 Thread Wang Nan

This patch collects all programs in an object file into an array of
'struct bpf_program' for further processing. That structure is for
representing each eBPF program. 'bpf_prog' should be a better name, but
it has been used by linux/filter.h. Although it is a kernel space name,
I still prefer to call it 'bpf_program' to prevent possible confusion.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 105 +
 1 file changed, 105 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index d89fd42..17b2aa1 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -81,12 +81,27 @@ void libbpf_set_print(int (*warn)(const char *format, ...),
 # define LIBBPF_ELF_C_READ_MMAP ELF_C_READ
 #endif
 
+/* 
+ * bpf_prog should be a better name but it has been used in
+ * linux/filter.h.
+ */
+struct bpf_program {
+   /* Index in elf obj file, for relocation use. */
+   int idx;
+   char *section_name;
+   struct bpf_insn *insns;
+   size_t insns_cnt;
+};
+
 struct bpf_object {
char license[64];
u32 kern_version;
void *maps_buf;
size_t maps_buf_sz;
 
+   struct bpf_program *programs;
+   size_t nr_programs;
+
/*
 * Information when doing elf related work. Only valid if fd
 * is valid.
@@ -101,6 +116,74 @@ struct bpf_object {
 };
 #define obj_elf_valid(o)   ((o)->efile.fd >= 0)
 
+static void bpf_program__clear(struct bpf_program *prog)
+{
+   if (!prog)
+   return;
+
+   zfree(>section_name);
+   zfree(>insns);
+   prog->insns_cnt = 0;
+   prog->idx = -1;
+}
+
+static struct bpf_program *
+bpf_program__new(struct bpf_object *obj, void *data, size_t size,
+char *name, int idx)
+{
+   struct bpf_program *prog, *progs;
+   int nr_progs;
+
+   if (size < sizeof(struct bpf_insn)) {
+   pr_warning("corrupted section '%s'\n", name);
+   return NULL;
+   }
+   
+   progs = obj->programs;
+   nr_progs = obj->nr_programs;
+
+   progs = realloc(progs, sizeof(*prog) * (nr_progs + 1));
+   if (!progs) {
+   /*
+* In this case the original obj->programs
+* is still valid, so don't need special treat for
+* bpf_close_object().
+*/
+   pr_warning("failed to alloc a new program '%s'\n",
+  name);
+   return NULL;
+   }
+
+   obj->programs = progs;
+
+   prog = [nr_progs];
+   bzero(prog, sizeof(*prog));
+
+   obj->nr_programs = nr_progs + 1;
+
+   prog->section_name = strdup(name);
+   if (!prog->section_name) {
+   pr_warning("failed to alloc name for prog %s\n",
+  name);
+   goto out;
+   }
+
+   prog->insns = malloc(size);
+   if (!prog->insns) {
+   pr_warning("failed to alloc insns for %s\n", name);
+   goto out;
+   }
+   prog->insns_cnt = size / sizeof(struct bpf_insn);
+   memcpy(prog->insns, data,
+  prog->insns_cnt * sizeof(struct bpf_insn));
+   prog->idx = idx;
+
+   return prog;
+out:
+   bpf_program__clear(prog);
+   return NULL;
+}
+
 static struct bpf_object *bpf_object__new(const char *path)
 {
struct bpf_object *obj;
@@ -318,6 +401,21 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
err = -EEXIST;
} else
obj->efile.symbols = data;
+   } else if ((sh.sh_type == SHT_PROGBITS) &&
+  (sh.sh_flags & SHF_EXECINSTR) &&
+  (data->d_size > 0)) {
+   struct bpf_program *prog;
+
+   prog = bpf_program__new(obj, data->d_buf,
+   data->d_size, name,
+   idx);
+   if (!prog) {
+   pr_warning("failed to alloc program %s (%s)",
+  name, obj->path);
+   err = -ENOMEM;
+   } else
+   pr_debug("found program %s\n",
+prog->section_name);
}
if (err)
goto out;
@@ -373,11 +471,18 @@ out:
 
 void bpf_object__close(struct bpf_object *obj)
 {
+   size_t i;
+
if (!obj)
return;
 
bpf_object__elf_finish(obj);
 
zfree(>maps_buf);
+
+   for (i = 0; i < obj->nr_programs; i++)
+   bpf_program__clear(>programs[i]);
+   zfree(>programs);
+
free(obj);
 }
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org

Re: [PATCH try #4] proc: fix PAGE_SIZE limit of /proc/$PID/cmdline

2015-05-26 Thread Jarod Wilson

On May 26, 2015, at 5:24 PM, Alexey Dobriyan  wrote:
> 
>> On Tue, May 26, 2015 at 04:42:36PM -0400, Jarod Wilson wrote:
>>> On 5/8/2015 8:28 AM, Alexey Dobriyan wrote:
>>> /proc/$PID/cmdline truncates output at PAGE_SIZE. It is easy to see with
>>> 
>>>$ cat /proc/self/cmdline $(seq 1037) 2>/dev/null
>>> 
>>> However, command line size was never limited to PAGE_SIZE but to 128 KB and
>>> relatively recently limitation was removed altogether.
>>> 
>>> People noticed and ask questions:
>>> http://stackoverflow.com/questions/199130/how-do-i-increase-the-proc-pid-cmdline-4096-byte-limit
>>> 
>>> seq file interface is not OK, because it kmalloc's for whole output and
>>> open + read(, 1) + sleep will pin arbitrary amounts of kernel memory.
>>> To not do that, limit must be imposed which is incompatible with
>>> arbitrary sized command lines.
>>> 
>>> I apologize for hairy code, but this it direct consequence of command line
>>> layout in memory and hacks to support things like "init [3]".
>>> 
>>> The loops are "unrolled" otherwise it is either macros which hide
>>> control flow or functions with 7-8 arguments with equal line count.
>>> 
>>> There should be real setproctitle(2) or something.
>>> 
>>> Signed-off-by: Alexey Dobriyan 
>>> Tested-by: Jarod Wilson 
>>> Acked-by: Jarod Wilson 
>> 
>> Should have tested on more than just x86, it appears. We've started 
>> hammering on this internally across all arches, and its exploded 
>> multiple times on ppc64 now:
>> 
>> [ 2717.074699] [ cut here ]
>> [ 2717.074787] kernel BUG at fs/proc/base.c:244!
> 
>> OE--   3.10.0-255.el7.ppc64.debug #1
> 
> Which BUG_ON is this?
> 
>BUG_ON(*pos < 0);
>BUG_ON(arg_start > arg_end);
>BUG_ON(env_start > env_end);

Ah, sorry, right, might not be exactly the same with the back-up ported 
version... It was the env_start > env_end one.

-- 
Jarod Wilson--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 14/29] bpf tools: Record map accessing instructions for each program

2015-05-26 Thread Wang Nan

This patch records the indics of instructions which are needed to be
relocated. Those information are saved in 'reloc_desc' field in
'struct bpf_program'. In loading phase (this patch takes effect in
opening phase), the collected instructions will be replaced by
map loading instructions.

Since we are going to close the ELF file and clear all data at the end
of 'opening' phase, ELF information will no longer be valid in
'loading' phase. We have to locate the instructions before maps are
loaded, instead of directly modifying the instruction.

'struct bpf_map_def' is introduce in this patch to let us know how many
maps defined in the object.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 124 +
 tools/lib/bpf/libbpf.h |  13 ++
 2 files changed, 137 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 560018e..3a7ff7d 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -91,6 +92,12 @@ struct bpf_program {
char *section_name;
struct bpf_insn *insns;
size_t insns_cnt;
+
+   struct {
+   int insn_idx;
+   int map_idx;
+   } *reloc_desc;
+   int nr_reloc;
 };
 
 struct bpf_object {
@@ -128,6 +135,9 @@ static void bpf_program__clear(struct bpf_program *prog)
 
zfree(>section_name);
zfree(>insns);
+   zfree(>reloc_desc);
+
+   prog->nr_reloc = 0;
prog->insns_cnt = 0;
prog->idx = -1;
 }
@@ -450,6 +460,118 @@ out:
return err;
 }
 
+static struct bpf_program *
+bpf_object__find_prog_by_idx(struct bpf_object *obj, int idx)
+{
+   struct bpf_program *prog;
+   size_t i;
+
+   for (i = 0; i < obj->nr_programs; i++) {
+   prog = >programs[i];
+   if (prog->idx == idx)
+   return prog;
+   }
+   return NULL;
+}
+
+static int
+bpf_program__collect_reloc(struct bpf_program *prog,
+  size_t nr_maps, GElf_Shdr *shdr,
+  Elf_Data *data, Elf_Data *symbols)
+{
+   int i, nrels;
+
+   pr_debug("collecting relocating info for: '%s'\n",
+prog->section_name);
+   nrels = shdr->sh_size / shdr->sh_entsize;
+   
+   prog->reloc_desc = malloc(sizeof(*prog->reloc_desc) * nrels);
+   if (!prog->reloc_desc) {
+   pr_warning("failed to alloc memory in relocation\n");
+   return -ENOMEM;
+   }
+   prog->nr_reloc = nrels;
+
+   for (i = 0; i < nrels; i++) {
+   GElf_Sym sym;
+   GElf_Rel rel;
+   unsigned int insn_idx;
+   struct bpf_insn *insns = prog->insns;
+   size_t map_idx;
+
+   if (!gelf_getrel(data, i, )) {
+   pr_warning("relocation: failed to get %d reloc\n", i);
+   return -EINVAL;
+   }
+
+   insn_idx = rel.r_offset / sizeof(struct bpf_insn);
+   pr_debug("relocation: insn_idx=%u\n", insn_idx);
+
+   if (!gelf_getsym(symbols,
+GELF_R_SYM(rel.r_info),
+)) {
+   pr_warning("relocation: symbol %"PRIx64" not found\n",
+  GELF_R_SYM(rel.r_info));
+   return -EINVAL;
+   }
+
+   if (insns[insn_idx].code != (BPF_LD | BPF_IMM | BPF_DW)) {
+   pr_warning("bpf: relocation: invalid relo for 
insns[%d].code 0x%x\n",
+  insn_idx, insns[insn_idx].code);
+   return -EINVAL;
+   }
+
+   map_idx = sym.st_value / sizeof(struct bpf_map_def);
+   if (map_idx >= nr_maps) {
+   pr_warning("bpf relocation: map_idx %d large than %d\n",
+  (int)map_idx, (int)nr_maps - 1);
+   return -EINVAL;
+   }
+
+   prog->reloc_desc[i].insn_idx = insn_idx;
+   prog->reloc_desc[i].map_idx = map_idx;
+   }
+   return 0;
+}
+
+static int bpf_object__collect_reloc(struct bpf_object *obj)
+{
+   int i, err;
+
+   if (!obj_elf_valid(obj)) {
+   pr_warning("Internal error: elf object is closed\n");
+   return -EINVAL;
+   }
+
+   for (i = 0; i < obj->efile.nr_reloc; i++) {
+   GElf_Shdr *shdr = >efile.reloc[i].shdr;
+   Elf_Data *data = obj->efile.reloc[i].data;
+   int idx = shdr->sh_info;
+   struct bpf_program *prog;
+   size_t nr_maps = obj->maps_buf_sz /
+sizeof(struct bpf_map_def);
+
+   if (shdr->sh_type != SHT_REL) {
+   pr_warning("internal error at %d\n", __LINE__);
+

[RFC PATCH v4 02/29] perf tools: Move linux/kernel.h to tools/include

2015-05-26 Thread Wang Nan

This patch moves kernel.h from tools/perf/util/include/linux/kernel.h
to tools/include/linux/kernel.h to enable other libraries use macros in
it, like libbpf which will be introduced by further patches.

Signed-off-by: Wang Nan 
---
 tools/{perf/util => }/include/linux/kernel.h | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename tools/{perf/util => }/include/linux/kernel.h (100%)

diff --git a/tools/perf/util/include/linux/kernel.h 
b/tools/include/linux/kernel.h
similarity index 100%
rename from tools/perf/util/include/linux/kernel.h
rename to tools/include/linux/kernel.h
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 22/29] bpf tools: Link all bpf objects onto a list

2015-05-26 Thread Wang Nan

To prevent caller from creating additional structures to hold
pointers of 'struct bpf_object', this patch link all such
structures onto a list (hidden to user). bpf_object__for_each() is
introduced to allow users iterate over each objects.
bpf_object__for_each() is safe even user close the object during
iteration.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 32 
 tools/lib/bpf/libbpf.h |  7 +++
 2 files changed, 39 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e5084fe..e2b41b7 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -107,6 +108,8 @@ struct bpf_program {
bpf_program_clear_priv_t clear_priv;
 };
 
+static LIST_HEAD(bpf_objects_list);
+
 struct bpf_object {
char license[64];
u32 kern_version;
@@ -132,6 +135,12 @@ struct bpf_object {
} *reloc;
int nr_reloc;
} efile;
+   /*
+* All loaded bpf_object is linked in a list, which is
+* hidden to caller. bpf_objects__ handlers deal with
+* all objects.
+*/
+   struct list_head list;
char path[];
 };
 #define obj_elf_valid(o)   ((o)->efile.fd >= 0)
@@ -236,6 +245,9 @@ static struct bpf_object *bpf_object__new(const char *path)
 
strcpy(obj->path, path);
obj->efile.fd = -1;
+
+   INIT_LIST_HEAD(>list);
+   list_add(>list, _objects_list);
return obj;
 }
 
@@ -853,6 +865,7 @@ void bpf_object__close(struct bpf_object *obj)
bpf_program__clear(>programs[i]);
zfree(>programs);
 
+   list_del(>list);
free(obj);
 }
 
@@ -865,6 +878,25 @@ int bpf_object__get_prog_cnt(struct bpf_object *obj, 
size_t *pcnt)
return 0;
 }
 
+struct bpf_object *
+bpf_object__next(struct bpf_object *prev)
+{
+   struct bpf_object *next;
+
+   if (!prev)
+   next = list_first_entry(_objects_list,
+   struct bpf_object,
+   list);
+   else
+   next = list_next_entry(prev, list);
+
+   /* Empty list is noticed here so don't need checking on entry. */
+   if (>list == _objects_list)
+   return NULL;
+
+   return next;
+}
+
 struct bpf_program *
 bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 84c83d4..fa271ec 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -28,6 +28,13 @@ int bpf_object__unload(struct bpf_object *obj);
 /* Accessors of bpf_object */
 int bpf_object__get_prog_cnt(struct bpf_object *obj, size_t *pcnt);
 
+struct bpf_object *bpf_object__next(struct bpf_object *prev);
+#define bpf_object__for_each(pos, tmp) \
+   for ((pos) = bpf_object__next(NULL),\
+   (tmp) = bpf_object__next(pos);  \
+(pos) != NULL; \
+(pos) = (tmp), (tmp) = bpf_object__next(tmp))
+
 /* Accessors of bpf_program. */
 struct bpf_program;
 struct bpf_program *bpf_program__next(struct bpf_program *prog,
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 07/29] bpf tools: Check endianess and make libbpf fail early

2015-05-26 Thread Wang Nan

Check endianess according to EHDR. Code is taken from
tools/perf/util/symbol-elf.c.

Libbpf doesn't magically convert missmatched endianess. See discussion
on https://lkml.org/lkml/2015/5/18/650 that, even if we swap
eBPF instructions to correct byte order, we are unable to deal with
endianess in code logical generated by LLVM.

Therefore, libbpf should simply reject missmatched ELF object, and let
LLVM to create good code.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e910fb8..dbbea0c 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -169,6 +169,34 @@ errout:
return err;
 }
 
+static int
+bpf_object__check_endianess(struct bpf_object *obj)
+{
+   static unsigned int const endian = 1;
+
+   switch (obj->efile.ehdr.e_ident[EI_DATA]) {
+   case ELFDATA2LSB:
+   /* We are big endian, BPF obj is little endian. */
+   if (*(unsigned char const *) != 1)
+   goto mismatch;
+   break;
+
+   case ELFDATA2MSB:
+   /* We are little endian, BPF obj is big endian. */
+   if (*(unsigned char const *) != 0)
+   goto mismatch;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   return 0;
+
+mismatch:
+   pr_warning("Error: endianess mismatch.\n");
+   return -EINVAL;
+}
+
 struct bpf_object *bpf_object__open(const char *path)
 {
struct bpf_object *obj;
@@ -190,6 +218,8 @@ struct bpf_object *bpf_object__open(const char *path)
 
if (bpf_object__elf_init(obj))
goto out;
+   if (bpf_object__check_endianess(obj))
+   goto out;
 
bpf_object__elf_finish(obj);
return obj;
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 03/29] perf tools: Move linux/{list.h,poison.h} to tools/include

2015-05-26 Thread Wang Nan

This patch moves list.h from tools/perf/util/include/linux/list.h
to tools/include/linux/list.h to enable other libraries use macros in
it, like libbpf which will be introduced by further patches. Since
list.h depend on poison.h, poison.h is also moved.

Both file use relative path, so one '..' is removed for each header
to make them suit for new directory.

Signed-off-by: Wang Nan 
---
 tools/{perf/util => }/include/linux/list.h | 6 +++---
 tools/include/linux/poison.h   | 1 +
 tools/perf/util/include/linux/poison.h | 1 -
 3 files changed, 4 insertions(+), 4 deletions(-)
 rename tools/{perf/util => }/include/linux/list.h (90%)
 create mode 100644 tools/include/linux/poison.h
 delete mode 100644 tools/perf/util/include/linux/poison.h

diff --git a/tools/perf/util/include/linux/list.h b/tools/include/linux/list.h
similarity index 90%
rename from tools/perf/util/include/linux/list.h
rename to tools/include/linux/list.h
index 76ddbc7..76b014c 100644
--- a/tools/perf/util/include/linux/list.h
+++ b/tools/include/linux/list.h
@@ -1,10 +1,10 @@
 #include 
 #include 
 
-#include "../../../../include/linux/list.h"
+#include "../../../include/linux/list.h"
 
-#ifndef PERF_LIST_H
-#define PERF_LIST_H
+#ifndef TOOLS_LIST_H
+#define TOOLS_LIST_H
 /**
  * list_del_range - deletes range of entries from list.
  * @begin: first element in the range to delete from the list.
diff --git a/tools/include/linux/poison.h b/tools/include/linux/poison.h
new file mode 100644
index 000..0c27bdf
--- /dev/null
+++ b/tools/include/linux/poison.h
@@ -0,0 +1 @@
+#include "../../../include/linux/poison.h"
diff --git a/tools/perf/util/include/linux/poison.h 
b/tools/perf/util/include/linux/poison.h
deleted file mode 100644
index fef6dbc..000
--- a/tools/perf/util/include/linux/poison.h
+++ /dev/null
@@ -1 +0,0 @@
-#include "../../../../include/linux/poison.h"
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 15/29] bpf tools: Add bpf.c/h for common bpf operations

2015-05-26 Thread Wang Nan

This patch introduces bpf.c and bpf.h, which hold common functions
issuing bpf syscall. The goal of these two files is to hide syscall
completly from user.  Note that bpf.c and bpf.h only deal with kernel
interface. Things like structure of 'map' section in the ELF object is
not cared by of bpf.[ch].

We first introduce bpf_create_map().

Note that, since functions in bpf.[ch] are wrapper of sys_bpf, they
don't use OO style naming.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/Build |  2 +-
 tools/lib/bpf/bpf.c | 56 +
 tools/lib/bpf/bpf.h | 16 +++
 3 files changed, 73 insertions(+), 1 deletion(-)
 create mode 100644 tools/lib/bpf/bpf.c
 create mode 100644 tools/lib/bpf/bpf.h

diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
index a316484..d874975 100644
--- a/tools/lib/bpf/Build
+++ b/tools/lib/bpf/Build
@@ -1 +1 @@
-libbpf-y := libbpf.o
+libbpf-y := libbpf.o bpf.o
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
new file mode 100644
index 000..7481923
--- /dev/null
+++ b/tools/lib/bpf/bpf.c
@@ -0,0 +1,56 @@
+/*
+ * common eBPF ELF operations.
+ *
+ * Copyright (C) 2013-2015 Alexei Starovoitov 
+ * Copyright (C) 2015 Wang Nan 
+ * Copyright (C) 2015 Huawei Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "bpf.h"
+
+/* When building perf, unistd.h is override. __NR_bpf by ourself. */
+#if defined(__i386__)
+#ifndef __NR_bpf
+# define __NR_bpf 357
+#endif
+#endif
+
+#if defined(__x86_64__)
+#ifndef __NR_bpf
+# define __NR_bpf 321
+#endif
+#endif
+
+#if defined(__aarch64__)
+#ifndef __NR_bpf
+# define __NR_bpf 280
+#endif
+#endif
+
+#ifndef __NR_bpf
+# error __NR_bpf not defined. libbpf does not support your arch.
+#endif
+
+static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
+{
+   return syscall(__NR_bpf, cmd, attr, size);
+}
+
+int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
+  int max_entries)
+{
+   union bpf_attr attr;
+   memset(, '\0', sizeof(attr));
+   
+   attr.map_type = map_type;
+   attr.key_size = key_size;
+   attr.value_size = value_size;
+   attr.max_entries = max_entries;
+
+   return sys_bpf(BPF_MAP_CREATE, , sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
new file mode 100644
index 000..28f7942
--- /dev/null
+++ b/tools/lib/bpf/bpf.h
@@ -0,0 +1,16 @@
+/*
+ * common eBPF ELF operations.
+ *
+ * Copyright (C) 2013-2015 Alexei Starovoitov 
+ * Copyright (C) 2015 Wang Nan 
+ * Copyright (C) 2015 Huawei Inc.
+ */
+#ifndef __BPF_BPF_H
+#define __BPF_BPF_H
+
+#include 
+
+int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
+  int max_entries);
+
+#endif
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 13/29] bpf tools: Collect relocation sections from SHT_REL sections

2015-05-26 Thread Wang Nan

This patch collects relocation sections into 'struct object'.
Such sections are used for connecting maps to bpf programs.
'reloc' field in 'struct bpf_object' is introduced for storing
such informations.

This patch simply store the data into 'reloc' field. Following
patch will parse them to know the exact instructions which are
needed to be relocated.

Note that the collected data will be invalid after ELF object file
is closed.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 17b2aa1..560018e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -111,6 +111,11 @@ struct bpf_object {
Elf *elf;
GElf_Ehdr ehdr;
Elf_Data *symbols;
+   struct {
+   GElf_Shdr shdr;
+   Elf_Data *data;
+   } *reloc;
+   int nr_reloc;
} efile;
char path[];
 };
@@ -209,6 +214,9 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
obj->efile.elf = NULL;
}
obj->efile.symbols = NULL;
+
+   zfree(>efile.reloc);
+   obj->efile.nr_reloc = 0;
zclose(obj->efile.fd);
 }
 
@@ -416,6 +424,24 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
} else
pr_debug("found program %s\n",
 prog->section_name);
+   } else if (sh.sh_type == SHT_REL) {
+   void *reloc = obj->efile.reloc;
+   int nr_reloc = obj->efile.nr_reloc;
+
+   reloc = realloc(reloc,
+   sizeof(*obj->efile.reloc) * 
(++nr_reloc));
+   if (!reloc) {
+   pr_warning("realloc failed\n");
+   err = -ENOMEM;
+   } else {
+   int n = nr_reloc - 1;
+
+   obj->efile.reloc = reloc;
+   obj->efile.nr_reloc = nr_reloc;
+
+   obj->efile.reloc[n].shdr = sh;
+   obj->efile.reloc[n].data = data;
+   }
}
if (err)
goto out;
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 21/29] bpf tools: Introduce accessors for struct bpf_object

2015-05-26 Thread Wang Nan

This patch add an accessor which allows caller to get count of programs
in an object file.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 9 +
 tools/lib/bpf/libbpf.h | 3 +++
 2 files changed, 12 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index a577f3e..e5084fe 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -856,6 +856,15 @@ void bpf_object__close(struct bpf_object *obj)
free(obj);
 }
 
+int bpf_object__get_prog_cnt(struct bpf_object *obj, size_t *pcnt)
+{
+   if (!obj || !pcnt)
+   return -EINVAL;
+
+   *pcnt = obj->nr_programs;
+   return 0;
+}
+
 struct bpf_program *
 bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 8276735..84c83d4 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -25,6 +25,9 @@ void bpf_object__close(struct bpf_object *object);
 int bpf_object__load(struct bpf_object *obj);
 int bpf_object__unload(struct bpf_object *obj);
 
+/* Accessors of bpf_object */
+int bpf_object__get_prog_cnt(struct bpf_object *obj, size_t *pcnt);
+
 /* Accessors of bpf_program. */
 struct bpf_program;
 struct bpf_program *bpf_program__next(struct bpf_program *prog,
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 01/29] tools: Add __aligned_u64 to types.h

2015-05-26 Thread Wang Nan

Following patches will introduce linux/bpf.h to a new libbpf library,
which requires definition of __aligned_u64. This patch add it to the
common types.h for tools.

Signed-off-by: Wang Nan 
---
 tools/include/linux/types.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/include/linux/types.h b/tools/include/linux/types.h
index b5cf25e..10a2cdc 100644
--- a/tools/include/linux/types.h
+++ b/tools/include/linux/types.h
@@ -60,6 +60,11 @@ typedef __u32 __bitwise __be32;
 typedef __u64 __bitwise __le64;
 typedef __u64 __bitwise __be64;
 
+/* Taken from uapi/linux/types.h. Required by linux/bpf.h */
+#ifndef __aligned_u64
+# define __aligned_u64 __u64 __attribute__((aligned(8)))
+#endif
+
 struct list_head {
struct list_head *next, *prev;
 };
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 10/29] bpf tools: Collect map definitions from 'maps' section

2015-05-26 Thread Wang Nan

If maps are used by eBPF programs, corresponding object file(s) should
contain a section named 'map'. Which contains map definitions. This
patch copies the data of the whole section. Map data parsing should be
acted just before map loading.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 15525ad..da83766 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -84,6 +84,9 @@ void libbpf_set_print(int (*warn)(const char *format, ...),
 struct bpf_object {
char license[64];
u32 kern_version;
+   void *maps_buf;
+   size_t maps_buf_sz;
+
/*
 * Information when doing elf related work. Only valid if fd
 * is valid.
@@ -226,6 +229,28 @@ bpf_object__init_kversion(struct bpf_object *obj,
return 0;
 }
 
+static int
+bpf_object__init_maps(struct bpf_object *obj, void *data,
+ size_t size)
+{
+   if (size == 0) {
+   pr_debug("%s doesn't need map definition\n",
+obj->path);
+   return 0;
+   }
+
+   obj->maps_buf = malloc(size);
+   if (!obj->maps_buf) {
+   pr_warning("malloc maps failed: %s\n", obj->path);
+   return -ENOMEM;
+   }
+
+   obj->maps_buf_sz = size;
+   memcpy(obj->maps_buf, data, size);
+   pr_debug("maps in %s: %ld bytes\n", obj->path, (long)size);
+   return 0;
+}
+
 static int bpf_object__elf_collect(struct bpf_object *obj)
 {
Elf *elf = obj->efile.elf;
@@ -281,6 +306,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
err = bpf_object__init_kversion(obj,
data->d_buf,
data->d_size);
+   else if (strcmp(name, "maps") == 0)
+   err = bpf_object__init_maps(obj, data->d_buf,
+   data->d_size);
if (err)
goto out;
}
@@ -340,5 +368,6 @@ void bpf_object__close(struct bpf_object *obj)
 
bpf_object__elf_finish(obj);
 
+   zfree(>maps_buf);
free(obj);
 }
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 06/29] bpf tools: Open eBPF object file and do basic validation

2015-05-26 Thread Wang Nan

This patch defines basic interface of libbpf. 'struct bpf_object' will
be the handler of each object file. Its internal structure is hide to
user. eBPF object files are compiled by LLVM as ELF format. In this
patch, libelf is used to open those files, read EHDR and do basic
validation according to e_type and e_machine.

All elf related staffs are grouped together and reside in efile field of
'struct bpf_object'. bpf_object__elf_finish() is introduced to clear it.

After all eBPF programs in an object file are loaded, related ELF
information is useless. Close the object file and free those memory.

zfree() and zclose() are introduced to ensure setting NULL pointers and
negative file descriptors after resources are released.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 152 +
 tools/lib/bpf/libbpf.h |   8 +++
 2 files changed, 160 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 49091c3..e910fb8 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -11,8 +11,12 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
+#include 
+#include 
 
 #include "libbpf.h"
 
@@ -55,3 +59,151 @@ void libbpf_set_print(int (*warn)(const char *format, ...),
__pr_info = info;
__pr_debug = debug;
 }
+
+/* Copied from tools/perf/util/util.h */
+#ifndef zfree
+# define zfree(ptr) ({ free(*ptr); *ptr = NULL; })
+#endif
+
+#ifndef zclose
+# define zclose(fd) ({ \
+   int ___err = 0; \
+   if ((fd) >= 0)  \
+   ___err = close((fd));   \
+   fd = -1;\
+   ___err;})
+#endif
+
+#ifdef HAVE_LIBELF_MMAP_SUPPORT
+# define LIBBPF_ELF_C_READ_MMAP ELF_C_READ_MMAP
+#else
+# define LIBBPF_ELF_C_READ_MMAP ELF_C_READ
+#endif
+
+struct bpf_object {
+   /*
+* Information when doing elf related work. Only valid if fd
+* is valid.
+*/
+   struct {
+   int fd;
+   Elf *elf;
+   GElf_Ehdr ehdr;
+   } efile;
+   char path[];
+};
+#define obj_elf_valid(o)   ((o)->efile.fd >= 0)
+
+static struct bpf_object *bpf_object__new(const char *path)
+{
+   struct bpf_object *obj;
+
+   obj = calloc(1, sizeof(struct bpf_object) + strlen(path) + 1);
+   if (!obj) {
+   pr_warning("alloc memory failed for %s\n", path);
+   return NULL;
+   }
+
+   strcpy(obj->path, path);
+   obj->efile.fd = -1;
+   return obj;
+}
+
+static void bpf_object__elf_finish(struct bpf_object *obj)
+{
+   if (!obj_elf_valid(obj))
+   return;
+
+   if (obj->efile.elf) {
+   elf_end(obj->efile.elf);
+   obj->efile.elf = NULL;
+   }
+   zclose(obj->efile.fd);
+}
+
+static int bpf_object__elf_init(struct bpf_object *obj)
+{
+   int err = 0;
+   GElf_Ehdr *ep;
+
+   if (obj_elf_valid(obj)) {
+   pr_warning("elf init: internal error\n");
+   return -EEXIST;
+   }
+   
+   obj->efile.fd = open(obj->path, O_RDONLY);
+   if (obj->efile.fd < 0) {
+   pr_warning("failed to open %s: %s\n", obj->path,
+   strerror(errno));
+   return -errno;
+   }
+
+   obj->efile.elf = elf_begin(obj->efile.fd,
+LIBBPF_ELF_C_READ_MMAP,
+NULL);
+   if (!obj->efile.elf) {
+   pr_warning("failed to open %s as ELF file\n",
+   obj->path);
+   err = -EINVAL;
+   goto errout;
+   }
+
+   if (!gelf_getehdr(obj->efile.elf, >efile.ehdr)) {
+   pr_warning("failed to get EHDR from %s\n",
+   obj->path);
+   err = -EINVAL;
+   goto errout;
+   }
+   ep = >efile.ehdr;
+
+   if ((ep->e_type != ET_REL) || (ep->e_machine != 0)) {
+   pr_warning("%s is not an eBPF object file\n",
+   obj->path);
+   err = -EINVAL;
+   goto errout;
+   }
+
+   return 0;
+errout:
+   bpf_object__elf_finish(obj);
+   return err;
+}
+
+struct bpf_object *bpf_object__open(const char *path)
+{
+   struct bpf_object *obj;
+
+   /* param validation */
+   if (!path)
+   return NULL;
+
+   pr_debug("loading %s\n", path);
+
+   if (elf_version(EV_CURRENT) == EV_NONE) {
+   pr_warning("failed to init libelf for %s\n", path);
+   return NULL;
+   }
+
+   obj = bpf_object__new(path);
+   if (!obj)
+   return NULL;
+
+   if (bpf_object__elf_init(obj))
+   goto out;
+
+   bpf_object__elf_finish(obj);
+   return obj;
+out:
+   bpf_object__close(obj);
+   return NULL;
+}
+
+void bpf_object__close(struct bpf_object *obj)
+{
+   if (!obj)
+

[RFC PATCH v4 25/29] perf tools: Parse probe points of eBPF programs during preparation

2015-05-26 Thread Wang Nan

This patch parses section name of each program, and creates
corresponding 'struct perf_probe_event' structure.

parse_perf_probe_command() is used to do the main parsing works.
Parsing result is stored into a global array. This is because
add_perf_probe_events() is non-reentrantable. In following patch,
add_perf_probe_events will be introduced to insert kprobes. It accepts
an array of 'struct perf_probe_event' and do all works in one call.

Define PERF_BPF_PROBE_GROUP as "perf_bpf_probe", which will be used
as group name of all eBPF probing points.

This patch utilizes bpf_program__set_private(), bind perf_probe_event
with bpf program by private field.

Signed-off-by: Wang Nan 
---
 tools/perf/util/bpf-loader.c | 123 ++-
 tools/perf/util/bpf-loader.h |   2 +
 2 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index eef15f3..1022729 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -9,6 +9,8 @@
 #include "perf.h"
 #include "debug.h"
 #include "bpf-loader.h"
+#include "probe-event.h"
+#include "probe-finder.h"
 
 #define DEFINE_PRINT_FN(name, level) \
 static int libbpf_##name(const char *fmt, ...) \
@@ -28,9 +30,120 @@ DEFINE_PRINT_FN(debug, 1)
 
 static bool libbpf_initialized = false;
 
+static struct perf_probe_event probe_event_array[MAX_PROBES];
+static size_t nr_probe_events;
+
+static struct perf_probe_event *
+alloc_perf_probe_event(void)
+{
+   struct perf_probe_event *pev;
+   int n = nr_probe_events;
+   if (n >= MAX_PROBES) {
+   pr_err("bpf: too many events, increase MAX_PROBES\n");
+   return NULL;
+   }
+
+   nr_probe_events = n + 1;
+   pev = _event_array[n];
+   bzero(pev, sizeof(*pev));
+   return pev;
+}
+
+struct bpf_prog_priv {
+   struct perf_probe_event *pev;
+};
+
+static void
+bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
+ void *_priv)
+{
+   struct bpf_prog_priv *priv = _priv;
+   if (priv->pev)
+   clear_perf_probe_event(priv->pev);
+   free(priv);
+}
+
+static int
+config_bpf_program(struct bpf_program *prog)
+{
+   struct perf_probe_event *pev = alloc_perf_probe_event();
+   struct bpf_prog_priv *priv = NULL;
+   const char *config_str;
+   int err;
+
+   /* pr_err has been done by alloc_perf_probe_event */
+   if (!pev)
+   return -ENOMEM;
+
+   err = bpf_program__get_title(prog, _str, false);
+   if (err || !config_str) {
+   pr_err("bpf: unable to get title for program\n");
+   return -EINVAL;
+   }
+
+   pr_debug("bpf: config program '%s'\n", config_str);
+   err = parse_perf_probe_command(config_str, pev);
+   if (err < 0) {
+   pr_err("bpf: '%s' is not a valid config string\n",
+  config_str);
+   /* parse failed, don't need clear pev. */
+   return -EINVAL;
+   }
+
+   if (pev->group && strcmp(pev->group, PERF_BPF_PROBE_GROUP)) {
+   pr_err("bpf: '%s': group for event is set and not '%s'.\n",
+  config_str, PERF_BPF_PROBE_GROUP);
+   err = -EINVAL;
+   goto errout;
+   } else if (!pev->group)
+   pev->group = strdup(PERF_BPF_PROBE_GROUP);
+
+   if (!pev->group) {
+   pr_err("bpf: strdup failed\n");
+   err = -ENOMEM;
+   goto errout;
+   }
+
+   if (!pev->event) {
+   pr_err("bpf: '%s': event name is missing\n",
+  config_str);
+   err = -EINVAL;
+   goto errout;
+   }
+
+   pr_debug("bpf: config '%s' is ok\n", config_str);
+
+   priv = calloc(1, sizeof(*priv));
+   if (!priv) {
+   pr_err("bpf: failed to alloc memory\n");
+   err = -ENOMEM;
+   goto errout;
+   }
+
+   priv->pev = pev;
+   
+   err = bpf_program__set_private(prog, priv,
+  bpf_prog_priv__clear);
+   if (err) {
+   pr_err("bpf: set program private failed\n");
+   err = -ENOMEM;
+   goto errout;
+   }
+   return 0;
+
+errout:
+   if (pev)
+   clear_perf_probe_event(pev);
+   if (priv)
+   free(priv);
+   return err;
+}
+
 int bpf__prepare_load(const char *filename)
 {
struct bpf_object *obj;
+   struct bpf_program *prog;
+   int err = 0;
 
if (!libbpf_initialized)
libbpf_set_print(libbpf_warning,
@@ -43,12 +156,20 @@ int bpf__prepare_load(const char *filename)
return -EINVAL;
}
 
+   bpf_object__for_each_program(prog, obj) {
+   err = config_bpf_program(prog);
+   if (err)
+   goto errout;
+   }
+
/*
 * Throw

[RFC PATCH v4 08/29] bpf tools: Iterate over ELF sections to collect information

2015-05-26 Thread Wang Nan

bpf_obj_elf_collect() is introduced to iterate over each elf sections
to collection informations in eBPF object files. This function will
futher enhanced to collect license, kernel version, programs, configs
and map information.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 53 ++
 1 file changed, 53 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index dbbea0c..16e47a3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -197,6 +197,57 @@ mismatch:
return -EINVAL;
 }
 
+static int bpf_object__elf_collect(struct bpf_object *obj)
+{
+   Elf *elf = obj->efile.elf;
+   GElf_Ehdr *ep = >efile.ehdr;
+   Elf_Scn *scn = NULL;
+   int idx = 0, err = 0;
+
+   /* Elf is corrupted/truncated, avoid calling elf_strptr. */
+   if (!elf_rawdata(elf_getscn(elf, ep->e_shstrndx), NULL)) {
+   pr_warning("failed to get e_shstrndx from %s\n",
+  obj->path);
+   return -EINVAL;
+   }
+
+   while ((scn = elf_nextscn(elf, scn)) != NULL) {
+   char *name;
+   GElf_Shdr sh;
+   Elf_Data *data;
+
+   idx++;
+   if (gelf_getshdr(scn, ) != ) {
+   pr_warning("failed to get section header from %s\n",
+  obj->path);
+   err = -EINVAL;
+   goto out;
+   }
+
+   name = elf_strptr(elf, ep->e_shstrndx, sh.sh_name);
+   if (!name) {
+   pr_warning("failed to get section name from %s\n",
+  obj->path);
+   err = -EINVAL;
+   goto out;
+   }
+
+   data = elf_getdata(scn, 0);
+   if (!data) {
+   pr_warning("failed to get section data from %s(%s)\n",
+  name, obj->path);
+   err = -EINVAL;
+   goto out;
+   }
+   pr_debug("section %s, size %ld, link %d, flags %lx, type=%d\n",
+name, (unsigned long)data->d_size,
+(int)sh.sh_link, (unsigned long)sh.sh_flags,
+(int)sh.sh_type);
+   }
+out:
+   return err;
+}
+
 struct bpf_object *bpf_object__open(const char *path)
 {
struct bpf_object *obj;
@@ -220,6 +271,8 @@ struct bpf_object *bpf_object__open(const char *path)
goto out;
if (bpf_object__check_endianess(obj))
goto out;
+   if (bpf_object__elf_collect(obj))
+   goto out;
 
bpf_object__elf_finish(obj);
return obj;
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 04/29] bpf tools: Introduce 'bpf' library to tools

2015-05-26 Thread Wang Nan

This is the first patch of libbpf. The goal of libbpf is to create a
standard way for accessing eBPF object files. This patch creates
Makefile and Build for it, allows 'make' to build libbpf.a and
libbpf.so, 'make install' to put them into proper directories.
Most part of Makefile is borrowed from traceevent. Before building,
it checks the existance of libelf in Makefile, and deny to build if
not found. Instead of throwing an error if libelf not found, the error
raises in a phony target "elfdep". This design is to ensure
'make clean' still workable even if libelf is not found.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/.gitignore |   2 +
 tools/lib/bpf/Build  |   1 +
 tools/lib/bpf/Makefile   | 190 +++
 tools/lib/bpf/libbpf.c   |  14 
 tools/lib/bpf/libbpf.h   |  11 +++
 5 files changed, 218 insertions(+)
 create mode 100644 tools/lib/bpf/.gitignore
 create mode 100644 tools/lib/bpf/Build
 create mode 100644 tools/lib/bpf/Makefile
 create mode 100644 tools/lib/bpf/libbpf.c
 create mode 100644 tools/lib/bpf/libbpf.h

diff --git a/tools/lib/bpf/.gitignore b/tools/lib/bpf/.gitignore
new file mode 100644
index 000..812aeed
--- /dev/null
+++ b/tools/lib/bpf/.gitignore
@@ -0,0 +1,2 @@
+libbpf_version.h
+FEATURE-DUMP
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
new file mode 100644
index 000..a316484
--- /dev/null
+++ b/tools/lib/bpf/Build
@@ -0,0 +1 @@
+libbpf-y := libbpf.o
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
new file mode 100644
index 000..0715aaa
--- /dev/null
+++ b/tools/lib/bpf/Makefile
@@ -0,0 +1,190 @@
+# Most of this file is copied from tools/lib/traceevent/Makefile
+
+BPF_VERSION = 0
+BPF_PATCHLEVEL = 0
+BPF_EXTRAVERSION = 1
+
+MAKEFLAGS += --no-print-directory
+
+
+# Makefiles suck: This macro sets a default value of $(2) for the
+# variable named by $(1), unless the variable has been set by
+# environment or command line. This is necessary for CC and AR
+# because make sets default values, so the simpler ?= approach
+# won't work as expected.
+define allow-override
+  $(if $(or $(findstring environment,$(origin $(1))),\
+$(findstring command line,$(origin $(1,,\
+$(eval $(1) = $(2)))
+endef
+
+# Allow setting CC and AR, or setting CROSS_COMPILE as a prefix.
+$(call allow-override,CC,$(CROSS_COMPILE)gcc)
+$(call allow-override,AR,$(CROSS_COMPILE)ar)
+
+INSTALL = install
+
+# Use DESTDIR for installing into a different root directory.
+# This is useful for building a package. The program will be
+# installed in this directory as if it was the root directory.
+# Then the build tool can move it later.
+DESTDIR ?=
+DESTDIR_SQ = '$(subst ','\'',$(DESTDIR))'
+
+LP64 := $(shell echo __LP64__ | ${CC} ${CFLAGS} -E -x c - | tail -n 1)
+ifeq ($(LP64), 1)
+  libdir_relative = lib64
+else
+  libdir_relative = lib
+endif
+
+prefix ?= /usr/local
+libdir = $(prefix)/$(libdir_relative)
+man_dir = $(prefix)/share/man
+man_dir_SQ = '$(subst ','\'',$(man_dir))'
+
+export man_dir man_dir_SQ INSTALL
+export DESTDIR DESTDIR_SQ
+
+include ../../scripts/Makefile.include
+
+# copy a bit from Linux kbuild
+
+ifeq ("$(origin V)", "command line")
+  VERBOSE = $(V)
+endif
+ifndef VERBOSE
+  VERBOSE = 0
+endif
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+#$(info Determined 'srctree' to be $(srctree))
+endif
+
+FEATURE_DISPLAY = libelf libelf-getphdrnum libelf-mmap
+FEATURE_TESTS = libelf
+include $(srctree)/tools/build/Makefile.feature
+
+export prefix libdir src obj
+
+# Shell quotes
+libdir_SQ = $(subst ','\'',$(libdir))
+libdir_relative_SQ = $(subst ','\'',$(libdir_relative))
+plugin_dir_SQ = $(subst ','\'',$(plugin_dir))
+
+LIB_FILE = libbpf.a libbpf.so
+
+VERSION= $(BPF_VERSION)
+PATCHLEVEL = $(BPF_PATCHLEVEL)
+EXTRAVERSION   = $(BPF_EXTRAVERSION)
+
+OBJ= $@
+N  =
+
+LIBBPF_VERSION = $(BPF_VERSION).$(BPF_PATCHLEVEL).$(BPF_EXTRAVERSION)
+
+INCLUDES = -I. -I$(srctree)/tools/include 
-I$(srctree)/arch/$(ARCH)/include/uapi -I$(srctree)/include/uapi
+
+# Set compile option CFLAGS
+ifdef EXTRA_CFLAGS
+  CFLAGS := $(EXTRA_CFLAGS)
+else
+  CFLAGS := -g -Wall
+endif
+
+ifeq ($(feature-libelf-mmap), 1)
+  override CFLAGS += -DHAVE_LIBELF_MMAP_SUPPORT
+endif
+
+ifeq ($(feature-libelf-getphdrnum), 1)
+  override CFLAGS += -DHAVE_ELF_GETPHDRNUM_SUPPORT
+endif
+
+# Append required CFLAGS
+override CFLAGS += $(EXTRA_WARNINGS)
+override CFLAGS += -Werror -Wall
+override CFLAGS += -fPIC
+override CFLAGS += $(INCLUDES)
+
+ifeq ($(VERBOSE),1)
+  Q =
+else
+  Q = @
+endif
+
+# Disable command line variables (CFLAGS) overide from top
+# level Makefile (perf), otherwise build Makefile will get
+# the same command line setup.
+MAKEOVERRIDES=
+
+export srctree OUTPUT CC LD CFLAGS V
+build := -f $(srctree)/tools/build/Makefile.build dir=. obj
+
+BPF_IN:=

[RFC PATCH v4 29/29] perf tools: Attach eBPF program to perf event

2015-05-26 Thread Wang Nan

In this patch PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF
program to a newly created perf event. The file descriptor of the
eBPF program is passed to perf record using previous patches, and
stored into evsel->bpf_fd.

It is possible that different perf event are created for one kprobe
events for different CPUs. In this case, when trying to call the
ioctl, EEXIST will be return. This patch doesn't treat it as an error.

Signed-off-by: Wang Nan 
---
 tools/perf/util/evsel.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 04d60a7..3b94c66 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1201,6 +1201,22 @@ retry_open:
  err);
goto try_fallback;
}
+
+   if (evsel->bpf_fd >= 0) {
+   int evt_fd = FD(evsel, cpu, thread);
+   int bpf_fd = evsel->bpf_fd;
+
+   err = ioctl(evt_fd,
+   PERF_EVENT_IOC_SET_BPF,
+   bpf_fd);
+   if (err && errno != EEXIST) {
+   pr_err("failed to attach bpf fd %d: 
%s\n",
+  bpf_fd, strerror(errno));
+   err = -EINVAL;
+   goto out_close;
+   }
+   }
+
set_rlimit = NO_CHANGE;
 
/*
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 11/29] bpf tools: Collect symbol table from SHT_SYMTAB section

2015-05-26 Thread Wang Nan

This patch collects symbols section. This section is useful when
linking ELF maps.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index da83766..d89fd42 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -95,6 +95,7 @@ struct bpf_object {
int fd;
Elf *elf;
GElf_Ehdr ehdr;
+   Elf_Data *symbols;
} efile;
char path[];
 };
@@ -124,6 +125,7 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
elf_end(obj->efile.elf);
obj->efile.elf = NULL;
}
+   obj->efile.symbols = NULL;
zclose(obj->efile.fd);
 }
 
@@ -309,6 +311,14 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
else if (strcmp(name, "maps") == 0)
err = bpf_object__init_maps(obj, data->d_buf,
data->d_size);
+   else if (sh.sh_type == SHT_SYMTAB) {
+   if (obj->efile.symbols) {
+   pr_warning("bpf: multiple SYMTAB in %s\n",
+  obj->path);
+   err = -EEXIST;
+   } else
+   obj->efile.symbols = data;
+   }
if (err)
goto out;
}
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 18/29] bpf tools: Introduce bpf_load_program() to bpf.c

2015-05-26 Thread Wang Nan

bpf_load_program() can be used to load bpf program into kernel. To make
loading faster, first try to load without logbuf. Try again with logbuf
if the first try failed.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/bpf.c | 34 ++
 tools/lib/bpf/bpf.h |  7 +++
 2 files changed, 41 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 7481923..d2002da 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -36,6 +36,11 @@
 # error __NR_bpf not defined. libbpf does not support your arch.
 #endif
 
+static __u64 ptr_to_u64(void *ptr)
+{
+   return (__u64) (unsigned long) ptr;
+}
+
 static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
 {
return syscall(__NR_bpf, cmd, attr, size);
@@ -54,3 +59,32 @@ int bpf_create_map(enum bpf_map_type map_type, int key_size, 
int value_size,
 
return sys_bpf(BPF_MAP_CREATE, , sizeof(attr));
 }
+
+int bpf_load_program(enum bpf_prog_type type, struct bpf_insn *insns,
+size_t insns_cnt, char *license,
+u32 kern_version, char *log_buf, size_t log_buf_sz)
+{
+   int fd;
+   union bpf_attr attr;
+
+   bzero(, sizeof(attr));
+   attr.prog_type = type;
+   attr.insn_cnt = (__u32)insns_cnt;
+   attr.insns = ptr_to_u64(insns);
+   attr.license = ptr_to_u64(license);
+   attr.log_buf = ptr_to_u64(NULL);
+   attr.log_size = 0;
+   attr.log_level = 0;
+   attr.kern_version = kern_version;
+
+   fd = sys_bpf(BPF_PROG_LOAD, , sizeof(attr));
+   if (fd >= 0 || !log_buf || !log_buf_sz)
+   return fd;
+
+   /* Try again with log */
+   attr.log_buf = ptr_to_u64(log_buf);
+   attr.log_size = log_buf_sz;
+   attr.log_level = 1;
+   log_buf[0] = 0;
+   return sys_bpf(BPF_PROG_LOAD, , sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 28f7942..854b736 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -13,4 +13,11 @@
 int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
   int max_entries);
 
+/* Recommend log buffer size */
+#define BPF_LOG_BUF_SIZE 65536
+int bpf_load_program(enum bpf_prog_type type, struct bpf_insn *insns,
+size_t insns_cnt, char *license,
+u32 kern_version, char *log_buf,
+size_t log_buf_sz);
+
 #endif
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 26/29] perf record: Probe at kprobe points

2015-05-26 Thread Wang Nan

In this patch, kprobe points are created using add_perf_probe_events.
Since all events are already grouped together in an array, calling
add_perf_probe_events() once creates all of them.

Signed-off-by: Wang Nan 
---
 tools/perf/builtin-record.c  | 14 ++
 tools/perf/util/bpf-loader.c | 45 
 tools/perf/util/bpf-loader.h |  2 ++
 3 files changed, 61 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index c3efdfb..9297800 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -27,6 +27,7 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/data.h"
+#include "util/bpf-loader.h"
 
 #include 
 #include 
@@ -957,6 +958,18 @@ int cmd_record(int argc, const char **argv, const char 
*prefix __maybe_unused)
usage_with_options(record_usage, record_options);
}
 
+   /*
+* bpf__probe must be called before symbol__init() because we
+* need init_symbol_maps. If called after symbol__init,
+* symbol_conf.sort_by_name won't take effect.
+*/
+   err = bpf__probe();
+   if (err) {
+   pr_err("Probing at events in BPF object failed.\n");
+   pr_err("Try perf probe -d '*' to remove existing probe 
events.\n");
+   return err;
+   }
+
symbol__init(NULL);
 
if (symbol_conf.kptr_restrict)
@@ -1011,5 +1024,6 @@ int cmd_record(int argc, const char **argv, const char 
*prefix __maybe_unused)
 out_symbol_exit:
perf_evlist__delete(rec->evlist);
symbol__exit();
+   bpf__unprobe();
return err;
 }
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 1022729..f5d5f3e 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -179,3 +179,48 @@ void bpf__clear(void)
bpf_object__for_each(obj, tmp)
bpf_object__close(obj);
 }
+
+static bool is_probing = false;
+
+int bpf__unprobe(void)
+{
+   struct strlist *dellist;
+   int ret;
+
+   if (!is_probing)
+   return 0;
+
+   dellist = strlist__new(true, PERF_BPF_PROBE_GROUP ":*");
+   if (!dellist) {
+   pr_err("Failed to create dellist when unprobing\n");
+   return -ENOMEM;
+   }
+
+   ret = del_perf_probe_events(dellist);
+   strlist__delete(dellist);
+   if (ret < 0 && is_probing)
+   pr_err("  Error: failed to delete events: %s\n",
+   strerror(-ret));
+   else
+   is_probing = false;
+   return ret < 0 ? ret : 0;
+}
+
+int bpf__probe(void)
+{
+   int err;
+
+   if (nr_probe_events <= 0)
+   return 0;
+
+   err = add_perf_probe_events(probe_event_array,
+   nr_probe_events,
+   MAX_PROBES, 0);
+   /* add_perf_probe_events return negative when fail */
+   if (err < 0)
+   pr_err("bpf probe: failed to probe events\n");
+   else
+   is_probing = true;
+
+   return err < 0 ? err : 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 43d7b99..1127880 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -8,6 +8,8 @@
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 int bpf__prepare_load(const char *filename);
+int bpf__probe(void);
+int bpf__unprobe(void);
 
 void bpf__clear(void);
 #endif
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 23/29] perf tools: Make perf depend on libbpf

2015-05-26 Thread Wang Nan

By adding libbpf into perf's Makefile, this patch enable perf to
build libbpf during building if libelf is found and NO_LIBELF is not
set. The newly introduced code is similar to libapi and libtraceevent
building in Makefile.perf.

Signed-off-by: Wang Nan 
---
 tools/perf/Makefile.perf | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index c43a205..45f10da 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -123,6 +123,7 @@ STRIP   = strip
 
 LIB_DIR  = $(srctree)/tools/lib/api/
 TRACE_EVENT_DIR = $(srctree)/tools/lib/traceevent/
+BPF_DIR = $(srctree)/tools/lib/bpf/
 
 # include config/Makefile by default and rule out
 # non-config cases
@@ -158,6 +159,7 @@ strip-libs = $(filter-out -l%,$(1))
 
 ifneq ($(OUTPUT),)
   TE_PATH=$(OUTPUT)
+  BPF_PATH=$(OUTPUT)
 ifneq ($(subdir),)
   LIB_PATH=$(OUTPUT)/../lib/api/
 else
@@ -166,6 +168,7 @@ endif
 else
   TE_PATH=$(TRACE_EVENT_DIR)
   LIB_PATH=$(LIB_DIR)
+  BPF_PATH=$(BPF_DIR)
 endif
 
 LIBTRACEEVENT = $(TE_PATH)libtraceevent.a
@@ -174,6 +177,8 @@ export LIBTRACEEVENT
 LIBAPI = $(LIB_PATH)libapi.a
 export LIBAPI
 
+LIBBPF = $(BPF_PATH)libbpf.a
+
 # python extension build directories
 PYTHON_EXTBUILD := $(OUTPUT)python_ext_build/
 PYTHON_EXTBUILD_LIB := $(PYTHON_EXTBUILD)lib/
@@ -225,6 +230,11 @@ export PERL_PATH
 LIB_FILE=$(OUTPUT)libperf.a
 
 PERFLIBS = $(LIB_FILE) $(LIBAPI) $(LIBTRACEEVENT)
+ifndef NO_LIBELF
+  ifeq ($(feature-libelf), 1)
+PERFLIBS += $(LIBBPF)
+  endif
+endif
 
 # We choose to avoid "if .. else if .. else .. endif endif"
 # because maintaining the nesting to match is a pain.  If
@@ -387,6 +397,13 @@ $(LIBAPI)-clean:
$(call QUIET_CLEAN, libapi)
$(Q)$(MAKE) -C $(LIB_DIR) O=$(OUTPUT) clean >/dev/null
 
+$(LIBBPF): FORCE
+   $(Q)$(MAKE) -C $(BPF_DIR) O=$(OUTPUT) $(OUTPUT)libbpf.a
+
+$(LIBBPF)-clean:
+   $(call QUIET_CLEAN, libbpf)
+   $(Q)$(MAKE) -C $(BPF_DIR) O=$(OUTPUT) clean >/dev/null
+
 help:
@echo 'Perf make targets:'
@echo '  doc- make *all* documentation (see below)'
@@ -525,7 +542,7 @@ config-clean:
$(call QUIET_CLEAN, config)
$(Q)$(MAKE) -C $(srctree)/tools/build/feature/ clean >/dev/null
 
-clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean config-clean
+clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean config-clean
$(call QUIET_CLEAN, core-objs)  $(RM) $(LIB_FILE) $(OUTPUT)perf-archive 
$(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
$(Q)find . -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name 
'\.*.d' -delete
$(Q)$(RM) .config-detected
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 27/29] perf record: Load all eBPF object into kernel

2015-05-26 Thread Wang Nan

This patch utilizes bpf_load_object() provided by libbpf to load all
objects into kernel.

Signed-off-by: Wang Nan 
---
 tools/perf/builtin-record.c  | 12 
 tools/perf/util/bpf-loader.c | 19 +++
 tools/perf/util/bpf-loader.h |  1 +
 3 files changed, 32 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9297800..4b52399 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -970,6 +970,18 @@ int cmd_record(int argc, const char **argv, const char 
*prefix __maybe_unused)
return err;
}
 
+   /*
+* bpf__probe() also calls symbol__init() if there are probe
+* events in bpf objects, so calling symbol_exit when failuer
+* is safe. If there is no probe event, bpf__load() always
+* success.
+*/
+   err = bpf__load();
+   if (err) {
+   pr_err("Loading BPF programs failed\n");
+   goto out_symbol_exit;
+   }
+
symbol__init(NULL);
 
if (symbol_conf.kptr_restrict)
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index f5d5f3e..f9a1ab9 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -224,3 +224,22 @@ int bpf__probe(void)
 
return err < 0 ? err : 0;
 }
+
+int bpf__load(void)
+{
+   struct bpf_object *obj, *tmp;
+   int err = 0;
+
+   bpf_object__for_each(obj, tmp) {
+   err = bpf_object__load(obj);
+   if (err) {
+   pr_err("bpf: load objects failed\n");
+   goto errout;
+   }
+   }
+   return 0;
+errout:
+   bpf_object__for_each(obj, tmp)
+   bpf_object__unload(obj);
+   return err;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 1127880..fcc775d 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -10,6 +10,7 @@
 int bpf__prepare_load(const char *filename);
 int bpf__probe(void);
 int bpf__unprobe(void);
+int bpf__load(void);
 
 void bpf__clear(void);
 #endif
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v4 1/4] arm64: kvm: add a cpu tear-down function

2015-05-26 Thread AKASHI Takahiro


Marc,

Thank you for your reviews:

On 05/26/2015 06:26 PM, Marc Zyngier wrote:

Hi Takahiro,

On 08/05/15 02:18, AKASHI Takahiro wrote:

Cpu must be put back into its initial state, at least, in the
following cases in order to shutdown the system and/or re-initialize cpus
later on:
1) kexec/kdump
2) cpu hotplug (offline)
3) removing kvm as a module

To address those issues in later patches, this patch adds a tear-down
function, kvm_cpu_reset(), that disables D-cache & MMU and restore a vector
table to the initial stub at EL2.

Signed-off-by: AKASHI Takahiro 
---
  arch/arm/kvm/arm.c|   15 +++
  arch/arm/kvm/mmu.c|5 +
  arch/arm64/include/asm/kvm_asm.h  |1 +
  arch/arm64/include/asm/kvm_host.h |   11 +++
  arch/arm64/include/asm/kvm_mmu.h  |7 +++
  arch/arm64/include/asm/virt.h |   11 +++
  arch/arm64/kvm/hyp-init.S |   32 
  arch/arm64/kvm/hyp.S  |   16 +---
  8 files changed, 95 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 07e7eb1..251ab9e 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -897,6 +897,21 @@ static void cpu_init_hyp_mode(void *dummy)
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
  }

+static void kvm_cpu_reset(void *dummy)


It looks like you can entirely loose the "dummy" parameter. The only
reason some function have this is when they are used from an IPI call.


OK. I will remove all of them.


+{
+   phys_addr_t boot_pgd_ptr;
+   phys_addr_t phys_idmap_start;
+
+   if (__hyp_get_vectors() == hyp_default_vectors)
+   return;
+
+   boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+   phys_idmap_start = kvm_get_idmap_start();
+   __cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start,
+hyp_default_vectors,
+kvm_virt_to_trampoline(__kvm_hyp_reset));
+}
+
  static int hyp_init_cpu_notify(struct notifier_block *self,
   unsigned long action, void *cpu)
  {
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 3e6859b..3631a37 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1490,6 +1490,11 @@ phys_addr_t kvm_get_idmap_vector(void)
return hyp_idmap_vector;
  }

+phys_addr_t kvm_get_idmap_start(void)
+{
+   return hyp_idmap_start;
+}
+
  int kvm_mmu_init(void)
  {
int err;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4f7310f..f1c16e2 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -118,6 +118,7 @@ struct kvm_vcpu;

  extern char __kvm_hyp_init[];
  extern char __kvm_hyp_init_end[];
+extern char __kvm_hyp_reset[];

  extern char __kvm_hyp_vector[];

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 8ac3c70..6a8da9c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -199,6 +199,8 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
  struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);

  u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start,
+   unsigned long stub_vector_ptr, unsigned long reset_func);
  void force_vm_exit(const cpumask_t *mask);
  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);

@@ -223,6 +225,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t 
boot_pgd_ptr,
 hyp_stack_ptr, vector_ptr);
  }

+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+   phys_addr_t phys_idmap_start,
+   unsigned long stub_vector_ptr,
+   unsigned long reset_func)
+{
+   kvm_call_reset(boot_pgd_ptr, phys_idmap_start, stub_vector_ptr,
+  reset_func);
+}
+
  struct vgic_sr_vectors {
void*save_vgic;
void*restore_vgic;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6458b53..facfd6d 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -96,6 +96,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
  phys_addr_t kvm_mmu_get_httbr(void);
  phys_addr_t kvm_mmu_get_boot_httbr(void);
  phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
  int kvm_mmu_init(void);
  void kvm_clear_hyp_idmap(void);

@@ -305,5 +306,11 @@ static inline void __kvm_flush_dcache_pud(pud_t pud)
  void kvm_set_way_flush(struct kvm_vcpu *vcpu);
  void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled);

+extern char __hyp_idmap_text_start[];


If you're defining it here, then it worth considering removing the
similar declaration from mmu.c.


Yes. I will remove them from mmu.c.


+#define

[RFC PATCH v4 28/29] perf tools: Add bpf_fd field to evsel and config it

2015-05-26 Thread Wang Nan

This patch adds a bpf_fd field to 'struct evsel' then introduces method
to config it. In bpf-loader, a bpf__for_each_program() function is added.
Which calls the callback function for each eBPF program with their names
and file descriptors. In evlist.c, perf_evlist__add_bpf()
is added to add all bpf events into evlist. 'perf record' calls
perf_evlist__add_bpf().

Since bpf-loader.c will not built if libelf not found, an empty
bpf__for_each_program() is defined in bpf-loader.h to avoid compiling
error.
---
 tools/perf/builtin-record.c  |  6 ++
 tools/perf/util/bpf-loader.c | 39 +++
 tools/perf/util/bpf-loader.h | 16 
 tools/perf/util/evlist.c | 32 
 tools/perf/util/evlist.h |  1 +
 tools/perf/util/evsel.c  |  1 +
 tools/perf/util/evsel.h  |  1 +
 7 files changed, 96 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4b52399..692cfe5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -982,6 +982,12 @@ int cmd_record(int argc, const char **argv, const char 
*prefix __maybe_unused)
goto out_symbol_exit;
}
 
+   err = perf_evlist__add_bpf(rec->evlist);
+   if (err < 0) {
+   pr_err("Failed to add events from BPF object(s)\n");
+   goto out_symbol_exit;
+   }
+
symbol__init(NULL);
 
if (symbol_conf.kptr_restrict)
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index f9a1ab9..5d23346 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -243,3 +243,42 @@ errout:
bpf_object__unload(obj);
return err;
 }
+
+int
+bpf__for_each_program(bpf_prog_iter_callback_t func,
+ void *arg)
+{
+   struct bpf_object *obj, *tmp;
+   struct bpf_program *prog;
+   int err;
+
+   bpf_object__for_each(obj, tmp) {
+   bpf_object__for_each_program(prog, obj) {
+   struct bpf_prog_priv *priv;
+   char *group, *event;
+   int fd;
+
+   err = bpf_program__get_private(prog,
+  (void **));
+   if (err || !priv) {
+   pr_err("bpf: failed to get private field\n");
+   return -EINVAL;
+   }
+   err = bpf_program__get_fd(prog, );
+   if (err || fd < 0) {
+   pr_err("bpf: failed to get file descriptor\n");
+   return -EINVAL;
+   }
+
+   group = priv->pev->group;
+   event = priv->pev->event;
+   err = func(group, event, fd, arg);
+   if (err) {
+   pr_err("bpf: call back failed, stop iterate\n");
+   return err;
+   }
+   }
+   }
+
+   return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index fcc775d..4d5eec2 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -5,6 +5,8 @@
 #ifndef __BPF_LOADER_H
 #define __BPF_LOADER_H
 
+#include  // for __maybe_unused
+
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 int bpf__prepare_load(const char *filename);
@@ -13,4 +15,18 @@ int bpf__unprobe(void);
 int bpf__load(void);
 
 void bpf__clear(void);
+
+typedef int (*bpf_prog_iter_callback_t)(char *group, char *event,
+   int fd, void *arg);
+
+#ifdef HAVE_LIBELF_SUPPORT
+int bpf__for_each_program(bpf_prog_iter_callback_t func, void *arg);
+#else
+static inline int
+bpf__for_each_program(bpf_prog_iter_callback_t func __maybe_unused,
+ void *arg __maybe_unused)
+{
+   return 0;
+}
+#endif
 #endif
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 080be93..2b997d1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -14,6 +14,7 @@
 #include "target.h"
 #include "evlist.h"
 #include "evsel.h"
+#include "bpf-loader.h"
 #include "debug.h"
 #include 
 
@@ -194,6 +195,37 @@ error:
return -ENOMEM;
 }
 
+static int add_bpf_event(char *group, char *event, int fd,
+void *arg)
+{
+   struct perf_evlist *evlist = arg;
+   struct perf_evsel *pos;
+   struct list_head list;
+   int err, idx, entries;
+
+   pr_debug("add bpf event %s:%s and attach bpf program %d\n",
+group, event, fd);
+   INIT_LIST_HEAD();
+   idx = evlist->nr_entries;
+   err = parse_events_add_tracepoint(, , group, event);
+   
+   if (err) {
+   pr_err("Failed to add BPF event %s:%s\n",
+  group, event);
+   return err;
+   }
+

[RFC PATCH v4 05/29] bpf tools: Allow caller to set printing function

2015-05-26 Thread Wang Nan

By libbpf_set_print(), users of libbpf are allowed to register he/she
own debug, info and warning printing functions. Libbpf will use those
functions to print messages. If not provided, default info and warning
printing functions are fprintf(stderr, ...); defailt debug printing
is NULL.

This API is designed to be used by perf, enables it to register its own
logging functions to make all logs uniform, instead of separated
logging level control.

Acked-by: Alexei Starovoitov 
Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 43 +++
 tools/lib/bpf/libbpf.h |  4 
 2 files changed, 47 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index c08d6bc..49091c3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -7,8 +7,51 @@
  */
 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
 
 #include "libbpf.h"
+
+#define __printf(a, b) __attribute__((format(printf, a, b)))
+
+__printf(1, 2)
+static int __base_pr(const char *format, ...)
+{
+   va_list args;
+   int err;
+
+   va_start(args, format);
+   err = vfprintf(stderr, format, args);
+   va_end(args);
+   return err;
+}
+
+static __printf(1, 2) int (*__pr_warning)(const char *format, ...) =
+   __base_pr;
+static __printf(1, 2) int (*__pr_info)(const char *format, ...) =
+   __base_pr;
+static __printf(1, 2) int (*__pr_debug)(const char *format, ...) =
+   NULL;
+
+#define __pr(func, fmt, ...)   \
+do {   \
+   if ((func)) \
+   (func)("libbpf: " fmt, ##__VA_ARGS__); \
+} while(0)
+
+#define pr_warning(fmt, ...)   __pr(__pr_warning, fmt, ##__VA_ARGS__)
+#define pr_info(fmt, ...)  __pr(__pr_info, fmt, ##__VA_ARGS__)
+#define pr_debug(fmt, ...) __pr(__pr_debug, fmt, ##__VA_ARGS__)
+
+void libbpf_set_print(int (*warn)(const char *format, ...),
+ int (*info)(const char *format, ...),
+ int (*debug)(const char *format, ...))
+{
+   __pr_warning = warn;
+   __pr_info = info;
+   __pr_debug = debug;
+}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index a6f46d9..430b122 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -8,4 +8,8 @@
 #ifndef __BPF_LIBBPF_H
 #define __BPF_LIBBPF_H
 
+void libbpf_set_print(int (*warn)(const char *format, ...),
+ int (*info)(const char *format, ...),
+ int (*debug)(const char *format, ...));
+
 #endif
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 09/29] bpf tools: Collect version and license from ELF sections

2015-05-26 Thread Wang Nan

Expand bpf_obj_elf_collect() to collect license and kernel version
information in eBPF object file. eBPF object file should have a section
named 'license', which contains a string. It should also have a section
named 'version', contains a u32 LINUX_VERSION_CODE.

bpf_obj_validate() is introduced to validate object file after loaded.
Currently it only check existance of 'version' section.

Signed-off-by: Wang Nan 
---
 tools/lib/bpf/libbpf.c | 52 ++
 1 file changed, 52 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 16e47a3..15525ad 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -81,6 +82,8 @@ void libbpf_set_print(int (*warn)(const char *format, ...),
 #endif
 
 struct bpf_object {
+   char license[64];
+   u32 kern_version;
/*
 * Information when doing elf related work. Only valid if fd
 * is valid.
@@ -197,6 +200,32 @@ mismatch:
return -EINVAL;
 }
 
+static int
+bpf_object__init_license(struct bpf_object *obj,
+void *data, size_t size)
+{
+   memcpy(obj->license, data,
+  min(size, sizeof(obj->license) - 1));
+   pr_debug("license of %s is %s\n", obj->path, obj->license);
+   return 0;
+}
+
+static int
+bpf_object__init_kversion(struct bpf_object *obj,
+ void *data, size_t size)
+{
+   u32 kver;
+   if (size < sizeof(kver)) {
+   pr_warning("invalid kver section in %s\n", obj->path);
+   return -EINVAL;
+   }
+   memcpy(, data, sizeof(kver));
+   obj->kern_version = kver;
+   pr_debug("kernel version of %s is %x\n", obj->path,
+obj->kern_version);
+   return 0;
+}
+
 static int bpf_object__elf_collect(struct bpf_object *obj)
 {
Elf *elf = obj->efile.elf;
@@ -243,11 +272,32 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 name, (unsigned long)data->d_size,
 (int)sh.sh_link, (unsigned long)sh.sh_flags,
 (int)sh.sh_type);
+
+   if (strcmp(name, "license") == 0)
+   err = bpf_object__init_license(obj,
+  data->d_buf,
+  data->d_size);
+   else if (strcmp(name, "version") == 0)
+   err = bpf_object__init_kversion(obj,
+   data->d_buf,
+   data->d_size);
+   if (err)
+   goto out;
}
 out:
return err;
 }
 
+static int bpf_object__validate(struct bpf_object *obj)
+{
+   if (obj->kern_version == 0) {
+   pr_warning("%s doesn't provide kernel version\n",
+  obj->path);
+   return -EINVAL;
+   }
+   return 0;
+}
+
 struct bpf_object *bpf_object__open(const char *path)
 {
struct bpf_object *obj;
@@ -273,6 +323,8 @@ struct bpf_object *bpf_object__open(const char *path)
goto out;
if (bpf_object__elf_collect(obj))
goto out;
+   if (bpf_object__validate(obj))
+   goto out;
 
bpf_object__elf_finish(obj);
return obj;
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/36] mmu_notifier: pass page pointer to mmu_notifier_invalidate_page()

2015-05-26 Thread Aneesh Kumar K.V

j.gli...@gmail.com writes:

> From: Jérôme Glisse 
>
> Listener of mm event might not have easy way to get the struct page
> behind and address invalidated with mmu_notifier_invalidate_page()
> function as this happens after the cpu page table have been clear/
> updated. This happens for instance if the listener is storing a dma
> mapping inside its secondary page table. To avoid complex reverse
> dma mapping lookup just pass along a pointer to the page being
> invalidated.

.

> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> index ada3ed1..283ad26 100644
> --- a/include/linux/mmu_notifier.h
> +++ b/include/linux/mmu_notifier.h
> @@ -172,6 +172,7 @@ struct mmu_notifier_ops {
>   void (*invalidate_page)(struct mmu_notifier *mn,
>   struct mm_struct *mm,
>   unsigned long address,
> + struct page *page,
>   enum mmu_event event);
>  

How do we handle this w.r.t invalidate_range ? 

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/1] dw_mmc: insmod followed by rmmod will hung for eMMC

2015-05-26 Thread Prabu Thangamuthu

Hi Ulf, Jaehoon,

Thanks for your comments, I will update and send a new Patch.

> On  05/27/2015 7:20 AM, Jaehoon Chung Wrote:
>
> Hi, Pradu.
> 
> Sorry for late.
No Problem.
> I will wait for your next version than i will check yours.
> To Ulf.
> 
> Thanks for review!
> 
> Best Regards,
> Jaehoon Chung
> 
> On 05/22/2015 10:21 PM, Ulf Hansson wrote:
> > On 18 May 2015 at 16:23, Prabu Thangamuthu 
> wrote:
> >> Removing dw_mmc driver immediately after inserting the dw_mmc driver
> >> is
> >
> > I guess it hangs even if you remove it after a couple of days? :-)
> >
> > Perhaps makes this a bit more clear?
> >
> >> getting hung for eMMC device. Root cause for this issue is,
> >> dw_mci_remove will disable all the interrupts then it will call
> dw_mci_cleanup_slot.
> >> dw_mci_cleanup_slot is issuing CMD6 to disable boot partition access
> >> and it's waiting for command complete interrupt. Since INTMASK was
> >> already cleared by dw_mci_remove, command complete interrupt is not
> >> reaching the system. This leads to process hung.
> >
> > /s dw_mci_remove / dw_mci_remove()
> >
> >>
> >> Signed-off-by: Prabu Thangamuthu 
> >
> > This patch looks good overall, but please send a new version with
> > updated changelog. Moreover, please use "mmc: dw_mmc:" as prefix for
> > the commit message header.
> >
> > Kind regards
> > Uffe
> >
> >> ---
> >>  drivers/mmc/host/dw_mmc.c |6 +++---
> >>  1 files changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> >> index 5f5adaf..f0a0aff 100644
> >> --- a/drivers/mmc/host/dw_mmc.c
> >> +++ b/drivers/mmc/host/dw_mmc.c
> >> @@ -2941,15 +2941,15 @@ void dw_mci_remove(struct dw_mci *host)  {
> >> int i;
> >>
> >> -   mci_writel(host, RINTSTS, 0x);
> >> -   mci_writel(host, INTMASK, 0); /* disable all mmc interrupt first */
> >> -
> >> for (i = 0; i < host->num_slots; i++) {
> >> dev_dbg(host->dev, "remove slot %d\n", i);
> >> if (host->slot[i])
> >> dw_mci_cleanup_slot(host->slot[i], i);
> >> }
> >>
> >> +   mci_writel(host, RINTSTS, 0x);
> >> +   mci_writel(host, INTMASK, 0); /* disable all mmc interrupt
> >> + first */
> >> +
> >> /* disable clock to CIU */
> >> mci_writel(host, CLKENA, 0);
> >> mci_writel(host, CLKSRC, 0);
> >> --
> >> 1.7.6.5
> >

Thanks & Regards,
Prabu Thangamuthu.

RE: [PATCH] MAINTAINERS: update Emulex ocrdma email addresses

2015-05-26 Thread Devesh Sharma

Thanks Laurent,

My earlier mail bounced back from Linux-kernel mailing list, thus
resending.

CC'ing Doug.

Acked-By: Devesh Sharma 


> -Original Message-
> From: Laurent Navet [mailto:laurent.na...@gmail.com]
> Sent: Wednesday, May 27, 2015 12:46 AM
> To: a...@linux-foundation.org; gre...@linuxfoundation.org;
> da...@davemloft.net; mche...@osg.samsung.com; a...@arndb.de;
> j...@perches.com; jingooh...@gmail.com; selvin.xav...@avagotech.com;
> devesh.sha...@avagotech.com; mitesh.ah...@avagotech.com
> Cc: linux-kernel@vger.kernel.org; Laurent Navet
> Subject: [PATCH] MAINTAINERS: update Emulex ocrdma email addresses
>
> @emulex.com addresses respond to use @avagotech.com.
>
> Signed-off-by: Laurent Navet 
> ---
>  MAINTAINERS | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f8e0afb..05766f7 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8846,9 +8846,9 @@ S:  Supported
>  F:   drivers/net/ethernet/emulex/benet/
>
>  EMULEX ONECONNECT ROCE DRIVER
> -M:   Selvin Xavier 
> -M:   Devesh Sharma 
> -M:   Mitesh Ahuja 
> +M:   Selvin Xavier 
> +M:   Devesh Sharma 
> +M:   Mitesh Ahuja 
>  L:   linux-r...@vger.kernel.org
>  W:   http://www.emulex.com
>  S:   Supported
> --
> 2.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH-v2 0/4] target: Eliminate se_port + t10_alua_tg_pt_gp_member

2015-05-26 Thread Nicholas A. Bellinger

On Tue, 2015-05-26 at 14:44 +0200, Bart Van Assche wrote:
> On 05/26/15 08:57, Nicholas A. Bellinger wrote:
> >- Add various rcu_dereference and lockless_dereference RCU notation
> 
> Hello Nic,
> 
> Feedback from an RCU expert (which I'm not) would be appreciated here. 
> But my understanding is that lockless_dereference(p) should be used for 
> a pointer p that has *not* been annotated as an RCU pointer. I think in 
> the for-next branch of the target repository that this macro is used to 
> access RCU-annotated pointers. Is that why sparse complains about how 
> lockless_dereference() is used in the target tree ?
> 

Was curious about this myself..  Thanks for raising the question!

The intention of lockless_dereference() in both this and preceding
series is for __rcu protected pointers that are accessed outside of
rcu_read_lock() protection, and who's lifetime is controlled by a:

  - struct kref
  - struct percpu_ref
  - struct config_group symlink
  - RCU updater path with some manner of mutex or spinlock held

This is supposed to be following Paul's comment in rcupdate.h:

 * Similar to rcu_dereference(), but for situations where the pointed-to
 * object's lifetime is managed by something other than RCU.  That
 * "something other" might be reference counting or simple immortality.

Paul, would you be to kind to clarify the intention for us..?

Thank you,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-05-26 Thread Aneesh Kumar K.V

j.gli...@gmail.com writes:

> From: Jérôme Glisse 
>
> The mmu_notifier_invalidate_range_start() and 
> mmu_notifier_invalidate_range_end()
> can be considered as forming an "atomic" section for the cpu page table update
> point of view. Between this two function the cpu page table content is 
> unreliable
> for the address range being invalidated.
>
> Current user such as kvm need to know when they can trust the content of the 
> cpu
> page table. This becomes even more important to new users of the mmu_notifier
> api (such as HMM or ODP).

I don't see kvm using the new APIs in this patch. Also what is that HMM use this
for, to protect walking of mirror page table ?. I am sure you are
covering that in the later patches. May be you may want to mention
the details here too. 

>
> This patch use a structure define at all call site to invalidate_range_start()
> that is added to a list for the duration of the invalidation. It adds two new
> helpers to allow querying if a range is being invalidated or to wait for a 
> range
> to become valid.
>
> For proper synchronization, user must block new range invalidation from inside
> there invalidate_range_start() callback, before calling the helper functions.
> Otherwise there is no garanty that a new range invalidation will not be added
> after the call to the helper function to query for existing range.
>
> Changed since v1:
>   - Fix a possible deadlock in mmu_notifier_range_wait_valid()
>
> Changed since v2:
>   - Add the range to invalid range list before calling ->range_start().
>   - Del the range from invalid range list after calling ->range_end().
>   - Remove useless list initialization.
>

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] ALSA: set no sound proc fs for reduced memory footprint

2015-05-26 Thread Sudip Mukherjee

On Tue, May 26, 2015 at 09:13:57PM +0800, Jie Yang wrote:
> Disable sound proc fs, when CONFIG_SND_NO_PROC_FS is selected,
> which can save about 9KB memory size for reducing memory
> footprint purpose.
> ---
missing Signed-off-by.

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ACPI / property: Define a symbol for PRP0001

2015-05-26 Thread Darren Hart

On Fri, May 22, 2015 at 04:24:34AM +0200, Rafael Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> Use a #defined symbol ACPI_DT_NAMESPACE_HID instead of the PRP0001
> string.
> 
> Signed-off-by: Rafael J. Wysocki 

That's a worthy improvement for both legibility as well as maintenance.

Reviewed-by: Darren Hart 

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v3] apple-gmux: lock iGP IO to protect from vgaarb changes

2015-05-26 Thread Darren Hart

On Tue, May 26, 2015 at 12:10:48PM -0700, Michael Marineau wrote:
> FYI, this actually broke backlight controls on my MBP11,3 because the
> assumption the patch makes that gmux is always loaded before graphics
> drivers didn't hold true. At least for me dracut included the nouveau
> module in the initrd but not gmux, ensuring the ordering was wrong. No
> errors were reporting, and gmux still offered the backlight device, it
> just became inoperable. I worked around this for my kernel by building
> gmux into vmlinuz instead of as a module but that isn't going to in
> more general configs because there is an apple backlight driver which
> cannot be built at all in that configuration.
> 

Thank you for reporting this Michael,

That is tough as nouveau doesn't have an explicit dependency on gmux, so we
could do something like a passive request_module(), but if it isn't in the
initrd image, it would still fail as you describe.

> Is there a way to make the ordering between nouveau and gmux more
> explicit/reliable? Can gmux complain loudly if the ordering is ever
> wrong?

It should print an error if the probe fails due to the IO already being in use
or if it can't be allocated. The disabled IO case is only info level though,
perhaps that should be higher priority. Printing something when failing to probe
seems like a reasonable thing to do.

Michael, which message do you get if you boot with "debug" or "loglevel=6" when
apple-gmux is not built-in?

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: ulpi: don't register drivers if bus doesn't exist

2015-05-26 Thread Sudip Mukherjee

On Tue, May 26, 2015 at 07:41:18PM -0700, Greg KH wrote:
> On Tue, May 26, 2015 at 10:54:01AM -0700, David Cohen wrote:
> > Hi,
> > 
> > On Mon, May 25, 2015 at 07:00:13PM +0200, Bjørn Mork wrote:
> > > Greg KH  writes:
> > > 
> Don't mess with bus->p.  I can rename it to
> "do_not_touch_this_isnt_for_you" if people think that would make it more
> obvious that a private data structure shouldn't be messed with in any
> way.  Outside of the driver core, you have no knowledge that even if it
> is a pointer, what that means with regards to anything.
Being a newbie I had a newbie kind of doubt that if a module is builtin
and the init fails then what happens to the functions exported by it.
And to test that I created a module:
int abcd(void)
{
pr_err("test: in abcd\n");
return 0;
}
EXPORT_SYMBOL(abcd);

static int __init test_init(void)
{
return -ENOMEM;
}
module_init(test_init);

static void __exit test_exit(void)
{
}
module_exit(test_exit);

Compiled it as builtin, and created another module which calls abcd();
and as expected abcd() executed.

So same thing can happen here also:
if bus_register() in ulpi_init() fails then also ulpi_unregister_driver()
can be executed as the symbol has been exported. you are saying bus->p is
private and not to use that but you are also saying that if we use another
variable to keep the status of bus registration then the design is wrong.
Then what should be the correct way?

regards
sudip

> 
> thanks,
> 
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: dell_rbtn - kernel panic at boot...

2015-05-26 Thread Darren Hart

On Tue, May 26, 2015 at 09:09:30PM -0700, Darren Hart wrote:
> On Mon, May 25, 2015 at 04:40:14PM +0200, Pali Rohár wrote:
> > On Sunday 24 May 2015 21:44:32 Darren Hart wrote:
> > > On Sat, May 23, 2015 at 03:05:36AM +0200, Pali Rohár wrote:
> > > > On Saturday 23 May 2015 00:53:16 Dmitry Torokhov wrote:
> > > > > On Thu, May 21, 2015 at 7:06 PM, Valdis Kletnieks
> > > > > 
> > > > >  wrote:
> > > > > > So after I made both config variables =y, the resulting kernel
> > > > > > built, but died a glorious death at boot.
> > > > > 
> > > > > I guess if both are built-in then, according to link order,
> > > > > dell-laptop starts first, before dell-rbtn, and dies in
> > > > > dell_rbtn_notifier_register() in call to
> > > > > driver_for_each_device(_driver.drv, ...) because rbtn_driver has
> > > > > not been registered yet and thus half-initlalized.
> > > > > 
> > > > > Thanks.
> > > > 
> > > > pr_debug() messages could be useful... but no idea if we can get them.
> > > > 
> > > > Is there any way to fix that dependency race condition? Could 
> > > > driver_attach() function call help?
> > > 
> > > I believe you can avoid this by moving dell-rbtn earlier in the Makefile 
> > > than
> > > dell-laptop - but this is fragile and a hack to resolve a dependency 
> > > problem.
> > > 
> > 
> > And what about that late_initcall() instead module_init() in dell-laptop?
> > Will it fix this problem?
> > 
> 
> No, because late_initcall() for modules is module_init(). See
> include/linux/init.h.

Apologies, in this context we're concerned about built-in, not module.

This might function as desired.

module_init() is defined as device_initcall (level 6) for built-in
late_initcall() is level 7

There is precedent for this under drivers/ - although not in anything that stood
out to me as a good exemplar. See:

b233020 Input: gpio_keys - move to late_initcall

for an example with a similar purpose.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 2/2] arm64: Implement vmalloc based thread_info allocator

2015-05-26 Thread Minchan Kim

On Tue, May 26, 2015 at 09:10:11PM +0900, Jungseok Lee wrote:
> On May 25, 2015, at 11:58 PM, Minchan Kim wrote:
> > On Mon, May 25, 2015 at 07:01:33PM +0900, Jungseok Lee wrote:
> >> On May 25, 2015, at 2:49 AM, Arnd Bergmann wrote:
> >>> On Monday 25 May 2015 01:02:20 Jungseok Lee wrote:
>  Fork-routine sometimes fails to get a physically contiguous region for
>  thread_info on 4KB page system although free memory is enough. That is,
>  a physically contiguous region, which is currently 16KB, is not available
>  since system memory is fragmented.
>  
>  This patch tries to solve the problem as allocating thread_info memory
>  from vmalloc space, not 1:1 mapping one. The downside is one additional
>  page allocation in case of vmalloc. However, vmalloc space is large 
>  enough,
>  around 240GB, under a combination of 39-bit VA and 4KB page. Thus, it is
>  not a big tradeoff for fork-routine service.
> >>> 
> >>> vmalloc has a rather large runtime cost. I'd argue that failing to 
> >>> allocate
> >>> thread_info structures means something has gone very wrong.
> >> 
> >> That is why the feature is marked "N" by default.
> >> I focused on fork-routine stability rather than performance.
> > 
> > If VM has trouble with order-2 allocation, your system would be
> > trouble soon although fork at the moment manages to be successful
> > because such small high-order(ex, order <= PAGE_ALLOC_COSTLY_ORDER)
> > allocation is common in the kernel so VM should handle it smoothly.
> > If VM didn't, it means we should fix VM itself, not a specific
> > allocation site. Fork is one of victim by that.
> 
> A problem I observed is an user space, not a kernel side. As user applications
> fail to create threads in order to distribute their jobs properly, they are 
> getting
> in trouble slowly and then gone.
> 
> Yes, fork is one of victim, but damages user applications seriously.
> At this snapshot, free memory is enough.

Yes, it's the one you found.

*Free memory is enough but why forking was failed*

You should find the exact reason for it rather than papering over by
hiding forking fail.

1. Investigate how many of movable/unmovable page ratio at the moment
2. Investigate why compaction doesn't work
3. Investigate why reclaim couldn't make order-2 page


> 
> >> Could you give me an idea how to evaluate performance degradation?
> >> Running some benchmarks would be helpful, but I would like to try to
> >> gather data based on meaningful methodology.
> >> 
> >>> Can you describe the scenario that leads to fragmentation this bad?
> >> 
> >> Android, but I could not describe an exact reproduction procedure step
> >> by step since it's behaved and reproduced randomly. As reading the 
> >> following
> >> thread from mm mailing list, a similar symptom is observed on other 
> >> systems. 
> >> 
> >> https://lkml.org/lkml/2015/4/28/59
> >> 
> >> Although I do not know the details of a system mentioned in the thread,
> >> even order-2 page allocation is not smoothly operated due to fragmentation 
> >> on
> >> low memory system.
> > 
> > What Joonsoo have tackle is generic fragmentation problem, not *a* fork 
> > fail,
> > which is more right approach to handle small high-order allocation problem.
> 
> I totally agree with that point. One of the best ways is to figure out a 
> generic
> anti-fragmentation with VM system improvement. Reducing the stack size to 8KB 
> is also
> a really great approach. My intention is not to overlook them or figure out a 
> workaround.
> 
> IMHO, vmalloc would be a different option in case of ARM64 on low memory 
> systems since
> *fork failure from fragmentation* is a nontrivial issue.
> 
> Do you think the patch set doesn't need to be considered?

I don't know because the changelog doesn't have full description
about your problem. You just wrote "forking was failed so we want
to avoid that by vmalloc because forking is important".
It seems to me it is just bandaid.

What you should provide for description is

" Forking was failed although there were lots of free pages
  so I investigated it and found root causes in somewhere
  so this patch fixes the problem"

Thanks.


> 
> Best Regards
> Jungseok Lee

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 0/9] Add VT-d Posted-Interrupts support - IOMMU part

2015-05-26 Thread Feng Wu

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

This series was part of http://thread.gmane.org/gmane.linux.kernel.iommu/7708. 
To make things clear, send out IOMMU part here.

This patch-set is based on the lastest x86/apic branch of tip tree.

Divide the whole series which contain multiple components into three parts:
- Prerequisite changes to irq subsystem (already merged in tip tree x86/apic 
branch)
- IOMMU part (in this series)
- KVM and VFIO parts (will send out this part once the first two parts are 
accepted)

v7->v8:
* Save the irq mode (posted or remapped) of an IRTE in struct irq_2_iommu.
* Use this new mode to decide whether update the hardware when
modifying irte in intel_ir_set_affinity().

v6->v7:
* Add an static inline helper function set_irq_posting_cap() to set
the PI capability.
* Add some comments for the new member "ir_data->irte_pi_entry".

v5->v6:
* Extend 'struct irte' for VT-d Posted-Interrupts, combine remapped
and posted mode into one irte structure.

v4->v5:
* Abstract modify_irte() to accept two format of irte.

v3->v4:
* Change capability to a int variant flags instead of a function call.
* Add hotplug case for VT-d PI.

Feng Wu (8):
  iommu: Add new member capability to struct irq_remap_ops
  iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
  iommu, x86: Save the mode (posted or remapped) of an IRTE
  iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  iommu, x86: Add cap_pi_support() to detect VT-d PI capability
  iommu, x86: Setup Posted-Interrupts capability for Intel iommu
  iommu, x86: define irq_remapping_cap()
  iommu, x86: Properly handler PI for IOMMU hotplug

Thomas Gleixner (1):
  iommu: dmar: Extend struct irte for VT-d Posted-Interrupts

 arch/x86/include/asm/irq_remapping.h | 11 +
 drivers/iommu/intel_irq_remapping.c  | 93 +++-
 drivers/iommu/irq_remapping.c| 11 +
 drivers/iommu/irq_remapping.h|  6 +++
 include/linux/dmar.h | 70 +--
 include/linux/intel-iommu.h  |  1 +
 6 files changed, 176 insertions(+), 16 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 3/9] iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip

2015-05-26 Thread Feng Wu

Implement irq_set_vcpu_affinity for intel_ir_chip.

Signed-off-by: Feng Wu 
Reviewed-by: Jiang Liu 
Acked-by: David Woodhouse 
---
 arch/x86/include/asm/irq_remapping.h |  5 
 drivers/iommu/intel_irq_remapping.c  | 46 
 2 files changed, 51 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h 
b/arch/x86/include/asm/irq_remapping.h
index 0953723..202e040 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -57,6 +57,11 @@ static inline struct irq_domain 
*arch_get_ir_parent_domain(void)
return x86_vector_domain;
 }
 
+struct vcpu_data {
+   u64 pi_desc_addr;   /* Physical address of PI Descriptor */
+   u32 vector; /* Guest vector of the interrupt */
+};
+
 #else  /* CONFIG_IRQ_REMAP */
 
 static inline void set_irq_remapping_broken(void) { }
diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index 8fad71c..1955b09 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -42,6 +42,7 @@ struct irq_2_iommu {
 struct intel_ir_data {
struct irq_2_iommu  irq_2_iommu;
struct irte irte_entry;
+   struct irte irte_pi_entry;
union {
struct msi_msg  msi_entry;
};
@@ -1013,10 +1014,55 @@ static void intel_ir_compose_msi_msg(struct irq_data 
*irq_data,
*msg = ir_data->msi_entry;
 }
 
+static int intel_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
+{
+   struct intel_ir_data *ir_data = data->chip_data;
+   struct irte *irte_pi = _data->irte_pi_entry;
+   struct vcpu_data *vcpu_pi_info;
+
+   /* stop posting interrupts, back to remapping mode */
+   if (!vcpu_info) {
+   modify_irte(_data->irq_2_iommu, _data->irte_entry);
+   } else {
+   vcpu_pi_info = (struct vcpu_data *)vcpu_info;
+
+   /*
+* "ir_data->irte_entry" saves the remapped format of IRTE,
+* which being a cached irte is still updated when setting
+* the affinity even when we are in posted mode. So this make
+* it possible to switch back to remapped mode from posted mode,
+* we can just set "ir_data->irte_entry" to hardware for that
+* purpose. Here we store the posted format of IRTE in another
+* new member "ir_data->irte_pi_entry" to not corrupt
+* "ir_data->irte_entry".
+*/
+   memcpy(irte_pi, _data->irte_entry, sizeof(struct irte));
+
+   irte_pi->p_urgent = 0;
+   irte_pi->p_vector = vcpu_pi_info->vector;
+   irte_pi->pda_l = (vcpu_pi_info->pi_desc_addr >>
+(32 - PDA_LOW_BIT)) & ~(-1UL << PDA_LOW_BIT);
+   irte_pi->pda_h = (vcpu_pi_info->pi_desc_addr >> 32) &
+~(-1UL << PDA_HIGH_BIT);
+
+   irte_pi->p_res0 = 0;
+   irte_pi->p_res1 = 0;
+   irte_pi->p_res2 = 0;
+   irte_pi->p_res3 = 0;
+
+   irte_pi->p_pst = 1;
+
+   modify_irte(_data->irq_2_iommu, irte_pi);
+   }
+
+   return 0;
+}
+
 static struct irq_chip intel_ir_chip = {
.irq_ack = ir_ack_apic_edge,
.irq_set_affinity = intel_ir_set_affinity,
.irq_compose_msi_msg = intel_ir_compose_msi_msg,
+   .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity,
 };
 
 static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: dell_rbtn - kernel panic at boot...

2015-05-26 Thread Darren Hart

On Mon, May 25, 2015 at 08:03:42AM +0200, Pali Rohár wrote:
> On Monday 25 May 2015 07:01:21 Matthew Garrett wrote:
> > On Sun, May 24, 2015 at 09:44:32PM -0700, Darren Hart wrote:
> > > Greg, Matthew, I'm tempted to recommend this 434 line driver be
> > > rolled into dell-laptop.c. Any strong opinions?
> > 
> > Mrm. It's slightly conceptually nasty in that one's an ACPI driver
> > and one's calling a Dell custom interface, but I think merging them
> > is probably the last bad answer.
> 
> I think merging does not fix our problem. dell laptop rfkill driver 
> needs to be initialized after dell-rbtn acpi driver register itself.

If they were the same driver, you could control this ordering.

> 
> And dell-laptop and dell-rbtn are two different devices (one dell smbios 
> and one acpi) and it for me it sounds like bad idea too...

We all agree it's a bad idea - the point Mathew and I made was it may be the
"least bad" idea (all the others may be worse).

I'm looking into this, but I don't have an easy answer for you. This one is
going to take some research on your part to get to the right answer.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 5/9] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts

2015-05-26 Thread Feng Wu

We don't need to migrate the irqs for VT-d Posted-Interrupts here.
When 'pst' is set in IRTE, the associated irq will be posted to
guests instead of interrupt remapping. The destination of the
interrupt is set in Posted-Interrupts Descriptor, and the migration
happens during vCPU scheduling.

However, we still update the cached irte here, which can be used
when changing back to remapping mode.

Signed-off-by: Feng Wu 
Reviewed-by: Jiang Liu 
Acked-by: David Woodhouse 
---
 drivers/iommu/intel_irq_remapping.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index 70a1b79..d230edc 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1003,7 +1003,10 @@ intel_ir_set_affinity(struct irq_data *data, const 
struct cpumask *mask,
 */
irte->vector = cfg->vector;
irte->dest_id = IRTE_DEST(cfg->dest_apicid);
-   modify_irte(_data->irq_2_iommu, irte);
+
+   /* Update the hardware only if the interrupt is in remapped mode. */
+   if (ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
+   modify_irte(_data->irq_2_iommu, irte);
 
/*
 * After this point, all the interrupts will start arriving
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 6/9] iommu, x86: Add cap_pi_support() to detect VT-d PI capability

2015-05-26 Thread Feng Wu

Add helper function to detect VT-d Posted-Interrupts capability.

Signed-off-by: Feng Wu 
Reviewed-by: Jiang Liu 
Acked-by: David Woodhouse 
---
 include/linux/intel-iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 0af9b03..0c251be 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -87,6 +87,7 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
 /*
  * Decoding Capability Register
  */
+#define cap_pi_support(c)  (((c) >> 59) & 1)
 #define cap_read_drain(c)  (((c) >> 55) & 1)
 #define cap_write_drain(c) (((c) >> 54) & 1)
 #define cap_max_amask_val(c)   (((c) >> 48) & 0x3f)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 4/9] iommu, x86: Save the mode (posted or remapped) of an IRTE

2015-05-26 Thread Feng Wu

This patch adds a new field in struct irq_2_iommu, which can
capture whether the entry is in posted mode or remapped mode.

Signed-off-by: Feng Wu 
Suggested-by: Thomas Gleixner 
---
 drivers/iommu/intel_irq_remapping.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index 1955b09..70a1b79 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -18,6 +18,11 @@
 
 #include "irq_remapping.h"
 
+enum irq_mode {
+   IRQ_REMAPPING,
+   IRQ_POSTING,
+};
+
 struct ioapic_scope {
struct intel_iommu *iommu;
unsigned int id;
@@ -37,6 +42,7 @@ struct irq_2_iommu {
u16 irte_index;
u16 sub_handle;
u8  irte_mask;
+   enum irq_mode mode;
 };
 
 struct intel_ir_data {
@@ -105,6 +111,7 @@ static int alloc_irte(struct intel_iommu *iommu, int irq,
irq_iommu->irte_index =  index;
irq_iommu->sub_handle = 0;
irq_iommu->irte_mask = mask;
+   irq_iommu->mode = IRQ_REMAPPING;
}
raw_spin_unlock_irqrestore(_2_ir_lock, flags);
 
@@ -145,6 +152,8 @@ static int modify_irte(struct irq_2_iommu *irq_iommu,
__iommu_flush_cache(iommu, irte, sizeof(*irte));
 
rc = qi_flush_iec(iommu, index, 0);
+
+   irq_iommu->mode = irte->pst ? IRQ_POSTING : IRQ_REMAPPING;
raw_spin_unlock_irqrestore(_2_ir_lock, flags);
 
return rc;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 9/9] iommu, x86: Properly handler PI for IOMMU hotplug

2015-05-26 Thread Feng Wu

Return error when inserting a new IOMMU which doesn't support PI
if PI is currently in use.

Signed-off-by: Feng Wu 
---
 drivers/iommu/intel_irq_remapping.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index c21a2a9..1f35d0b 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1363,6 +1363,9 @@ int dmar_ir_hotplug(struct dmar_drhd_unit *dmaru, bool 
insert)
return -EINVAL;
if (!ecap_ir_support(iommu->ecap))
return 0;
+   if (irq_remapping_cap(IRQ_POSTING_CAP) &&
+   !cap_pi_support(iommu->cap))
+   return -EBUSY;
 
if (insert) {
if (!iommu->ir_table)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 7/9] iommu, x86: Setup Posted-Interrupts capability for Intel iommu

2015-05-26 Thread Feng Wu

Set Posted-Interrupts capability for Intel iommu when IR is enabled,
clear it when IR is disabled.

Signed-off-by: Feng Wu 
---
 drivers/iommu/intel_irq_remapping.c | 30 ++
 drivers/iommu/irq_remapping.c   |  2 ++
 drivers/iommu/irq_remapping.h   |  3 +++
 3 files changed, 35 insertions(+)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index d230edc..c21a2a9 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -581,6 +581,26 @@ error:
return -ENODEV;
 }
 
+/*
+ * Set Posted-Interrupts capability.
+ */
+static inline void set_irq_posting_cap(void)
+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+
+   if (!disable_irq_post) {
+   intel_irq_remap_ops.capability |= 1 << IRQ_POSTING_CAP;
+
+   for_each_iommu(iommu, drhd)
+   if (!cap_pi_support(iommu->cap)) {
+   intel_irq_remap_ops.capability &=
+   ~(1 << IRQ_POSTING_CAP);
+   break;
+   }
+   }
+}
+
 static int __init intel_enable_irq_remapping(void)
 {
struct dmar_drhd_unit *drhd;
@@ -656,6 +676,8 @@ static int __init intel_enable_irq_remapping(void)
 
irq_remapping_enabled = 1;
 
+   set_irq_posting_cap();
+
pr_info("Enabled IRQ remapping in %s mode\n", eim ? "x2apic" : "xapic");
 
return eim ? IRQ_REMAP_X2APIC_MODE : IRQ_REMAP_XAPIC_MODE;
@@ -856,6 +878,12 @@ static void disable_irq_remapping(void)
 
iommu_disable_irq_remapping(iommu);
}
+
+   /*
+* Clear Posted-Interrupts capability.
+*/
+   if (!disable_irq_post)
+   intel_irq_remap_ops.capability &= ~(1 << IRQ_POSTING_CAP);
 }
 
 static int reenable_irq_remapping(int eim)
@@ -883,6 +911,8 @@ static int reenable_irq_remapping(int eim)
if (!setup)
goto error;
 
+   set_irq_posting_cap();
+
return 0;
 
 error:
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index fc78b0d..ed605a9 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -22,6 +22,8 @@ int irq_remap_broken;
 int disable_sourceid_checking;
 int no_x2apic_optout;
 
+int disable_irq_post = 1;
+
 static int disable_irq_remap;
 static struct irq_remap_ops *remap_ops;
 
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index b6ca30d..039c7af 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -34,6 +34,8 @@ extern int disable_sourceid_checking;
 extern int no_x2apic_optout;
 extern int irq_remapping_enabled;
 
+extern int disable_irq_post;
+
 struct irq_remap_ops {
/* The supported capabilities */
int capability;
@@ -69,6 +71,7 @@ extern void ir_ack_apic_edge(struct irq_data *data);
 
 #define irq_remapping_enabled 0
 #define irq_remap_broken  0
+#define disable_irq_post  1
 
 #endif /* CONFIG_IRQ_REMAP */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 2/9] iommu: dmar: Extend struct irte for VT-d Posted-Interrupts

2015-05-26 Thread Feng Wu

From: Thomas Gleixner 

The IRTE (Interrupt Remapping Table Entry) is either an entry for
remapped or for posted interrupts. The hardware distiguishes between
remapped and posted entries by bit 15 in the low 64 bit of the
IRTE. If cleared the entry is remapped, if set it's posted.

The entries have common fields and dependent on the posted bit fields
with different meanings.

Extend struct irte to handle the differences between remap and posted
mode by having three structs in the unions:

- Shared
- Remapped
- Posted

Signed-off-by: Thomas Gleixner 
Signed-off-by: Feng Wu 
---
 include/linux/dmar.h | 70 +---
 1 file changed, 55 insertions(+), 15 deletions(-)

diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index 8473756..0dbcabc 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -185,33 +185,73 @@ static inline int dmar_device_remove(void *handle)
 
 struct irte {
union {
+   /* Shared between remapped and posted mode*/
struct {
-   __u64   present : 1,
-   fpd : 1,
-   dst_mode: 1,
-   redir_hint  : 1,
-   trigger_mode: 1,
-   dlvry_mode  : 3,
-   avail   : 4,
-   __reserved_1: 4,
-   vector  : 8,
-   __reserved_2: 8,
-   dest_id : 32;
+   __u64   present : 1,  /*  0  */
+   fpd : 1,  /*  1  */
+   __res0  : 6,  /*  2 -  6 */
+   avail   : 4,  /*  8 - 11 */
+   __res1  : 3,  /* 12 - 14 */
+   pst : 1,  /* 15  */
+   vector  : 8,  /* 16 - 23 */
+   __res2  : 40; /* 24 - 63 */
+   };
+
+   /* Remapped mode */
+   struct {
+   __u64   r_present   : 1,  /*  0  */
+   r_fpd   : 1,  /*  1  */
+   dst_mode: 1,  /*  2  */
+   redir_hint  : 1,  /*  3  */
+   trigger_mode: 1,  /*  4  */
+   dlvry_mode  : 3,  /*  5 -  7 */
+   r_avail : 4,  /*  8 - 11 */
+   r_res0  : 4,  /* 12 - 15 */
+   r_vector: 8,  /* 16 - 23 */
+   r_res1  : 8,  /* 24 - 31 */
+   dest_id : 32; /* 32 - 63 */
+   };
+
+   /* Posted mode */
+   struct {
+   __u64   p_present   : 1,  /*  0  */
+   p_fpd   : 1,  /*  1  */
+   p_res0  : 6,  /*  2 -  7 */
+   p_avail : 4,  /*  8 - 11 */
+   p_res1  : 2,  /* 12 - 13 */
+   p_urgent: 1,  /* 14  */
+   p_pst   : 1,  /* 15  */
+   p_vector: 8,  /* 16 - 23 */
+   p_res2  : 14, /* 24 - 37 */
+   pda_l   : 26; /* 38 - 63 */
};
__u64 low;
};
 
union {
+   /* Shared between remapped and posted mode*/
struct {
-   __u64   sid : 16,
-   sq  : 2,
-   svt : 2,
-   __reserved_3: 44;
+   __u64   sid : 16,  /* 64 - 79  */
+   sq  : 2,   /* 80 - 81  */
+   svt : 2,   /* 82 - 83  */
+   __res3  : 44;  /* 84 - 127 */
+   };
+
+   /* Posted mode*/
+   struct {
+   __u64   p_sid   : 16,  /* 64 - 79  */
+   p_sq: 2,   /* 80 - 81  */
+   p_svt   : 2,   /* 82 - 83  */
+   p_res3  : 12,  /* 84 - 95  */
+   pda_h   : 32;  /* 96 - 127 */
};
__u64 high;
};
 };
 
+#define PDA_LOW_BIT26
+#define PDA_HIGH_BIT   32
+

[v8 1/9] iommu: Add new member capability to struct irq_remap_ops

2015-05-26 Thread Feng Wu

This patch adds a new member capability to struct irq_remap_ops,
this new function ops can be used to check whether some
features are supported, such as VT-d Posted-Interrupts.

Signed-off-by: Feng Wu 
Reviewed-by: Jiang Liu 
---
 arch/x86/include/asm/irq_remapping.h | 4 
 drivers/iommu/irq_remapping.h| 3 +++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h 
b/arch/x86/include/asm/irq_remapping.h
index 78974fb..0953723 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -31,6 +31,10 @@ struct irq_alloc_info;
 
 #ifdef CONFIG_IRQ_REMAP
 
+enum irq_remap_cap {
+   IRQ_POSTING_CAP = 0,
+};
+
 extern void set_irq_remapping_broken(void);
 extern int irq_remapping_prepare(void);
 extern int irq_remapping_enable(void);
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 91d5a11..b6ca30d 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -35,6 +35,9 @@ extern int no_x2apic_optout;
 extern int irq_remapping_enabled;
 
 struct irq_remap_ops {
+   /* The supported capabilities */
+   int capability;
+
/* Initializes hardware and makes it ready for remapping interrupts */
int  (*prepare)(void);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v8 8/9] iommu, x86: define irq_remapping_cap()

2015-05-26 Thread Feng Wu

This patch adds a new interface irq_remapping_cap() to detect
whether irq remapping supports new features, such as VT-d
Posted-Interrupts. We export this function out, so that KVM
code can check this and use this mechanism properly.

Signed-off-by: Feng Wu 
Reviewed-by: Jiang Liu 
---
 arch/x86/include/asm/irq_remapping.h | 2 ++
 drivers/iommu/irq_remapping.c| 9 +
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h 
b/arch/x86/include/asm/irq_remapping.h
index 202e040..61aa8ad 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -35,6 +35,7 @@ enum irq_remap_cap {
IRQ_POSTING_CAP = 0,
 };
 
+extern bool irq_remapping_cap(enum irq_remap_cap cap);
 extern void set_irq_remapping_broken(void);
 extern int irq_remapping_prepare(void);
 extern int irq_remapping_enable(void);
@@ -64,6 +65,7 @@ struct vcpu_data {
 
 #else  /* CONFIG_IRQ_REMAP */
 
+static bool irq_remapping_cap(enum irq_remap_cap cap) { return 0; }
 static inline void set_irq_remapping_broken(void) { }
 static inline int irq_remapping_prepare(void) { return -ENODEV; }
 static inline int irq_remapping_enable(void) { return -ENODEV; }
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index ed605a9..2d99930 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -81,6 +81,15 @@ void set_irq_remapping_broken(void)
irq_remap_broken = 1;
 }
 
+bool irq_remapping_cap(enum irq_remap_cap cap)
+{
+   if (!remap_ops || disable_irq_post)
+   return 0;
+
+   return (remap_ops->capability & (1 << cap));
+}
+EXPORT_SYMBOL_GPL(irq_remapping_cap);
+
 int __init irq_remapping_prepare(void)
 {
if (disable_irq_remap)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 1/3] usb: add function usb_autopm_get_interface_upgrade

2015-05-26 Thread Zhang, Yanmin

From: Zhang Yanmin 

Some usb driver has a specific requirement. Their critical functions
might be called under both atomic environment and non-atomic environment.

If it's under atomic environment, the driver can wake up the device
by calling pm_runtime_get_sync directly.

If it's under non-atomic environment, the function's caller need wake
up the device before the function accesses the device.

The patch adds usb_autopm_get_interface_upgrade, a new function to
support above capability.

Signed-off-by: Zhang Yanmin 
---
 drivers/usb/core/driver.c | 54 +++
 include/linux/usb.h   |  3 +++
 2 files changed, 57 insertions(+)

diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 818369a..5d6f9ee 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -1684,6 +1684,60 @@ int usb_autopm_get_interface(struct usb_interface *intf)
 EXPORT_SYMBOL_GPL(usb_autopm_get_interface);
 
 /**
+ * usb_autopm_get_interface_upgrade - increment a USB interface's PM-usage
+ * counter
+ * @intf: the usb_interface whose counter should be incremented
+ *
+ * This routine should be called by an interface driver when it wants to
+ * use @intf and needs to guarantee that it is not suspended.  In addition,
+ * the routine prevents @intf from being autosuspended subsequently.  (Note
+ * that this will not prevent suspend events originating in the PM core.)
+ * This prevention will persist until usb_autopm_put_interface() is called
+ * or @intf is unbound.  A typical example would be a character-device
+ * driver when its device file is opened.
+ *
+ * Comparing with usb_autopm_get_interface, usb_autopm_get_interface_upgrade
+ * is more careful when resuming the device.
+ * 1) The caller's caller already resumes resume the device and hold spinlocks.
+ * usb_autopm_get_interface_upgrade couldn't call pm_runtime_get_sync;
+ * 2) The caller's caller doesn't resume the device.
+ * usb_autopm_get_interface_upgrade has to resume the device before going 
ahead.
+ *
+ * @intf's usage counter is incremented to prevent subsequent autosuspends.
+ * However if the autoresume fails then the counter is re-decremented.
+ *
+ * This routine can run only in process context.
+ *
+ * Return: 0 on success.
+ */
+int usb_autopm_get_interface_upgrade(struct usb_interface *intf)
+{
+   int status = 0;
+
+   pm_runtime_get(>dev);
+   if (!pm_runtime_active(>dev)) {
+   /* If not active, next _get_sync wakes device up*/
+   status = pm_runtime_get_sync(>dev);
+   /*
+* If it's active, next _put_sync wouldn't
+* really put it to sleep as the 1st _get
+* keeps the device active.
+*/
+   pm_runtime_put_sync(>dev);
+   if (status < 0)
+   pm_runtime_put(>dev);
+   } else
+   atomic_inc(>pm_usage_cnt);
+   dev_vdbg(>dev, "%s: cnt %d -> %d\n",
+   __func__, atomic_read(>dev.power.usage_count),
+   status);
+   if (status > 0)
+   status = 0;
+   return status;
+}
+EXPORT_SYMBOL_GPL(usb_autopm_get_interface_upgrade);
+
+/**
  * usb_autopm_get_interface_async - increment a USB interface's PM-usage 
counter
  * @intf: the usb_interface whose counter should be incremented
  *
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 447fe29..0a8a44a 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -663,6 +663,7 @@ extern void usb_enable_autosuspend(struct usb_device *udev);
 extern void usb_disable_autosuspend(struct usb_device *udev);
 
 extern int usb_autopm_get_interface(struct usb_interface *intf);
+extern int usb_autopm_get_interface_upgrade(struct usb_interface *intf);
 extern void usb_autopm_put_interface(struct usb_interface *intf);
 extern int usb_autopm_get_interface_async(struct usb_interface *intf);
 extern void usb_autopm_put_interface_async(struct usb_interface *intf);
@@ -683,6 +684,8 @@ static inline int usb_disable_autosuspend(struct usb_device 
*udev)
 
 static inline int usb_autopm_get_interface(struct usb_interface *intf)
 { return 0; }
+static inline int usb_autopm_get_interface_upgrade(struct usb_interface *intf)
+{ return 0; }
 static inline int usb_autopm_get_interface_async(struct usb_interface *intf)
 { return 0; }
 
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 3/3] n_gsm: wake up ldisc tty before using it

2015-05-26 Thread Zhang, Yanmin

From: Zhang Yanmin 

Wake up ldisc device before calling its driver to access the device.

Signed-off-by: Zhang Yanmin 
---
 drivers/tty/n_gsm.c | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 2c34c32..f887df6 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -62,6 +62,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int debug;
 module_param(debug, int, 0600);
@@ -555,6 +556,27 @@ static int gsm_stuff_frame(const u8 *input, u8 *output, 
int len)
return olen;
 }
 
+static int pm_runtime_get_sync_tty(struct tty_struct *tty)
+{
+   int ret = 0;
+
+   /* Wakeup parent as tty itself doesn't enable runtime */
+   if (tty->dev->parent)
+   ret = pm_runtime_get_sync(tty->dev->parent);
+
+   return ret;
+}
+
+static int pm_runtime_put_tty(struct tty_struct *tty)
+{
+   int ret = 0;
+
+   if (tty->dev->parent)
+   ret = pm_runtime_put(tty->dev->parent);
+
+   return ret;
+}
+
 /**
  * gsm_send-   send a control frame
  * @gsm: our GSM mux
@@ -1511,7 +1533,9 @@ static void gsm_dlci_begin_open(struct gsm_dlci *dlci)
return;
dlci->retries = gsm->n2;
dlci->state = DLCI_OPENING;
+   pm_runtime_get_sync_tty(gsm->tty);
gsm_command(dlci->gsm, dlci->addr, SABM|PF);
+   pm_runtime_put_tty(gsm->tty);
mod_timer(>t1, jiffies + gsm->t1 * HZ / 100);
 }
 
@@ -1533,7 +1557,9 @@ static void gsm_dlci_begin_close(struct gsm_dlci *dlci)
return;
dlci->retries = gsm->n2;
dlci->state = DLCI_CLOSING;
+   pm_runtime_get_sync_tty(gsm->tty);
gsm_command(dlci->gsm, dlci->addr, DISC|PF);
+   pm_runtime_put_tty(gsm->tty);
mod_timer(>t1, jiffies + gsm->t1 * HZ / 100);
 }
 
@@ -2286,7 +2312,9 @@ static void gsmld_receive_buf(struct tty_struct *tty, 
const unsigned char *cp,
flags = *f++;
switch (flags) {
case TTY_NORMAL:
+   pm_runtime_get_sync_tty(gsm->tty);
gsm->receive(gsm, *dp);
+   pm_runtime_put_tty(gsm->tty);
break;
case TTY_OVERRUN:
case TTY_BREAK:
@@ -2957,6 +2985,7 @@ static int gsmtty_open(struct tty_struct *tty, struct 
file *filp)
 {
struct gsm_dlci *dlci = tty->driver_data;
struct tty_port *port = >port;
+   int ret;
 
port->count++;
tty_port_tty_set(port, tty);
@@ -2968,7 +2997,11 @@ static int gsmtty_open(struct tty_struct *tty, struct 
file *filp)
/* Start sending off SABM messages */
gsm_dlci_begin_open(dlci);
/* And wait for virtual carrier */
-   return tty_port_block_til_ready(port, tty, filp);
+   pm_runtime_get_sync_tty(dlci->gsm->tty);
+   ret = tty_port_block_til_ready(port, tty, filp);
+   pm_runtime_put_tty(dlci->gsm->tty);
+
+   return ret;
 }
 
 static void gsmtty_close(struct tty_struct *tty, struct file *filp)
@@ -2986,11 +3019,14 @@ static void gsmtty_close(struct tty_struct *tty, struct 
file *filp)
gsm = dlci->gsm;
if (tty_port_close_start(>port, tty, filp) == 0)
return;
+   pm_runtime_get_sync_tty(gsm->tty);
gsm_dlci_begin_close(dlci);
if (test_bit(ASYNCB_INITIALIZED, >port.flags)) {
if (C_HUPCL(tty))
tty_port_lower_dtr_rts(>port);
}
+   pm_runtime_put_tty(gsm->tty);
+
tty_port_close_end(>port, tty);
tty_port_tty_set(>port, NULL);
return;
@@ -3012,10 +3048,12 @@ static int gsmtty_write(struct tty_struct *tty, const 
unsigned char *buf,
struct gsm_dlci *dlci = tty->driver_data;
if (dlci->state == DLCI_CLOSED)
return -EINVAL;
+   pm_runtime_get_sync_tty(dlci->gsm->tty);
/* Stuff the bytes into the fifo queue */
sent = kfifo_in_locked(dlci->fifo, buf, len, >lock);
/* Need to kick the channel */
gsm_dlci_data_kick(dlci);
+   pm_runtime_put_tty(dlci->gsm->tty);
return sent;
 }
 
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 2/3] cdc-acm: call usb_autopm_get_interface_upgrade in acm_tty_write

2015-05-26 Thread Zhang, Yanmin

From: Zhang Yanmin 

acm device might be used as ldisc device by n_gsm driver.
gsmtty_write and other gsm functions calls acm_tty_write
indirectly while they holds spinlocks.

Meanwhile, application might access ACM tty device directly.

Here we choose to call usb_autopm_get_interface_upgrade instead of
usb_autopm_get_interface_async to make sure above 2 scenarios can
work well.

Signed-off-by: Zhang Yanmin 
---
 drivers/usb/class/cdc-acm.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 5c8f581..6ad85a3 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -689,10 +689,15 @@ static int acm_tty_write(struct tty_struct *tty,
 
dev_vdbg(>data->dev, "%s - count %d\n", __func__, count);
 
+   stat = usb_autopm_get_interface_upgrade(acm->control);
+   if (stat)
+   return stat;
+
spin_lock_irqsave(>write_lock, flags);
wbn = acm_wb_alloc(acm);
if (wbn < 0) {
spin_unlock_irqrestore(>write_lock, flags);
+   usb_autopm_put_interface(acm->control);
return 0;
}
wb = >wb[wbn];
@@ -700,6 +705,7 @@ static int acm_tty_write(struct tty_struct *tty,
if (!acm->dev) {
wb->use = 0;
spin_unlock_irqrestore(>write_lock, flags);
+   usb_autopm_put_interface(acm->control);
return -ENODEV;
}
 
@@ -708,13 +714,6 @@ static int acm_tty_write(struct tty_struct *tty,
memcpy(wb->buf, buf, count);
wb->len = count;
 
-   stat = usb_autopm_get_interface_async(acm->control);
-   if (stat) {
-   wb->use = 0;
-   spin_unlock_irqrestore(>write_lock, flags);
-   return stat;
-   }
-
if (acm->susp_count) {
usb_anchor_urb(wb->urb, >delayed);
spin_unlock_irqrestore(>write_lock, flags);
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 0/3] cdc-acm: fix incorrect runtime wakeup in acm_tty_write

2015-05-26 Thread Zhang, Yanmin

Resend as V1/V2 have email format issue. Sorry for bothering.

I use Thunderbird. It has no a button to enable LKML email simply. :)

V3: Change email config to resend.
Add a space in comment.

 ---

There is a scenario about cdc-acm utilization.Application opens
n_gsm tty and cdc-acm tty. cdc-acm tty connects to xhci device.
The application configures cdc-adm tty to n_gsm tty as ldisc tty.

n_gsm=>cdc-acm=>xhci driver

acm_tty_write can be called from n_gsm driver by ldisc connection,
and from application when application opens cdc-acm tty directly.
acm_tty_write wakes up the device by calling usb_autopm_get_interface_async,
which calls pm_runtime_get. However, pm_runtime_get can't wake up
the device before returning as it's an async wake up. Then, acm_tty_write
might access the device when it is off.

The patchset fixes it by:
1) add a new function usb_autopm_get_interface_upgrade to deal with
above 2 requirements;
2) wake up device in n_gsm driver if n_gsm drivers calls cdc-acm driver;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 2/2] arm64: Implement vmalloc based thread_info allocator

2015-05-26 Thread Minchan Kim

Hello Jungseok,

On Tue, May 26, 2015 at 08:29:59PM +0900, Jungseok Lee wrote:
> On May 25, 2015, at 11:40 PM, Minchan Kim wrote:
> > Hello Jungseok,
> 
> Hi, Minchan,
> 
> > On Mon, May 25, 2015 at 01:02:20AM +0900, Jungseok Lee wrote:
> >> Fork-routine sometimes fails to get a physically contiguous region for
> >> thread_info on 4KB page system although free memory is enough. That is,
> >> a physically contiguous region, which is currently 16KB, is not available
> >> since system memory is fragmented.
> > 
> > Order less than PAGE_ALLOC_COSTLY_ORDER should not fail in current
> > mm implementation. If you saw the order-2,3 high-order allocation fail
> > maybe your application received SIGKILL by someone. LMK?
> 
> Exactly right. The allocation is failed via the following path.
> 
> if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
>   goto nopage;
> 
> IMHO, a reclaim operation would be not needed in this context if memory is
> allocated from vmalloc space. It means there is no need to traverse shrinker 
> list. 

For making fork successful with using vmalloc, it's bandaid.

> 
> >> This patch tries to solve the problem as allocating thread_info memory
> >> from vmalloc space, not 1:1 mapping one. The downside is one additional
> >> page allocation in case of vmalloc. However, vmalloc space is large enough,
> > 
> > The size you want to allocate is 16KB in here but additional 4K?
> > It increases 25% memory footprint, which is huge downside.
> 
> I agree with the point, and most people who try to use vmalloc might know the 
> number.
> However, an interoperation on the number depends on a point of view.
> 
> Vmalloc is large enough and not fully utilized in case of ARM64.
> With the considerations, there is a room to do math as follows.
> 
> 4KB / 240GB = 1.5e-8 (4KB page + 3 level combo)
> 
> It would be not a huge downside if fork-routine is not damaged due to 
> fragmentation.

Okay, address size point of view, it wouldn't be significant problem.
Then, let's see it performance as point of view.

If we use vmalloc, it needs additional data structure for vmalloc
management, several additional allocation request, page table hanlding
and TLB flush.

Normally, forking is very frequent operation so we shouldn't do
make it slow and memory consumption bigger if there isn't big reason.

> 
> However, this is one of reasons to add "RFC" prefix in the patch set. How is 
> the
> additional 4KB interpreted and considered?
> 
> Best Regards
> Jungseok Lee

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: dell_rbtn - kernel panic at boot...

2015-05-26 Thread Darren Hart

On Mon, May 25, 2015 at 04:40:14PM +0200, Pali Rohár wrote:
> On Sunday 24 May 2015 21:44:32 Darren Hart wrote:
> > On Sat, May 23, 2015 at 03:05:36AM +0200, Pali Rohár wrote:
> > > On Saturday 23 May 2015 00:53:16 Dmitry Torokhov wrote:
> > > > On Thu, May 21, 2015 at 7:06 PM, Valdis Kletnieks
> > > > 
> > > >  wrote:
> > > > > So after I made both config variables =y, the resulting kernel
> > > > > built, but died a glorious death at boot.
> > > > 
> > > > I guess if both are built-in then, according to link order,
> > > > dell-laptop starts first, before dell-rbtn, and dies in
> > > > dell_rbtn_notifier_register() in call to
> > > > driver_for_each_device(_driver.drv, ...) because rbtn_driver has
> > > > not been registered yet and thus half-initlalized.
> > > > 
> > > > Thanks.
> > > 
> > > pr_debug() messages could be useful... but no idea if we can get them.
> > > 
> > > Is there any way to fix that dependency race condition? Could 
> > > driver_attach() function call help?
> > 
> > I believe you can avoid this by moving dell-rbtn earlier in the Makefile 
> > than
> > dell-laptop - but this is fragile and a hack to resolve a dependency 
> > problem.
> > 
> 
> And what about that late_initcall() instead module_init() in dell-laptop?
> Will it fix this problem?
> 

No, because late_initcall() for modules is module_init(). See
include/linux/init.h.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] ARM64 PCI hostbridge init based on ACPI

2015-05-26 Thread Hanjun Guo


On 2015年05月27日 08:30, Rafael J. Wysocki wrote:

On Tuesday, May 26, 2015 08:49:13 PM Hanjun Guo wrote:

This patch set is introducing ARM64 PCI hostbridge init based on ACPI,
which based on Jiang Liu's patch set "Consolidate ACPI PCI root common
code into ACPI core":

https://lkml.org/lkml/2015/5/14/98


I'll be regarding this patchset as an RFC until the one from Jiang Liu goes in.


Yes, please, Jiang Liu's patch set should go in first.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/11] XEN / PCI: Remove the dependence on arch x86 when PCI_MMCONFIG=y

2015-05-26 Thread Hanjun Guo


On 2015年05月26日 23:44, Boris Ostrovsky wrote:

On 05/26/2015 10:54 AM, Tomasz Nowicki wrote:

On 26.05.2015 16:00, Boris Ostrovsky wrote:

On 05/26/2015 09:54 AM, Boris Ostrovsky wrote:

On 05/26/2015 08:49 AM, Hanjun Guo wrote:

In drivers/xen/pci.c, there are arch x86 dependent codes when
CONFIG_PCI_MMCONFIG is enabled, since CONFIG_PCI_MMCONFIG
depends on ACPI, so this will prevent XEN PCI running on other
architectures using ACPI with PCI_MMCONFIG enabled (such as ARM64).

Fortunatly, it can be sloved in a simple way. In drivers/xen/pci.c,
the only x86 dependent code is if ((pci_probe & PCI_PROBE_MMCONF) ==
0),
and it's defined in asm/pci_x86.h, the code means that
if the PCI resource is not probed in PCI_PROBE_MMCONF way, just
ingnore the xen mcfg init. Actually this is duplicate, because
if PCI resource is not probed in PCI_PROBE_MMCONF way, the
pci_mmconfig_list will be empty, and the if (list_empty())
after it will do the same job.

So just remove the arch related code and the head file, this
will be no functional change for x86, and also makes xen/pci.c
usable for other architectures.

Signed-off-by: Hanjun Guo 
CC: Konrad Rzeszutek Wilk 
CC: Boris Ostrovsky 
---
  drivers/xen/pci.c | 6 --
  1 file changed, 6 deletions(-)

diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
index 6785ebb..9a8dbe3 100644
--- a/drivers/xen/pci.c
+++ b/drivers/xen/pci.c
@@ -28,9 +28,6 @@
  #include 
  #include 
  #include "../pci/pci.h"
-#ifdef CONFIG_PCI_MMCONFIG
-#include 
-#endif

  static bool __read_mostly pci_seg_supported = true;

@@ -222,9 +219,6 @@ static int __init xen_mcfg_late(void)
  if (!xen_initial_domain())
  return 0;

-if ((pci_probe & PCI_PROBE_MMCONF) == 0)
-return 0;
-
  if (list_empty(_mmcfg_list))
  return 0;



(+Stefano who is Xen ARM maintainer)

This will not build on x86 since pci_mmcfg_list since, for example,
pci_mmcfg_list is declared in pci_x86.h.



And now really with Stefano and with parsable first sentence, sorry:


This will not build on x86 since pci_mmcfg_list, for example, is
declared in pci_x86.h.


With this patch set, not any more. Please see preceding patches.



OK, I didn't notice this was part of a series.


Sorry, I didn't cc you all of those patches.



Then if not having PCI_PROBE_MMCONF bit set is indeed equivalent to
list_empty(_mmcfg_list), is there any reason for this flag to
(continue to) exist? (and also for pci_mmcfg_arch_init_failed.)


I think PCI_PROBE_MMCONF bit is needed for early init of pci mmconfig in
the x86 arch related code, but for xen_mcfg_late(), it's called after
acpi_init() which the mmconfig is ready for use if it's available (the
pci_mmcfg_list is empty or not), so not having PCI_PROBE_MMCONF bit set
is equivalent list_empty(_mmcfg_list) is not suitable for all cases,
but I think it will be ok after mmconfig is initialized.

I think my change log is misleading and needs updating :)

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH net-next v6 1/2] pci: Add Cavium PCI vendor id

2015-05-26 Thread Aleksey Makarov

From: Sunil Goutham 

This vendor id will be used by network (vNIC), USB (xHCI),
SATA (AHCI), GPIO, I2C, MMC and maybe other drivers
for ThunderX SoC.

Acked-by: Bjorn Helgaas 
Signed-off-by: Sunil Goutham 
Signed-off-by: Aleksey Makarov 
---
 include/linux/pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 1fa99a3..80bd333 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2324,6 +2324,8 @@
 #define PCI_DEVICE_ID_ALTIMA_AC91000x03ea
 #define PCI_DEVICE_ID_ALTIMA_AC10030x03eb
 
+#define PCI_VENDOR_ID_CAVIUM   0x177d
+
 #define PCI_VENDOR_ID_BELKIN   0x1799
 #define PCI_DEVICE_ID_BELKIN_F5D7010V7 0x701f
 
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] zram: check comp algorithm availability earlier

2015-05-26 Thread Minchan Kim

Hello Sergey,

On Tue, May 26, 2015 at 10:13:37PM +0900, Sergey Senozhatsky wrote:
> Improvement idea by Marcin Jabrzyk.
> 
> comp_algorithm_store() silently accepts any supplied algorithm
> name, because zram performs algorithm availability check later,
> during the device configuration phase in disksize_store() and
> prints
>   "zram: Cannot initialise %s compressing backend"
> to syslog. this error line is somewhat generic and, besides,
> can indicate a failed attempt to allocate compression backend's
> working buffers.
> 
> make algorithm availability check earlier, in comp_algorithm_store(),
> and be move verbose:
> 
>   echo lzz > /sys/block/zram0/comp_algorithm
>   -bash: echo: write error: Invalid argument
> 
> dmesg:
>   zram: Error: unavailable compression algorithm: lzz
> 
> Signed-off-by: Sergey Senozhatsky 
> Reported-by: Marcin Jabrzyk 
> ---
>  drivers/block/zram/zcomp.c| 5 +
>  drivers/block/zram/zcomp.h| 1 +
>  drivers/block/zram/zram_drv.c | 6 ++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index a1a8b8e..e10e2b4 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -320,6 +320,11 @@ void zcomp_destroy(struct zcomp *comp)
>   kfree(comp);
>  }
>  
> +bool zcomp_available_algorithm(const char *comp)
> +{
> + return find_backend(comp) != NULL;
> +}
> +
>  /*
>   * search available compressors for requested algorithm.
>   * allocate new zcomp and initialize it. return compressing
> diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
> index c59d1fc..46e2b9f 100644
> --- a/drivers/block/zram/zcomp.h
> +++ b/drivers/block/zram/zcomp.h
> @@ -51,6 +51,7 @@ struct zcomp {
>  };
>  
>  ssize_t zcomp_available_show(const char *comp, char *buf);
> +bool zcomp_available_algorithm(const char *comp);
>  
>  struct zcomp *zcomp_create(const char *comp, int max_strm);
>  void zcomp_destroy(struct zcomp *comp);
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 28f6e46..e17b73e 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -378,6 +378,12 @@ static ssize_t comp_algorithm_store(struct device *dev,
>   if (sz > 0 && zram->compressor[sz - 1] == '\n')
>   zram->compressor[sz - 1] = 0x00;
>  
> + if (!zcomp_available_algorithm(zram->compressor)) {
> + pr_err("Error: unavailable compression algorithm: %s\n",
> + zram->compressor);
> + len = -EINVAL;
> + }
> +

I'm not against this patch because it's better than old.
But let's think more about the pr_err part.

If user try to set wrong algo name, he can see EINVAL.
Isn't it enough?

I think every sane admin can think he passed wrong argument
if he sees -EINVAL.
So, I don't think we need to emit pr_err in here.

The reason I am paranoid about that is that I really don't want
to argue with syslog info which is part of ABI or not in future.
If possible, I don't want to depend on pr_xxx.


>   up_write(>init_lock);
>   return len;
>  }
> -- 
> 2.4.1.314.g9532ead
> 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/2] kernel/fork.c: add a function to calculate page address from thread_info

2015-05-26 Thread KOSAKI Motohiro

On Sun, May 24, 2015 at 12:01 PM, Jungseok Lee  wrote:
> A current implementation assumes thread_info address is always correctly
> calculated via virt_to_page. It restricts a different approach, such as
> thread_info allocation from vmalloc space.
>
> This patch, thus, introduces an independent function to calculate page
> address from thread_info one.
>
> Suggested-by: Sungjinn Chung 
> Signed-off-by: Jungseok Lee 
> Cc: KOSAKI Motohiro 
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  kernel/fork.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

I haven't receive a path [2/2] and haven't review whole patches. But
this patch itself is OK to me.
Acked-by: KOSAKI Motohiro 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 3/3] n_gsm: wake up ldisc tty before using it

2015-05-26 Thread Zhang, Yanmin

On 2015/5/27 11:02, Greg KH wrote:
> On Wed, May 27, 2015 at 10:50:01AM +0800, Zhang, Yanmin wrote:
>> Wake up ldisc device before calling its driver to access the device.
>>
>> Signed-off-by: Zhang Yanmin 
>>
>> ---
>>
>>  drivers/tty/n_gsm.c | 40 +++-
>>  1 file changed, 39 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
>> index 2c34c32..40671fa 100644
>> --- a/drivers/tty/n_gsm.c
>> +++ b/drivers/tty/n_gsm.c
>> @@ -62,6 +62,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  static int debug;
>>  module_param(debug, int, 0600);
>> @@ -555,6 +556,27 @@ static int gsm_stuff_frame(const u8 *input, u8 *output, 
>> int len)
>>  return olen;
>>  }
>>  
>> +static int pm_runtime_get_sync_tty(struct tty_struct *tty)
>> +{
>> +int ret = 0;
>> +
>> +/*Wakeup parent as tty itself doesn't enable runtime*/
> No spaces in your comment?
I will add it.

>
> Anyway, this is corrupted and can't be applied, please fix up your email
> client and try it again...
I check the patch by scripts/checkpatch.pl and everything seems good.

I use Thunderbird 31.7.0, the latest. It auto updates to the latest version. 
Perhaps some new
config options changed something.

It seems email client converts some tab to space automatically.

Yanmin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Intel, Dell, Ubuntu - i915/intel_dp.c

2015-05-26 Thread Dragan Stancevic

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

this is not a patch for mainline. It affects Dell Precision M3800
Laptop running Ubuntu with Intel Graphics. Problem manifests in a way
where laptop after some use will not suspend anymore. It will attempt
to suspend, blank out the screen and come back up. It's rather finicky
as it doesn't happen consistently, but often enough that it started
being mildly annoying. I tracked it down to a race condition between
scheduled work and another thread. Essentially the driver would do a
double put on a structure if the race condition was hit. This patch
fixes the problem and my laptop has been suspending flawlessly for
about a month and a half.

- From my memory going back 1.5 months the race condition is between
_edp_panel_vdd_on, intel_edp_panel_off, edp_panel_vdd_off_sync.

Kernel and Distro info:
:~/src/Dell-M4800-Hardware/kernel$ uname -r
3.16.7-ckt8
:~/src/Dell-M4800-Hardware/kernel$ cat /etc/issue
Ubuntu 14.10 \n \l

Hope someone else finds the attached patch useful.

Thanks
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBAgAGBQJVZT1AAAoJEFvcFPjWWnki3NAP/Aq+o6HRWgJzjDPoHUnqoeuZ
xU/N4/aSjEMS2qBdOz++86B1nloT4ztKTRlaOJ45BCvCW2SeyHtMfHwyvIVw29hc
bo5+T4nOx0wHtX/iaULowM3++enEARH76SAFUcMsUJEX36nSJywyWSIM19h59L4z
JxjrPUVnbFn5K4FTBJ1GhG/3QDfDoD/i7bH/v+MIJCXSYWeV9S273nG9onMz5NhW
OxRx2avmat23zqwGDDTkLfOLCFVzByxPqbXU/muyo8yM9teHapfrMTOwYtWgIMDb
vFkfLJmDpa2NhPefSacNIi/L23oK4poGK4nw8w/U+VUaGwmConOtJq7tR517sewh
R481WfaZeJzCVt5IJtC076j+1YmtZQri7jdJJlxS1RW/2ZVGDNEWWFCMNo1g+ixl
kht0o2FSqjKjtSpzNuXjoKldpCH+v91ix8JqJSdfkjj4VOljKJh+xnPxOelbdU5a
aKnk6+hg8fc9CL37LHzV/nvIGaOht5VKPGSBZQQ0p7G+Oe169kPpE1gUfgsvzdw7
y0fkEaGV0iyuHMCrcY32dgEf0QrZyCqgmsts/u7q3bWoMSyWyYyE3gu8wQ1VT2Bb
+4sHgIXYjQZpVoEMs2i4gezPHL6oDhuOk6enJYah0dKpp9uefCRXbEiwitbZdnLE
P+l2M/QpOWM+qwvuV7Te
=HAXT
-END PGP SIGNATURE-
--- linux-3.16.0.ubt/drivers/gpu/drm/i915/intel_dp.c	2015-04-16 20:17:21.0 -0700
+++ linux-3.16.0/drivers/gpu/drm/i915/intel_dp.c	2015-04-16 20:29:45.040894002 -0700
@@ -1175,6 +1175,7 @@
 	if (!is_edp(intel_dp))
 		return false;
 
+	cancel_delayed_work(_dp->panel_vdd_work);
 	intel_dp->want_panel_vdd = true;
 
 	if (edp_have_panel_vdd(intel_dp))

Re: [fuse-devel] fuse_get_context() and namespaces

2015-05-26 Thread Eric W. Biederman

Seth Forshee  writes:

> On Tue, May 26, 2015 at 05:21:38PM +0200, Miklos Szeredi wrote:
>> On Fri, May 22, 2015 at 01:59:32PM -0500, Seth Forshee wrote:
>> > On Fri, May 22, 2015 at 12:44:35PM -0500, Eric W. Biederman wrote:
>> > > Seth Forshee  writes:
>> > > 
>> > > > On Fri, May 22, 2015 at 04:23:55PM +0200, Miklos Szeredi wrote:
>> > > >> On Sat, May 2, 2015 at 5:56 PM,   wrote:
>> > > >> >
>> > > >> > 3.10.0-229 form Scientific Linux and native 4.0.1-1 (from elrepo).
>> > > >> > SL 7.1 on the host and SL 6.6 on the LXC guest. At least in 3.10
>> > > >> > the 499dcf2024092e5cce41d05599a5b51d1f92031a is present.
>> > > >> > Steps to reproduce:
>> > > >> >
>> > > >> > On first console:
>> > > >> > [root@sl7test ~]# lxc-start  -n test-2 /bin/su -
>> > > >> > [root@test-2 ~]# diff -u  hello.py 
>> > > >> > /usr/share/doc/fuse-python-0.2.1/example/hello.py
>> > > >> > --- hello.py2015-05-02 11:12:13.963093580 -0400
>> > > >> > +++ /usr/share/doc/fuse-python-0.2.1/example/hello.py   2010-04-14 
>> > > >> > 18:29:21.0 -0400
>> > > >> > @@ -41,8 +41,6 @@
>> > > >> >  class HelloFS(Fuse):
>> > > >> >
>> > > >> >  def getattr(self, path):
>> > > >> > -dic = Fuse.GetContext(self)
>> > > >> > -print dic
>> > > >> >  st = MyStat()
>> > > >> >  if path == '/':
>> > > >> >  st.st_mode = stat.S_IFDIR | 0755
>> > > >> > [root@test-2 ~]# python hello.py -f  /mnt/
>> > > >> >
>> > > >> > On second console:
>> > > >> > [root@test-2 ~]# echo $$
>> > > >> > 41
>> > > >> > [root@test-2 ~]# ls /mnt/
>> > > >> > hello
>> > > >> >
>> > > >> > Output of first console:
>> > > >> > {'gid': 0, 'pid': 12083, 'uid': 0}
>> > > >> 
>> > > >> Thanks.
>> > > >> 
>> > > >> Digging in mailbox...  There was a thread last year about adding
>> > > >> support for running fuse daemon in a container:
>> > > >> 
>> > > >>   http://thread.gmane.org/gmane.linux.kernel/1811658
>> > > >> 
>> > > >> Not sure what happened, but no updated patches have been posted or
>> > > >> maybe I just missed them.
>> > > >
>> > > > I haven't sent updated patches in a while. I still intend to, but I
>> > > > shifted focus to first getting general support for mounts from user
>> > > > namespaces into the vfs (which will give a clearer direction for some 
>> > > > of
>> > > > the concerns raised about the fuse patches).
>> > > >
>> > > > All of this code is available in the userns-mounts branch of
>> > > > git://kernel.ubuntu.com/sforshee/linux.git, and I don't think the fuse
>> > > > patches actually depend on any of the stuff that precedes them. I'm
>> > > > planning to start submitting some of the earlier patches from that
>> > > > branch soon, and eventually get back to resubmitting the fuse patches.
>> > > >
>> > > > This is about pid namespaces though, and the fuse pid namespace patch
>> > > > from that series (see below) should be more or less independent of the
>> > > > rest of the patches. Potentially that could be merged separately from
>> > > > the user namespae stuff.
>> > > 
>> > > [snip]
>> > > 
>> > > > @@ -2076,7 +2077,15 @@ static int convert_fuse_file_lock(const struct 
>> > > > fuse_file_lock *ffl,
>> > > >  
>> > > >fl->fl_start = ffl->start;
>> > > >fl->fl_end = ffl->end;
>> > > > -  fl->fl_pid = ffl->pid;
>> > > > +
>> > > > +  /*
>> > > > +   * Convert pid into the connection's pid namespace. If 
>> > > > the
>> > > > +   * pid does not map into the namespace fl_pid will get 
>> > > > set
>> > > > +   * to 0.
>> > > > +   */
>> > > > +  rcu_read_lock();
>> > > > +  fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));
>> > > > +  rcu_read_unlock();
>> > > 
>> > > Scratches my head.  This looks wrong.
>> > > 
>> > > I would have expected pid_nr_ns.  Am I missing something reading this
>> > > patch quickly?
>> > 
>> > Here we're in the context of a F_GETLK operation. We've requested the
>> > lock information from the fuse process, and ffl->pid is the pid number
>> > in that process's pid namespace so it needs to be translated into
>> > current's namespace. First we have to look up the struct pid, then
>> > pid_vnr is just a wrapper for pid_nr_ns in the current pid namespace:
>> > 
>> >   pid_t pid_vnr(struct pid *pid)
>> >   {
>> >   return pid_nr_ns(pid, task_active_pid_ns(current));
>> >   }
>> > 
>> > Oh, but the comment is wrong, so maybe that's what confused you.
>> > s/connection/caller/ there and it should make more sense.
>> 
>> Attaching updated patch against fuse.git for-next.  Check namespace in both
>> device read and write.  Check them at the start (doesn't matter if requests 
>> are
>> stuck in the queue, if server isn't playing by the rules, then all is lost
>> anyway).
>> 
>> One thing: we return error if current tgid isn't valid in server's namespace.
>> That's looks good.  However we silently succeed and set in.h.pid to

Re: [PATCH] mm/hugetlb: document the reserve map/region tracking routines

2015-05-26 Thread Mike Kravetz


On 05/26/2015 04:09 PM, Andrew Morton wrote:

On Tue, 26 May 2015 14:27:10 -0700 Mike Kravetz  wrote:


This is a documentation only patch and does not modify any code.
Descriptions of the routines used for reserve map/region tracking
are added.


Confused.  This adds comments which are similar to the ones which were
added by
mm-hugetlb-compute-return-the-number-of-regions-added-by-region_add-v2.patch
and
mm-hugetlb-handle-races-in-alloc_huge_page-and-hugetlb_reserve_pages-v2.patch.
But the comments are a bit different.  And this patch madly conflicts
with the two abovementioned patches.

Maybe the thing to do is to start again, with a three-patch series:

mm-hugetlb-document-the-reserve-map-region-tracking-routines.patch
mm-hugetlb-compute-return-the-number-of-regions-added-by-region_add-v3.patch
mm-hugetlb-handle-races-in-alloc_huge_page-and-hugetlb_reserve_pages-v3.patch

while resolving the differences in the new code comments?



Sorry for the confusion.  Naoya and Davidlohr suggested changes to
the documentation and code.  One suggestion was to create a separate
documentation only patch.


I will create a new series as you suggest above.

--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-26 Thread Mike Galbraith

On Tue, 2015-05-26 at 15:51 -0400, Chris Metcalf wrote:

> On balance I suspect it's still better to make command line arguments
> handle the common cases most succinctly.

I prefer user specifies precisely, but yeah, that entails more typing.  

Idle curiosity: can SGI monster from hell boot a NO_HZ_FULL_ALL kernel,
w/wo it implying isolcpus?  Readers having same and a reactor to power
it in their basement, please test.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] GPIO support for BRCMSTB

2015-05-26 Thread Gregory Fong

Hi Linus,

On Wed, May 13, 2015 at 1:59 AM, Linus Walleij  wrote:
> On Tue, May 12, 2015 at 9:38 PM, Gregory Fong  wrote:
>> On Tue, May 12, 2015 at 3:59 AM, Linus Walleij  
>> wrote:
>>> On Wed, May 6, 2015 at 10:37 AM, Gregory Fong  
>>> wrote:
>>>
 There is only one IRQ for each GIO IP block (i.e. several register banks 
 share
 an IRQ).  After briefly looking into the generic IRQ chip implementation, 
 it
 seemed like in this case that using it would result in the driver being 
 more
 complex than necessary because AFAICT it expects a 1:1 mapping of
 irq_chip_generic to gpio_chip.  It seemed like less of a pain to have a 
 single
 irq_chip since we have a single IRQ for all register banks (multiple
 gpio_chips).  I might be missing something, maybe using a shared IRQ across
 multiple irq_chips is easier than I think?  Suggestions welcome.
>>>
>>> What is needed is a 1:1 mapping between GPIO offsets and IRQ
>>> offsets.
>>>
>>> If you just number your GPIOs 0...n and your IRQs 0...n
>>> it should work just fine with one irqchip for all banks.
>>>
>>> What screws things up is likely that the hardware supports
>>> 32 lines per bank and not all are used.
>>>
>>> I suggest you enable 32 line and 32 IRQs per bank,
>>> so that hwirq maps nicely 1:1 on the GPIO offsets,
>>> then just use the width thing to NACK operations on
>>> GPIO lines you are not using. This way you can also
>>> decode and warn on spurious IRQs on the unused lines.
>>
>> For having 32 lines per bank, the big problem here is the upper limit
>> of 256 GPIOs.
>
> Which arch is this?
> Usually this limit comes from
> arch/*/include/asm/gpio.h
>
> For ARM that was bumped to 512 a while back. It is also possible
> to define a custom value for your system by defining
> ARCH_NR_GPIOS
>
>> Anyway, I don't think I understand IRQ domains and irq_chip_generic
>> very well.  One possibility _might_ be to use multiple irq_chips.
>
> That is probably not possible if there is just one IRQ for all
> banks.
>
> The task of the irqdomain is a 1-to-1 translation from one
> hardware numberspace to the Linux IRQ number space.
>
> In your case the hardware IRQ (hwirq) numberspace
> should be:
>
> bank0: 0..31
> bank1: 32..63
> 
> bankn: 32*n..32*n+31
>
> I think the gpiolib irqchip code can translate that properly
> as it is just a simple 0...x mapping, the irq handler need
> some magic to loop over all banks from 0..n though.

I've now actually attempted to use the gpiolib irqchip code.  This
driver can't directly use gpiochip_irqchip_add() because of the
multiple gpiochip : one irqchip map.  At first, I thought it might be
possible to simply add a new argument (or break things into a helper
function) to allow setting the associated IRQ domain, but then I can't
use the generic map and unmap functions which expect the irq_domain
host_data member to be struct gpiochip *, which makes no sense in this
case.  That puts me right back to implementing a special version of
the map and unmap function.

Since there doesn't appear to be any benefit to using the gpiolib
irqchip code for this case, I'm going to stick with my implementation
from patch 3 of this patchset.  I've also added to it to allow for
using the GPIOs as a wakeup source, and will submit that as well with
V2.

Thanks,
Gregory
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scripts:checkpatch - Do not give error if static bool or global bool variables are assigned to false value.

2015-05-26 Thread Joe Perches

On Wed, 2015-05-27 at 08:43 +0530, Shailendra Verma wrote:
> Hello Joe,
> 
> Thanks for the clarification. So I will change the error message as
> suggested by you and will send the patch to you.

Hello Shailendra.

My humble apologies to you.

I totally misunderstood what you were writing.
I guess the first sentence threw me for a loop.

It makes sense to avoid emitting a warning message with

bool foo = false;

cheers, Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] regulator: s2mps11: Fix GPIO suspend enable shift wrapping bug

2015-05-26 Thread Krzysztof Kozlowski

Status of enabling suspend mode for regulator was stored in bitmap-like
long integer.

However since adding support for S2MPU02 the number of regulators
exceeded 32 so on devices with more than 32 regulators (S2MPU02 and
S2MPS13) overflow happens when shifting the bit. This could lead to
enabling suspend mode for completely different regulator than intended
or to switching different regulator to other mode (e.g. from always
enabled to controlled by PWRHOLD pin). Both cases could result in larger
energy usage and issues when suspending to RAM.

Signed-off-by: Krzysztof Kozlowski 
Cc: 
Reported-by: Dan Carpenter 
Fixes: 00e2573d2c10 ("regulator: s2mps11: Add support S2MPU02 regulator device")
---
 drivers/regulator/s2mps11.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c
index 326ffb553371..72fc3c32db49 100644
--- a/drivers/regulator/s2mps11.c
+++ b/drivers/regulator/s2mps11.c
@@ -34,6 +34,8 @@
 #include 
 #include 
 
+/* The highest number of possible regulators for supported devices. */
+#define S2MPS_REGULATOR_MAXS2MPS13_REGULATOR_MAX
 struct s2mps11_info {
unsigned int rdev_num;
int ramp_delay2;
@@ -49,7 +51,7 @@ struct s2mps11_info {
 * One bit for each S2MPS13/S2MPS14/S2MPU02 regulator whether
 * the suspend mode was enabled.
 */
-   unsigned long long s2mps14_suspend_state:50;
+   DECLARE_BITMAP(suspend_state, S2MPS_REGULATOR_MAX);
 
/* Array of size rdev_num with GPIO-s for external sleep control */
int *ext_control_gpio;
@@ -500,7 +502,7 @@ static int s2mps14_regulator_enable(struct regulator_dev 
*rdev)
switch (s2mps11->dev_type) {
case S2MPS13X:
case S2MPS14X:
-   if (s2mps11->s2mps14_suspend_state & (1 << rdev_get_id(rdev)))
+   if (test_bit(rdev_get_id(rdev), s2mps11->suspend_state))
val = S2MPS14_ENABLE_SUSPEND;
else if 
(gpio_is_valid(s2mps11->ext_control_gpio[rdev_get_id(rdev)]))
val = S2MPS14_ENABLE_EXT_CONTROL;
@@ -508,7 +510,7 @@ static int s2mps14_regulator_enable(struct regulator_dev 
*rdev)
val = rdev->desc->enable_mask;
break;
case S2MPU02:
-   if (s2mps11->s2mps14_suspend_state & (1 << rdev_get_id(rdev)))
+   if (test_bit(rdev_get_id(rdev), s2mps11->suspend_state))
val = S2MPU02_ENABLE_SUSPEND;
else
val = rdev->desc->enable_mask;
@@ -562,7 +564,7 @@ static int s2mps14_regulator_set_suspend_disable(struct 
regulator_dev *rdev)
if (ret < 0)
return ret;
 
-   s2mps11->s2mps14_suspend_state |= (1 << rdev_get_id(rdev));
+   set_bit(rdev_get_id(rdev), s2mps11->suspend_state);
/*
 * Don't enable suspend mode if regulator is already disabled because
 * this would effectively for a short time turn on the regulator after
@@ -960,18 +962,22 @@ static int s2mps11_pmic_probe(struct platform_device 
*pdev)
case S2MPS11X:
s2mps11->rdev_num = ARRAY_SIZE(s2mps11_regulators);
regulators = s2mps11_regulators;
+   BUILD_BUG_ON(S2MPS_REGULATOR_MAX < s2mps11->rdev_num);
break;
case S2MPS13X:
s2mps11->rdev_num = ARRAY_SIZE(s2mps13_regulators);
regulators = s2mps13_regulators;
+   BUILD_BUG_ON(S2MPS_REGULATOR_MAX < s2mps11->rdev_num);
break;
case S2MPS14X:
s2mps11->rdev_num = ARRAY_SIZE(s2mps14_regulators);
regulators = s2mps14_regulators;
+   BUILD_BUG_ON(S2MPS_REGULATOR_MAX < s2mps11->rdev_num);
break;
case S2MPU02:
s2mps11->rdev_num = ARRAY_SIZE(s2mpu02_regulators);
regulators = s2mpu02_regulators;
+   BUILD_BUG_ON(S2MPS_REGULATOR_MAX < s2mps11->rdev_num);
break;
default:
dev_err(>dev, "Invalid device type: %u\n",
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] net/unix: sk_socket can disappear when state is unlocked

2015-05-26 Thread David Miller

From: Hannes Frederic Sowa 
Date: Tue, 26 May 2015 23:24:59 +0200

> On Tue, May 26, 2015, at 17:22, Mark Salyzyn wrote:
>> got a rare NULL pointer dereference in clear_bit
>> 
>> Signed-off-by: Mark Salyzyn 
> 
> IMHO, this is the right approach, I didn't came up with something
> easier, thanks!
> 
> Acked-by: Hannes Frederic Sowa 

Applied and queued up for -stable, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/7] Smack namespace

2015-05-26 Thread Eric W. Biederman

Lukasz Pawelczyk  writes:

> Hello,
>
> Some time ago I sent a Smack namespace documentation and a preliminary
> LSM namespace for RFC. I've been suggested that there shouldn't be a
> separate LSM namespace and that it should live within user namespace.
> And this version does. This is a complete set of patches required for
> Smack namespace.
>
> This was designed with a collaboration of Smack maintainer Casey
> Schaufler.
>
> Smack namespace have been implemented using user namespace hooks added
> by one of the patches. To put some context to it I paste here a
> documentation on what Smack namespace wants to achieve.
>
> LSM hooks themselves are documented in the security.h file inside the
> patch.
>
> The patches are based on:
> https://github.com/cschaufler/smack-next/tree/smack-for-4.2-stacked
>
> The tree with them is avaiable here:
> https://github.com/Havner/smack-namespace/tree/smack-namespace-for-4.2-stacked-v2
>
> Changes from v1:
>  - "kernel/exit.c: make sure current's nsproxy != NULL while checking
>caps" patch has been dropped
>  - fixed the title of the user_ns operations patch

A have not completed a review I don't understand smack well enough to
answer some questions but I don't think I like the approach this patch
takes to get things done.

I am flattered that you are using a mapping as I did with uid map and
gid map.  Unfortunately I do not believe your mapping achieves what my
mapping of uids and gids achieved.

A technical goal is to give people the tools so that a sysadmin can set
up a system, can grant people subids and subgids, and then the user can
proceed to do what they need to do.  In particular there should be
little to no need to keep pestering the system administrator for more
identifiers.

The flip side of that was that the mapping would ensure all of the
existing permissions checks would work as expected, and the checks in
the kernel could be converted without much trouble.

Ranges of ids were choosen because they allow for a lot of possible ways
of using uids and gids in containers, are comparitively easy to
administer, are very fast to use, and don't need large data structures.

With a discreet mapping of labels I have the premonition that we now
have a large data structure that, is not as flexible in to use,
is comparatively slow and appears to require an interaction with the
system administrator for every label you use in a container.

As part of that there is added a void *security pointer in the user
namespace to apparently hang off anything anyone would like to use.
Connected to that are hooks that have failure codes (presumably memory
allocation failures), but the semantics are not clear.  My gut feel is
that I would rather extend struct user_namespace to hold the smack label
mapping table and remove all of the error codes because they would then
be unnecessary.

I am also concerned that several of the operations assume that setns
and the like are normally privileged operations and so require the
ability to perform other privileged operations.  Given that in the right
circumstances setns is not privileged that seems like a semantics
mismatch.

Or in short my gut feel says the semantics of this change are close to
something that would be very useful, but the details make this patchset
far less useful, usable and comprehensible than it should be.

Eric

> ===
>
> --- What is a Smack namespace ---
>
> Smack namespace was developed to make it possible for Smack to work
> nicely with Linux containers where there is a full operating system
> with its own init inside the namespace. Such a system working with
> Smack expects to have at least partially working SMACK_MAC_ADMIN to be
> able to change labels of processes and files. This is required to be
> able to securely start applications under the control of Smack and
> manage their access rights.
>
> It was implemented using new LSM hooks added to the user namespace
> that were developed together with Smack namespace.
>
>
> --- Design ideas ---
>
> "Smack namespace" is rather "Smack labels namespace" as not the whole
> MAC is namespaced, only the labels. There is a great analogy between
> Smack labels namespace and the user namespace part that remaps UIDs.
>
> The idea is to create a map of labels for a namespace so the namespace
> is only allowed to use those labels. Smack rules are always the same
> as in the init namespace (limited only by what labels are mapped) and
> cannot be manipulated from the child namespace. The map is actually
> only for labels' names. The underlying structures for labels remain
> the same. The filesystem also stores the "unmapped" labels from the
> init namespace.
>
> Let's say we have those labels in the init namespace:
> label1
> label2
> label3
>
> and those rules:
> label1 label2 rwx
> label1 label3 rwx
> label2 label3 rwx
>
> We create a map for a namespace:
> label1 -> mapped1
> label2 -> mapped2
>
> This means that

Re: [PATCH net-next v6 0/2] Adding support for Cavium ThunderX network controller

2015-05-26 Thread David Miller

From: Aleksey Makarov 
Date: Tue, 26 May 2015 19:20:13 -0700

> This patchset adds support for the Cavium ThunderX network controller.

I don't see patch #1 (yet).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 v3.18-rc4 1/4] drm: prime: Honour O_RDWR during prime-handle-to-fd

2015-05-26 Thread Damian Hobson-Garcia

Hello,
>On Wed, Nov 12, 2014 at 6:38 AM, Daniel Thompson  
>wrote:
>> Currently DRM_IOCTL_PRIME_HANDLE_TO_FD rejects all flags except
>> (DRM|O)_CLOEXEC making it hard for the userspace to generate a file
>> descriptor that can be used by mmap().
>>
>> It is easy to relax the restriction and allow read/write permissions.
>> This should be safe because the flags are seldom touched by drm; mostly
>> they are passed verbatim to dma_buf calls.
>>
>> Signed-off-by: Daniel Thompson 
>
> Reviewed-by: Rob Clark 

It's a little bit old by now, but I'm wondering if someone call tell me
whether this patch is likely to be merged sometime, or has it been
(should it be?) abandoned.

Thank you,
Damian

>
>> ---
>>  drivers/gpu/drm/drm_prime.c | 9 +++--
>>  include/uapi/drm/drm.h  | 1 +
>>  2 files changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>> index 78ca30808422..8467b17c8053 100644
>> --- a/drivers/gpu/drm/drm_prime.c
>> +++ b/drivers/gpu/drm/drm_prime.c
>> @@ -331,7 +331,7 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops 
>> =  {
>>   * drm_gem_prime_export - helper library implemention of the export callback
>>   * @dev: drm_device to export from
>>   * @obj: GEM object to export
>> - * @flags: flags like DRM_CLOEXEC
>> + * @flags: flags like DRM_CLOEXEC and DRM_RDWR
>>   *
>>   * This is the implementation of the gem_prime_export functions for GEM 
>> drivers
>>   * using the PRIME helpers.
>> @@ -635,14 +635,11 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device 
>> *dev, void *data,
>> return -ENOSYS;
>>
>> /* check flags are valid */
>> -   if (args->flags & ~DRM_CLOEXEC)
>> +   if (args->flags & ~(DRM_CLOEXEC | DRM_RDWR))
>> return -EINVAL;
>>
>> -   /* we only want to pass DRM_CLOEXEC which is == O_CLOEXEC */
>> -   flags = args->flags & DRM_CLOEXEC;
>> -
>> return dev->driver->prime_handle_to_fd(dev, file_priv,
>> -   args->handle, flags, >fd);
>> +   args->handle, args->flags, >fd);
>>  }
>>
>>  int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data,
>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>> index b0b855613641..89c2b68ddc51 100644
>> --- a/include/uapi/drm/drm.h
>> +++ b/include/uapi/drm/drm.h
>> @@ -660,6 +660,7 @@ struct drm_set_client_cap {
>> __u64 value;
>>  };
>>
>> +#define DRM_RDWR O_RDWR
>>  #define DRM_CLOEXEC O_CLOEXEC
>>  struct drm_prime_handle {
>> __u32 handle;
>> --
>> 1.9.3
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the backlight tree

2015-05-26 Thread Stephen Rothwell

Hi Lee,

After merging the backlight tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

drivers/video/backlight/backlight.c: In function 'of_find_backlight_by_node':
drivers/video/backlight/backlight.c:563:2: error: implicit declaration of 
function 'of_platform_device_ensure' [-Werror=implicit-function-declaration]
  of_platform_device_ensure(node);
  ^

Caused by commit 67a6b10546ea ("backlight: Probe backlight devices on
demand").

I have used the backlight tree from next-20150526 for today.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpfw9K9SyEv8.pgp
Description: OpenPGP digital signature

Re: [PATCH v3 6/6] ACPI: import watchdog info of GTDT into platform device

2015-05-26 Thread Timur Tabi


Hanjun Guo wrote:

I don't agree with this.  The GTDT should be parsed even if there's no
watchdog driver compiled for this kernel.  There are no other #ifdefs in
this file.


So what's the point of parse GTDT and alloc memories for it if there
is no watchdog driver compiled for the kernel?


I don't think it's normal policy to generate a platform only if one 
specific driver is enabled.


> will the module insmod

later even if the CONFIG_ARM_SBSA_WATCHDOG=n?


I think that actually can work, but it's not a good reason by itself.


OK, that's good point. but what I proposed is some hint to which driver
will use the data prepared in this file, we can easily understand it
in this patchset, but if just review the code in this fiel, I think
people will be confused without detail comments.


All anyone needs to is

git grep "sbsa-gwdt"

And you'll find the driver.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv6 05/10] ARM: mxs: fix in tree users of ssd1306

2015-05-26 Thread Shawn Guo

On Tue, May 26, 2015 at 10:08:23AM +0300, Tomi Valkeinen wrote:
> So should I take this via fbdev tree with the rest of the patches? If
> so, I want an ack from a relevant dts maintainer. get_maintainers.pl
> gives Shawn as the first contact for imx28-cfa10036.dts.

Acked-by: Shawn Guo 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 3/3] n_gsm: wake up ldisc tty before using it

2015-05-26 Thread Greg KH

On Wed, May 27, 2015 at 10:50:01AM +0800, Zhang, Yanmin wrote:
> Wake up ldisc device before calling its driver to access the device.
> 
> Signed-off-by: Zhang Yanmin 
> 
> ---
> 
>  drivers/tty/n_gsm.c | 40 +++-
>  1 file changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
> index 2c34c32..40671fa 100644
> --- a/drivers/tty/n_gsm.c
> +++ b/drivers/tty/n_gsm.c
> @@ -62,6 +62,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static int debug;
>  module_param(debug, int, 0600);
> @@ -555,6 +556,27 @@ static int gsm_stuff_frame(const u8 *input, u8 *output, 
> int len)
>  return olen;
>  }
>  
> +static int pm_runtime_get_sync_tty(struct tty_struct *tty)
> +{
> +int ret = 0;
> +
> +/*Wakeup parent as tty itself doesn't enable runtime*/

No spaces in your comment?

Anyway, this is corrupted and can't be applied, please fix up your email
client and try it again...

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 6/6] ACPI: import watchdog info of GTDT into platform device

2015-05-26 Thread Hanjun Guo


On 2015年05月27日 00:35, Timur Tabi wrote:

On 05/26/2015 03:28 AM, Hanjun Guo wrote:


  early_acpi_os_unmap_memory((char *)table, tbl_size);
  }


please add

#ifdef CONFIG_ARM_SBSA_WATCHDOG
(acpi gtdt code)
#endif


I don't agree with this.  The GTDT should be parsed even if there's no
watchdog driver compiled for this kernel.  There are no other #ifdefs in
this file.


So what's the point of parse GTDT and alloc memories for it if there
is no watchdog driver compiled for the kernel? will the module insmod
later even if the CONFIG_ARM_SBSA_WATCHDOG=n?




+ * Add a platform device named "sbsa-gwdt" to match the platform
driver.
+ * "sbsa-gwdt": SBSA(Server Base System Architecture) Generic
Watchdog
+ * The platform driver can get device info below by matching this
name.


* The platform driver (drivers/watchdog/sbsa_gwdt.c) can get device info
below by matching this name.

Adding the file name which will help for review and maintain in my
opinion.


Except it will cause problems if the driver is renamed or moved.  I
don't think this is a good idea, either (sorry!)


OK, that's good point. but what I proposed is some hint to which driver
will use the data prepared in this file, we can easily understand it
in this patchset, but if just review the code in this fiel, I think
people will be confused without detail comments.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] atm:he - Change 0 to false for bool type variable initialization.

2015-05-26 Thread David Miller

From: Shailendra Verma 
Date: Wed, 27 May 2015 06:50:18 +0530

> The variable sdh is bool type so initializing it with false value
> instead of 0.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/atm/he.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/atm/he.c b/drivers/atm/he.c
> index 93dca2e..eb5bebc 100644
> --- a/drivers/atm/he.c
> +++ b/drivers/atm/he.c
> @@ -116,8 +116,8 @@ static bool disable64;
>  static short nvpibits = -1;
>  static short nvcibits = -1;
>  static short rx_skb_reserve = 16;
> -static bool irq_coalesce = 1;
> -static bool sdh = 0;
> +static bool irq_coalesce = true;
> +static bool sdh = false;

You didn't understand my feedback.

I already applied your patch that handled the irq_coalesce issue,
so you have to submit a patch relative to that.

In fact, you should always test that your patch applied to my tree
before submitting it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] KVM: MMU: fix SMAP virtualization

2015-05-26 Thread Xiao Guangrong




On 05/26/2015 10:48 PM, Paolo Bonzini wrote:



On 26/05/2015 16:45, Edward Cree wrote:

This breaks older compilers that can't initialize anon structures.

How old ? Even gcc 3.1 says you can use unnamed struct/union fields and
3.2 is the minimum version required to compile the kernel as mentioned
in the README.

We could simply just name the structure, but I doubt this is the
only place in the kernel code where it's being used this way :)

This appears to be GCC bug #10676, see 

Says it was fixed in 4.6, but I believe the kernel supports GCCs much older
than that (back to 3.2).  I personally hit it on 4.4.7, the version shipped
with RHEL6.6.


Yes, it will be fixed soon(ish).  Probably before you can get rid of the
obnoxious disclaimer... :)


It has been fixed by Andrew:

From: Andrew Morton 
Subject: arch/x86/kvm/mmu.c: work around gcc-4.4.4 bug

arch/x86/kvm/mmu.c: In function 'kvm_mmu_pte_write':
arch/x86/kvm/mmu.c:4256: error: unknown field 'cr0_wp' specified in initializer
arch/x86/kvm/mmu.c:4257: error: unknown field 'cr4_pae' specified in initializer
arch/x86/kvm/mmu.c:4257: warning: excess elements in union initializer
...

gcc-4.4.4 (at least) has issues when using anonymous unions in
initializers.

Fixes: edc90b7dc4ceef6 ("KVM: MMU: fix SMAP virtualization")
Cc: Xiao Guangrong 
Cc: Paolo Bonzini 
Signed-off-by: Andrew Morton 

Should be found at -mm tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH net-next v6 0/2] Adding support for Cavium ThunderX network controller

2015-05-26 Thread Aleksey Makarov

This patchset adds support for the Cavium ThunderX network controller.

changes in v6:
 * unused preprocessor symbols were removed
 * reduce no of atomic operations in SQ maintenance
 * support for TCP segmentation at driver level
 * reset RBDR if fifo state is FAIL
 * fixed an issue with link state mailbox message

changes in v5:
 * __packed were removed.  now we rely on C language ABI
 * nic_dbg() -> netdev_dbg()
 * fixes for a typo, constant spelling and using BIT_ULL
 * use print_hex_dump()
 * unnecessary conditions in a long if() chain were removed

changes in v4:
 * the patch "pci: Add Cavium PCI vendor id" was attributed correctly
 * a note that Cavium id is used in many drivers was added
 * the license comments now match MODULE_LICENSE
 * a comment explaining usage of writeq_relaxed()/readq_relaxed() was added

changes in v3:
 * code cleanup
 * issues discovered by reviewers were addressed

changes in v2:
 * non-generic module parameters removed
 * ethtool support added (nicvf_set_rxnfc())

v5: 
https://lkml.kernel.org/g/<1432344498-17131-1-git-send-email-aleksey.maka...@caviumnetworks.com>
v4: 
https://lkml.kernel.org/g/<1432000757-28700-1-git-send-email-aleksey.maka...@auriga.com>
v3: 
https://lkml.kernel.org/g/<1431747401-20847-1-git-send-email-aleksey.maka...@auriga.com>
v2: 
https://lkml.kernel.org/g/<1415596445-10061-1-git-send-email-r...@kernel.org>
v1: https://lkml.kernel.org/g/<20141030165434.GW20170@rric.localhost>

Sunil Goutham (2):
  pci: Add Cavium PCI vendor id
  net: Adding support for Cavium ThunderX network controller

 MAINTAINERS|7 +
 drivers/net/ethernet/Kconfig   |1 +
 drivers/net/ethernet/Makefile  |1 +
 drivers/net/ethernet/cavium/Kconfig|   40 +
 drivers/net/ethernet/cavium/Makefile   |5 +
 drivers/net/ethernet/cavium/thunder/Makefile   |   11 +
 drivers/net/ethernet/cavium/thunder/nic.h  |  414 ++
 drivers/net/ethernet/cavium/thunder/nic_main.c |  940 
 drivers/net/ethernet/cavium/thunder/nic_reg.h  |  213 +++
 .../net/ethernet/cavium/thunder/nicvf_ethtool.c|  601 
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 1332 +
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 1544 
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  381 +
 drivers/net/ethernet/cavium/thunder/q_struct.h |  701 +
 drivers/net/ethernet/cavium/thunder/thunder_bgx.c  |  966 
 drivers/net/ethernet/cavium/thunder/thunder_bgx.h  |  223 +++
 include/linux/pci_ids.h|2 +
 17 files changed, 7382 insertions(+)
 create mode 100644 drivers/net/ethernet/cavium/Kconfig
 create mode 100644 drivers/net/ethernet/cavium/Makefile
 create mode 100644 drivers/net/ethernet/cavium/thunder/Makefile
 create mode 100644 drivers/net/ethernet/cavium/thunder/nic.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/nic_main.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nic_reg.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_main.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_queues.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_queues.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/q_struct.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/thunder_bgx.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/thunder_bgx.h

-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2026 matches

Mail list logo