Re: [PATCH] pstore: Fail to unlink if a driver has not defined pstore_erase

2013-06-24 Thread Aruna Balakrishnaiah

Hi Keek,

On Monday 24 June 2013 10:33 PM, Kees Cook wrote:

On Mon, Jun 24, 2013 at 12:48 AM, Aruna Balakrishnaiah
 wrote:

pstore_erase is used to erase the record from the persistent store.
So if a driver has not defined pstore_erase callback return
-EINVAL instead of unlinking a file as deleting the file without
erasing its record in persistent store will give a wrong impression
to customers.

This is probably true -- I originally liked the idea of being able to
clean up the entries, regardless of their storage state, but you're
probably right. They shouldn't be deleted unless they can _actually_
be deleted.

So, I support this change, but I think the return needs to be
different. EINVAL isn't listed, for example, in unlink(2)'s man-page.
Perhaps EROFS, EACCESS, or EPERM?


The filesystem (pstore) has privileges to unlink the file but only if the
callback function is defined. Since the filesystem has privileges I didn't
consider these error codes (EROFS, EACCESS or EPERM).

In the case where callback function is not defined unlinking the file would
be an invalid operation and hence EINVAL.

Since unlink(2) man page does not have EINVAL listed, I feel going with
EPERM will make more sense.




-Kees


Signed-off-by: Aruna Balakrishnaiah 
---
  fs/pstore/inode.c |2 ++
  1 file changed, 2 insertions(+)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index e4bcb2c..fa6339a 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -178,6 +178,8 @@ static int pstore_unlink(struct inode *dir, struct dentry 
*dentry)
 if (p->psi->erase)
 p->psi->erase(p->type, p->id, p->count,
   dentry->d_inode->i_ctime, p->psi);
+   else
+   return -EINVAL;

 return simple_unlink(dir, dentry);
  }




--
Kees Cook
Chrome OS Security



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


vmlinux symtab matching kallsyms fails: dso__find_symbol_by_name ---- end ----

2013-06-24 Thread Prabhat Kumar Ravi
Hi All,

On 3.4-rc49, I got following failure running `perf test`:

/ # perf test -v 1
 1: vmlinux symtab matches kallsyms:
--- start ---
dso__find_symbol_by_name  end 
vmlinux symtab matches kallsyms: FAILED!

Perf test is failing at dsofind_symbol_by_name

where

kallsyms_map = machine__kernel_map(, type);

 sym = map__find_symbol_by_name(kallsyms_map, ref_reloc_sym.name, NULL);
 if (sym == NULL) {
pr_debug("dso__find_symbol_by_name ");
goto out;
}

Here sym is search for "_stext" which is NULL here so perf test fails
here only.

On investigation found that _stext having same address as asm_do_IRQ
and exception_text_start,

c00081c0 T asm_do_IRQ
c00081c0 T _stext
c00081c0 T __exception_text_start

so being deleted by symbolsfixup_duplicate in

if (choose_best_symbol(curr, next) == SYMBOL_A) {

rb_erase(>rb_node, symbols); --> symbole
getting erase here.
goto again;
} else {


My doubt is, Is we really need this commit??:

commit 3f5a42722b9e78a434d5a4ee5e607dc33c69ac80
Author: Anton Blanchard 
Date:   Wed Aug 24 16:40:15 2011 +1000

perf symbols: /proc/kallsyms does not sort module symbols

kallsyms__parse assumes that /proc/kallsyms is sorted and sets the end
of the previous symbol to the start of the current one.

Unfortunately module symbols are not sorted, eg:

a0081f30 t e1000_clean_rx_irq   [e1000e]
a00817a0 t e1000_alloc_rx_buffers   [e1000e]

Some symbols end up with a negative length and others have a length
larger than they should. This results in confusing perf output.

We already have a function to fixup the end of zero length symbols so
use that instead.

Or we can search or other string.??
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time

2013-06-24 Thread DuanZhenzhong

Stefano Stabellini wrote:

Trimming some of the people in CC

On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
  

On 2013-06-20 22:21, Stefano Stabellini wrote:


On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
  

On 2013-06-05 20:50, Stefano Stabellini wrote:


On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
  

Stefano Stabellini wrote:

On Tue, 21 May 2013, Stefano Stabellini wrote:
On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
  Looking at the hypervisor code I couldn't see anything obviously
wrong.
  I think the culprit is "physdev_unmap_pirq":

 if ( is_hvm_domain(d) )
  {
  spin_lock(>event_lock);
  gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
  d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
  domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
"unbound" :
"",
  domain_pirq_to_irq(d, pirq));

   if
( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
  ret = unmap_domain_pirq_emuirq(d, pirq);
  spin_unlock(>event_lock);
  if ( domid == DOMID_SELF || ret )
  goto free_domain;

It always tells me unbound:

(XEN) physdev.c:237:d14 14, pirq: 54 is 
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 53 is 
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 52 is 
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 51 is 
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 50 is 
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(a bit older debug code, so the 'unbound' does not show up here).

Which means that the call to unmap_domain_pirq_emuirq does not happen.
The checks in unmap_domain_pirq_emuirq also look to be depend
on the code being IRQ_UNBOUND.

In other words, all of that code looks to only clear things when
they are !IRQ_UNBOUND.

But the other logic (IRQ_UNBOUND) looks to be missing a removal
in the radix tree:

if ( emuirq != IRQ_PT )
  radix_tree_delete(>arch.hvm_domain.emuirq_pirq, emuirq);
  And
I think that is what is causing the leak - the radix tree
needs to be pruned? Or perhaps the allocate_pirq should check
the radix tree for IRQ_UNBOUND ones and re-use them?
I think that you are looking in the wrong place.
The issue is that QEMU doesn't call pt_msi_disable in
pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).

The code above is correct as is because it is trying to handle
emulated
IRQs and MSIs, not real passthrough MSIs. They latter are not added to
that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
  
This patch fixes the issue, I have only tested MSI (MSI-X completely

untested).


diff --git a/hw/pass-through.c b/hw/pass-through.c
index 304c438..079e465 100644
--- a/hw/pass-through.c
+++ b/hw/pass-through.c
@@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
*ptdev,
   ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
   }
   else
-ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
+{
+if (ptdev->msi->flags & PT_MSI_MAPPED) {
+pt_msi_disable(ptdev);
+}
+}
 /* pass through MSI_ENABLE bit when no MSI-INTx translation
*/
   if (!ptdev->msi_trans_en) {
@@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
*ptdev,
   pt_disable_msi_translate(ptdev);
   }
   pt_msix_update(ptdev);
+} else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
+pt_msix_delete(ptdev);
Hi Stefano,
I made a test with this patch, os reboot when driver reload. If use
pt_msix_disable
instead of pt_msix_delete, driver could be reloaded.
But I still see some error in qemu.log and xen console. Seems four
IRQs
are not freed
when unmap.
--first load---
pt_msix_update_one: pt_msix_update_one requested pirq = 103
pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 102
pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 101
pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 100
pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
- first unload---
pt_msix_disable: Unbind msix with pirq 67, gvec 0
pt_msix_disable: Unmap msix with pirq 67
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
pt_msix_disable: Unbind msix with pirq 66, gvec 0
pt_msix_disable: Unmap msix with pirq 66
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
pt_msix_disable: Unbind msix with pirq 65, gvec 0
pt_msix_disable: Unmap msix with pirq 65
pt_msix_disable: Error: 

Re: [PATCH V9 03/13] MIPS: Loongson: Introduce and use cpu_has_coherent_cache feature

2013-06-24 Thread chenhc
> On 06/22/2013 09:10 PM, Huacai Chen wrote:
>>
>> Is the 3rd patch of V10 is OK to be accepted now? If so, could the
>> patchset of V10 be merged into 3.11?
>>
> The merge window for 3.11 is closed at this point. You should get it
> prepared for 3.12, so start tracking the 'mips-for-linux-next' branch
> with your patches.
OK, if the 3rd patch has no problem, I think they can be easily applied
on mips-for-linux-next in future.

>
> Acked-by: Steven J. Hill 
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] power: charger-manager: regulator_get() never returns NULL.

2013-06-24 Thread Jonghwa Lee
This patch fixes return value checking of regulator_get() in charger-manager
driver. The API, regulator_get(), returns ERR_PTR() when it fails to get
regulator with given name, not NULL.

Signed-off-by: Jonghwa Lee 
Signed-off-by: Myungjoo Ham 
---
v2:
 - Fix return value to use API's directly with PTR_ERR().

 drivers/power/charger-manager.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/power/charger-manager.c b/drivers/power/charger-manager.c
index ba42029..0a4cce3 100644
--- a/drivers/power/charger-manager.c
+++ b/drivers/power/charger-manager.c
@@ -1239,11 +1239,10 @@ static int charger_manager_register_extcon(struct 
charger_manager *cm)
 
charger->consumer = regulator_get(cm->dev,
charger->regulator_name);
-   if (charger->consumer == NULL) {
+   if (IS_ERR(charger->consumer)) {
dev_err(cm->dev, "Cannot find charger(%s)\n",
charger->regulator_name);
-   ret = -EINVAL;
-   goto err;
+   return PTR_ERR(charger->consumer);
}
charger->cm = cm;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: f2fs tests

2013-06-24 Thread Jaegeuk Kim
Hi,

Thank you for the interest.

2013-06-21 (금), 21:36 +0200, Pavel Machek:
> Hi!
> 
> I played a bit with f2fs...
> 
> First, I had to compile the f2fstools... Unfortunately they require
> never autoconf than available on debian. I tried to hack it, but then
> I decided that compiling it by hand is just simpler.
> 
> gcc -I include/ -I . lib/*.c mkfs/*.c /usr/lib/libuuid.so
> 
> does the trick.
> 
> I was quite surprised there's no fsck in the repository... Do you have
> fsck somewhere?

It'll be available soon via f2fs-tools.git. We've almost done to clean
up the codes. 

> 
> I tested with copy of kernel and 4GB stick (with RedHat logo, thanks
> :-)... copying it from hdd took 46 minutes for VFAT and 19 minutes for
> F2FS. Good.
> 
> VFAT: time find . -name "not-here" took 26 seconds.
> F2F:  time find . -name "not-here" took 22-24 seconds.
> 
> Faster copy, same speed find, good. (Find is even slightly faster than
> HDD, with 27-30 seconds).
> 
> But now the strange stuff: the same data takes 861MB on ext3 and 1.3GB
> on f2fs. (It was even bigger than that on VFAT). I guess I should test
> the patch for inlining small files into inodes?

It is just caused by the different policy to show the file system
utilization.

When you request "df",
1. F2FS shows all the consumed space including its file system metadata.
Instead it tries to show file system size close to the device partiton
size as much as possbile.

2. EXT4 shows the amount of used-made data excluding the reserved space.
Therefore, it shows that its total size is smaller than the underlying
partition size.

So, you need to compare each entries from the results of "df" at a same
time.

When I tested 8GB partition,

F2FS shows:  [Size]   [Used][Avail]
 1. empty:8.0G 497M   7.5G
 2. cp   :8.0G 594M   7.4G
 3. untar:8.0G 1.3G   6.7G

EXT4 shows:
 1. empty:7.8G 18M7.4G
 2. cp   :7.8G116M7.3G
 3. untar:7.8G794M6.6G 

So, after untar the kernel source, you're able to discover that the
available sizes are not much different between F2FS and EXT4.

Thanks,

> 
>   Pavel

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: at91/PMC: fix at91sam9n12 USB FS init

2013-06-24 Thread Bo Shen

Hi Nicolas,

On 06/24/2013 06:57 PM, Nicolas Ferre wrote:

at91sam9n12 has Full-speed only USB. So we should add
it to the list in at91_pllb_usbfs_clock_init() function.

Signed-off-by: Nicolas Ferre 
---
  arch/arm/mach-at91/clock.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-at91/clock.c b/arch/arm/mach-at91/clock.c
index 978de42..2f732d9 100644
--- a/arch/arm/mach-at91/clock.c
+++ b/arch/arm/mach-at91/clock.c
@@ -699,7 +699,7 @@ static void __init at91_pllb_usbfs_clock_init(unsigned long 
main_clock)
at91_pmc_write(AT91_PMC_SCER, AT91RM9200_PMC_MCKUDP);
} else if (cpu_is_at91sam9260() || cpu_is_at91sam9261() ||
   cpu_is_at91sam9263() || cpu_is_at91sam9g20() ||
-  cpu_is_at91sam9g10()) {
+  cpu_is_at91sam9g10() || cpu_is_at91sam9n12()) {
uhpck.pmc_mask = AT91SAM926x_PMC_UHP;
udpck.pmc_mask = AT91SAM926x_PMC_UDP;
}



As you post the following patches:
  ARM: at91/PMC: fix at91sam9n12 USB FS init 
(https://patchwork.kernel.org/patch/2772301/)


This patch no need anymore.

Best Regards,
Bo Shen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] power: charger-manager: regulator_get() never returns NULL.

2013-06-24 Thread jonghwa3 . lee
On 2013년 06월 25일 14:07, Sachin Kamat wrote:

> On 25 June 2013 10:32, Jonghwa Lee  wrote:
>> This patch fixes return value checking of regulator_get() in charger-manager
>> driver. The API, regulator_get(), returns ERR_PTR() when it fails to get
>> regulator with given name, not NULL.
>>
>> Signed-off-by: Jonghwa Lee 
>> Signed-off-by: Myungjoo Ham 
>> ---
>>  drivers/power/charger-manager.c |2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/power/charger-manager.c 
>> b/drivers/power/charger-manager.c
>> index ba42029..7d1bcde 100644
>> --- a/drivers/power/charger-manager.c
>> +++ b/drivers/power/charger-manager.c
>> @@ -1239,7 +1239,7 @@ static int charger_manager_register_extcon(struct 
>> charger_manager *cm)
>>
>> charger->consumer = regulator_get(cm->dev,
>> charger->regulator_name);
>> -   if (charger->consumer == NULL) {
>> +   if (IS_ERR(charger->consumer)) {
>> dev_err(cm->dev, "Cannot find charger(%s)\n",
>> charger->regulator_name);
>> ret = -EINVAL;
> 
>  You can as well make this ret = PTR_ERR(charger->consumer).


Yes, I'll fix it.

Thanks,
Jonghwa

> 
> ---
> With warm regards,
> Sachin
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] power: charger-manager: regulator_get() never returns NULL.

2013-06-24 Thread Sachin Kamat
On 25 June 2013 10:32, Jonghwa Lee  wrote:
> This patch fixes return value checking of regulator_get() in charger-manager
> driver. The API, regulator_get(), returns ERR_PTR() when it fails to get
> regulator with given name, not NULL.
>
> Signed-off-by: Jonghwa Lee 
> Signed-off-by: Myungjoo Ham 
> ---
>  drivers/power/charger-manager.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/power/charger-manager.c b/drivers/power/charger-manager.c
> index ba42029..7d1bcde 100644
> --- a/drivers/power/charger-manager.c
> +++ b/drivers/power/charger-manager.c
> @@ -1239,7 +1239,7 @@ static int charger_manager_register_extcon(struct 
> charger_manager *cm)
>
> charger->consumer = regulator_get(cm->dev,
> charger->regulator_name);
> -   if (charger->consumer == NULL) {
> +   if (IS_ERR(charger->consumer)) {
> dev_err(cm->dev, "Cannot find charger(%s)\n",
> charger->regulator_name);
> ret = -EINVAL;

 You can as well make this ret = PTR_ERR(charger->consumer).

---
With warm regards,
Sachin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: at91/PMC: fix at91sam9n12 USB FS init

2013-06-24 Thread Bo Shen

Hi Nicolas,

On 06/25/2013 12:37 AM, Nicolas Ferre wrote:

at91sam9n12 has Full-speed only USB. So we should add
it to the list in at91_pllb_usbfs_clock_init() function.
Moreover, at91sam9n12 has an unusual PMC in the sense that it
has a PLLB but also has a USB clock register.

Signed-off-by: Nicolas Ferre 
---
  arch/arm/mach-at91/clock.c | 25 +
  arch/arm/mach-at91/include/mach/at91_pmc.h |  3 +++
  2 files changed, 24 insertions(+), 4 deletions(-)


For this series, test on at91sam9n12ek board, all are OK.

Tested-by: Bo Shen 

Best Regards,
Bo Shen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] block:Remove extra condition in end of disk check

2013-06-24 Thread Raghavendra K T

On 06/25/2013 04:21 AM, Tejun Heo wrote:

On Mon, Jun 24, 2013 at 07:20:12PM +0530, Raghavendra K T wrote:

@@ -1656,7 +1656,7 @@ static inline int bio_check_eod(struct bio *bio, unsigned 
int nr_sectors)
if (maxsector) {
sector_t sector = bio->bi_sector;

-   if (maxsector < nr_sectors || maxsector - nr_sectors < sector) {
+   if (maxsector - nr_sectors < sector) {


If maxsector < nr_sectors, the subtraction will underflow making it a
very large number and fail to detect the invalid condition, no?



Hi Tejun,
Thanks for the reply and explanation. You are right. underflow results
in invalid condition.

Considering maxsector and sectors are unsigned long, and nr_sector is
unsigned int, probably safer bet is
(max_sector < sector + nr_sector), but still it would leave scope for 
overflow.


Thanks again,
Raghu.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] power: charger-manager: Fix a bug when it unregisters notifier block of extcon.

2013-06-24 Thread Jonghwa Lee
This patch prevents NULL pointer error cauesed by unregistering unregistered
exton notifier block. At the probing time of charger manager, it tries to
remove extcon notifier block when it fails to initialize them. It has to be
applied for only registered one. Otherwise, it'd make kernel panic. To make it
work right, it checks extcon_specific_cable_nb's extcon_dev node. If extcon
cable notifier block was registered successfully, it has proper extcon_dev
pointer if not so it has NULL pointer.

Signed-off-by: Jonghwa Lee 
Signed-off-by: Myungjoo Ham 
---
 drivers/power/charger-manager.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/power/charger-manager.c b/drivers/power/charger-manager.c
index 7d1bcde..c55a7dc 100644
--- a/drivers/power/charger-manager.c
+++ b/drivers/power/charger-manager.c
@@ -1666,7 +1666,9 @@ err_reg_extcon:
charger = >charger_regulators[i];
for (j = 0; j < charger->num_cables; j++) {
struct charger_cable *cable = >cables[j];
-   extcon_unregister_interest(>extcon_dev);
+   /* Remove notifier block if only edev exists */
+   if (cable->extcon_dev.edev)
+   extcon_unregister_interest(>extcon_dev);
}
 
regulator_put(desc->charger_regulators[i].consumer);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] power: charger-manager: regulator_get() never returns NULL.

2013-06-24 Thread Jonghwa Lee
This patch fixes return value checking of regulator_get() in charger-manager
driver. The API, regulator_get(), returns ERR_PTR() when it fails to get
regulator with given name, not NULL.

Signed-off-by: Jonghwa Lee 
Signed-off-by: Myungjoo Ham 
---
 drivers/power/charger-manager.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/charger-manager.c b/drivers/power/charger-manager.c
index ba42029..7d1bcde 100644
--- a/drivers/power/charger-manager.c
+++ b/drivers/power/charger-manager.c
@@ -1239,7 +1239,7 @@ static int charger_manager_register_extcon(struct 
charger_manager *cm)
 
charger->consumer = regulator_get(cm->dev,
charger->regulator_name);
-   if (charger->consumer == NULL) {
+   if (IS_ERR(charger->consumer)) {
dev_err(cm->dev, "Cannot find charger(%s)\n",
charger->regulator_name);
ret = -EINVAL;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3 v16] iommu/fsl: Freescale PAMU driver and iommu implementation.

2013-06-24 Thread Alex Williamson
On Thu, 2013-06-20 at 21:31 +0530, Varun Sethi wrote:

> +#define REQ_ACS_FLAGS(PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | 
> PCI_ACS_UF)
> +
> +static struct iommu_group *get_device_iommu_group(struct device *dev)
> +{
> + struct iommu_group *group;
> +
> + group = iommu_group_get(dev);
> + if (!group)
> + group = iommu_group_alloc();
> +
> + return group;
> +}
> +
[snip]
> +

This really gets parent or peer, right?

> +static struct iommu_group *get_peer_pci_device_group(struct pci_dev *pdev)
> +{
> + struct iommu_group *group = NULL;
> +
> + /* check if this is the first device on the bus*/
> + if (pdev->bus_list.next == pdev->bus_list.prev) {

It's a list_head, use list functions.  The list implementation should be
treated as opaque.

if (list_is_singular(>bus_list))

> + struct pci_bus *bus = pdev->bus->parent;
> + /* Traverese the parent bus list to get
> +  * pdev & dev for the sibling device.
> +  */
> + while (bus) {
> + if (!list_empty(>devices)) {
> + pdev = container_of(bus->devices.next,
> + struct pci_dev, bus_list);

pdev = list_first_entry(>devices, struct pci_dev, bus_list);

> + group = iommu_group_get(>dev);
> + break;
> + } else
> + bus = bus->parent;

Is this ever reached?  Don't you always have bus->self?

> + }
> + } else {
> + /*
> +  * Get the pdev & dev for the sibling device
> +  */
> + pdev = container_of(pdev->bus_list.prev,
> + struct pci_dev, bus_list);

How do you know if you're at the head or tail of the list?

struct pci_dev *tmp;
list_for_each_entry(tmp, >bus_list, bus_list) {
if (tmp == pdev)
continue;

group = iommu_group_get(>dev);
break;
}

> + group = iommu_group_get(>dev);
> + }
> +
> + return group;
> +}
> +
> +static struct iommu_group *get_pci_device_group(struct pci_dev *pdev)
> +{
> + struct iommu_group *group = NULL;
> + struct pci_dev *bridge, *dma_pdev = NULL;
> + struct pci_controller *pci_ctl;
> + bool pci_endpt_partioning;
> +
> + pci_ctl = pci_bus_to_host(pdev->bus);
> + pci_endpt_partioning = check_pci_ctl_endpt_part(pci_ctl);
> + /* We can partition PCIe devices so assign device group to the device */
> + if (pci_endpt_partioning) {
> + bridge = pci_find_upstream_pcie_bridge(pdev);
> + if (bridge) {
> + if (pci_is_pcie(bridge))
> + dma_pdev = pci_get_domain_bus_and_slot(
> + pci_domain_nr(pdev->bus),
> + bridge->subordinate->number, 0);
> + if (!dma_pdev)
> + dma_pdev = pci_dev_get(bridge);
> + } else
> + dma_pdev = pci_dev_get(pdev);
> +
> + /* Account for quirked devices */
> + swap_pci_ref(_pdev, pci_get_dma_source(dma_pdev));
> +
> + /*
> +  * If it's a multifunction device that does not support our
> +  * required ACS flags, add to the same group as function 0.
> +  */

See c14d2690 in Joerg's next tree, using function 0 was a poor
assumption.

> + if (dma_pdev->multifunction &&
> + !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS))
> + swap_pci_ref(_pdev,
> +  pci_get_slot(dma_pdev->bus,
> +   
> PCI_DEVFN(PCI_SLOT(dma_pdev->devfn),
> +   0)));
> +
> + group = get_device_iommu_group(>dev);
> + pci_dev_put(pdev);

What was the point of all the above if we use pdev here instead of
dma_pdev?  Wrong device and broken reference counting.  This also isn't
testing ACS all the way up to the root complex or controller.

> + /*
> +  * PCIe controller is not a paritionable entity
> +  * free the controller device iommu_group.
> +  */
> + if (pci_ctl->parent->iommu_group)
> + iommu_group_remove_device(pci_ctl->parent);
> + } else {
> + /*
> +  * All devices connected to the controller will share the
> +  * PCI controllers device group. If this is the first
> +  * device to be probed for the pci controller, copy the
> +  * device group information from the PCI controller device
> +  * node and remove the PCI controller iommu group.
> +  * For subsequent devices, the iommu group information can
> +  * be obtained 

[PATCH Resend] drivers: uio_pdrv_genirq: Use of_match_ptr() macro

2013-06-24 Thread Sachin Kamat
This eliminates having an #ifdef returning NULL for the case
when OF is disabled.

Signed-off-by: Sachin Kamat 
---
Rebased on latest char-misc-next branch of char-misc tree.
---
 drivers/uio/uio_pdrv_genirq.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/uio/uio_pdrv_genirq.c b/drivers/uio/uio_pdrv_genirq.c
index bcd72f3..4eb8eaf 100644
--- a/drivers/uio/uio_pdrv_genirq.c
+++ b/drivers/uio/uio_pdrv_genirq.c
@@ -271,11 +271,8 @@ static struct of_device_id uio_of_genirq_match[] = {
{ /* Sentinel */ },
 };
 MODULE_DEVICE_TABLE(of, uio_of_genirq_match);
-
 module_param_string(of_id, uio_of_genirq_match[0].compatible, 128, 0);
 MODULE_PARM_DESC(of_id, "Openfirmware id of the device to be handled by uio");
-#else
-# define uio_of_genirq_match NULL
 #endif
 
 static struct platform_driver uio_pdrv_genirq = {
@@ -285,7 +282,7 @@ static struct platform_driver uio_pdrv_genirq = {
.name = DRIVER_NAME,
.owner = THIS_MODULE,
.pm = _pdrv_genirq_dev_pm_ops,
-   .of_match_table = uio_of_genirq_match,
+   .of_match_table = of_match_ptr(uio_of_genirq_match),
},
 };
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/32] arc: delete __cpuinit usage from all arc files

2013-06-24 Thread Vineet Gupta
On 06/25/2013 01:00 AM, Paul Gortmaker wrote:
> The __cpuinit type of throwaway sections might have made sense
> some time ago when RAM was more constrained, but now the savings
> do not offset the cost and complications.  For example, the fix in
> commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
> is a good example of the nasty type of bugs that can be created
> with improper use of the various __init prefixes.
>
> After a discussion on LKML[1] it was decided that cpuinit should go
> the way of devinit and be phased out.  Once all the users are gone,
> we can then finally remove the macros themselves from linux/init.h.
>
> Note that some harmless section mismatch warnings may result, since
> notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
> are flagged as __cpuinit  -- so if we remove the __cpuinit from
> arch specific callers, we will also get section mismatch warnings.
> As an intermediate step, we intend to turn the linux/init.h cpuinit
> content into no-ops as early as possible, since that will get rid
> of these warnings.  In any case, they are temporary and harmless.
>
> This removes all the arch/arc uses of the __cpuinit macros from
> all C files.  Currently arc does not have any __CPUINIT used in
> assembly files.
>
> [1] https://lkml.org/lkml/2013/5/20/589
>
> Cc: Vineet Gupta 
> Signed-off-by: Paul Gortmaker 

Applied to ARC for-curr (for 3.11)

Thx a bunch,
-Vineet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] ahci: sata: add support for exynos5440 sata

2013-06-24 Thread Girish KS
On Mon, Jun 17, 2013 at 11:54 PM, Tejun Heo  wrote:
> On Tue, Apr 16, 2013 at 02:58:02PM +0530, Girish K S wrote:
>> This patch adds the compatible string of the exynos5440 sata controller
>> compliant with the ahci 1.3 and sata 3.0 specification.
>>
>> Signed-off-by: Girish K S 
>>
>> changes in v2:
>>   changed the compatible string by adding the actual IP
>> owners name instead of the SoC vendor name.
>
> Applied to libata/for-3.11.
>
> Can you please keep SOB at the end of the description after the
> revision history from the next time?

Sure. Thanks

>
> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8] ethernet/arc/arc_emac - Add new driver

2013-06-24 Thread Alexey Brodkin
On 06/24/2013 09:54 AM, Alexey Brodkin wrote:
> Driver for non-standard on-chip ethernet device ARC EMAC 10/100,
> instantiated in some legacy ARC (Synopsys) FPGA Boards such as
> ARCAngel4/ML50x.

Any comments on this patch?

If not please consider applying.

Regards,
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 25/32] cpufreq: delete __cpuinit usage from all cpufreq files

2013-06-24 Thread Joe Perches
On Tue, 2013-06-25 at 09:01 +0530, Viresh Kumar wrote:
> On 25 June 2013 01:00, Paul Gortmaker  wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
[]
> > -static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> > -   unsigned long action, void *hcpu)
> > +static int cpufreq_cpu_callback(struct notifier_block *nfb,
> > +   unsigned long action, void *hcpu)
> 
> You were not required to change second line here and also don't
> change its indentation level. Check this everywhere.

Paul, yes, thanks for doing that here,
but please, do it everywhere...

;)

> > diff --git a/drivers/cpufreq/cpufreq_stats.c 
> > b/drivers/cpufreq/cpufreq_stats.c
[]
> > @@ -306,7 +306,7 @@ static int cpufreq_stat_notifier_policy(struct 
> > notifier_block *nb,
> >  }
> >
> >  static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
> > -   unsigned long val, void *data)
> > +  unsigned long val, void *data)
> 
> See.. unnecessary change.

Or from another perspective, ideal change.

> > -static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> > +static int cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> >unsigned long action,
> >void *hcpu)

Hey Paul, you missed some too.

Viresh, there's no absolute "right" way to do this.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/2] x86_64, mm: Delay initializing large portion of memory

2013-06-24 Thread Rob Landley

On 06/21/2013 11:25:33 AM, Nathan Zimmer wrote:
On a 16TB system it can takes upwards of two hours to boot the system  
with
about 60% of the time being spent initializing memory.  This patch  
delays
initializing a large portion of memory until after the system is  
booted.
This can significantly reduce the time it takes the boot the system  
down

to the 15 to 30 minute range.


Why is this conditional? Initialize the minimum amount of memory to  
bring up each NUMA node, and then have each processor initialize its  
own memory. I would have thought it was already doing this...




+   delay_mem_init=B:M:n:l:h
+			This delays the initialization of a large  
portion of
+			memory by inserting it into the "absent" memory  
list.
+			This allows the system to boot up much faster  
at the
+			expense of the time needed to add this absent  
memory
+			after the system has booted.  That however can  
be done

+   in parallel with other operations.


This seems like a giant advertisement primarily aimed at repeating why  
you think we need to merge the patch, not explaining what it is or how  
to use it.


I would rephrase:

			Defer memory initialization until after SMP  
init (so
			large memory ranges can be initialized in  
parallel) by
			moving memory not needed during boot to the  
"absent" list.


And I repeat: why do we need to micromanage this? It sounds like all  
NUMA systems should do something like this. (Single-threaded memory  
initialization in an SMP system is kind of weird.)



+   Format: B:M:n:l:h
+   (1 << B) is the block size (bsize)
+				 ['0' indicates use the default  
128M]

+   (1 << M) is the address space per node
+			(n * bsize) is minimum sized node memory to  
slice

+   (l * bisze) is low memory to leave on node
+   (h * bisze) is high memory to leave on node


I don't understand this in the slightest. I understand "low memory to  
leave on the node", I have no idea why there are four other parameters.




+config DELAY_MEM_INIT
+   bool "Delay memory initialization"
+   depends on EFI && MEMORY_HOTPLUG_SPARSE
+   ---help---
+ This  option delays initializing a large portion of memory
+ until after the system is booted.  This can significantly
+ reduce the time it takes the boot the system when there
+ is a significant amount of memory present.  Systems with
+ 8TB or more of memory benefit the most.


I can see an SMP phone wanting to use this to shave a quarter second  
off its boot time. Your "large portion of memory" description is a bit  
myopic.


Rob--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cgroup: status-quo and userland efforts

2013-06-24 Thread Tim Hockin
On Mon, Jun 24, 2013 at 5:01 PM, Tejun Heo  wrote:
> Hello, Tim.
>
> On Sat, Jun 22, 2013 at 04:13:41PM -0700, Tim Hockin wrote:
>> I'm very sorry I let this fall off my plate.  I was pointed at a
>> systemd-devel message indicating that this is done.  Is it so?  It
>
> It's progressing pretty fast.
>
>> seems so completely ass-backwards to me. Below is one of our use-cases
>> that I just don't see how we can reproduce in a single-heierarchy.
>
> Configurations which depend on orthogonal multiple hierarchies of
> course won't be replicated under unified hierarchy.  It's unfortunate
> but those just have to go.  More on this later.

I really want to understand why this is SO IMPORTANT that you have to
break userspace compatibility?  I mean, isn't Linux supposed to be the
OS with the stable kernel interface?  I've seen Linus rant time and
time again about this - why is it OK now?

>> We're also long into the model that users can control their own
>> sub-cgroups (moderated by permissions decided by admin SW up front).
>
> If you're in control of the base system, nothing prevents you from
> doing so.  It's utterly broken security and policy-enforcement point
> of view but if you can trust each software running on your system to
> do the right thing, it's gonna be fine.

Examples?  we obviously don't grant full access, but our kernel gang
and security gang seem to trust the bits we're enabling well enough...

>> This gives us 4 combinations:
>>   1) { production, DTF }
>>   2) { production, non-DTF }
>>   3) { batch, DTF }
>>   4) { batch non-DTF }
>>
>> Of these, (3) is sort of nonsense, but the others are actually used
>> and needed.  This is only
>> possible because of split hierarchies.  In fact, we undertook a very painful
>> process to move from a unified cgroup hierarchy to split hierarchies in large
>> part _because of_ these examples.
>
> You can create three sibling cgroups and configure cpuset and blkio
> accordingly.  For cpuset, the setup wouldn't make any different.  For
> blkio, the two non-DTFs would now belong to different cgroups and
> compete with each other as two groups, which won't matter at all as
> non-DTFs are given what's left over after serving DTFs anyway, IIRC.

The non-DTF jobs have a combined share that is small but non-trivial.
If we cut that share in half, giving one slice to prod and one slice
to batch, we get bad sharing under contention.  We tried this.  We
could add control loops in userspace code which try to balance the
shares in proportion to the load.  We did that with CPU, and it's sort
of horrible.  We're moving AWAY from all this craziness in favor of
well-defined hierarchical behaviors.

>> Making cgroups composable allows us to build a higher level abstraction that
>> is very powerful and flexible.  Moving back to unified hierarchies goes
>> against everything that we're doing here, and will cause us REAL pain.
>
> Categorizing processes into hierarchical groups of tasks is a
> fundamental idea and a fundamental idea is something to base things on
> top of as it's something people can agree upon relatively easily and
> establish a structure by.  I'd go as far as saying that it's the
> failure on the part of workload design if they in general can't be
> categorized hierarchically.

It's a bit naive to think that this is some absolute truth, don't you
think?  It just isn't so.  You should know better than most what
craziness our users do, and what (legit) rationales they can produce.
I have $large_number of machines running $huge_number of jobs from
thousands of developers running for years upon years backing up my
worldview.

> Even at the practical level, the orthogonal hierarchy encouraged, at
> the very least, the blkcg writeback support which can't be upstreamed
> in any reasonable manner because it is impossible to say that a
> resource can't be said to belong to a cgroup irrespective of who's
> looking at it.

I'm not sure I really grok that statement.  I'm OK with defining new
rules that bring some order to the chaos.  Give us new rules to live
by.  All-or-nothing would be fine.  What if mounting cgroupfs gives me
N sub-dirs, one for each compiled-in controller?  You could make THAT
the mount option - you can have either a unified hierarchy of all
controllers or fully disjoint hierarchies.  Or some other rule.

> It's something fundamentally broken and I have very difficult time
> believing google's workload is so different that it can't be
> categorized in a single hierarchy for the purpose of resource
> distribution.  I'm sure there are cases where some compromises are
> necessary but the laternative is much worse here.  As I wrote multiple
> times now, multiple orthogonal hierarchy support is gonna be around
> for some time, so I don't think there's any rason for panic; that
> said, please at least plan to move on.

The time frame you talk about IS reason for panic.  If I know that
you're going to completely screw me in a a year and a half, I have to
start 

Re: [PATCH] PCI: avoid NULL deref in alloc_pcie_link_state

2013-06-24 Thread Alex Williamson
On Mon, 2013-06-24 at 21:35 -0600, Bjorn Helgaas wrote:
> On Mon, Jun 24, 2013 at 8:58 PM, Alex Williamson
>  wrote:
> > On Mon, 2013-06-24 at 19:38 -0600, Bjorn Helgaas wrote:
> >> [+cc Michael, Alex, Isaku]
> >>
> >> On Wed, Jun 19, 2013 at 12:56 PM, Radim Krčmář  wrote:
> >> > PCIe switch upstream port can be connected directly to the PCIe root bus
> >> > in QEMU; ASPM does not expect this topology and dereferences NULL pointer
> >> > when initializing.
> >> >
> >> > I have not confirmed this can happen on real hardware, but it is 
> >> > presented
> >> > as a feature in QEMU, so there is no reason to panic if we can recover.
> >>
> >> This doesn't seem like a valid hardware topology to me.  If this *can*
> >> occur on real hardware, we should fix it in Linux.  If not, maybe QEMU
> >> should be changed to disallow it.
> >
> > I think a quad-port 82576 plugged into an express slot is likely the
> > same topology.
> 
> I don't think that would be the same topology Radim described.  In
> Radim's case, we have this:
> 
>   00:03.0 upstream port
>   01:00.0 downstream port
> 
> and when we call alloc_pcie_link_state() for 01:00.0,
> 
>   pdev is 01:00.0
>   pdev->bus is bus 01
>   pdev->bus->parent is bus 00
>   pdev->bus->parent->self (the bridge device leading to bus 00) is NULL
> 
> But in the case of a quad 82576 plugged into a slot, there would be a
> root port or a downstream port leading to the slot's link, so my guess
> is we'd have something like this (based on lspci output I found at
> [1]):
> 
>   00:05.0 root port leading to slot (bridge to [bus 01-08])
>   01:00.0 upstream port of switch on card (bridge to [bus 02-08])
>   02:02.0 downstream port (bridge to [bus 03-05])
>   02:04.0 downstream port (bridge to [bus 06-08])
>   03:00.0 82576 port 0
>   03:00.1 82576 port 1
>   06:00.0 82576 port 2
>   06:00.1 82576 port 3
> 
> So when we call alloc_pcie_link_state() for 02:02.0,
> 
>   pdev is 02:02.0
>   pdev->bus is bus 02
>   pdev->bus->parent is bus 01
>   pdev->bus->parent->self (the bridge leading to bus 01) is 00:05.0


Oops, I misread his statement.  I don't think an upstream port connected
directly to a root complex is valid.  I thought he was having problems
with an upstream port connected to a root port, which is obviously
valid.  QEMU lets us attach nearly anything to the root complex, most of
them aren't valid.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] RFC: mmc: dw_mmc: Always go to STATE_DATA_BUSY from STATE_DATA_ERROR

2013-06-24 Thread Bing Zhao

> I think the proposal on the table is to take Seungwon's patches
> instead of mine.  Assuming they solve your problems, I'm OK with that.
>  I think he was requesting testing the first of his two patches alone
> and then both of his two patches together.

Test #1: Swungwon's patch #1 alone [1]
Test #2: Swungwon's patch #2 alone [1]
Test #3: Swungwon's patch #1 and #2 [1]
Test #4: Doug's original patch [2]

Test #1 and #3: it doesn't work; system reboots due to kernel hung_task
Test #2 and #4: it works; instead of hung_task driver gets CRC error (which is 
expected)

Thanks,
Bing

[1] https://lkml.org/lkml/2013/4/8/316
[2] https://lkml.org/lkml/2013/3/15/583

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2 v3] tracing/uprobes: Support multibuffer and soft-mode disabling

2013-06-24 Thread zhangwei(Jovi)
This is third version of uprobes-based dynamic events
multibuffer and soft-mode disabling support work.

[v3]:
1. separate soft-mode disabling patch from multibuffer support patch.
2. fix some comments
3. coding style trivial
4. fix flags race in probe_event_disable


zhangwei(Jovi) (2):
  tracing/uprobes: Support ftrace_event_file base multibuffer
  tracing/uprobes: Support soft-mode disabling

 kernel/trace/trace_uprobe.c |  121 +++
 1 file changed, 100 insertions(+), 21 deletions(-)

-- 
1.7.9.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PCI: avoid NULL deref in alloc_pcie_link_state

2013-06-24 Thread Bjorn Helgaas
On Mon, Jun 24, 2013 at 8:58 PM, Alex Williamson
 wrote:
> On Mon, 2013-06-24 at 19:38 -0600, Bjorn Helgaas wrote:
>> [+cc Michael, Alex, Isaku]
>>
>> On Wed, Jun 19, 2013 at 12:56 PM, Radim Krčmář  wrote:
>> > PCIe switch upstream port can be connected directly to the PCIe root bus
>> > in QEMU; ASPM does not expect this topology and dereferences NULL pointer
>> > when initializing.
>> >
>> > I have not confirmed this can happen on real hardware, but it is presented
>> > as a feature in QEMU, so there is no reason to panic if we can recover.
>>
>> This doesn't seem like a valid hardware topology to me.  If this *can*
>> occur on real hardware, we should fix it in Linux.  If not, maybe QEMU
>> should be changed to disallow it.
>
> I think a quad-port 82576 plugged into an express slot is likely the
> same topology.

I don't think that would be the same topology Radim described.  In
Radim's case, we have this:

  00:03.0 upstream port
  01:00.0 downstream port

and when we call alloc_pcie_link_state() for 01:00.0,

  pdev is 01:00.0
  pdev->bus is bus 01
  pdev->bus->parent is bus 00
  pdev->bus->parent->self (the bridge device leading to bus 00) is NULL

But in the case of a quad 82576 plugged into a slot, there would be a
root port or a downstream port leading to the slot's link, so my guess
is we'd have something like this (based on lspci output I found at
[1]):

  00:05.0 root port leading to slot (bridge to [bus 01-08])
  01:00.0 upstream port of switch on card (bridge to [bus 02-08])
  02:02.0 downstream port (bridge to [bus 03-05])
  02:04.0 downstream port (bridge to [bus 06-08])
  03:00.0 82576 port 0
  03:00.1 82576 port 1
  06:00.0 82576 port 2
  06:00.1 82576 port 3

So when we call alloc_pcie_link_state() for 02:02.0,

  pdev is 02:02.0
  pdev->bus is bus 02
  pdev->bus->parent is bus 01
  pdev->bus->parent->self (the bridge leading to bus 01) is 00:05.0

Bjorn

[1] http://sourceforge.net/p/e1000/bugs/112/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] USB: initialize or shutdown PHY when add or remove host controller

2013-06-24 Thread Felipe Balbi
Hi,

On Tue, Jun 25, 2013 at 09:25:30AM +0800, Chao Xie wrote:
> >> It is same as clk, irq requested by ehci-xxx driver.
> >
> > clocks could be handled generically in some cases, we have pm_clk_add()
> > for a reason ;-)
> >
> > Also, clock handling can be hidden under pm_runtime callbacks (say,
> > clk_enable() on ->runtime_resume(), clk_disable() on
> > ->runtime_suspend()). IRQ is actually handled by usbcore, you just pass
> > a handler which, in most cases, is the normal ehci_irq() handler.
> >
> > But we'll get to those later, let's focus on PHY for now.
> >
> clock is another story, and i know that OMAP has full system to handle
> the clock with PM runtime,
> i would like to discuss it when one day you want to do it.

sure, anytime.

> >> So i think add a flag and use usb_get_phy() is not very good.
> >
> > Alan was talking about use hcd->phy as that flag, no flag would be
> > added. But why isn't it very good ? you didn't mention your resoning.
> >
> I maybe understand something wrong.
> Using hcd->phy as a flag to indicates whether the gule driver need
> EHCI HCD to help
> phy operation, such as initialization and shutdown, i think it is fine.
> If add another member as a flag in EHCI HCD to indicates the PHY
> differences of each echi-xxx.c driver,
> and handle them in EHCI HCD, i think that is not very good. Because as

no argument there :-)

> you said that make
> common part into EHCI HCD is the target, but this member will import
> all the differences to EHCI HCD.

oh no, by 'flag' I meant something to tell ehci-hcd that we want to
handle PHY by ourselves, but as Alan pointed out, we don't need a
separate flag.

IOW, I didn't mean to cater for OMAP's peculiarities in the generic code
:-)

> It is better to let the ehci-xxx.c driver to handle the differences if
> it does not fit EHCI HCD's requirment
> for common PHY handling just as this patch did.

right :-)

> >> It is bette to make ehci-xxx to do the phy getting and EHCI HCD
> >> initialize it and shut down as the patch did, or let ehci-xxx to
> >> handle the PHY as Roger said.
> >
> > right, so this is what Alan suggested:
> >
> > ehci-xxx.c does usb_get_phy() (or any of those variants) and sets the
> > returned pointer to hcd->phy. From that point on, ehci-hcd will play
> > with the phy, resuming and suspending at the proper locations, asking
> > the phy to enable wakeup capabilities and the like.
> >
> > In fact, because of that, I was just considering if I should protect
> > usb_phy* against NULL pointers, just to make EHCI's life easier, I mean:
> >
> > static inline int usb_phy_set_suspend(struct usb_phy *phy, int suspend)
> > {
> > if (!phy)
> > return 0;
> >
> > return phy->suspend(phy, suspend);
> > }
> >
> This patch does not include the suspending/resumeing. It is great that you are
> woking at it.

yeah, I'll add that part so that ehci-hcd doesn't have to add if
(hcd->phy) all over the place.

> >> Based on the generic work is not too much, and does not look so
> >> meaningful. I suggest that let to echi-xxx
> >> do it.
> >
> > we'll end up with a boilerplate code in every single ehci-xxx doing
> > exactly the same thing. By building the common case in ehci-hcd, we can
> > make sure to focus efforts wrt power consumption, proper use of the phy
> > layer, etc in a single location which (almost) everybody shares.
> >
> > The other bits which are non-generic, can use ehci-hcd as a reference to
> > build their own stuff.
> >
> > my 2 cents
> >
> OK. I understand. I am not very fimilar with PHY suspending/resuming.
> I hope that i can see the patch move all PHY handling to EHCI HCD
> including suspending/resuming, so
> i can change our ehci driver to fit it and continuing to push the USB
> patches ;-)

suspend/resume is usually very tricky, so I'd rather leave it for later.

For now, let's just build enough ground-work as to make it easier to
think about suspend/resume later :-)

Meaning that we can just add the bare minimum (init on probe and
shutdown on remove) and add more support as we go :-)

cheers

-- 
balbi


signature.asc
Description: Digital signature


[PATCH 2/2 v3] tracing/uprobes: Support soft-mode disabling

2013-06-24 Thread zhangwei(Jovi)
Support soft-mode disabling on uprobe-based dynamic events.
Soft-disabling is just ignoring recording if the soft disabled
flag is set.

Signed-off-by: zhangwei(Jovi) 
Cc: Masami Hiramatsu 
Cc: Frederic Weisbecker 
Cc: Oleg Nesterov 
Cc: Srikar Dronamraju 
---
 kernel/trace/trace_uprobe.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index dbbb4a9..d2da3ea 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -530,6 +530,9 @@ static void uprobe_trace_print(struct trace_uprobe *tu,

WARN_ON(call != ftrace_file->event_call);

+   if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, _file->flags))
+   return;
+
size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
event = trace_event_buffer_lock_reserve(, ftrace_file,
call->event.type,
-- 
1.7.9.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 25/32] cpufreq: delete __cpuinit usage from all cpufreq files

2013-06-24 Thread Viresh Kumar
On 25 June 2013 01:00, Paul Gortmaker  wrote:

> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index f8c2860..5687d28 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1900,8 +1900,8 @@ no_policy:
>  }
>  EXPORT_SYMBOL(cpufreq_update_policy);
>
> -static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> -   unsigned long action, void *hcpu)
> +static int cpufreq_cpu_callback(struct notifier_block *nfb,
> +   unsigned long action, void *hcpu)

You were not required to change second line here and also don't
change its indentation level. Check this everywhere.

>  {
> unsigned int cpu = (unsigned long)hcpu;
> struct device *dev;
> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> index fb65dec..ff50b0c 100644
> --- a/drivers/cpufreq/cpufreq_stats.c
> +++ b/drivers/cpufreq/cpufreq_stats.c
> @@ -306,7 +306,7 @@ static int cpufreq_stat_notifier_policy(struct 
> notifier_block *nb,
>  }
>
>  static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
> -   unsigned long val, void *data)
> +  unsigned long val, void *data)

See.. unnecessary change.

>  {
> struct cpufreq_freqs *freq = data;
> struct cpufreq_stats *stat;
> @@ -341,7 +341,7 @@ static int cpufreq_stat_notifier_trans(struct 
> notifier_block *nb,
> return 0;
>  }
>
> -static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> +static int cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>unsigned long action,
>void *hcpu)
>  {
> diff --git a/drivers/cpufreq/dbx500-cpufreq.c 
> b/drivers/cpufreq/dbx500-cpufreq.c
> index 6ec6539..8c005ac 100644
> --- a/drivers/cpufreq/dbx500-cpufreq.c
> +++ b/drivers/cpufreq/dbx500-cpufreq.c
> @@ -82,7 +82,7 @@ static unsigned int dbx500_cpufreq_getspeed(unsigned int 
> cpu)
> return freq_table[i].frequency;
>  }
>
> -static int __cpuinit dbx500_cpufreq_init(struct cpufreq_policy *policy)
> +static int dbx500_cpufreq_init(struct cpufreq_policy *policy)
>  {
> int res;
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 07f2840..b012d76 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -617,7 +617,7 @@ static int intel_pstate_verify_policy(struct 
> cpufreq_policy *policy)
> return 0;
>  }
>
> -static int __cpuinit intel_pstate_cpu_exit(struct cpufreq_policy *policy)
> +static int intel_pstate_cpu_exit(struct cpufreq_policy *policy)
>  {
> int cpu = policy->cpu;
>
> @@ -627,7 +627,7 @@ static int __cpuinit intel_pstate_cpu_exit(struct 
> cpufreq_policy *policy)
> return 0;
>  }
>
> -static int __cpuinit intel_pstate_cpu_init(struct cpufreq_policy *policy)
> +static int intel_pstate_cpu_init(struct cpufreq_policy *policy)
>  {
> int rc, min_pstate, max_pstate;
> struct cpudata *cpu;
> diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c
> index b6a0a7a..8c49261 100644
> --- a/drivers/cpufreq/longhaul.c
> +++ b/drivers/cpufreq/longhaul.c
> @@ -422,7 +422,7 @@ static int guess_fsb(int mult)
>  }
>
>
> -static int __cpuinit longhaul_get_ranges(void)
> +static int longhaul_get_ranges(void)
>  {
> unsigned int i, j, k = 0;
> unsigned int ratio;
> @@ -526,7 +526,7 @@ static int __cpuinit longhaul_get_ranges(void)
>  }
>
>
> -static void __cpuinit longhaul_setup_voltagescaling(void)
> +static void longhaul_setup_voltagescaling(void)
>  {
> union msr_longhaul longhaul;
> struct mV_pos minvid, maxvid, vid;
> @@ -780,7 +780,7 @@ static int longhaul_setup_southbridge(void)
> return 0;
>  }
>
> -static int __cpuinit longhaul_cpu_init(struct cpufreq_policy *policy)
> +static int longhaul_cpu_init(struct cpufreq_policy *policy)
>  {
> struct cpuinfo_x86 *c = _data(0);
> char *cpuname = NULL;
> diff --git a/drivers/cpufreq/longhaul.h b/drivers/cpufreq/longhaul.h
> index e2dc436..1928b92 100644
> --- a/drivers/cpufreq/longhaul.h
> +++ b/drivers/cpufreq/longhaul.h
> @@ -56,7 +56,7 @@ union msr_longhaul {
>  /*
>   * VIA C3 Samuel 1  & Samuel 2 (stepping 0)
>   */
> -static const int __cpuinitconst samuel1_mults[16] = {
> +static const int samuel1_mults[16] = {
> -1, /*  -> RESERVED */
> 30, /* 0001 ->  3.0x */
> 40, /* 0010 ->  4.0x */
> @@ -75,7 +75,7 @@ static const int __cpuinitconst samuel1_mults[16] = {
> -1, /*  -> RESERVED */
>  };
>
> -static const int __cpuinitconst samuel1_eblcr[16] = {
> +static const int samuel1_eblcr[16] = {
> 50, /*  -> RESERVED */
> 30, /* 0001 ->  3.0x */
> 40, /* 0010 ->  4.0x */
> @@ -97,7 +97,7 @@ static const int __cpuinitconst samuel1_eblcr[16] = {
>  /*
>   * 

[PATCH 1/2 v3] tracing/uprobes: Support ftrace_event_file base multibuffer

2013-06-24 Thread zhangwei(Jovi)
Support multi-buffer on uprobe-based dynamic events by
using ftrace_event_file.

This patch is based kprobe-based dynamic events multibuffer
support work initially, commited by Masami(commit 41a7dd420c),
but revised as below:

Oleg changed the kprobe-based multibuffer design from
array-pointers of ftrace_event_file into simple list,
so this patch also change to the list degisn.

rcu_read_lock/unlock added into uprobe_trace_func/uretprobe_trace_func,
to synchronize with ftrace_event_file list add and delete.

Even though we allow multi-uprobes instances now,
but TP_FLAG_PROFILE/TP_FLAG_TRACE are still mutually exclusive
in probe_event_enable currently, this means we cannot allow
one user is using uprobe-tracer, and another user is using
perf-probe on same uprobe concurrently.
(Perhaps this will be fix in future, kprobe dont't have this
limitation now)

Signed-off-by: zhangwei(Jovi) 
Cc: Masami Hiramatsu 
Cc: Frederic Weisbecker 
Cc: Oleg Nesterov 
Cc: Srikar Dronamraju 
---
 kernel/trace/trace_uprobe.c |  118 +++
 1 file changed, 97 insertions(+), 21 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 32494fb0..dbbb4a9 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -53,6 +53,7 @@ struct trace_uprobe {
struct list_headlist;
struct ftrace_event_class   class;
struct ftrace_event_callcall;
+   struct list_headfiles;
struct trace_uprobe_filter  filter;
struct uprobe_consumer  consumer;
struct inode*inode;
@@ -65,6 +66,11 @@ struct trace_uprobe {
struct probe_argargs[];
 };

+struct event_file_link {
+   struct ftrace_event_file*file;
+   struct list_headlist;
+};
+
 #define SIZEOF_TRACE_UPROBE(n) \
(offsetof(struct trace_uprobe, args) +  \
(sizeof(struct probe_arg) * (n)))
@@ -124,6 +130,7 @@ alloc_trace_uprobe(const char *group, const char *event, 
int nargs, bool is_ret)
goto error;

INIT_LIST_HEAD(>list);
+   INIT_LIST_HEAD(>files);
tu->consumer.handler = uprobe_dispatcher;
if (is_ret)
tu->consumer.ret_handler = uretprobe_dispatcher;
@@ -511,7 +518,8 @@ static const struct file_operations uprobe_profile_ops = {
 };

 static void uprobe_trace_print(struct trace_uprobe *tu,
-   unsigned long func, struct pt_regs *regs)
+   unsigned long func, struct pt_regs *regs,
+   struct ftrace_event_file *ftrace_file)
 {
struct uprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
@@ -520,9 +528,12 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
int size, i;
struct ftrace_event_call *call = >call;

+   WARN_ON(call != ftrace_file->event_call);
+
size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
-   event = trace_current_buffer_lock_reserve(, call->event.type,
- size + tu->size, 0, 0);
+   event = trace_event_buffer_lock_reserve(, ftrace_file,
+   call->event.type,
+   size + tu->size, 0, 0);
if (!event)
return;

@@ -546,15 +557,28 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
 /* uprobe handler */
 static int uprobe_trace_func(struct trace_uprobe *tu, struct pt_regs *regs)
 {
-   if (!is_ret_probe(tu))
-   uprobe_trace_print(tu, 0, regs);
+   struct event_file_link *link;
+
+   if (is_ret_probe(tu))
+   return 0;
+
+   rcu_read_lock();
+   list_for_each_entry(link, >files, list)
+   uprobe_trace_print(tu, 0, regs, link->file);
+   rcu_read_unlock();
+
return 0;
 }

 static void uretprobe_trace_func(struct trace_uprobe *tu, unsigned long func,
struct pt_regs *regs)
 {
-   uprobe_trace_print(tu, func, regs);
+   struct event_file_link *link;
+
+   rcu_read_lock();
+   list_for_each_entry(link, >files, list)
+   uprobe_trace_print(tu, func, regs, link->file);
+   rcu_read_unlock();
 }

 /* Event entry printers */
@@ -605,33 +629,84 @@ typedef bool (*filter_func_t)(struct uprobe_consumer 
*self,
struct mm_struct *mm);

 static int
-probe_event_enable(struct trace_uprobe *tu, int flag, filter_func_t filter)
+probe_event_enable(struct trace_uprobe *tu, struct ftrace_event_file *file,
+  filter_func_t filter)
 {
+   int enabled = 0;
int ret = 0;

+   /* we cannot call uprobe_register twice for same tu */
if (is_trace_uprobe_enabled(tu))
-   return -EINTR;
+   enabled = 1;
+
+   if (file) {
+

Re: [PATCH v2] tracing/uprobes: Support ftrace_event_file base multibuffer

2013-06-24 Thread zhangwei(Jovi)
On 2013/6/25 2:05, Oleg Nesterov wrote:
> Hi Jovi,
> 
> I'll try to read this patch carefully tomorrow.
> 
> Looks fine at first glance, but some nits below.
> 
> On 06/24, zhangwei(Jovi) wrote:
>>
>>  static int uprobe_trace_func(struct trace_uprobe *tu, struct pt_regs *regs)
>>  {
>> -if (!is_ret_probe(tu))
>> -uprobe_trace_print(tu, 0, regs);
>> +struct event_file_link *link;
>> +
>> +if (is_ret_probe(tu))
>> +return 0;
>> +
>> +rcu_read_lock();
>> +
>> +list_for_each_entry(link, >files, list)
>> +uprobe_trace_print(tu, 0, regs, link->file);
>> +
>> +rcu_read_unlock();
> 
> Purely cosmetic and I won't argue, but why the empty lines around
> list_for_each_entry() ?
> 
>>  static int
>> -probe_event_enable(struct trace_uprobe *tu, int flag, filter_func_t filter)
>> +probe_event_enable(struct trace_uprobe *tu, struct ftrace_event_file *file,
>> +   filter_func_t filter)
>>  {
>> +int enabled = 0;
>>  int ret = 0;
>>
>> -if (is_trace_uprobe_enabled(tu))
>> +/*
>> + * Currently TP_FLAG_TRACE/TP_FLAG_PROFILE are mutually exclusive
>> + * for uprobe(filter argument issue), this need to fix in future.
>> + */
>> +if ((file && (tu->flags & TP_FLAG_PROFILE)) ||
>> +(!file && (tu->flags & TP_FLAG_TRACE)))
>>  return -EINTR;
> 
> Well, this looks confusing and overcomplicated, see below.
> 
>> +/* Currently we cannot call uprobe_register twice for same tu */
>> +if (is_trace_uprobe_enabled(tu))
>> +enabled = 1;
> 
> The comment is wrong. It is not that we can't do this "Currently".
> 
> We must not do uprobe_register(..., consumer) twice, consumer/uprobe
> are linked together.
> 
>> +if (file) {
>> +struct event_file_link *link;
>> +
> 
> Just add
>   if (TP_FLAG_PROFILE)
>   return -EINTR;
> 
> here and kill the complicated check below. Same for the "else" branch.
> 
>> +static void
>> +probe_event_disable(struct trace_uprobe *tu, struct ftrace_event_file *file)
>> +{
>> +if (file) {
>> +struct event_file_link *link;
>> +
>> +link = find_event_file_link(tu, file);
>> +if (!link)
>> +return;
>> +
>> +list_del_rcu(>list);
>> +/* synchronize with uprobe_trace_func/uretprobe_trace_func */
>> +synchronize_sched();
>> +kfree(link);
>> +
>> +if (!list_empty(>files))
>> +return;
>> +
>> +tu->flags &= ~TP_FLAG_TRACE;
>> +} else
>> +tu->flags &= ~TP_FLAG_PROFILE;
>> +
>>
>>  WARN_ON(!uprobe_filter_is_empty(>filter));
>>
>> -uprobe_unregister(tu->inode, tu->offset, >consumer);
>> -tu->flags &= ~flag;
>> +if (!is_trace_uprobe_enabled(tu))
>> +uprobe_unregister(tu->inode, tu->offset, >consumer);
> 
> Well, this is not exactly right... Currently this is fine, but still.
> 
> It would be better to clear TP_FLAG_TRACE/TP_FLAG_PROFILE after
> uprobe_unregister(), when we can't race with the running handler
> which can check ->flags.
> 
> And I'd suggest you to send the soft-enable/disable change in a
> separate (and trivial) patch.
> 
> Oleg.
Thanks Oleg, you are right, please check v3 patch.

.jovi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH V10 1/4] pci: Add PCIe driver for Samsung Exynos

2013-06-24 Thread Kukjin Kim
Bjorn Helgaas wrote:
> 
> On Fri, Jun 21, 2013 at 04:24:54PM +0900, Jingoo Han wrote:
> > Exynos5440 has a PCIe controller which can be used as Root Complex.
> > This driver supports a PCIe controller as Root Complex mode.
> >
> > Signed-off-by: Surendranath Gurivireddy Balla 
> > Signed-off-by: Siva Reddy Kallam 
> > Signed-off-by: Jingoo Han 
> > Acked-by: Arnd Bergmann 
> 
> Acked-by: Bjorn Helgaas 
> 
> Please merge this through arm-soc as you discussed.
> 
Sounds great.

Arnd, please take this series into arm-soc tree directly by yourself with my
ack on arch/exynos stuff if you want ;)

Thanks,
- Kukjin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Allow block drivers to poll for I/O instead of sleeping

2013-06-24 Thread Matthew Wilcox
On Mon, Jun 24, 2013 at 10:07:51AM +0200, Ingo Molnar wrote:
> I'm wondering, how will this scheme work if the IO completion latency is a 
> lot more than the 5 usecs in the testcase? What if it takes 20 usecs or 
> 100 usecs or more?

There's clearly a threshold at which it stops making sense, and our
current NAND-based SSDs are almost certainly on the wrong side of that
threshold!  I can't wait for one of the "post-NAND" technologies to make
it to market in some form that makes it economical to use in an SSD.

The problem is that some of the people who are looking at those
technologies are crazy.  They want to "bypass the kernel" and "do user
space I/O" because "the kernel is too slow".  This patch is part of an
effort to show them how crazy they are.  And even if it doesn't convince
them, at least users who refuse to rewrite their applications to take
advantage of magical userspace I/O libraries will see real performance
benefits.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Suggestion] arch: avr32: compiler: compiling tools issue

2013-06-24 Thread Chen Gang
Hello Maintainers:

With allmodconfig, and set "avr32-linux-gnu-" as cross compiler prefix.

It will report error:
  avr32-linux-gnu-gcc: error: unrecognized command line option ‘-mno-pic’
  avr32-linux-gnu-gcc: error: unrecognized command line option ‘-march=ap’

The related gcc version:
  [root@dhcp122 linux-next]# rpm -qf /usr/bin/avr32-linux-gnu-gcc
  gcc-avr32-linux-gnu-4.7.1-0.1.20120606.fc17.x86_64

Can we say: it is compiler's issue, and I need try to compile the cross
compiler to test it again ?

Welcome any additional suggestion or completions.

Thanks.
--
Chen Gang

Asianux Corporation 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 12:58 +1000, Michael Ellerman wrote:
> On Tue, Jun 25, 2013 at 12:13:04PM +1000, Benjamin Herrenschmidt wrote:
> > On Tue, 2013-06-25 at 12:08 +1000, Michael Ellerman wrote:
> > > We're not checking for allocation failure, which we should be.
> > > 
> > > But this code is only used on powermac and 85xx, so it should probably
> > > just be a TODO to fix this up to handle the failure.
> > 
> > And what can we do if they fail ?
> 
> Fail up the chain and not unplug the CPU presumably.

BTW. Isn't Srivatsa series removing the need to stop_machine() for
unplug ? That should mean we should be able to use GFP_KERNEL no ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Allow block drivers to poll for I/O instead of sleeping

2013-06-24 Thread Matthew Wilcox
On Mon, Jun 24, 2013 at 08:11:02PM -0400, Steven Rostedt wrote:
> What about hooking into the idle_balance code? That happens if we are
> about to go to idle but before the full schedule switch to the idle
> task.
> 
> 
> In __schedule(void):
> 
>   if (unlikely(!rq->nr_running))
>   idle_balance(cpu, rq);

That may be a great place to try it from the PoV of the scheduler, but are
you OK with me threading a struct backing_dev_info * all the way through
the scheduler to idle_balance()?  :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Allow block drivers to poll for I/O instead of sleeping

2013-06-24 Thread Matthew Wilcox
On Mon, Jun 24, 2013 at 09:15:45AM +0200, Jens Axboe wrote:
> Willy, I think the general design is fine, hooking in via the bdi is the
> only way to get back to the right place from where you need to sleep.
> Some thoughts:
> 
> - This should be hooked in via blk-iopoll, both of them should call into
>   the same driver hook for polling completions.

I actually started working on this, then I realised that it's actually
a bad idea.  blk-iopoll's poll function is to poll the single I/O queue
closest to this CPU.  The iowait poll function is to poll all queues
that the I/O for this address_space might complete on.

I'm reluctant to ask drivers to define two poll functions, but I'm even
more reluctant to ask them to define one function with two purposes.

> - It needs to be more intelligent in when you want to poll and when you
>   want regular irq driven IO.

Oh yeah, absolutely.  While the example patch didn't show it, I wouldn't
enable it for all NVMe devices; only ones with sufficiently low latency.
There's also the ability for the driver to look at the number of
outstanding I/Os and return an error (eg -EBUSY) to stop spinning.

> - With the former note, the app either needs to opt in (and hence
>   willingly sacrifice CPU cycles of its scheduling slice) or it needs to
>   be nicer in when it gives up and goes back to irq driven IO.

Yup.  I like the way you framed it.  If the task *wants* to spend its
CPU cycles on polling for I/O instead of giving up the remainder of its
time slice, then it should be able to do that.  After all, it already can;
it can submit an I/O request via AIO, and then call io_getevents in a
tight loop.

So maybe the right way to do this is with a task flag?  If we go that
route, I'd like to further develop this option to allow I/Os to be
designated as "low latency" vs "normal".  Taking a page fault would be
"low latency" for all tasks, not just ones that choose to spin for I/O.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Michael Ellerman
On Tue, Jun 25, 2013 at 12:13:04PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2013-06-25 at 12:08 +1000, Michael Ellerman wrote:
> > We're not checking for allocation failure, which we should be.
> > 
> > But this code is only used on powermac and 85xx, so it should probably
> > just be a TODO to fix this up to handle the failure.
> 
> And what can we do if they fail ?

Fail up the chain and not unplug the CPU presumably.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PCI: avoid NULL deref in alloc_pcie_link_state

2013-06-24 Thread Alex Williamson
On Mon, 2013-06-24 at 19:38 -0600, Bjorn Helgaas wrote:
> [+cc Michael, Alex, Isaku]
> 
> On Wed, Jun 19, 2013 at 12:56 PM, Radim Krčmář  wrote:
> > PCIe switch upstream port can be connected directly to the PCIe root bus
> > in QEMU; ASPM does not expect this topology and dereferences NULL pointer
> > when initializing.
> >
> > I have not confirmed this can happen on real hardware, but it is presented
> > as a feature in QEMU, so there is no reason to panic if we can recover.
> 
> This doesn't seem like a valid hardware topology to me.  If this *can*
> occur on real hardware, we should fix it in Linux.  If not, maybe QEMU
> should be changed to disallow it.

I think a quad-port 82576 plugged into an express slot is likely the
same topology.

> > The dereference happens with topology defined by
> >   -M q35 -device x3130-upstream,bus=pcie.0,id=upstream \
> >   -device xio3130-downstream,bus=upstream,id=downstream,chassis=1
> > where on line drivers/pci/pcie/aspm.c:530 (alloc_pcie_link_state+13):
> > parent = pdev->bus->parent->self->link_state;
> > "pdev->bus->parent->self == NULL", because "pdev->bus->parent" has no
> > "->parent", hence no "->self".
> >
> > Even though discouraged by QEMU documentation, one can set up even
> > topology without the upstream port
> >   -M q35 -device xio3130-downstream,bus=pcie.0,id=downstream,chassis=1
> > so "pdev->bus->parent == NULL", because "pdev->bus" is the root bus.
> > The patch checks for this too, because I do not like *NULL.

I don't think this is legal on real hardware.

> > Right now, PCIe switch has to connect to the root port
> >   -M q35 -device ioh3420,bus=pcie.0,id=root.0 \
> >   -device x3130-upstream,bus=root.0,id=upstream \
> >   -device xio3130-downstream,bus=upstream,id=downstream,chassis=1
> >
> > Signed-off-by: Radim Krčmář 
> > ---
> >  drivers/pci/pcie/aspm.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 403a443..1ad1514 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -527,8 +527,8 @@ static struct pcie_link_state 
> > *alloc_pcie_link_state(struct pci_dev *pdev)
> > link->pdev = pdev;
> > if (pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM) {
> > struct pcie_link_state *parent;
> > -   parent = pdev->bus->parent->self->link_state;
> > -   if (!parent) {
> > +   if (!pdev->bus->parent || !pdev->bus->parent->self ||

I think there's an extra test in here.  Elsewhere we seem to assume that
if parent exists, then so does parent->self.  So this could be
simplified to just add:

if (pci_is_root_bus(pdev->bus) {
kfree(link);
return NULL;
}

> > +   !(parent = pdev->bus->parent->self->link_state)) {
> > kfree(link);
> > return NULL;
> > }
> > --
> > 1.8.1.4
> 
> I don't really want to further complicate the "if" statement you're
> changing.  The link state allocation is pretty obtuse already, and if
> this situation only occurs in QEMU, we're likely to break it again
> when somebody refactors this code.

Maybe the above plus a common exit to avoid duplicate free/return.
Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arch: mn10300: Kconfig: remove 'KPROBES' options

2013-06-24 Thread Chen Gang
Currently, mn10300 does not implement 'KPROBES', so need remove it now,
or can not pass compiling with allmodconfig.

The related error (with allmodconfig):
  In file included from arch/mn10300/kernel/kprobes.c:20:0:
  include/linux/kprobes.h: In function ‘get_kprobe_ctlblk’:
  include/linux/kprobes.h:332:11: error: dereferencing pointer to incomplete 
type

Signed-off-by: Chen Gang 
---
 arch/mn10300/Kconfig.debug |   10 --
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/arch/mn10300/Kconfig.debug b/arch/mn10300/Kconfig.debug
index bdbfd44..624de7d 100644
--- a/arch/mn10300/Kconfig.debug
+++ b/arch/mn10300/Kconfig.debug
@@ -24,16 +24,6 @@ config TEST_MISALIGNMENT_HANDLER
  accesses to make sure the misalignment handler deals them with
  correctly.  If it does not, the kernel will throw a BUG.
 
-config KPROBES
-   bool "Kprobes"
-   depends on DEBUG_KERNEL
-   help
- Kprobes allows you to trap at almost any kernel address and
- execute a callback function.  register_kprobe() establishes
- a probepoint and specifies the callback.  Kprobes is useful
- for kernel debugging, non-intrusive instrumentation and testing.
- If in doubt, say "N".
-
 config GDBSTUB
bool "Remote GDB kernel debugging"
depends on DEBUG_KERNEL && DEPRECATED
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the net-next tree with the net tree

2013-06-24 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in
drivers/net/ethernet/renesas/sh_eth.c between commit ca8c35852138
("sh_eth: fix unhandled RFE interrupt") from the net tree and commit
8f80899665c4 ("sh_eth: remove 'tx_error_check' field of 'struct
sh_eth_cpu_data'") from the net-next tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/net/ethernet/renesas/sh_eth.c
index e29fe8d,7732f11..000
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@@ -380,10 -382,8 +382,9 @@@ static struct sh_eth_cpu_data r8a777x_d
.eesipr_value   = 0x01ff009f,
  
.tx_check   = EESR_FTC | EESR_CND | EESR_DLC | EESR_CD | EESR_RTO,
 -  .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RDE |
 -EESR_RFRMER | EESR_TFE | EESR_TDE | EESR_ECI,
 +  .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RFE |
 +EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE |
 +EESR_ECI,
-   .tx_error_check = EESR_TWB | EESR_TABT | EESR_TDE | EESR_TFE,
  
.apr= 1,
.mpr= 1,
@@@ -425,13 -414,11 +415,12 @@@ static struct sh_eth_cpu_data sh7724_da
  
.ecsr_value = ECSR_PSRTO | ECSR_LCHNG | ECSR_ICD,
.ecsipr_value   = ECSIPR_PSRTOIP | ECSIPR_LCHNGIP | ECSIPR_ICDIP,
-   .eesipr_value   = DMAC_M_RFRMER | DMAC_M_ECI | 0x01ff009f,
+   .eesipr_value   = 0x01ff009f,
  
.tx_check   = EESR_FTC | EESR_CND | EESR_DLC | EESR_CD | EESR_RTO,
 -  .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RDE |
 -EESR_RFRMER | EESR_TFE | EESR_TDE | EESR_ECI,
 +  .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RFE |
 +EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE |
 +EESR_ECI,
-   .tx_error_check = EESR_TWB | EESR_TABT | EESR_TDE | EESR_TFE,
  
.apr= 1,
.mpr= 1,
@@@ -480,11 -453,10 +455,11 @@@ static struct sh_eth_cpu_data sh7757_da
.rmcr_value = 0x0001,
  
.tx_check   = EESR_FTC | EESR_CND | EESR_DLC | EESR_CD | EESR_RTO,
 -  .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RDE |
 -EESR_RFRMER | EESR_TFE | EESR_TDE | EESR_ECI,
 +  .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RFE |
 +EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE |
 +EESR_ECI,
-   .tx_error_check = EESR_TWB | EESR_TABT | EESR_TDE | EESR_TFE,
  
+   .irq_flags  = IRQF_SHARED,
.apr= 1,
.mpr= 1,
.tpauser= 1,
@@@ -595,11 -521,9 +524,9 @@@ static struct sh_eth_cpu_data sh7757_da
.eesipr_value   = DMAC_M_RFRMER | DMAC_M_ECI | 0x003f,
  
.tx_check   = EESR_TC1 | EESR_FTC,
 -  .eesr_err_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT | \
 -EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE | \
 -EESR_ECI,
 +  .eesr_err_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT |
 +EESR_RFE | EESR_RDE | EESR_RFRMER | EESR_TFE |
 +EESR_TDE | EESR_ECI,
-   .tx_error_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_TDE | \
- EESR_TFE,
.fdr_value  = 0x072f,
.rmcr_value = 0x0001,
  
@@@ -677,11 -579,9 +582,9 @@@ static struct sh_eth_cpu_data sh7734_da
.eesipr_value   = DMAC_M_RFRMER | DMAC_M_ECI | 0x003f,
  
.tx_check   = EESR_TC1 | EESR_FTC,
 -  .eesr_err_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT | \
 -EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE | \
 -EESR_ECI,
 +  .eesr_err_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT |
 +EESR_RFE | EESR_RDE | EESR_RFRMER | EESR_TFE |
 +EESR_TDE | EESR_ECI,
-   .tx_error_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_TDE | \
- EESR_TFE,
  
.apr= 1,
.mpr= 1,
@@@ -814,11 -643,9 +646,9 @@@ static struct sh_eth_cpu_data r8a7740_d
.eesipr_value   = DMAC_M_RFRMER | DMAC_M_ECI | 0x003f,
  
.tx_check   = EESR_TC1 | EESR_FTC,
 -  .eesr_err_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT | \
 -EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE | \
 -EESR_ECI,
 +  .eesr_err_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT |
 +EESR_RFE | EESR_RDE | EESR_RFRMER | EESR_TFE |
 +EESR_TDE | EESR_ECI,
-   .tx_error_check = EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_TDE | \

Re: [PATCH] ARM: mach-clps711x: common: Use linux/sched_clock.h

2013-06-24 Thread Fabio Estevam
On Mon, Jun 24, 2013 at 9:37 PM, Stephen Boyd  wrote:

> Is this one in the arm-soc tree? I plan to make a sweep after 3.11-rc1

I haven't see a fix for this in the linux-arm-kernel list.

> and fix up all the stragglers (looks like just this one) and remove the
> dummy asm header that just got merged into the tip tree.

There was another one and Thomas has already applied it:
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=2699339361a9bacb3fa663e6b8981a040cfca4ee
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92

2013-06-24 Thread Dave Chinner
On Sun, Jun 23, 2013 at 03:51:29PM +0400, Glauber Costa wrote:
> On Fri, Jun 21, 2013 at 11:00:21AM +0200, Michal Hocko wrote:
> > On Thu 20-06-13 17:12:01, Michal Hocko wrote:
> > > I am bisecting it again. It is quite tedious, though, because good case
> > > is hard to be sure about.
> > 
> > OK, so now I converged to 2d4fc052 (inode: convert inode lru list to 
> > generic lru
> > list code.) in my tree and I have double checked it matches what is in
> > the linux-next. This doesn't help much to pin point the issue I am
> > afraid :/
> > 
> Can you revert this patch (easiest way ATM is to rewind your tree to a point
> right before it) and apply the following patch?
> 
> As Dave has mentioned, it is very likely that this bug was already there, we
> were just not ever checking imbalances. The attached patch would tell us at
> least if the imbalance was there before. If this is the case, I would suggest
> turning the BUG condition into a WARN_ON_ONCE since we would be officially
> not introducing any regression. It is no less of a bug, though, and we should
> keep looking for it.

We probably should do that BUG->WARN change anyway. BUG_ON is pretty
obnoxious in places where we can probably continue on without much
impact

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92

2013-06-24 Thread Dave Chinner
On Tue, Jun 18, 2013 at 03:50:25PM +0200, Michal Hocko wrote:
> And again, another hang. It looks like the inode deletion never
> finishes. The good thing is that I do not see any LRU related BUG_ONs
> anymore. I am going to test with the other patch in the thread.
> 
> 2476 [] __wait_on_freeing_inode+0x9e/0xc0   <<< waiting for 
> an inode to go away
> [] find_inode_fast+0xa1/0xc0
> [] iget_locked+0x4f/0x180
> [] ext4_iget+0x33/0x9f0
> [] ext4_lookup+0xbc/0x160
> [] lookup_real+0x20/0x60
> [] lookup_open+0x175/0x1d0
> [] do_last+0x2de/0x780  <<< holds 
> i_mutex
> [] path_openat+0xda/0x400
> [] do_filp_open+0x43/0xa0
> [] do_sys_open+0x160/0x1e0
> [] sys_open+0x1c/0x20
> [] system_call_fastpath+0x16/0x1b
> [] 0x

I don't think this has anything to do with LRUs.

__wait_on_freeing_inode() only blocks once the inode is being freed
(i.e. I_FREEING is set), and that happens when a lookup is done when
the inode is still in the inode hash.

I_FREEING is set on the inode at the same time it is removed from
the LRU, and from that point onwards the LRUs play no part in the
inode being freed and anyone waiting on the inode being freed
getting woken.

The only way I can see this happening, is if there is a dispose list
that is not getting processed properly. e.g., we move a bunch on
inodes to the dispose list setting I_FREEING, then for some reason
it gets dropped on the ground and so the wakeup call doesn't happen
when the inode has been removed from the hash.

I can't see anywhere in the code that this happens, though, but it
might be some pre-existing race in the inode hash that you are now
triggering because freeing will be happening in parallel on multiple
nodes rather than serialising on a global lock...

I won't have seen this on XFS stress testing, because it doesn't use
the VFS inode hashes for inode lookups. Given that XFS is not
triggering either problem you are seeing, that makes me think
that it might be a pre-existing inode hash lookup/reclaim race
condition, not a LRU problem.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 12:08 +1000, Michael Ellerman wrote:
> We're not checking for allocation failure, which we should be.
> 
> But this code is only used on powermac and 85xx, so it should probably
> just be a TODO to fix this up to handle the failure.

And what can we do if they fail ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] ASoC: mid-x86: Convert to use devm_* APIs

2013-06-24 Thread Wei Yongjun
From: Wei Yongjun 

devm_* APIs are device managed and make code simpler.

Signed-off-by: Wei Yongjun 
---
 sound/soc/mid-x86/mfld_machine.c | 29 ++---
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/sound/soc/mid-x86/mfld_machine.c b/sound/soc/mid-x86/mfld_machine.c
index 78d5825..e16bcd1 100644
--- a/sound/soc/mid-x86/mfld_machine.c
+++ b/sound/soc/mid-x86/mfld_machine.c
@@ -371,7 +371,7 @@ static int snd_mfld_mc_probe(struct platform_device *pdev)
 
/* audio interrupt base of SRAM location where
 * interrupts are stored by System FW */
-   mc_drv_ctx = kzalloc(sizeof(*mc_drv_ctx), GFP_ATOMIC);
+   mc_drv_ctx = devm_kzalloc(>dev, sizeof(*mc_drv_ctx), GFP_ATOMIC);
if (!mc_drv_ctx) {
pr_err("allocation failed\n");
return -ENOMEM;
@@ -381,40 +381,33 @@ static int snd_mfld_mc_probe(struct platform_device *pdev)
pdev, IORESOURCE_MEM, "IRQ_BASE");
if (!irq_mem) {
pr_err("no mem resource given\n");
-   ret_val = -ENODEV;
-   goto unalloc;
+   return -ENODEV;
}
-   mc_drv_ctx->int_base = ioremap_nocache(irq_mem->start,
-   resource_size(irq_mem));
+   mc_drv_ctx->int_base = devm_ioremap_nocache(>dev, irq_mem->start,
+   resource_size(irq_mem));
if (!mc_drv_ctx->int_base) {
pr_err("Mapping of cache failed\n");
-   ret_val = -ENOMEM;
-   goto unalloc;
+   return -ENOMEM;
}
/* register for interrupt */
-   ret_val = request_threaded_irq(irq, snd_mfld_jack_intr_handler,
+   ret_val = devm_request_threaded_irq(>dev, irq,
+   snd_mfld_jack_intr_handler,
snd_mfld_jack_detection,
IRQF_SHARED, pdev->dev.driver->name, mc_drv_ctx);
if (ret_val) {
pr_err("cannot register IRQ\n");
-   goto unalloc;
+   return ret_val;
}
/* register the soc card */
snd_soc_card_mfld.dev = >dev;
ret_val = snd_soc_register_card(_soc_card_mfld);
if (ret_val) {
pr_debug("snd_soc_register_card failed %d\n", ret_val);
-   goto freeirq;
+   return ret_val;
}
platform_set_drvdata(pdev, mc_drv_ctx);
pr_debug("successfully exited probe\n");
-   return ret_val;
-
-freeirq:
-   free_irq(irq, mc_drv_ctx);
-unalloc:
-   kfree(mc_drv_ctx);
-   return ret_val;
+   return 0;
 }
 
 static int snd_mfld_mc_remove(struct platform_device *pdev)
@@ -422,9 +415,7 @@ static int snd_mfld_mc_remove(struct platform_device *pdev)
struct mfld_mc_private *mc_drv_ctx = platform_get_drvdata(pdev);
 
pr_debug("snd_mfld_mc_remove called\n");
-   free_irq(platform_get_irq(pdev, 0), mc_drv_ctx);
snd_soc_unregister_card(_soc_card_mfld);
-   kfree(mc_drv_ctx);
return 0;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 23/32] x86: delete __cpuinit usage from all x86 files

2013-06-24 Thread Paul Gortmaker
[Re: [PATCH 23/32] x86: delete __cpuinit usage from all x86 files] On 
24/06/2013 (Mon 16:12) H. Peter Anvin wrote:

> On 06/24/2013 12:30 PM, Paul Gortmaker wrote:
> > The __cpuinit type of throwaway sections might have made sense
> > some time ago when RAM was more constrained, but now the savings
> > do not offset the cost and complications.  For example, the fix in
> > commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
> > is a good example of the nasty type of bugs that can be created
> > with improper use of the various __init prefixes.
> > 
> > After a discussion on LKML[1] it was decided that cpuinit should go
> > the way of devinit and be phased out.  Once all the users are gone,
> > we can then finally remove the macros themselves from linux/init.h.
> > 
> > Note that some harmless section mismatch warnings may result, since
> > notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
> > are flagged as __cpuinit  -- so if we remove the __cpuinit from
> > arch specific callers, we will also get section mismatch warnings.
> > As an intermediate step, we intend to turn the linux/init.h cpuinit
> > content into no-ops as early as possible, since that will get rid
> > of these warnings.  In any case, they are temporary and harmless.
> > 
> > This removes all the arch/x86 uses of the __cpuinit macros from
> > all C files.  x86 only had the one __CPUINIT used in assembly files,
> > and it wasn't paired off with a .previous or a __FINIT, so we can
> > delete it directly w/o any corresponding additional change there.
> > 
> > [1] https://lkml.org/lkml/2013/5/20/589
> > 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: x...@kernel.org
> > Signed-off-by: Paul Gortmaker 
> > ---
> > 
> 
> Acked-by: H. Peter Anvin 
> 
> Do you want me to carry this or are you planning to push the entire
> thing as a single patchset?

Short answer -- I'll carry it unless you expect massive changes
still pending to arch/x86 (which I highly doubt) and really want
to carry it.

I'm fine with carrying most/all of it as a patch queue, however
some folks expected significant churn in their tree and wanted to
handle the conflicts/refreshes themselves.  But I'm fine with keeping
things up to date with the latest linux-next and doing the trivial
refreshes on the series until the merge window closes out and the
remaining 99% of it goes in tree.

Paul.
--

> 
>   -hpa
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] regulator: 88pm800: add regulator driver for 88pm800

2013-06-24 Thread Chao Xie
On Mon, Jun 24, 2013 at 6:14 PM, Mark Brown  wrote:
> On Mon, Jun 24, 2013 at 10:01:39AM +0800, Chao Xie wrote:
>> On Fri, Jun 21, 2013 at 11:24 PM, Mark Brown  wrote:
>
>> > Just provide get_voltage_sel(), the core will do the mapping to voltages
>> > using list_voltage().
>
>> I am a little confused.
>> The BUCK voltage is not linear, and it contains a lot of voltages.
>> If we use map_voltage_ascend, it will looking a a suitable voltage
>> from begin to end one by one.
>
> What does this have to do with get_voltage()?  And note that you can
> write your own mapping function for a reason...
>
>> If we directly make use of set_voltage and get_voltage, we can
>> directly calculates the voltage which is
>> suitable, and do not go through all the voltags.
>> for example, now BUCK voltage table is
>> range 1 from 60 to 1587500, each step is 12500
>> range 2 from 160 to 180, each step is 5
>
> No, this is nothing at all to do with using the selector versions of the
> API.  Think about what the API is doing and take a look at the code.
>
OK. I will try to define my own mapping function and use the sel APIs.

>> >> + } else if (pdata->num_regulators) {
>> >> + /* Check whether num_regulator is valid. */
>> >> + unsigned int count = 0;
>> >> + for (i = 0; pdata->regulators[i]; i++)
>> >> + count++;
>> >> + if (count != pdata->num_regulators)
>> >> + return -EINVAL;
>
>> > This looks...  odd.
>
>> It is just make sure that pdata has correct number of regulators.
>> It you think that it is redundant, i can remove it.
>
> If you really need to have platform data for all the regulators then
> just embed the array inside the platform data so there's no possibility
> of any confusion.
>
Some regulators will not be exported to kernel depending on boards which are
for special usage. So i do not use a array inside the platform data.

>> > With deferred probing you should just be able to use
>> > module_platform_driver().
>
>> The regulator controlles some BUCK regulators.
>> These regulators may be used by application CPU or CP(communication
>> CPU) for telephony.
>> The CP may need different voltages if it goes deep initialization. if
>> we defer the setting later, it is too late for
>> CP initialization, and will impact the performance.
>
> If your kernel startup is taking long enough for this to be an issue it
> seems like there's much bigger problems here and things are going to be
> very fragile anyway, it's going to be better to figure out what the root
> issue is.
I see. I will use module_platform_driver
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the final tree (staging tree related)

2013-06-24 Thread Greg KH
On Tue, Jun 25, 2013 at 08:22:49AM +0800, Peng Tao wrote:
> On 06/25/2013 07:50 AM, Greg KH wrote:
> >On Tue, Jun 25, 2013 at 09:40:51AM +1000, Stephen Rothwell wrote:
> >>Hi Greg,
> >>
> >>On Mon, 24 Jun 2013 15:40:35 -0700 Greg KH  wrote:
> >>>We are running out of time, my tree is pretty much closed for 3.11 now,
> >>>should I just disable the build of this module for 3.11?
> >>That's what I've been doing - it has never been enabled in a final
> >>linux-next release.  So, it should probably just be disabled properly.
> >I agree, now disabled.
> >
> >greg k-h
> Greg,
> 
> Sorry for the delay. I've been caught up by other build failures
> that were found during my tests. I'm sending you the latest patches
> that should fix all the build errors. I've verified that we can now
> pass build on sparc/mips/s390/powerpc/x86, both 32bits and 64bits.
> Please help to queue them.
> 
> I see that you have already disabled Lustre build. Can we re-enable
> it or do we have to wait until 3.10 is released?

My trees mare now closed for 3.11, given that 3.10 will be released in
less than a week.  I'll keep patches in my to-apply mailbox, which I
will then apply after 3.11-rc1 is out, so I'll not be accepting anything
for the next 3 weeks or so.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V10 1/4] pci: Add PCIe driver for Samsung Exynos

2013-06-24 Thread Bjorn Helgaas
On Fri, Jun 21, 2013 at 04:24:54PM +0900, Jingoo Han wrote:
> Exynos5440 has a PCIe controller which can be used as Root Complex.
> This driver supports a PCIe controller as Root Complex mode.
> 
> Signed-off-by: Surendranath Gurivireddy Balla 
> Signed-off-by: Siva Reddy Kallam 
> Signed-off-by: Jingoo Han 
> Acked-by: Arnd Bergmann 

Acked-by: Bjorn Helgaas 

Please merge this through arm-soc as you discussed.

> ---
>  .../devicetree/bindings/pci/designware-pcie.txt|   73 ++
>  drivers/pci/host/Kconfig   |9 +
>  drivers/pci/host/Makefile  |1 +
>  drivers/pci/host/pcie-designware.c | 1057 
> 
>  4 files changed, 1140 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/pci/designware-pcie.txt
>  create mode 100644 drivers/pci/host/pcie-designware.c
> 
> diff --git a/Documentation/devicetree/bindings/pci/designware-pcie.txt 
> b/Documentation/devicetree/bindings/pci/designware-pcie.txt
> new file mode 100644
> index 000..e2371f5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/pci/designware-pcie.txt
> @@ -0,0 +1,73 @@
> +* Synopsis Designware PCIe interface
> +
> +Required properties:
> +- compatible: should contain "snps,dw-pcie" to identify the
> + core, plus an identifier for the specific instance, such
> + as "samsung,exynos5440-pcie".
> +- reg: base addresses and lengths of the pcie controller,
> + the phy controller, additional register for the phy controller.
> +- interrupts: interrupt values for level interrupt,
> + pulse interrupt, special interrupt.
> +- clocks: from common clock binding: handle to pci clock.
> +- clock-names: from common clock binding: should be "pcie" and "pcie_bus".
> +- #address-cells: set to <3>
> +- #size-cells: set to <2>
> +- device_type: set to "pci"
> +- ranges: ranges for the PCI memory and I/O regions
> +- #interrupt-cells: set to <1>
> +- interrupt-map-mask and interrupt-map: standard PCI properties
> + to define the mapping of the PCIe interface to interrupt
> + numbers.
> +- reset-gpio: gpio pin number of power good signal
> +
> +Example:
> +
> +SoC specific DT Entry:
> +
> + pcie@29 {
> + compatible = "samsung,exynos5440-pcie", "snps,dw-pcie";
> + reg = <0x29 0x1000
> + 0x27 0x1000
> + 0x271000 0x40>;
> + interrupts = <0 20 0>, <0 21 0>, <0 22 0>;
> + clocks = < 28>, < 27>;
> + clock-names = "pcie", "pcie_bus";
> + #address-cells = <3>;
> + #size-cells = <2>;
> + device_type = "pci";
> + ranges = <0x0800 0 0x4000 0x4000 0 0x1000   /* 
> configuration space */
> +   0x8100 0 0  0x40001000 0 0x0001   /* 
> downstream I/O */
> +   0x8200 0 0x40011000 0x40011000 0 0x1ffef000>; /* 
> non-prefetchable memory */
> + #interrupt-cells = <1>;
> + interrupt-map-mask = <0 0 0 0>;
> + interrupt-map = <0x0 0  53>;
> + };
> +
> + pcie@2a {
> + compatible = "samsung,exynos5440-pcie", "snps,dw-pcie";
> + reg = <0x2a 0x1000
> + 0x272000 0x1000
> + 0x271040 0x40>;
> + interrupts = <0 23 0>, <0 24 0>, <0 25 0>;
> + clocks = < 29>, < 27>;
> + clock-names = "pcie", "pcie_bus";
> + #address-cells = <3>;
> + #size-cells = <2>;
> + device_type = "pci";
> + ranges = <0x0800 0 0x6000 0x6000 0 0x1000   /* 
> configuration space */
> +   0x8100 0 0  0x60001000 0 0x0001   /* 
> downstream I/O */
> +   0x8200 0 0x60011000 0x60011000 0 0x1ffef000>; /* 
> non-prefetchable memory */
> + #interrupt-cells = <1>;
> + interrupt-map-mask = <0 0 0 0>;
> + interrupt-map = <0x0 0  56>;
> + };
> +
> +Board specific DT Entry:
> +
> + pcie@29 {
> + reset-gpio = <_ctrl 5 0>;
> + };
> +
> + pcie@2a {
> + reset-gpio = <_ctrl 22 0>;
> + };
> diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
> index 1f1d67f..1184ff6 100644
> --- a/drivers/pci/host/Kconfig
> +++ b/drivers/pci/host/Kconfig
> @@ -5,4 +5,13 @@ config PCI_MVEBU
>   bool "Marvell EBU PCIe controller"
>   depends on ARCH_MVEBU || ARCH_KIRKWOOD
>  
> +config PCIE_DW
> + bool
> +
> +config PCI_EXYNOS
> + bool "Samsung Exynos PCIe controller"
> + depends on SOC_EXYNOS5440
> + select PCIEPORTBUS
> + select PCIE_DW
> +
>  endmenu
> diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
> index 5ea2d8b..086d850 100644
> --- a/drivers/pci/host/Makefile
> +++ b/drivers/pci/host/Makefile
> @@ -1 +1,2 @@
>  obj-$(CONFIG_PCI_MVEBU) += pci-mvebu.o
> 

Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Michael Ellerman
On Sun, Jun 23, 2013 at 07:17:00PM +0530, Srivatsa S. Bhat wrote:
> The function migrate_irqs() is called with interrupts disabled
> and hence its not safe to do GFP_KERNEL allocations inside it,
> because they can sleep. So change the gfp mask to GFP_ATOMIC.

OK so it gets there via:
  __stop_machine()
take_cpu_down()
  __cpu_disable()
smp_ops->cpu_disable()
  generic_cpu_disable()
migrate_irqs()

> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index ea185e0..ca39bac 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -412,7 +412,7 @@ void migrate_irqs(void)
>   cpumask_var_t mask;
>   const struct cpumask *map = cpu_online_mask;
>  
> - alloc_cpumask_var(, GFP_KERNEL);
> + alloc_cpumask_var(, GFP_ATOMIC);

We're not checking for allocation failure, which we should be.

But this code is only used on powermac and 85xx, so it should probably
just be a TODO to fix this up to handle the failure.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arch: c6x: platforms: include "asm/special_insns.h" to pass compiling

2013-06-24 Thread Chen Gang
Include "asm/special_insns.h" to pass compiling.

The related error (with allmodconfig):
  arch/c6x/platforms/plldata.c: In function ‘c6472_setup_clocks’:
  arch/c6x/platforms/plldata.c:279:2: error: implicit declaration of function 
‘get_coreid’ [-Werror=implicit-function-declaration]

Signed-off-by: Chen Gang 
---
 arch/c6x/platforms/plldata.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/c6x/platforms/plldata.c b/arch/c6x/platforms/plldata.c
index 755359e..ddd1441 100644
--- a/arch/c6x/platforms/plldata.c
+++ b/arch/c6x/platforms/plldata.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Common SoC clock support.
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arch: c6x: mm: include "asm/uaccess.h" to pass compiling

2013-06-24 Thread Chen Gang
Need include "asm/uaccess.h" to pass compiling.

The related error (with allmodconfig):
  arch/c6x/mm/init.c: In function ‘paging_init’:
  arch/c6x/mm/init.c:46:2: error: implicit declaration of function ‘set_fs’ 
[-Werror=implicit-function-declaration]
  arch/c6x/mm/init.c:46:9: error: ‘KERNEL_DS’ undeclared (first use in this 
function)
  arch/c6x/mm/init.c:46:9: note: each undeclared identifier is reported only 
once for each function it appears in

Signed-off-by: Chen Gang 
---
 arch/c6x/mm/init.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/c6x/mm/init.c b/arch/c6x/mm/init.c
index e524fde..63f5560 100644
--- a/arch/c6x/mm/init.c
+++ b/arch/c6x/mm/init.c
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 
 /*
  * ZERO_PAGE is a special page that is used for zero-initialized
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: v3.10-rc7 oops soon after boot

2013-06-24 Thread Gao feng
On 06/25/2013 06:17 AM, George Spelvin wrote:
>>> Reported-by: Borislav Petkov 
> 
>> This should be:
>>
>> Reported-by: George Spelvin 
>>
>> I only connected the dots...
> 
> Well, you did a whole lot more than me!  I just lobbed a "d'oh, it
> crashes" into the seething ocean of lkml.  (Admittedly, I had reason
> to act fast: we're very close to release.)
> 
> You figured out what subsystem was at fault and got the right people
> involved.  Definitely a valuable contribution.
> 
> Me, personally, I don't give a flying f*** about such credit; I had
> an itch and was trolling for someone to scratch it.
> 
> So feel free to take Reported-by (you are the one who reported it *to
> someone who could fix it*), Triaged-by, or whatever.

It's my mistake, the reported-by should be you.
of course we should thank Borislav, I didn't notice this bug report
mail until he forwarded it.

Thanks all you guys! Sorry for my mistake..

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arch: c6x: kernel: include "linux/console.h" when 'VT' and 'DUMMY_CONSOLE' enabled.

2013-06-24 Thread Chen Gang
Need include "linux/console.h" when 'VT' and 'DUMMY_CONSOLE' enabled
(e.g allmodconfig).

The related error:
  arch/c6x/kernel/setup.c: In function ‘setup_arch’:
  arch/c6x/kernel/setup.c:442:2: error: ‘conswitchp’ undeclared (first use in 
this function)
  arch/c6x/kernel/setup.c:442:2: note: each undeclared identifier is reported 
only once for each function it appears in
  arch/c6x/kernel/setup.c:442:16: error: ‘dummy_con’ undeclared (first use in 
this function)

Signed-off-by: Chen Gang 
---
 arch/c6x/kernel/setup.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/c6x/kernel/setup.c b/arch/c6x/kernel/setup.c
index f4e72bd..1411141 100644
--- a/arch/c6x/kernel/setup.c
+++ b/arch/c6x/kernel/setup.c
@@ -26,6 +26,9 @@
 #include 
 #include 
 #include 
+#if defined(CONFIG_VT) && defined(CONFIG_DUMMY_CONSOLE)
+#include 
+#endif
 
 
 #include 
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-stable 3.8.1 performance regression] madvise POSIX_FADV_DONTNEED

2013-06-24 Thread Dave Chinner
On Thu, Jun 20, 2013 at 08:20:16AM -0400, Mathieu Desnoyers wrote:
> * Rob van der Heij (rvdh...@gmail.com) wrote:
> > Wouldn't you batch the calls to drop the pages from cache rather than drop
> > one packet at a time?
> 
> By default for kernel tracing, lttng's trace packets are 1MB, so I
> consider the call to fadvise to be already batched by applying it to 1MB
> packets rather than indivitual pages. Even there, it seems that the
> extra overhead added by the lru drain on each CPU is noticeable.
> 
> Another reason for not batching this in larger chunks is to limit the
> impact of the tracer on the kernel page cache. LTTng limits itself to
> its own set of buffers, and use the page cache for what is absolutely
> needed to perform I/O, but no more.

I think you are doing it wrong. This is a poster child case for
using Direct IO and completely avoiding the page cache altogether

> > Your effort to help Linux mm seems a bit overkill,
> 
> Without performing this, I have a situation similar as yours, where
> LTTng fills up the page cache very quickly, until it gets to a point
> where memory pressure level increase enough that the consumerd is
> blocked until some pages are reclaimed. I really don't care about making
> the consumerd "as fast as possible for a while" if it means its
> throughput will drop when the page cache is filled. I prefer a constant
> slower pace to a short burst followed by slower throughput.
> 
> > and you don't want every application to do it like that himself.
> 
> Indeed, tracing has always been slightly odd in the sense that it's not
> the workload the system is meant to run, but rather a tool that should
> have the smallest impact on the usual system's run when it is used.
> 
> > The
> > fadvise will not even work when the page is still to be flushed out.
> > Without the patch that started the thread, it would 'at random' not work
> > due to SMP race condition (not multi-threading).
> 
> This is why the lttng consumerd calls:
> 
> sync_file_range with flags:
> SYNC_FILE_RANGE_WAIT_BEFORE
> SYNC_FILE_RANGE_WRITE
> SYNC_FILE_RANGE_WAIT_AFTER
> 
> on the page range. The purpose of this call is to flush the pages to
> disk before calling fadvise(POSIX_FADV_DONTNEED) on the page range.

Yup, you're emulating direct IO semantics with buffered IO.

This seems to be an emerging trend I'm seeing a lot of over the past
few months - I'm hearing about it because of all the wierd corner
case behaviours it causes because sync_file_range() doesn't provide
data integrity guarantees and fadvise(DONTNEED) can randomly issue
lots of IO, block for long periods of time, silently do nothing,
remove pages from the page cache and/or some or all of the above.

Direct IO is a model of sanity compared to that mess

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Suggestion] arch: s390: mm: the warnings with allmodconfig and "EXTRA_CFLAGS=-W"

2013-06-24 Thread Chen Gang
Hello Maintainers:

When allmodconfig for " IBM zSeries model z800 and z900"

It will report the related warnings ("EXTRA_CFLAGS=-W"):
  mm/slub.c:1875:1: warning: ‘deactivate_slab’ uses dynamic stack allocation 
[enabled by default]
  mm/slub.c:1941:1: warning: ‘unfreeze_partials.isra.32’ uses dynamic stack 
allocation [enabled by default]
  mm/slub.c:2575:1: warning: ‘__slab_free’ uses dynamic stack allocation 
[enabled by default]
  mm/slub.c:1582:1: warning: ‘get_partial_node.isra.34’ uses dynamic stack 
allocation [enabled by default]
  mm/slub.c:2311:1: warning: ‘__slab_alloc.constprop.42’ uses dynamic stack 
allocation [enabled by default]

Is it OK ?


Thanks.
--
Chen Gang

Asianux Corporation 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gpio MIPS/OCTEON: Add a driver for OCTEON's on-chip GPIO pins.

2013-06-24 Thread David Daney

Thanks for looking at this again.

I will be away from my office until the middle of July, so I will not be 
able to generate and test a revised patch until then.


David Daney



On 06/24/2013 03:06 PM, Linus Walleij wrote:

On Thu, Jun 20, 2013 at 8:10 PM, David Daney  wrote:

On 06/17/2013 01:51 AM, Linus Walleij wrote:



+#include 
+#include 

I cannot find this in my tree.


Weird, I see them here:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/mips/include/asm/octeon/cvmx-gpio-defs.h
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/mips/include/asm/octeon/octeon.h

Do you not have these?


Yeah no problem, I must have misgrepped.
Sorry for the fuzz...


depend on OF as well right? Or does the CAVIUM_OCTEON_SOC already
imply that?


We already have 'select USE_OF', so I think adding OF here would be
redundant.


OK.


+/*
+ * The address offset of the GPIO configuration register for a given
+ * line.
+ */
+static unsigned int bit_cfg_reg(unsigned int gpio)


Maybe the passed variable shall be named "offset" here, as it is named
offset on all call sites, and it surely local for this instance?


Well it is the gpio line, so perhaps it should universally be change to
"line" or "pin"


We use "offset" to signify line enumerators in drivers/gpio/*
well atleaste if they are local to a piece of hardware.
(Check the GPIO siblings.)


+{
+   if (gpio < 16)
+   return 8 * gpio;
+   else
+   return 8 * (gpio - 16) + 0x100;



Put this 0x100 in the #defines above with the name something like
STRIDE.


But it is not a 'STRIDE', it is a discontinuity compensation and used in
exactly one place.


OK what about a comment or something, because it isn't
exactly intuitive right?


+struct octeon_gpio {
+   struct gpio_chip chip;
+   u64 register_base;
+};


OMG everything is 64 bit. Well has to come to this I guess.


Not everything.  This is custom logic in an SoC with 64-bit wide internal
address buses, what would you suggest?


Yep that's what I meant, no big deal. Just first time
I really see it in driver bases.


I'm not a fan of packed bitfields like this, I prefer if you just
OR | and AND & the bits together in the driver.


I see you disregarded this comment, and looking at the header
files it seems the MIPS arch is a big fan if packed bitfields so
will live with it for this arch...


+static int octeon_gpio_get(struct gpio_chip *chip, unsigned offset)
+{
+   struct octeon_gpio *gpio = container_of(chip, struct octeon_gpio,
chip);
+   u64 read_bits = cvmx_read_csr(gpio->register_base + RX_DAT);
+
+   return ((1ull << offset) & read_bits) != 0;


A common idiom we use for this is:

return !!(read_bits & (1ull << offset));


I hate that idiom, but if its use is a condition of accepting the patch, I
will change it.


Nah. If a good rational reason like "hate" is given for not using a coding
idiom I will accept it as it stands ;-)


+   dev_info(>dev, "OCTEON GPIO\n");



This is like shouting "REAL MADRID!" in the bootlog, be a bit more
precise: "octeon GPIO driver probed\n" or something so we know what
is happening.


No, more akin to 'Real Madrid', as 'OCTEON' is the correct spelling of its
given name.

I will happily add "driver probed", and grudgingly switch to lower case if
it is a necessary condition of patch acceptance.


I don't know, does this rest of the MIPS drivers emit similar messages
such that the bootlog will say

OCTEON clocks
OCTEON irqchip
OCTEON I2C
OCTEON GPIO

then I guess it's convention and it can stay like this.

Yours,
Linus Walleij




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arch: s390: kernel: reset 'c->hotpluggable' when failure occurs

2013-06-24 Thread Chen Gang
When smp_add_present_cpu() fails, it has reset all things excluding
'c->hotpluggable', so need reset it as original state completely.

Signed-off-by: Chen Gang 
---
 arch/s390/kernel/smp.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 15a016c..c4c6f42 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -1016,6 +1016,7 @@ out_cpu:
unregister_cpu(c);
 #endif
 out:
+   c->hotpluggable = 0;
return rc;
 }
 
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tracing/rcu] WARNING: at kernel/lockdep.c:3537 check_flags()

2013-06-24 Thread Steven Rostedt
On Sun, 2013-06-23 at 12:19 +0800, Fengguang Wu wrote:
> Greetings,
> 
> I find the below dmesg in upstream and linux-next.
> 
> [2.456884] Testing tracer branch: 
> [2.458281] [ cut here ]
> [2.459813] WARNING: at /c/kernel-tests/src/tip/kernel/lockdep.c:3537 
> check_flags+0xb7/0x1b0()

Hmm, I bet lockdep and the branch tracer probably don't play well
together. They both are bullies, and want to beat up the same kid. The
problem is, they want sole access to beat up that kid, and don't want
help.


> [2.46] Hardware name: Bochs
> [2.46] Pid: 3, comm: ksoftirqd/0 Not tainted 3.9.0-rc4-03252-g8b473e1 
> #58
> [2.46] Call Trace:
> 
> [2.46]  [] warn_slowpath_common+0xaf/0xd0
> [2.46]  [] warn_slowpath_null+0x1a/0x20
> [2.46]  [] check_flags+0xb7/0x1b0
> [2.46]  [] lock_is_held+0x62/0xc0
> [2.46]  [] __might_sleep+0x3c/0x3b0
> [2.46]  [] run_ksoftirqd+0xd4/0x130
> [2.46]  [] smpboot_thread_fn+0x25c/0x2e0
> [2.46]  [] ? lg_global_unlock+0x40/0x40
> [2.46]  [] kthread+0xfb/0x110
> [2.46]  [] ? insert_kthread_work+0x120/0x120
> [2.46]  [] ret_from_fork+0x7a/0xb0
> [2.46]  [] ? insert_kthread_work+0x120/0x120
> [2.46] ---[ end trace 3af7e87d98c6254d ]---
> 
> Bisecting for "__might_sleep" and the first bad commit is
> 
> commit 965a002b4f1a458c5dcb334ec29f48a0046faa25
> Author: Paul E. McKenney 
> Date:   Sat Jun 18 09:55:39 2011 -0700
> 
> rcu: Make TINY_RCU also use softirq for RCU_BOOST=n
> 
> This patch #ifdefs TINY_RCU kthreads out of the kernel unless RCU_BOOST=y,
> thus eliminating context-switch overhead if RCU priority boosting has
> not been configured.
> 
> Signed-off-by: Paul E. McKenney 
> Signed-off-by: Paul E. McKenney 
> 
> But note that its parent commit 385680a9487d2f85382ad6d74e2a15837e47bfd9
> is not really clean and has this dmesg instead:
> 
> [2.592748] Testing tracer wakeup_rt: PASSED
> [2.936495] Testing tracer branch: 
> [2.940281] [ cut here ]
> [2.941194] WARNING: at /c/wfg/mm/kernel/lockdep.c:3363 
> check_flags.part.31+0xaf/0x1c0()
> [2.942593] Hardware name: Bochs
> [2.943199] Pid: 0, comm: swapper Not tainted 3.1.0-rc8-00019-g385680a #99
> [2.944234] Call Trace:
> [2.944234][] warn_slowpath_common+0x9e/0xd0
> [2.944234]  [] warn_slowpath_null+0x1a/0x20
> [2.944234]  [] check_flags.part.31+0xaf/0x1c0
> [2.944234]  [] lock_acquire+0x119/0x230
> [2.944234]  [] run_timer_softirq+0x217/0x8a0
> [2.944234]  [] ? run_timer_softirq+0x1a1/0x8a0
> [2.944234]  [] ? 
> ftrace_raw_output_itimer_expire+0x160/0x160
> [2.944234]  [] __do_softirq+0x1c0/0x5c0
> [2.944234]  [] call_softirq+0x1a/0x30
> [2.944234]  [] do_softirq+0x165/0x290
> [2.944234]  [] irq_exit+0xb7/0x130
> [2.944234]  [] smp_apic_timer_interrupt+0x77/0xb0
> [2.944234]  [] apic_timer_interrupt+0x71/0x80
> [2.944234][] ? ftrace_likely_update+0xc5/0x230
> [2.944234]  [] ? trace_hardirqs_off+0xd/0x10
> [2.944234]  [] ? native_safe_halt+0xb/0x10
> [2.944234]  [] default_idle+0x7d3/0x810
> [2.944234]  [] cpu_idle+0x14c/0x160
> [2.944234]  [] rest_init+0xe7/0xf4
> [2.944234]  [] ? csum_partial_copy_generic+0x16c/0x16c
> [2.944234]  [] start_kernel+0x4f4/0x4ff
> [2.944234]  [] ? vsyscall_gtod_data+0xf80/0xf80
> [2.944234]  [] ? vsyscall_gtod_data+0xf80/0xf80
> [2.944234]  [] x86_64_start_reservations+0x166/0x16a
> [2.944234]  [] x86_64_start_kernel+0x270/0x27f
> [2.944234] ---[ end trace 6d450e935ee1897c ]---
> [2.944234] possible reason: unannotated irqs-on.
> [2.944234] irq event stamp: 10085
> [2.944234] hardirqs last  enabled at (10084): [] 
> _raw_spin_unlock_irq+0x32/0x80
> [2.944234] hardirqs last disabled at (10085): [] 
> ftrace_likely_update+0x87/0x230

irqs were last disabled at ftrace_likely_update(), perhaps the branch
tracer called something in the wrong place.

I took your config, and I'm unable to reproduce this. Does this only
happen on virt boxes?

-- Steve


> [2.944234] softirqs last  enabled at (10076): [] 
> irq_enter+0x87/0x90
> [2.944234] softirqs last disabled at (10077): [] 
> call_softirq+0x1a/0x30
> [3.040274] PASSED
> [3.041998] HugeTLB registered 2 MB page size, pre-allocated 0 pages
> 
> 
> git bisect start v3.2 v3.1 --
> git bisect  bad 68d99b2c8efcb6ed3807a55569300c53b5f88be5  # 10:10  0-  
> Merge branch 'for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good efb8d21b2c6db3497655cc6a033ae8a9883e4063  # 10:18 27+  
> Merge branch 'tty-next' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> git bisect  bad 8686a0e200419322654a75155e2e6f80346a1297  # 10:22  0-  
> Merge branch 'perf-urgent-for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Re: [patch] mm, memcg: add oom killer delay

2013-06-24 Thread Kamezawa Hiroyuki

(2013/06/14 19:12), David Rientjes wrote:

On Fri, 14 Jun 2013, Kamezawa Hiroyuki wrote:


Reading your discussion, I think I understand your requirements.
The problem is that I can't think you took into all options into
accounts and found the best way is this new oom_delay. IOW, I can't
convice oom-delay is the best way to handle your issue.



Ok, let's talk about it.



I'm sorry that my RTT is long in these days.


Your requeirement is
  - Allowing userland oom-handler within local memcg.



Another requirement:

  - Allow userland oom handler for global oom conditions.

Hopefully that's hooked into memcg because the functionality is already
there, we can simply duplicate all of the oom functionality that we'll be
adding for the root memcg.



At mm-summit, it was discussed ant people seems to think user-land-oom-handler
is impossible. Hm, and in-kernel scripting was discussed, as far as I remember.




Considering straightforward, the answer should be
  - Allowing oom-handler daemon out of memcg's control by its limit.
(For example, a flag/capability for a task can archive this.)
Or attaching some *fixed* resource to the task rather than cgroup.

Allow to set task->secret_saving=20M.



Exactly!

First of all, thanks very much for taking an interest in our usecase and
discussing it with us.

I didn't propose what I referred to earlier in the thread as "memcg
reserves" because I thought it was going to be a more difficult battle.
The fact that you brought it up first actually makes me think it's less
insane :)

We do indeed want memcg reserves and I have patches to add it if you'd
like to see that first.  It ensures that this userspace oom handler can
actually do some work in determining which process to kill.  The reserve
is a fraction of true memory reserves (the space below the per-zone min
watermarks) which is dependent on min_free_kbytes.  This does indeed
become more difficult with true and complete kmem charging.  That "work"
could be opening the tasks file (which allocates the pidlist within the
kernel), checking /proc/pid/status for rss, checking for how long a
process has been running, checking for tid, sending a signal to drop
caches, etc.




Considering only memcg, bypassing all charge-limit-check will work.
But as you say, that will not work against global-oom.
Then, in-kernel scripting was discussed.



We'd also like to do this for global oom conditions, which makes it even
more interesting.  I was thinking of using a fraction of memory reserves
as the oom killer currently does (that memory below the min watermark) for
these purposes.

Memory charging is simply bypassed for these oom handlers (we only grant
access to those waiting on the memory.oom_control eventfd) up to
memory.limit_in_bytes + (min_free_kbytes / 4), for example.  I don't think
this is entirely insane because these oom handlers should lead to future
memory freeing, just like TIF_MEMDIE processes.



I think that kinds of bypassing is acceptable.



Going back to your patch, what's confusing is your approach.
Why the problem caused by the amount of memory should be solved by
some dealy, i.e. the amount of time ?

This exchanging sounds confusing to me.



Even with all of the above (which is not actually that invasive of a
patch), I still think we need memory.oom_delay_millisecs.  I probably made
a mistake in describing what that is addressing if it seems like it's
trying to address any of the above.

If a userspace oom handler fails to respond even with access to those
"memcg reserves",


How this happens ?


 the kernel needs to kill within that memcg.  Do we do
that above a set time period (this patch) or when the reserves are
completely exhausted?  That's debatable, but if we are to allow it for
global oom conditions as well then my opinion was to make it as safe as
possible; today, we can't disable the global oom killer from userspace and
I don't think we should ever allow it to be disabled.  I think we should
allow userspace a reasonable amount of time to respond and then kill if it
is exceeded.

For the global oom case, we want to have a priority-based memcg selection.
Select the lowest priority top-level memcg and kill within it.  If it has
an oom notifier, send it a signal to kill something.  If it fails to
react, kill something after memory.oom_delay_millisecs has elapsed.  If
there isn't a userspace oom notifier, kill something within that lowest
priority memcg.



Someone may be against that kind of control and say "Hey, I have better idea".
That was another reason that oom-scirpiting was discussed. No one can implement
general-purpose-victim-selection-logic.


The bottomline with my approach is that I don't believe there is ever a
reason for an oom memcg to remain oom indefinitely.  That's why I hate
memory.oom_control == 1 and I think for the global notification it would
be deemed a nonstarter since you couldn't even login to the machine.


I'm not against what you finally want to do, but I 

Re: [PATCH] PCI: avoid NULL deref in alloc_pcie_link_state

2013-06-24 Thread Bjorn Helgaas
[+cc Michael, Alex, Isaku]

On Wed, Jun 19, 2013 at 12:56 PM, Radim Krčmář  wrote:
> PCIe switch upstream port can be connected directly to the PCIe root bus
> in QEMU; ASPM does not expect this topology and dereferences NULL pointer
> when initializing.
>
> I have not confirmed this can happen on real hardware, but it is presented
> as a feature in QEMU, so there is no reason to panic if we can recover.

This doesn't seem like a valid hardware topology to me.  If this *can*
occur on real hardware, we should fix it in Linux.  If not, maybe QEMU
should be changed to disallow it.

> The dereference happens with topology defined by
>   -M q35 -device x3130-upstream,bus=pcie.0,id=upstream \
>   -device xio3130-downstream,bus=upstream,id=downstream,chassis=1
> where on line drivers/pci/pcie/aspm.c:530 (alloc_pcie_link_state+13):
> parent = pdev->bus->parent->self->link_state;
> "pdev->bus->parent->self == NULL", because "pdev->bus->parent" has no
> "->parent", hence no "->self".
>
> Even though discouraged by QEMU documentation, one can set up even
> topology without the upstream port
>   -M q35 -device xio3130-downstream,bus=pcie.0,id=downstream,chassis=1
> so "pdev->bus->parent == NULL", because "pdev->bus" is the root bus.
> The patch checks for this too, because I do not like *NULL.
>
> Right now, PCIe switch has to connect to the root port
>   -M q35 -device ioh3420,bus=pcie.0,id=root.0 \
>   -device x3130-upstream,bus=root.0,id=upstream \
>   -device xio3130-downstream,bus=upstream,id=downstream,chassis=1
>
> Signed-off-by: Radim Krčmář 
> ---
>  drivers/pci/pcie/aspm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 403a443..1ad1514 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -527,8 +527,8 @@ static struct pcie_link_state 
> *alloc_pcie_link_state(struct pci_dev *pdev)
> link->pdev = pdev;
> if (pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM) {
> struct pcie_link_state *parent;
> -   parent = pdev->bus->parent->self->link_state;
> -   if (!parent) {
> +   if (!pdev->bus->parent || !pdev->bus->parent->self ||
> +   !(parent = pdev->bus->parent->self->link_state)) {
> kfree(link);
> return NULL;
> }
> --
> 1.8.1.4

I don't really want to further complicate the "if" statement you're
changing.  The link state allocation is pretty obtuse already, and if
this situation only occurs in QEMU, we're likely to break it again
when somebody refactors this code.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 06/14] locks: encapsulate the fl_link list handling

2013-06-24 Thread Stephen Rothwell
Hi Jeff,

Thanks for doing all this work!

Trivial comments below.

On Fri, 21 Jun 2013 08:58:14 -0400 Jeff Layton  wrote:
>
> +static inline void
> +locks_insert_global_locks(struct file_lock *fl)
> +{
> + list_add_tail(>fl_link, _lock_list);
> +}

We generally do not use "inline" in C files any more and leave it to the
compiler to do that.  Also, without the "inline" these function headers
should all be able to fit on single lines like the others here i.e.

static void locks_insert_global_locks(struct file_lock *fl)

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp9FREoBbCNh.pgp
Description: PGP signature


Re: [Suggestion] arch: arm64: xen: "ln -s" the paravirt.h from arm.

2013-06-24 Thread Chen Gang
On 06/25/2013 12:50 AM, Stefano Stabellini wrote:
> On Mon, 24 Jun 2013, Stefano Stabellini wrote:
>> > On Mon, 24 Jun 2013, Chen Gang wrote:
>>> > > Hello Maintainers:
>>> > > 
>>> > > if 'CONFIG_XEN'
>>> > > 
>>> > >   CC  arch/arm64/xen/../../arm/xen/enlighten.o
>>> > > arch/arm64/xen/../../arm/xen/enlighten.c:19:26: fatal error: 
>>> > > asm/paravirt.h: No such file or directory
>>> > > 
>>> > > The related .config file for next-20130624 is in attachment.
>>> > > 
>>> > > 
>>> > > If "ln -s ../../../arm/include/asm/paravirt.h paravirt.h", it can pass
>>> > > compiling, but I do not know how to make a patch for it ("ln -s ..."),
>>> > > Do we have another more suitable ways for it (or another fixing ways) ?
>>> > > 
>>> > > Welcome any suggestions or completions.
>> > 
>> > The problem is caused by:
>> > 
>> > commit 3a885582a366caf868b0782041c44854ff4c3568
>> > Author: Stefano Stabellini 
>> > Date:   Wed May 29 10:56:34 2013 +
>> > 
>> > xen/arm: account for stolen ticks
>> > 
>> > that is in my tree for linux-next (even though I have not received any
>> > replies from the ARM maintainers so I don't know when and if it is going
>> > to go upstream).
>> > 
>> > I think that the best thing to do would be to add a couple of ifdef
>> > CONFIG_PARAVIRT in arch/arm/xen/enlighten.c. I'll add them to my tree.
> Actually now that we have XEN support under arm64, we can just introduce the
> same pv_time_op struct and paravirt header that this patch
> 
> http://marc.info/?l=linux-arm-kernel=136992435301890=2
> 
> is introducing under arm.
> Any opinions?
> 

OK, thanks, and excuse me, I can not open the link above (the reason
maybe it can not be accessed from China).

But at least, the patch below is better than "ln -s" (which I provided).

Thanks.

> ---
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 19c1cde..47bd26b 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -207,6 +207,25 @@ config FORCE_MAX_ZONEORDER
>   default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
>   default "11"
>  
> +config PARAVIRT
> + bool "Enable paravirtualization code"
> + ---help---
> +   This changes the kernel so it can modify itself when it is run
> +   under a hypervisor, potentially improving performance significantly
> +   over full virtualization.
> +
> +config PARAVIRT_TIME_ACCOUNTING
> + bool "Paravirtual steal time accounting"
> + select PARAVIRT
> + default n
> + ---help---
> +   Select this option to enable fine granularity task steal time
> +   accounting. Time spent executing other tasks in parallel with
> +   the current vCPU is discounted from the vCPU power. To account for
> +   that, there can be a small performance impact.
> +
> +   If in doubt, say N here.
> +
>  config XEN_DOM0
>   def_bool y
>   depends on XEN
> @@ -214,6 +233,7 @@ config XEN_DOM0
>  config XEN
>   bool "Xen guest support on ARM64 (EXPERIMENTAL)"
>   depends on ARM64 && OF
> + select PARAVIRT
>   help
> Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64.
>  
> diff --git a/arch/arm64/include/asm/paravirt.h 
> b/arch/arm64/include/asm/paravirt.h
> new file mode 100644
> index 000..54e895b
> --- /dev/null
> +++ b/arch/arm64/include/asm/paravirt.h
> @@ -0,0 +1,19 @@
> +#ifndef _ASM_ARM64_PARAVIRT_H
> +#define _ASM_ARM64_PARAVIRT_H
> +
> +struct static_key;
> +extern struct static_key paravirt_steal_enabled;
> +extern struct static_key paravirt_steal_rq_enabled;
> +
> +struct pv_time_ops {
> + unsigned long long (*steal_clock)(int cpu);
> +};
> +extern struct pv_time_ops pv_time_ops;
> +
> +static inline u64 paravirt_steal_clock(int cpu)
> +{
> + return pv_time_ops.steal_clock(cpu);
> +}
> +
> +
> +#endif
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 7b4b564..17613ad 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP) += smp.o 
> smp_spin_table.o smp_psci.o
>  arm64-obj-$(CONFIG_HW_PERF_EVENTS)   += perf_event.o
>  arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
>  arm64-obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
&g

Re: [PATCH V2] USB: initialize or shutdown PHY when add or remove host controller

2013-06-24 Thread Chao Xie
On Tue, Jun 25, 2013 at 3:45 AM, Felipe Balbi  wrote:
> Hi,
>
> On Fri, Jun 21, 2013 at 09:07:59AM +0800, Chao Xie wrote:
>> On Fri, Jun 21, 2013 at 1:25 AM, Alan Stern  
>> wrote:
>> > On Thu, 20 Jun 2013, Felipe Balbi wrote:
>> >
>> >> > In fact, the PHY setting and handling is related to platform or SOC,
>> >> > and for different SOC they can
>> >> > have same EHCI HCD but they PHY handling can be different.
>> >> > Omap'a case is the example, and i think some other vendors may have
>> >> > silimar cases.
>> >> > From above point, It is better to leave the PHY initialization and
>> >> > shutdown to be done by each echi-xxx driver.
>> >> >
>> >> > So Alan and Felipe
>> >> > What are your ideas about it?
>> >>
>> >> If we have so many exceptions, then sure. But eventually, the common
>> >> case should be added generically with a flag so that non-generic cases
>> >> (like OMAP) can request to handle the PHY by themselves.
>> >>
>> >> Alan ?
>> >
>> > I don't have very strong feelings about this; Felipe has much more
>> > experience with these things.
>> >
>> > However, when the common case is added into the core, the simplest way
>> > to indicate that the HCD wants to handle the PHY(s) by itself will be
>> > to leave hcd->phy set to NULL or an ERR_PTR value.
>> >
>> > One important thing that hasn't been pointed out yet: When we move
>> > these calls into the core, the same patch must also remove those calls
>> > from the glue drivers that currently do set hcd->phy.  And it must make
>> > sure that the glue drivers which handle the PHY by themselves do not
>> > set hcd->phy.
>> >
>>
>> From device point of view, EHCI is a standlone component. It has the
>> standard sepcification, so each
>> SOC vendor has EHCI HCD need to follow the standards. Then we have
>> common EHCI HCD driver.
>> The PHY is outside of EHCI component, each SOC vendor may have
>> different PHY implementation. Then
>> we have PHY driver.
>> The EHCI glue driver ehci-xxx works like a SOC depended driver. It is
>> its duty to handle the'
>> relationship between the EHCI HCD driver and PHY driver.
>
> that's not entirely true. We build abstractions layers so that the
> commonalities can be written generically. Just look at the amount of
> code I removed on v3.10 merge window by moving all other UDC drivers to
> use generic constructs I introduced earlier.
>
> It just so happens that OMAP's EHCI has two different working modes
> which mandates different ways to handle the PHY, one is pretty much the
> generic way (power up EHCI, then power up PHY) the other is inverted
> (PHY, then EHCI), that's the only reason (as of today) we're having this
> thread.
>
>> It is same as clk, irq requested by ehci-xxx driver.
>
> clocks could be handled generically in some cases, we have pm_clk_add()
> for a reason ;-)
>
> Also, clock handling can be hidden under pm_runtime callbacks (say,
> clk_enable() on ->runtime_resume(), clk_disable() on
> ->runtime_suspend()). IRQ is actually handled by usbcore, you just pass
> a handler which, in most cases, is the normal ehci_irq() handler.
>
> But we'll get to those later, let's focus on PHY for now.
>
clock is another story, and i know that OMAP has full system to handle
the clock with PM runtime,
i would like to discuss it when one day you want to do it.

>> So i think add a flag and use usb_get_phy() is not very good.
>
> Alan was talking about use hcd->phy as that flag, no flag would be
> added. But why isn't it very good ? you didn't mention your resoning.
>
I maybe understand something wrong.
Using hcd->phy as a flag to indicates whether the gule driver need
EHCI HCD to help
phy operation, such as initialization and shutdown, i think it is fine.
If add another member as a flag in EHCI HCD to indicates the PHY
differences of each echi-xxx.c driver,
and handle them in EHCI HCD, i think that is not very good. Because as
you said that make
common part into EHCI HCD is the target, but this member will import
all the differences to EHCI HCD.
It is better to let the ehci-xxx.c driver to handle the differences if
it does not fit EHCI HCD's requirment
for common PHY handling just as this patch did.


>> It is bette to make ehci-xxx to do the phy getting and EHCI HCD
>> initialize it and shut down as the patch did, or let ehci-xxx to
>> handle the PHY as Roger said.
>
> right, so this is what Alan suggested:
>
> ehci-xxx.c does usb_get_phy() (or any of those variants) and sets the
> returned pointer to hcd->phy. From that point on, ehci-hcd will play
> with the phy, resuming and suspending at the proper locations, asking
> the phy to enable wakeup capabilities and the like.
>
> In fact, because of that, I was just considering if I should protect
> usb_phy* against NULL pointers, just to make EHCI's life easier, I mean:
>
> static inline int usb_phy_set_suspend(struct usb_phy *phy, int suspend)
> {
> if (!phy)
> return 0;
>
> return phy->suspend(phy, suspend);
> }
>
This patch does not 

[PATCH v3] net: Unmap fragment page once iterator is done

2013-06-24 Thread Wedson Almeida Filho
Callers of skb_seq_read() are currently forced to call skb_abort_seq_read()
even when consuming all the data because the last call to skb_seq_read (the
one that returns 0 to indicate the end) fails to unmap the last fragment page.

With this patch callers will be allowed to traverse the SKB data by calling
skb_prepare_seq_read() once and repeatedly calling skb_seq_read() as originally
intended (and documented in the original commit 677e90eda - "[NET]: Zerocopy
sequential reading of skb data"), that is, only call skb_abort_seq_read() if
the sequential read is actually aborted.

Signed-off-by: Wedson Almeida Filho 
---
 drivers/scsi/libiscsi_tcp.c |1 -
 net/batman-adv/main.c   |1 -
 net/core/skbuff.c   |7 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
index 552e8a2..448eae8 100644
--- a/drivers/scsi/libiscsi_tcp.c
+++ b/drivers/scsi/libiscsi_tcp.c
@@ -906,7 +906,6 @@ int iscsi_tcp_recv_skb(struct iscsi_conn *conn, struct 
sk_buff *skb,
ISCSI_DBG_TCP(conn, "no more data avail. Consumed %d\n",
  consumed);
*status = ISCSI_TCP_SKB_DONE;
-   skb_abort_seq_read();
goto skb_done;
}
BUG_ON(segment->copied >= segment->size);
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index 51aafd6..08125f3 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -473,7 +473,6 @@ __be32 batadv_skb_crc32(struct sk_buff *skb, u8 
*payload_ptr)
crc = crc32c(crc, data, len);
consumed += len;
}
-   skb_abort_seq_read();
 
return htonl(crc);
 }
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index cfd777b..26ea1cf 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2554,8 +2554,13 @@ unsigned int skb_seq_read(unsigned int consumed, const 
u8 **data,
unsigned int block_limit, abs_offset = consumed + st->lower_offset;
skb_frag_t *frag;
 
-   if (unlikely(abs_offset >= st->upper_offset))
+   if (unlikely(abs_offset >= st->upper_offset)) {
+   if (st->frag_data) {
+   kunmap_atomic(st->frag_data);
+   st->frag_data = NULL;
+   }
return 0;
+   }
 
 next_skb:
block_limit = skb_headlen(st->cur_skb) + st->stepped_offset;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATH v4] itimers: Remove bogus NULL pointer check in sys_getitimer()

2013-06-24 Thread Chen Gang
People might be tricked into assuming that the return value for a
failed NULL pointer check should be -EINVAL instead of -EFAULT.

Remove the misleading NULL pointer check to fix this nuisance.

Aside of that this patch fixes the problem of NOMMU kernels, where
a NULL pointer dereference is a valid operation. This allows to
boot NOMMU kernels without working around the shortcomings of the
getitimer() system call, which have been ignored since this NULL
pointer check was introduced in Linux 0.96a.

Signed-off-by: Chen Gang 
---
 kernel/itimer.c |   13 ++---
 1 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/kernel/itimer.c b/kernel/itimer.c
index 8d262b4..3b12271 100644
--- a/kernel/itimer.c
+++ b/kernel/itimer.c
@@ -102,15 +102,14 @@ int do_getitimer(int which, struct itimerval *value)
 
 SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, value)
 {
-   int error = -EFAULT;
+   int error;
struct itimerval get_buffer;
 
-   if (value) {
-   error = do_getitimer(which, _buffer);
-   if (!error &&
-   copy_to_user(value, _buffer, sizeof(get_buffer)))
-   error = -EFAULT;
-   }
+   error = do_getitimer(which, _buffer);
+   if (!error &&
+   copy_to_user(value, _buffer, sizeof(get_buffer)))
+   error = -EFAULT;
+
return error;
 }
 
-- 
1.7.7.6



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Tux3 Report: Meet Shardmap, the designated successor of HTree

2013-06-24 Thread David Lang

On Tue, 25 Jun 2013, Christian Stroetmann wrote:


Dear Mr. Richard Weinberger:

Thank you very much for the reminder and the prove again that a profound 
discussion seems not to be possible. Even more important is the point that 
the discussion related with the ReiserFS was different than this discussion, 
because this time I have not presented the LogHashFS to the open source 
community, but another person has taken copyright descriptions from my 
websites and wanted to make it an open source project and this even by the 
support of another company, which by the way has its very own business 
strategy.


unless they copied your descriptions pretty close to word for word it's not a 
copyright violation.


It's perfectly legal to take the ideas from one document and write a new 
document that explains those ideas. The copyright is on the exact text, not on 
the ideas.


Patents give you the right to the idea, Copyright only gives you the right to 
the particular expression of the idea.


If you published a paper explaining the concept of a LogHashFS that contained no 
code, then anyone who actually wrote a filesystem implementing the ideas in your 
paper could not possibly be violating your copyright (unless they included too 
much of your paper in comments), because they wrote code, not a paper.


David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 29/32] rcu: delete __cpuinit usage from all rcu files

2013-06-24 Thread Josh Triplett
On Mon, Jun 24, 2013 at 03:30:34PM -0400, Paul Gortmaker wrote:
> The __cpuinit type of throwaway sections might have made sense
> some time ago when RAM was more constrained, but now the savings
> do not offset the cost and complications.  For example, the fix in
> commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
> is a good example of the nasty type of bugs that can be created
> with improper use of the various __init prefixes.
> 
> After a discussion on LKML[1] it was decided that cpuinit should go
> the way of devinit and be phased out.  Once all the users are gone,
> we can then finally remove the macros themselves from linux/init.h.
> 
> This removes all the drivers/rcu uses of the __cpuinit macros
> from all C files.
> 
> [1] https://lkml.org/lkml/2013/5/20/589
> 
> Cc: "Paul E. McKenney" 
> Cc: Josh Triplett 
> Cc: Dipankar Sarma 
> Signed-off-by: Paul Gortmaker 
> ---

Reviewed-by: Josh Triplett 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] regulator: max77693: Add max77693 regualtor driver.

2013-06-24 Thread Jonghwa Lee
This patch adds new regulator driver to support max77693 chip's regulators.
max77693 has two linear voltage regulators and one current regulator which
can be controlled through I2C bus. This driver also supports device tree.

Signed-off-by: Jonghwa Lee 
Signed-off-by: Myungjoo Ham 
---
chagnes in v2:
 - Add comments for charger regulator's operations.
 - Add a binding document under Documentation/devicetree/binding/mfd/

 Documentation/devicetree/bindings/mfd/max77693.txt |   55 
 drivers/regulator/Kconfig  |9 +
 drivers/regulator/Makefile |1 +
 drivers/regulator/max77693.c   |  324 
 include/linux/mfd/max77693-private.h   |   13 +
 include/linux/mfd/max77693.h   |   18 ++
 6 files changed, 420 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mfd/max77693.txt
 create mode 100644 drivers/regulator/max77693.c

diff --git a/Documentation/devicetree/bindings/mfd/max77693.txt 
b/Documentation/devicetree/bindings/mfd/max77693.txt
new file mode 100644
index 000..11921cc
--- /dev/null
+++ b/Documentation/devicetree/bindings/mfd/max77693.txt
@@ -0,0 +1,55 @@
+Maxim MAX77693 multi-function device
+
+MAX77693 is a Multifunction device with the following submodules:
+- PMIC,
+- CHARGER,
+- LED,
+- MUIC,
+- HAPTIC
+
+It is interfaced to host controller using i2c.
+This document describes the bindings for the mfd device.
+
+Required properties:
+- compatible : Must be "maxim,max77693".
+- reg : Specifies the i2c slave address of PMIC block.
+- interrupts : This i2c device has an IRQ line connected to the main SoC.
+- interrupt-parent :  The parent interrupt controller.
+
+Optional properties:
+- regulators : The regulators of max77693 have to be instantiated under subnod
+  named "regulators" using the following format.
+
+   regulators {
+   regualtor-compatible = ESAFEOUT1/ESAFEOUT2/CHARGER
+   standard regulator constratints[*].
+   };
+
+   [*] refer Documentation/devicetree/bindings/regulator/regulator.txt
+
+Example:
+   max77693@66 {
+   compatible = "maxim,max77693";
+   reg = <0x66>;
+   interrupt-parent = <>;
+   interrupts = <5 2>;
+
+   regulators {
+   esafeout@1 {
+   regulator-compatible = "ESAFEOUT1";
+   regulator-name = "ESAFEOUT1";
+   regulator-boot-on;
+   };
+   esafeout@2 {
+   regulator-compatible = "ESAFEOUT2";
+   regulator-name = "ESAFEOUT2";
+   };
+   charger@0 {
+   regulator-compatible = "CHARGER";
+   regulator-name = "CHARGER";
+   regulator-min-microamp = <6>;
+   regulator-max-microamp = <258>;
+   regulator-boot-on;
+   };
+   };
+   };
diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 9296425..f1e6ad9 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -250,6 +250,15 @@ config REGULATOR_MAX77686
  via I2C bus. The provided regulator is suitable for
  Exynos-4 chips to control VARM and VINT voltages.
 
+config REGULATOR_MAX77693
+   tristate "Maxim MAX77693 regulator"
+   depends on MFD_MAX77693
+   help
+ This driver controls a Maxim 77693 regulator via I2C bus.
+ The regulators include two LDOs, 'SAFEOUT1', 'SAFEOUT2'
+ and one current regulator 'CHARGER'. This is suitable for
+ Exynos-4x12 chips.
+
 config REGULATOR_PCAP
tristate "Motorola PCAP2 regulator driver"
depends on EZX_PCAP
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index 26e6c4a..ba4a3cf 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_REGULATOR_MAX8973) += max8973-regulator.o
 obj-$(CONFIG_REGULATOR_MAX8997) += max8997.o
 obj-$(CONFIG_REGULATOR_MAX8998) += max8998.o
 obj-$(CONFIG_REGULATOR_MAX77686) += max77686.o
+obj-$(CONFIG_REGULATOR_MAX77693) += max77693.o
 obj-$(CONFIG_REGULATOR_MC13783) += mc13783-regulator.o
 obj-$(CONFIG_REGULATOR_MC13892) += mc13892-regulator.o
 obj-$(CONFIG_REGULATOR_MC13XXX_CORE) +=  mc13xxx-regulator-core.o
diff --git a/drivers/regulator/max77693.c b/drivers/regulator/max77693.c
new file mode 100644
index 000..674ece7
--- /dev/null
+++ b/drivers/regulator/max77693.c
@@ -0,0 +1,324 @@
+/*
+ * max77693.c - Regulator driver for the Maxim 77693
+ *
+ * Copyright (C) 2013 Samsung Electronics
+ * Jonghwa Lee 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the 

Re: [tip:perf/core] perf sort: Separate out memory-specific sort keys

2013-06-24 Thread Namhyung Kim

2013-06-25 AM 10:02, Andi Kleen wrote:

I'm not sure it should move to the common keys as normal perf
session won't have those.


Why not? If I enable weight sampling i get weights perfectly
fine in any session using the right PEBS events.


I guess you need to set up a couple of
TSX-specific sort keys like perf mem, if so what about moving the
two weights to your table?


There's no TSX table and no separate perf TSX session.
It's just some additional attributes.


Okay then, I have no objection for moving them. :)

Thanks,
Namhyung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 3/6] drm: add SimpleDRM driver

2013-06-24 Thread Andy Lutomirski
On 06/24/2013 03:27 PM, David Herrmann wrote:
> + sdrm->fb_map = ioremap(sdrm->fb_base, sdrm->fb_size);

This should probably be ioremap_wc.  Otherwise it will be *really* slow
if used in legacy mode and it may cause conflicts with the
pgprot_writecombine mode for mmap.

(Watching boot messages go by on fbcon on efifb was like using an old
2400 baud modem before I made the corresponding change to efifb.)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:perf/core] perf sort: Separate out memory-specific sort keys

2013-06-24 Thread Andi Kleen
> I'm not sure it should move to the common keys as normal perf
> session won't have those.  

Why not? If I enable weight sampling i get weights perfectly
fine in any session using the right PEBS events.

> I guess you need to set up a couple of
> TSX-specific sort keys like perf mem, if so what about moving the
> two weights to your table?

There's no TSX table and no separate perf TSX session.
It's just some additional attributes.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:perf/core] perf sort: Separate out memory-specific sort keys

2013-06-24 Thread Namhyung Kim

Hi Andi,

2013-06-25 AM 9:30, Andi Kleen wrote:

On Fri, May 31, 2013 at 04:20:20AM -0700, tip-bot for Namhyung Kim wrote:

perf sort: Separate out memory-specific sort keys

Since they're used only for perf mem, separate out them to a different
dimension so that normal user cannot access them by any chance.

For global/local weights, I'm not entirely sure to place them into the
memory dimension.  But it's the only user at this time.


So I was finally able to test with this patch, but
I found it completely breaks TSX weight abort profiling.

It uses weight (global/local), but it's not
running in memory mode.

I'll send a patch to move weight back into
the common keys.


I'm not sure it should move to the common keys as normal perf session 
won't have those.  I guess you need to set up a couple of TSX-specific 
sort keys like perf mem, if so what about moving the two weights to your 
table?


Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] kernel/itimer.c: beautify code, not need check 'value', so save one instruction, simpler and easier for readers.t

2013-06-24 Thread Chen Gang
On 06/25/2013 07:28 AM, Thomas Gleixner wrote:
> On Fri, 21 Jun 2013, Chen Gang wrote:
> > >> > Also can let code simpler and easier for readers: if checking 
> > >> > parameter
> > >> > 'value', it will easily lead readers to think about why not return
> > >> > -EINVAL instead of -EFAULT, when checking parameter failed.
>>> > > So you are seriously claiming, that the check for !value makes people
>>> > > think that the return value should be -EINVAL?
>>> > > 
>>> > > That's hillarious.
>>> > > 
>> > That seems not a quite polite word, is it ?  ;-)
> My apologies for being so impolite. Let me rephrase it. Here is a
> "sample" changelog for your patch:
> 

It doesn't matter, I really don't (shouldn't) care about it.

Next time, I should try to send patch carefully, so may save the
maintainers' timer resource.

And excuse me for my poor English and either not familiar with kernel, I
am trying to improve them, and keep improving them.


>   Subject: itimers: Remove bogus NULL pointer check in sys_getitimer()
> 
> People might be tricked into assuming that the return value for a
> failed NULL pointer check should be -EINVAL instead of -EFAULT.
> 
> Remove the misleading NULL pointer check to fix this nuisance.
> 
> Aside of that this patch fixes the problem of NOMMU kernels, where
> a NULL pointer dereference is a valid operation. This allows to
> boot NOMMU kernels without working around the shortcomings of the
> getitimer() system call, which have been ignored since this NULL
> pointer check was introduced in Linux 0.96a.
> 

Really very good comments, at least for me now, I really can not write a
comment like that.

> 
> Please resubmit.

I will send patch v4 (patch v3 has sent, and should be obsoleted)

Thanks.
-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: mach-clps711x: common: Use linux/sched_clock.h

2013-06-24 Thread Stephen Boyd
On 06/24/13 17:03, Fabio Estevam wrote:
> From: Fabio Estevam 
>
> Commit 38ff87f7 (sched_clock: Make ARM's sched_clock generic for all 
> architectures) changed the header to , so adapt it in 
> order
> to fix the following build error:
>
> arch/arm/mach-clps711x/common.c:37:29: fatal error: asm/sched_clock.h: No 
> such file or directory
>
> Signed-off-by: Fabio Estevam 
> ---

Is this one in the arm-soc tree? I plan to make a sweep after 3.11-rc1
and fix up all the stragglers (looks like just this one) and remove the
dummy asm header that just got merged into the tip tree.

>  arch/arm/mach-clps711x/common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mach-clps711x/common.c b/arch/arm/mach-clps711x/common.c
> index 4ca2f3c..134641d 100644
> --- a/arch/arm/mach-clps711x/common.c
> +++ b/arch/arm/mach-clps711x/common.c
> @@ -29,12 +29,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include 


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Tux3 Report: Meet Shardmap, the designated successor of HTree

2013-06-24 Thread Christian Stroetmann

Dear Mr. Richard Weinberger:

Thank you very much for the reminder and the prove again that a profound 
discussion seems not to be possible. Even more important is the point 
that the discussion related with the ReiserFS was different than this 
discussion, because this time I have not presented the LogHashFS to the 
open source community, but another person has taken copyright 
descriptions from my websites and wanted to make it an open source 
project and this even by the support of another company, which by the 
way has its very own business strategy.


Now, other and I have heard what we wanted: In the moment you have no 
more arguments you become offensive and begin to mob and to intirigue. 
Besides this, ReiserFS is virtually dead  Furthermore, that 
journalist from the Linux Magazin said it due to other political and 
economical reasons in the B.R.D. as well and most potentially did never 
something that is important for the open source community. Sooner or 
later he will get a letter from my attorney  for this offensive with the 
demand to beg for pardon publicly in the ReiserFS and Tux3 mailing lists.


Said this, I will not sent any e-mails to this Chuck Nonsense thread 
anymore. It was a mistake at all to try it again.




Sincerely
Christian Stroetmann


Let's do the same as in 2009[1] and finish this thread.

[1] http://www.spinics.net/lists/reiserfs-devel/msg01543.html

--
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:perf/core] perf sort: Separate out memory-specific sort keys

2013-06-24 Thread Andi Kleen
On Fri, May 31, 2013 at 04:20:20AM -0700, tip-bot for Namhyung Kim wrote:
> perf sort: Separate out memory-specific sort keys
> 
> Since they're used only for perf mem, separate out them to a different
> dimension so that normal user cannot access them by any chance.
> 
> For global/local weights, I'm not entirely sure to place them into the
> memory dimension.  But it's the only user at this time.

So I was finally able to test with this patch, but 
I found it completely breaks TSX weight abort profiling.

It uses weight (global/local), but it's not 
running in memory mode.

I'll send a patch to move weight back into
the common keys.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] mm,fs: introduce helpers around i_mmap_mutex

2013-06-24 Thread Davidlohr Bueso
Various parts of the kernel acquire and release this mutex,
so add i_mmap_lock_write() and immap_unlock_write() helper
functions that will encapsulate this logic. The next patch
will make use of these.

Signed-off-by: Davidlohr Bueso 
---
 include/linux/fs.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 65c2be2..1ea6c68 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -475,6 +475,16 @@ struct block_device {
 
 int mapping_tagged(struct address_space *mapping, int tag);
 
+static inline void i_mmap_lock_write(struct address_space *mapping)
+{
+   mutex_lock(>i_mmap_mutex);
+}
+
+static inline void i_mmap_unlock_write(struct address_space *mapping)
+{
+   mutex_unlock(>i_mmap_mutex);
+}
+
 /*
  * Might pages of this file be mapped into userspace?
  */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the final tree (staging tree related)

2013-06-24 Thread Peng Tao

On 06/25/2013 07:50 AM, Greg KH wrote:

On Tue, Jun 25, 2013 at 09:40:51AM +1000, Stephen Rothwell wrote:

Hi Greg,

On Mon, 24 Jun 2013 15:40:35 -0700 Greg KH  wrote:

We are running out of time, my tree is pretty much closed for 3.11 now,
should I just disable the build of this module for 3.11?

That's what I've been doing - it has never been enabled in a final
linux-next release.  So, it should probably just be disabled properly.

I agree, now disabled.

greg k-h

Greg,

Sorry for the delay. I've been caught up by other build failures that 
were found during my tests. I'm sending you the latest patches that 
should fix all the build errors. I've verified that we can now pass 
build on sparc/mips/s390/powerpc/x86, both 32bits and 64bits. Please 
help to queue them.


I see that you have already disabled Lustre build. Can we re-enable it 
or do we have to wait until 3.10 is released?


Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] mm/rmap: share the i_mmap_rwsem

2013-06-24 Thread Davidlohr Bueso
Similar to commit 4fc3f1d6, which optimized the anon-vma rwsem, we can share
the i_mmap_rwsem among multiple readers for rmap_walk_file(),
try_to_unmap_file() and collect_procs_file().

With this change, and the rwsem optimizations discussed in
http://lkml.org/lkml/2013/6/16/38 we can see performance improvements.
On a 8 socket, 80 core DL980, when compared to a vanilla 3.10-rc5, aim7
benefits in throughput, with the following workloads (beyond 500 users):

- alltests (+14.5%)
- custom (+17%)
- disk (+11%)
- high_systime (+5%)
- shared (+15%)
- short (+4%)

For lower amounts of users, there are no significant differences as all numbers
are within the 0-2% noise range.

Signed-off-by: Davidlohr Bueso 
---
 include/linux/fs.h  | 10 ++
 mm/memory-failure.c |  7 +++
 mm/rmap.c   | 12 ++--
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 79b8548..5646641 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -485,6 +485,16 @@ static inline void i_mmap_unlock_write(struct 
address_space *mapping)
up_write(>i_mmap_rwsem);
 }
 
+static inline void i_mmap_lock_read(struct address_space *mapping)
+{
+   down_read(>i_mmap_rwsem);
+}
+
+static inline void i_mmap_unlock_read(struct address_space *mapping)
+{
+   up_read(>i_mmap_rwsem);
+}
+
 /*
  * Might pages of this file be mapped into userspace?
  */
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index e7e0f90..6db44eb 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -436,7 +436,7 @@ static void collect_procs_file(struct page *page, struct 
list_head *to_kill,
struct task_struct *tsk;
struct address_space *mapping = page->mapping;
 
-   i_mmap_lock_write(mapping);
+   i_mmap_lock_read(mapping);
read_lock(_lock);
for_each_process(tsk) {
pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -444,8 +444,7 @@ static void collect_procs_file(struct page *page, struct 
list_head *to_kill,
if (!task_early_kill(tsk))
continue;
 
-   vma_interval_tree_foreach(vma, >i_mmap, pgoff,
- pgoff) {
+   vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
/*
 * Send early kill signal to tasks where a vma covers
 * the page but the corrupted page is not necessarily
@@ -458,7 +457,7 @@ static void collect_procs_file(struct page *page, struct 
list_head *to_kill,
}
}
read_unlock(_lock);
-   i_mmap_unlock_write(mapping);
+   i_mmap_unlock_read(mapping);
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index bc8eeb5..98b986d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -808,7 +808,7 @@ static int page_referenced_file(struct page *page,
 */
BUG_ON(!PageLocked(page));
 
-   i_mmap_lock_write(mapping);
+   i_mmap_lock_read(mapping);
 
/*
 * i_mmap_mutex does not stabilize mapcount at all, but mapcount
@@ -831,7 +831,7 @@ static int page_referenced_file(struct page *page,
break;
}
 
-   i_mmap_unlock_write(mapping);
+   i_mmap_unlock_read(mapping);
return referenced;
 }
 
@@ -1516,7 +1516,7 @@ static int try_to_unmap_file(struct page *page, enum 
ttu_flags flags)
if (PageHuge(page))
pgoff = page->index << compound_order(page);
 
-   i_mmap_lock_write(mapping);
+   i_mmap_lock_read(mapping);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
unsigned long address = vma_address(page, vma);
ret = try_to_unmap_one(page, vma, address, flags);
@@ -1594,7 +1594,7 @@ static int try_to_unmap_file(struct page *page, enum 
ttu_flags flags)
list_for_each_entry(vma, >i_mmap_nonlinear, shared.nonlinear)
vma->vm_private_data = NULL;
 out:
-   i_mmap_unlock_write(mapping);
+   i_mmap_unlock_read(mapping);
return ret;
 }
 
@@ -1711,7 +1711,7 @@ static int rmap_walk_file(struct page *page, int 
(*rmap_one)(struct page *,
 
if (!mapping)
return ret;
-   i_mmap_lock_write(mapping);
+   i_mmap_lock_read(mapping);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
unsigned long address = vma_address(page, vma);
ret = rmap_one(page, vma, address, arg);
@@ -1723,7 +1723,7 @@ static int rmap_walk_file(struct page *page, int 
(*rmap_one)(struct page *,
 * never contain migration ptes.  Decide what to do about this
 * limitation to linear when we need rmap_walk() on nonlinear.
 */
-   i_mmap_unlock_write(mapping);
+   i_mmap_unlock_read(mapping);
return ret;
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

[PATCH 3/5] mm: convert i_mmap_mutex to rwsem

2013-06-24 Thread Davidlohr Bueso
This conversion is straightforward. All users take the write
lock, so there is really not much difference with the previous
mutex lock.

Signed-off-by: Davidlohr Bueso 
---
 fs/inode.c | 2 +-
 include/linux/fs.h | 6 +++---
 mm/mmap.c  | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 00d5fc3..af5f0ea 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -345,7 +345,7 @@ void address_space_init_once(struct address_space *mapping)
memset(mapping, 0, sizeof(*mapping));
INIT_RADIX_TREE(>page_tree, GFP_ATOMIC);
spin_lock_init(>tree_lock);
-   mutex_init(>i_mmap_mutex);
+   init_rwsem(>i_mmap_rwsem);
INIT_LIST_HEAD(>private_list);
spin_lock_init(>private_lock);
mapping->i_mmap = RB_ROOT;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1ea6c68..79b8548 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -410,7 +410,7 @@ struct address_space {
unsigned inti_mmap_writable;/* count VM_SHARED mappings */
struct rb_root  i_mmap; /* tree of private and shared 
mappings */
struct list_headi_mmap_nonlinear;/*list VM_NONLINEAR mappings */
-   struct mutexi_mmap_mutex;   /* protect tree, count, list */
+   struct rw_semaphore i_mmap_rwsem;   /* protect tree, count, list */
/* Protected by tree_lock together with the radix tree */
unsigned long   nrpages;/* number of total pages */
pgoff_t writeback_index;/* writeback starts here */
@@ -477,12 +477,12 @@ int mapping_tagged(struct address_space *mapping, int 
tag);
 
 static inline void i_mmap_lock_write(struct address_space *mapping)
 {
-   mutex_lock(>i_mmap_mutex);
+   down_write(>i_mmap_rwsem);
 }
 
 static inline void i_mmap_unlock_write(struct address_space *mapping)
 {
-   mutex_unlock(>i_mmap_mutex);
+   up_write(>i_mmap_rwsem);
 }
 
 /*
diff --git a/mm/mmap.c b/mm/mmap.c
index 01a9876..b4e142a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3016,7 +3016,7 @@ static void vm_lock_mapping(struct mm_struct *mm, struct 
address_space *mapping)
 */
if (test_and_set_bit(AS_MM_ALL_LOCKS, >flags))
BUG();
-   mutex_lock_nest_lock(>i_mmap_mutex, >mmap_sem);
+   down_write_nest_lock(>i_mmap_rwsem, >mmap_sem);
}
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] mm: use new helper functions around the i_mmap_mutex

2013-06-24 Thread Davidlohr Bueso
Convert all open coded mutex_lock/unlock calls to the
i_mmap_[lock/unlock]_write() helpers.

Signed-off-by: Davidlohr Bueso 
---
 arch/x86/mm/hugetlbpage.c |  4 ++--
 fs/hugetlbfs/inode.c  |  4 ++--
 kernel/events/uprobes.c   |  4 ++--
 kernel/fork.c |  4 ++--
 mm/filemap_xip.c  |  4 ++--
 mm/fremap.c   |  4 ++--
 mm/hugetlb.c  |  8 
 mm/memory-failure.c   |  4 ++--
 mm/memory.c   |  8 
 mm/mmap.c | 14 +++---
 mm/mremap.c   |  4 ++--
 mm/nommu.c| 14 +++---
 mm/rmap.c | 16 
 13 files changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index ae1aa71..9c61a1e 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -79,7 +79,7 @@ huge_pmd_share(struct mm_struct *mm, unsigned long addr, 
pud_t *pud)
if (!vma_shareable(vma, addr))
return (pte_t *)pmd_alloc(mm, pud, addr);
 
-   mutex_lock(>i_mmap_mutex);
+   i_mmap_lock_write(mapping);
vma_interval_tree_foreach(svma, >i_mmap, idx, idx) {
if (svma == vma)
continue;
@@ -105,7 +105,7 @@ huge_pmd_share(struct mm_struct *mm, unsigned long addr, 
pud_t *pud)
spin_unlock(>page_table_lock);
 out:
pte = (pte_t *)pmd_alloc(mm, pud, addr);
-   mutex_unlock(>i_mmap_mutex);
+   i_mmap_unlock_write(mapping);
return pte;
 }
 
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index a3f868a..35eebfc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -404,10 +404,10 @@ static int hugetlb_vmtruncate(struct inode *inode, loff_t 
offset)
pgoff = offset >> PAGE_SHIFT;
 
i_size_write(inode, offset);
-   mutex_lock(>i_mmap_mutex);
+   i_mmap_lock_write(mapping);
if (!RB_EMPTY_ROOT(>i_mmap))
hugetlb_vmtruncate_list(>i_mmap, pgoff);
-   mutex_unlock(>i_mmap_mutex);
+   i_mmap_unlock_write(mapping);
truncate_hugepages(inode, offset);
return 0;
 }
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index f356974..c7b9f45 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -693,7 +693,7 @@ build_map_info(struct address_space *mapping, loff_t 
offset, bool is_register)
int more = 0;
 
  again:
-   mutex_lock(>i_mmap_mutex);
+   i_mmap_lock_write(mapping);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
if (!valid_vma(vma, is_register))
continue;
@@ -724,7 +724,7 @@ build_map_info(struct address_space *mapping, loff_t 
offset, bool is_register)
info->mm = vma->vm_mm;
info->vaddr = offset_to_vaddr(vma, offset);
}
-   mutex_unlock(>i_mmap_mutex);
+   i_mmap_unlock_write(mapping);
 
if (!more)
goto out;
diff --git a/kernel/fork.c b/kernel/fork.c
index 987b28a..13226f1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -420,7 +420,7 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct 
*oldmm)
get_file(file);
if (tmp->vm_flags & VM_DENYWRITE)
atomic_dec(>i_writecount);
-   mutex_lock(>i_mmap_mutex);
+   i_mmap_lock_write(mapping);
if (tmp->vm_flags & VM_SHARED)
mapping->i_mmap_writable++;
flush_dcache_mmap_lock(mapping);
@@ -432,7 +432,7 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct 
*oldmm)
vma_interval_tree_insert_after(tmp, mpnt,
>i_mmap);
flush_dcache_mmap_unlock(mapping);
-   mutex_unlock(>i_mmap_mutex);
+   i_mmap_unlock_write(mapping);
}
 
/*
diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
index 28fe26b..c851586 100644
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -182,7 +182,7 @@ __xip_unmap (struct address_space * mapping,
return;
 
 retry:
-   mutex_lock(>i_mmap_mutex);
+   i_mmap_lock_write(mapping);
vma_interval_tree_foreach(vma, >i_mmap, pgoff, pgoff) {
mm = vma->vm_mm;
address = vma->vm_start +
@@ -202,7 +202,7 @@ retry:
page_cache_release(page);
}
}
-   mutex_unlock(>i_mmap_mutex);
+   i_mmap_unlock_write(mapping);
 
if (locked) {
mutex_unlock(_sparse_mutex);
diff --git a/mm/fremap.c b/mm/fremap.c
index 87da359..fa49f3d 100644
--- a/mm/fremap.c
+++ b/mm/fremap.c
@@ -215,13 +215,13 @@ get_write_lock:
}
goto out;
}
-   mutex_lock(>i_mmap_mutex);
+   

[PATCH 0/5] mm: i_mmap_mutex to rwsem

2013-06-24 Thread Davidlohr Bueso
This patchset extends the work started by Ingo Molnar in late 2012,
optimizing the anon-vma mutex lock, converting it from a exclusive mutex
to a rwsem, and sharing the lock for read-only paths when walking the
the vma-interval tree. More specifically commits 5a505085 and 4fc3f1d6.

The i_mmap mutex has similar responsibilities with the anon-vma, protecting
file backed pages. Therefore we can use similar locking techniques: covert
the mutex to a rwsem and share the lock when possible.

With these changes, and the rwsem optimizations discussed in
http://lkml.org/lkml/2013/6/16/38 we can see performance improvements.
For instance, on a 8 socket, 80 core DL980, when compared to a vanilla 
3.10-rc5, 
aim7 benefits in throughput, with the following workloads (beyond 500 users):

- alltests (+14.5%)
- custom (+17%)
- disk (+11%)
- high_systime (+5%)
- shared (+15%)
- short (+4%)

For lower amounts of users, there are no significant differences as all numbers
are within the 0-2% noise range.

Davidlohr Bueso (5):
  mm,fs: introduce helpers around i_mmap_mutex
  mm: use new helper functions around the i_mmap_mutex
  mm: convert i_mmap_mutex to rwsem
  mm/rmap: share the i_mmap_rwsem
  mm: rename leftover i_mmap_mutex

 Documentation/lockstat.txt   |  2 +-
 Documentation/vm/locking |  2 +-
 arch/x86/mm/hugetlbpage.c|  6 +++---
 fs/hugetlbfs/inode.c |  4 ++--
 fs/inode.c   |  2 +-
 include/linux/fs.h   | 22 +-
 include/linux/mmu_notifier.h |  2 +-
 kernel/events/uprobes.c  |  6 +++---
 kernel/fork.c|  4 ++--
 mm/filemap.c | 10 +-
 mm/filemap_xip.c |  4 ++--
 mm/fremap.c  |  4 ++--
 mm/hugetlb.c | 16 
 mm/memory-failure.c  |  7 +++
 mm/memory.c  |  8 
 mm/mmap.c| 22 +++---
 mm/mremap.c  |  6 +++---
 mm/nommu.c   | 14 +++---
 mm/rmap.c| 24 
 19 files changed, 92 insertions(+), 73 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] mm: rename leftover i_mmap_mutex

2013-06-24 Thread Davidlohr Bueso
Update the lock to i_mmap_rwsem throughout the kernel.
All changes are in comments and documentation.

Signed-off-by: Davidlohr Bueso 
---
 Documentation/lockstat.txt   |  2 +-
 Documentation/vm/locking |  2 +-
 arch/x86/mm/hugetlbpage.c|  2 +-
 include/linux/mmu_notifier.h |  2 +-
 kernel/events/uprobes.c  |  2 +-
 mm/filemap.c | 10 +-
 mm/hugetlb.c |  8 
 mm/mmap.c|  6 +++---
 mm/mremap.c  |  2 +-
 mm/rmap.c|  8 
 10 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/Documentation/lockstat.txt b/Documentation/lockstat.txt
index dd2f7b2..96b8233 100644
--- a/Documentation/lockstat.txt
+++ b/Documentation/lockstat.txt
@@ -168,7 +168,7 @@ View the top contending locks:
  dcache_lock:  1037   1161 
  0.38  45.32 774.51   6611 243371   
0.15 306.48   77387.24
  >i_mutex:   161286 
18446744073709   62882.54 1244614.55   3653  20598 
18446744073709   62318.60 1693822.74
  >lru_lock:94 94 
  0.53   7.33  92.10   4366  32690   
0.29  59.81   16350.06
-  >i_data.i_mmap_mutex:79 79
   0.40   3.77  53.03  11779  87755   
0.28 116.93   29898.44
+  >i_data.i_mmap_rwsem:   79 79 
  0.40   3.77  53.03  11779  87755   
0.28 116.93   29898.44
 >__queue_lock:48 50 
  0.52  31.62  86.31774  13131   
0.17 113.08   12277.52
 >rq_lock_key:43 47 
  0.74  68.50 170.63   3706  33929   
0.22 107.99   17460.62
   >rq_lock_key#2:39 46 
  0.75   6.68  49.03   2979  32292   
0.17 125.17   17137.63
diff --git a/Documentation/vm/locking b/Documentation/vm/locking
index f61228b..fb64028 100644
--- a/Documentation/vm/locking
+++ b/Documentation/vm/locking
@@ -66,7 +66,7 @@ in some cases it is not really needed. Eg, vm_start is 
modified by
 expand_stack(), it is hard to come up with a destructive scenario without 
 having the vmlist protection in this case.
 
-The page_table_lock nests with the inode i_mmap_mutex and the kmem cache
+The page_table_lock nests with the inode i_mmap_rwsem and the kmem cache
 c_spinlock spinlocks.  This is okay, since the kmem code asks for pages after
 dropping c_spinlock.  The page_table_lock also nests with pagecache_lock and
 pagemap_lru_lock spinlocks, and no code asks for memory with these locks
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 9c61a1e..df68d13 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -60,7 +60,7 @@ static int vma_shareable(struct vm_area_struct *vma, unsigned 
long addr)
  * and returns the corresponding pte. While this is not necessary for the
  * !shared pmd case because we can allocate the pmd later as well, it makes the
  * code much cleaner. pmd allocation is essential for the shared case because
- * pud has to be populated inside the same i_mmap_mutex section - otherwise
+ * pud has to be populated inside the same i_mmap_rwsem section - otherwise
  * racing tasks could either miss the sharing (see huge_pte_offset) or select a
  * bad pmd for sharing.
  */
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index deca874..f9c11ab 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -151,7 +151,7 @@ struct mmu_notifier_ops {
  * Therefore notifier chains can only be traversed when either
  *
  * 1. mmap_sem is held.
- * 2. One of the reverse map locks is held (i_mmap_mutex or anon_vma->rwsem).
+ * 2. One of the reverse map locks is held (i_mmap_rwsem or anon_vma->rwsem).
  * 3. No other concurrent thread can access the list (release)
  */
 struct mmu_notifier {
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c7b9f45..4ca146e 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -700,7 +700,7 @@ build_map_info(struct address_space *mapping, loff_t 
offset, bool is_register)
 
if (!prev && !more) {
/*
-* Needs GFP_NOWAIT to avoid i_mmap_mutex recursion 
through
+* Needs GFP_NOWAIT to avoid i_mmap_rwsem recursion 
through
 * reclaim. This is optimistic, no harm done if it 
fails.
 */
   

[PATCH] ARM: mach-clps711x: common: Use linux/sched_clock.h

2013-06-24 Thread Fabio Estevam
From: Fabio Estevam 

Commit 38ff87f7 (sched_clock: Make ARM's sched_clock generic for all 
architectures) changed the header to , so adapt it in order
to fix the following build error:

arch/arm/mach-clps711x/common.c:37:29: fatal error: asm/sched_clock.h: No such 
file or directory

Signed-off-by: Fabio Estevam 
---
 arch/arm/mach-clps711x/common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-clps711x/common.c b/arch/arm/mach-clps711x/common.c
index 4ca2f3c..134641d 100644
--- a/arch/arm/mach-clps711x/common.c
+++ b/arch/arm/mach-clps711x/common.c
@@ -29,12 +29,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Allow block drivers to poll for I/O instead of sleeping

2013-06-24 Thread Steven Rostedt
On Mon, Jun 24, 2013 at 09:17:18AM +0200, Jens Axboe wrote:
> On Sun, Jun 23 2013, Linus Torvalds wrote:
> > 
> > You could try to do that either *in* the idle thread (which would take
> > the context switch overhead - maybe negating some of the advantages),
> > or alternatively hook into the scheduler idle logic before actually
> > doing the switch.
> 
> It can't happen in the idle thread. If you need to take the context
> switch, then you've negated pretty much all of the gain of the polled
> approach.

What about hooking into the idle_balance code? That happens if we are
about to go to idle but before the full schedule switch to the idle
task.


In __schedule(void):

if (unlikely(!rq->nr_running))
idle_balance(cpu, rq);

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.

2013-06-24 Thread Dave Airlie
On Tue, Jun 25, 2013 at 9:18 AM, Konrad Rzeszutek Wilk
 wrote:
> Dave Airlie  wrote:
>
>>On Tue, Jun 25, 2013 at 4:34 AM, Konrad Rzeszutek Wilk
>> wrote:
>>> On Mon, Jun 24, 2013 at 08:26:18PM +0200, Daniel Vetter wrote:
 On Mon, Jun 24, 2013 at 7:32 PM, Konrad Rzeszutek Wilk
  wrote:
 > On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote:
 >> On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk
>>wrote:
 >> > Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4
 >> > ("drm/i915: create compact dma scatter lists for gem objects")
>>makes
 >> > certain assumptions about the under laying DMA API that are not
>>always
 >> > correct.
 >> >
 >> > On a ThinkPad X230 with an Intel HD 4000 with Xen during the
>>bootup
 >> > I see:
 >> >
 >> > [drm:intel_pipe_set_base] *ERROR* pin & fence failed
 >> > [drm:intel_crtc_set_config] *ERROR* failed to set mode on
>>[CRTC:3], err = -28
 >> >
 >> > Bit of debugging traced it down to dma_map_sg failing (in
 >> > i915_gem_gtt_prepare_object) as some of the SG entries were
>>huge (3MB).
 >> >
 >> > That unfortunately are sizes that the SWIOTLB is incapable of
>>handling -
 >> > the maximum it can handle is a an entry of 512KB of virtual
>>contiguous
 >> > memory for its bounce buffer. (See IO_TLB_SEGSIZE).
 >> >
 >> > Previous to the above mention git commit the SG entries were of
>>4KB, and
 >> > the code introduced by above git commit squashed the CPU
>>contiguous PFNs
 >> > in one big virtual address provided to DMA API.
 >> >
 >> > This patch is a simple semi-revert - were we emulate the old
>>behavior
 >> > if we detect that SWIOTLB is online. If it is not online then
>>we continue
 >> > on with the new compact scatter gather mechanism.
 >> >
 >> > An alternative solution would be for the the '.get_pages' and
>>the
 >> > i915_gem_gtt_prepare_object to retry with smaller max gap of
>>the
 >> > amount of PFNs that can be combined together - but with this
>>issue
 >> > discovered during rc7 that might be too risky.
 >> >
 >> > Reported-and-Tested-by: Konrad Rzeszutek Wilk
>>
 >> > CC: Chris Wilson 
 >> > CC: Imre Deak 
 >> > CC: Daniel Vetter 
 >> > CC: David Airlie 
 >> > CC: 
 >> > Signed-off-by: Konrad Rzeszutek Wilk 
 >>
 >> Two things:
 >
 > Hey Daniel,
 >
 >>
 >> - SWIOTLB usage should seriously blow up all over the place in
>>drm/i915.
 >>   We really rely on the everywhere else true fact that the pages
>>and their
 >>   dma mapping point at the same backing storage.
 >
 > It works. As in, it seems to work for just a normal desktop user.
>>I don't
 > see much of dma_sync_* sprinkled around the drm/i915 so I would
>>think that
 > there are some issues would be hit as well - but at the first
>>glance
 > when using it on a laptop it looks OK.

 Yeah, we have a pretty serious case of "roll our own coherency
>>stuff".
 The biggest reason is that for a long time i915.ko didn't care one
>>bit
 about iommus, and the thing we care about (flushing cpu caches for
 dma) isn't supported on x86 since x86 every dma is coherent (well,
>>not
 quite, but we don't have support for it). I think longer-term it
>>would
 make sense to move the clfushing we're doing into the dma layer.

 >> - How is this solved elsewhere when constructing sg tables? Or
>>are we
 >>   really the only guys who try to construct such big sg entries?
>>I
 >>   expected somewhat that the dma mapping backed would fill in the
>>segment
 >>   limits accordingly, but I haven't found anything really on a
>>quick
 >>   search.
 >
 > The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which
>>will
 > construct the dma mapped pages. That allows it to construct
>>"SWIOTLB-approved"
 > pages that won't need to go through dma_map/dma_unmap as they are
 > already mapped and ready to go.
 >
 > Coming back to your question - I think that i915 is the one that
>>I've
 > encountered.

 That's a bit surprising. With dma_buf graphics people will use sg
 tables much more (there's even a nice sg_alloc_table_from_pages
>>helper
 to construct them), and those sg tables tend to have large segments.
>>I
 guess we need some more generic solution here ...
>>>
>>> Yes. I don't grok the full picture yet so I am not sure how to help
>>with
>>> this right now. Is there a roadmap or Wiki on how this was
>>envisioned?

 For now I guess we can live with your CONFIG_SWIOTLB hack.
 -Daniel
>>>
>>> OK, I read that as an Ack-ed-by. Should I send the patch to Dave
>>Airlie
>>> in a GIT PULL or some other way to make it on the v3.10-rc7 train?
>>
>>I don't like this at all, I'll accept the patch on the condition you
>>investigate further :-)
>>
>>If you are using swiotlb on i915 things 

Re: cgroup: status-quo and userland efforts

2013-06-24 Thread Tejun Heo
Hello, Tim.

On Sat, Jun 22, 2013 at 04:13:41PM -0700, Tim Hockin wrote:
> I'm very sorry I let this fall off my plate.  I was pointed at a
> systemd-devel message indicating that this is done.  Is it so?  It

It's progressing pretty fast.

> seems so completely ass-backwards to me. Below is one of our use-cases
> that I just don't see how we can reproduce in a single-heierarchy.

Configurations which depend on orthogonal multiple hierarchies of
course won't be replicated under unified hierarchy.  It's unfortunate
but those just have to go.  More on this later.

> We're also long into the model that users can control their own
> sub-cgroups (moderated by permissions decided by admin SW up front).

If you're in control of the base system, nothing prevents you from
doing so.  It's utterly broken security and policy-enforcement point
of view but if you can trust each software running on your system to
do the right thing, it's gonna be fine.

> This gives us 4 combinations:
>   1) { production, DTF }
>   2) { production, non-DTF }
>   3) { batch, DTF }
>   4) { batch non-DTF }
> 
> Of these, (3) is sort of nonsense, but the others are actually used
> and needed.  This is only
> possible because of split hierarchies.  In fact, we undertook a very painful
> process to move from a unified cgroup hierarchy to split hierarchies in large
> part _because of_ these examples.

You can create three sibling cgroups and configure cpuset and blkio
accordingly.  For cpuset, the setup wouldn't make any different.  For
blkio, the two non-DTFs would now belong to different cgroups and
compete with each other as two groups, which won't matter at all as
non-DTFs are given what's left over after serving DTFs anyway, IIRC.

> Making cgroups composable allows us to build a higher level abstraction that
> is very powerful and flexible.  Moving back to unified hierarchies goes
> against everything that we're doing here, and will cause us REAL pain.

Categorizing processes into hierarchical groups of tasks is a
fundamental idea and a fundamental idea is something to base things on
top of as it's something people can agree upon relatively easily and
establish a structure by.  I'd go as far as saying that it's the
failure on the part of workload design if they in general can't be
categorized hierarchically.

Even at the practical level, the orthogonal hierarchy encouraged, at
the very least, the blkcg writeback support which can't be upstreamed
in any reasonable manner because it is impossible to say that a
resource can't be said to belong to a cgroup irrespective of who's
looking at it.

It's something fundamentally broken and I have very difficult time
believing google's workload is so different that it can't be
categorized in a single hierarchy for the purpose of resource
distribution.  I'm sure there are cases where some compromises are
necessary but the laternative is much worse here.  As I wrote multiple
times now, multiple orthogonal hierarchy support is gonna be around
for some time, so I don't think there's any rason for panic; that
said, please at least plan to move on.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: error fetching the cifs tree

2013-06-24 Thread Steve French
Yes that is fine - am updating with two minor changes to the tree.
will repost within three hours.

On Mon, Jun 24, 2013 at 6:51 PM, Stephen Rothwell  wrote:
> Hi all,
>
> Attempting to fetch the cifs tree
> (git://git.samba.org/sfrench/cifs-2.6.git#for-next) produces this error:
>
> fatal: Couldn't find remote ref refs/heads/for-next
>
> I am using whatever I have previously fetched.
>
> --
> Cheers,
> Stephen Rothwells...@canb.auug.org.au



-- 
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: error fetching the cifs tree

2013-06-24 Thread Stephen Rothwell
Hi all,

Attempting to fetch the cifs tree
(git://git.samba.org/sfrench/cifs-2.6.git#for-next) produces this error:

fatal: Couldn't find remote ref refs/heads/for-next

I am using whatever I have previously fetched.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpO6sd7n5V1I.pgp
Description: PGP signature


Re: linux-next: build failure after merge of the final tree (staging tree related)

2013-06-24 Thread Greg KH
On Tue, Jun 25, 2013 at 09:40:51AM +1000, Stephen Rothwell wrote:
> Hi Greg,
> 
> On Mon, 24 Jun 2013 15:40:35 -0700 Greg KH  wrote:
> >
> > We are running out of time, my tree is pretty much closed for 3.11 now,
> > should I just disable the build of this module for 3.11?
> 
> That's what I've been doing - it has never been enabled in a final
> linux-next release.  So, it should probably just be disabled properly.

I agree, now disabled.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: problem trying to fetch the c6x tree

2013-06-24 Thread Stephen Rothwell
Hi Mark,

Attempting to fetch the c6x tree
(git://linux-c6x.org/git/projects/linux-c6x-upstreaming.git#for-linux-next)
for the past two days has just produced hangs.  I am using whatever I
have previously fetched.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpaOV8xJ3q5D.pgp
Description: PGP signature


[tip:timers/core] clocksource: vf_pit_timer: Use linux/ sched_clock.h

2013-06-24 Thread tip-bot for Fabio Estevam
Commit-ID:  2699339361a9bacb3fa663e6b8981a040cfca4ee
Gitweb: http://git.kernel.org/tip/2699339361a9bacb3fa663e6b8981a040cfca4ee
Author: Fabio Estevam 
AuthorDate: Mon, 24 Jun 2013 20:20:08 -0300
Committer:  Thomas Gleixner 
CommitDate: Tue, 25 Jun 2013 01:41:48 +0200

clocksource: vf_pit_timer: Use linux/sched_clock.h

Commit 38ff87f7 (sched_clock: Make ARM's sched_clock generic for all
architectures) changed the header to , so adapt
it in order to fix the following build error:

drivers/clocksource/vf_pit_timer.c:15:29: fatal error: asm/sched_clock.h: No 
such file or directory

Signed-off-by: Fabio Estevam 
Cc: shawn@linaro.org
Cc: sb...@codeaurora.org
Cc: john.stu...@linaro.org
Link: 
http://lkml.kernel.org/r/1372116008-2323-1-git-send-email-feste...@gmail.com
Signed-off-by: Thomas Gleixner 
---
 drivers/clocksource/vf_pit_timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clocksource/vf_pit_timer.c 
b/drivers/clocksource/vf_pit_timer.c
index 598399d..587e020 100644
--- a/drivers/clocksource/vf_pit_timer.c
+++ b/drivers/clocksource/vf_pit_timer.c
@@ -12,7 +12,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * Each pit takes 0x10 Bytes register space
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: named anonymous vmas

2013-06-24 Thread John Stultz
On Mon, Jun 24, 2013 at 10:26 AM, Colin Cross  wrote:
> On Mon, Jun 24, 2013 at 4:48 AM, Christoph Hellwig  wrote:
>> On Sat, Jun 22, 2013 at 12:47:29PM -0700, Alex Elsayed wrote:
>>> Couldn't this be done by having a root-only tmpfs, and having a userspace
>>> component that creates per-app directories with restrictive permissions on
>>> startup/app install? Then each app creates files in its own directory, and
>>> can pass the fds around.
>
> If each app gets its own writable directory that's not really
> different than a world writable tmpfs.  It requires something that
> watches for apps to exit for any reason and cleans up their
> directories, and it requires each app to come up with an unused name
> when it wants to create a file, and the kernel can give you both very
> cleanly.

Though, I believe having a daemon that has exclusive access to tmpfs,
and creates, unlinks and passes the fd to the requesting application
would provide a userspace only implementation of the second feature
requirement ("without having a world-writable tmpfs that untrusted
apps could fill with files").  Though I'm not sure what the
proc//maps naming would look like on the unlinked file, so it
might not solve the third naming issue.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] build some drivers only when compile-testing

2013-06-24 Thread Greg Kroah-Hartman
On Wed, Jun 19, 2013 at 08:50:08AM +0200, Jiri Slaby wrote:
> On 06/18/2013 06:04 PM, Greg Kroah-Hartman wrote:
> >> So currently I have what is attached... Comments?
> > 
> > Looks good to me, want me to queue it up through my char/misc driver
> > tree for 3.11?
> 
> If there are no objections... Whoever picks that up, I would be happy 8-).

I've taken it, without the chipidea portion, through my driver-core
tree.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the final tree (staging tree related)

2013-06-24 Thread Stephen Rothwell
Hi Greg,

On Mon, 24 Jun 2013 15:40:35 -0700 Greg KH  wrote:
>
> We are running out of time, my tree is pretty much closed for 3.11 now,
> should I just disable the build of this module for 3.11?

That's what I've been doing - it has never been enabled in a final
linux-next release.  So, it should probably just be disabled properly.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpp6RSn68oeW.pgp
Description: PGP signature


Re: [PATCH v5] serial:st-asc: Add ST ASC driver.

2013-06-24 Thread Greg Kroah-Hartman
On Mon, Jun 24, 2013 at 08:21:43AM +0100, Srinivas KANDAGATLA wrote:
> From: Srinivas Kandagatla 
> 
> This patch adds support to ASC (asynchronous serial controller)
> driver, which is basically a standard serial driver. This IP is common
> across all the ST parts for settop box platforms.
> 
> ASC is embedded in ST COMMS IP block. It supports Rx & Tx functionality.
> It support all industry standard baud rates.
> 
> Signed-off-by: Srinivas Kandagatla 
> CC: Stephen Gallimore 
> CC: Stuart Menefy 
> CC: Arnd Bergmann 
> ---
> Hi Greg,
> 
> This patch is the part of the driver support for Sti SOCs.
> This patch undergone 3-4 cycles of review in arm-kernel mailing list.
> As Arnd prefered to take only SOC support patches via arm-soc, Am 
> sending this patch seperately.
> 
> If its not too late, can you consider this patch for 3.11 via tty tree?

I would have taken it, but it breaks the build on my machine:

drivers/tty/serial/st-asc.c: In function ‘asc_serial_resume’:
drivers/tty/serial/st-asc.c:774:15: error: ‘struct device’ has no member named 
‘pins’
drivers/tty/serial/st-asc.c:775:3: error: implicit declaration of function 
‘pinctrl_select_state’ [-Werror=implicit-function-declaration]
drivers/tty/serial/st-asc.c:775:37: error: ‘struct device’ has no member named 
‘pins’
drivers/tty/serial/st-asc.c:776:16: error: ‘struct device’ has no member named 
‘pins’

Please test your patches out on a "normal" Linux system.

Please feel free to resend this after 3.11-rc1 is out, for inclusion in
3.12, after you have fixed the build problems.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] scsi_debug: fix do_device_access() with wrap around range

2013-06-24 Thread Douglas Gilbert

On 13-06-23 02:37 PM, Akinobu Mita wrote:

do_device_access() is a function that abstracts copying SG list from/to
ramdisk storage (fake_storep).

It must deal with the ranges exceeding actual fake_storep size, because
such ranges are valid if virtual_gb is set greater than zero, and they
should be treated as fake_storep is repeatedly mirrored up to virtual size.

Unfortunately, it can't deal with the range which wraps around the end of
fake_storep. A wrap around range is copied by two sg_copy_{from,to}_buffer()
calls, but sg_copy_{from,to}_buffer() can't copy from/to in the middle of
SG list, therefore the second call can't copy correctly.

This fixes it by using sg_pcopy_{from,to}_buffer() that can copy from/to
the middle of SG list.

This also simplifies the assignment of sdb->resid in fill_from_dev_buffer().
Because fill_from_dev_buffer() is now only called once per command
execution cycle.  So it is not necessary to take care to decrease
sdb->resid if fill_from_dev_buffer() is called more than once.

Signed-off-by: Akinobu Mita 
Cc: "James E.J. Bottomley" 
Cc: Douglas Gilbert 
Cc: linux-s...@vger.kernel.org
---

* No change from v2


Acked-by: Douglas Gilbert 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] [NET]: Unmap fragment page once iterator is done

2013-06-24 Thread David Miller
From: Wedson Almeida Filho 
Date: Mon, 24 Jun 2013 15:47:18 -0700

> The summary line of the original commit is "[NET]: Zerocopy sequential
> reading of skb data".

He's telling you to put this in the commit message, and resubmit
the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] kernel/itimer.c: beautify code, not need check 'value', so save one instruction, simpler and easier for readers.t

2013-06-24 Thread Thomas Gleixner
On Fri, 21 Jun 2013, Chen Gang wrote:
> >> > Also can let code simpler and easier for readers: if checking parameter
> >> > 'value', it will easily lead readers to think about why not return
> >> > -EINVAL instead of -EFAULT, when checking parameter failed.
> > So you are seriously claiming, that the check for !value makes people
> > think that the return value should be -EINVAL?
> > 
> > That's hillarious.
> > 
> That seems not a quite polite word, is it ?  ;-)

My apologies for being so impolite. Let me rephrase it. Here is a
"sample" changelog for your patch:

  Subject: itimers: Remove bogus NULL pointer check in sys_getitimer()

People might be tricked into assuming that the return value for a
failed NULL pointer check should be -EINVAL instead of -EFAULT.

Remove the misleading NULL pointer check to fix this nuisance.

Aside of that this patch fixes the problem of NOMMU kernels, where
a NULL pointer dereference is a valid operation. This allows to
boot NOMMU kernels without working around the shortcomings of the
getitimer() system call, which have been ignored since this NULL
pointer check was introduced in Linux 0.96a.


Please resubmit.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >