Re: [PATCH 1/2] drivers: mtd: Mark functions as static and remove unused function in lpddr_cmds.c

2013-12-17 Thread Brian Norris
On Fri, Dec 13, 2013 at 12:44:07PM +0530, Rashika Kheria wrote:
> This patch marks the functions do_write_buffer() and do_erase_oneblock()
> as static because because they are not used outside this file. It also
> removes the unused function word_program() in lpddr/lpddr_cmds.c.
> 
> Thus, it also removes the following warnings in lpddr/lpddr_cmds.c:
> drivers/mtd/lpddr/lpddr_cmds.c:391:5: warning: no previous prototype for 
> ‘do_write_buffer’ [-Wmissing-prototypes]
> drivers/mtd/lpddr/lpddr_cmds.c:472:5: warning: no previous prototype for 
> ‘do_erase_oneblock’ [-Wmissing-prototypes]
> drivers/mtd/lpddr/lpddr_cmds.c:751:5: warning: no previous prototype for 
> ‘word_program’ [-Wmissing-prototypes]
> 
> Signed-off-by: Rashika Kheria 
> Reviewed-by: Josh Triplett 

Edited the $subjects and pushed both to l2-mtd.git. Thanks!

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG: 3.13.0-rc4] inconsistent lock state

2013-12-17 Thread Knut Petersen

On 17.12.2013 19:59, Eric Dumazet wrote:

On Tue, 2013-12-17 at 19:13 +0100, Knut Petersen wrote:

Hi Linus / everybody!

Booting openSuSE 13.1 with kernel 3.13.0-rc4 triggers the attached
warning.

cu,
   Knut



Following patch should solve the issue.

http://patchwork.ozlabs.org/patch/301382/


Indeed, it solves the problem.

cu,
 Knut

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Devel] Race in memcg kmem?

2013-12-17 Thread Vladimir Davydov
On 12/12/2013 05:39 PM, Vladimir Davydov wrote:
> On 12/12/2013 05:21 PM, Michal Hocko wrote:
>> On Wed 11-12-13 10:22:06, Vladimir Davydov wrote:
>>> On 12/11/2013 03:13 AM, Glauber Costa wrote:
 On Tue, Dec 10, 2013 at 5:59 PM, Vladimir Davydov
>> [...]
> -- memcg_update_cache_size(s, num_groups) --
> grows s->memcg_params to accomodate data for num_groups memcg's
> @s is the root cache whose memcg_params we want to grow
> @num_groups is the new number of kmem-active cgroups (defines the new
> size of memcg_params array).
>
> The function:
>
> B1) allocates and assigns a new cache:
> cur_params = s->memcg_params;
> s->memcg_params = kzalloc(size, GFP_KERNEL);
>
> B2) copies per-memcg cache ptrs from the old memcg_params array to the
> new one:
> for (i = 0; i < memcg_limited_groups_array_size; i++) {
> if (!cur_params->memcg_caches[i])
> continue;
> s->memcg_params->memcg_caches[i] =
> cur_params->memcg_caches[i];
> }
>
> B3) frees the old array:
> kfree(cur_params);
>
>
> Since these two functions do not share any mutexes, we can get the
 They do share a mutex, the slab mutex.
>> Worth sticking in a lock_dep_assert?
> AFAIU, lockdep_assert_held() is not applicable here:
> memcg_create_kmem_cache() is called w/o the slab_mutex held, but it
> calls kmem_cache_create_kmemcg(), which takes and releases this mutex,
> working as a barrier. Placing lockdep_assert_held() into the latter
> won't make things any clearer. IMO, we need a big good comment in
> memcg_create_kmem_cache() proving its correctness.

After a bit of thinking on the comment explaining why the race is
impossible I seem to have found another one in these two functions.

Assume two threads schedule kmem_cache creation works for the same
kmem_cache of the same memcg from __memcg_kmem_get_cache(). One of the
works successfully creates it. Another work should fail then, but if it
interleaves with memcg_update_cache_size() as follows, it does not:

memcg_create_kmem_cache()   memcg_update_cache_size()
(called w/o mutexes held)   (called with slab_mutex
held)
-   -
mutex_lock(_cache_mutex)
s->memcg_params=kzalloc(...)
new_cachep=cache_from_memcg_idx(cachep,idx)
// new_cachep==NULL => proceed to creation
// initialize
s->memcg_params;
// sets s->memcg_params
//   
->memcg_caches[idx]
new_cachep = kmem_cache_dup(memcg, cachep)
// nothing prevents kmem_cache_dup from
// succeeding so ...
cachep->memcg_params->memcg_caches[idx]=new_cachep
// we've overwritten an existing cache ptr!

slab_mutex won't help here...

Anyway, I'm going to move check and initialization of memcg_caches[idx]
from memcg_create_kmem_cache() to kmem_cache_create_memcg() under the
slab_mutex eliminating every possibility of race there. Will send the
patch soon.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [cfg80211 / iwlwifi] setting wireless regulatory domain doesn't work.

2013-12-17 Thread Pontus Fuchs

On 2013-12-17 22:49, Sander Eikelenboom wrote:


Indeed, I looked for a crda hook for initramfs-tools but didn't find it, so 
skipped that idea
for the moment.

So if i combine the two .. it's essentially just a very bad idea to compile the 
wireless stuff in.
It needs a access to a userland program at module load time, or it will block 
forever.


The canonical trick to have cfg80211 built in is to execute crda 
manually in your boot scripts. This will satisfy the initial request and 
resolve the block.


Cheers,

Pontus

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


tty: Removing the deprecated function tty_vhangup_locked()

2013-12-17 Thread Chuansheng Liu

The function tty_vhangup_locked() was deprecated, removed it
from the tty.h also.

Signed-off-by: Liu, Chuansheng 
---
 include/linux/tty.h |1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/tty.h b/include/linux/tty.h
index 97d660e..a98c85f 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -422,7 +422,6 @@ extern int is_ignored(int sig);
 extern int tty_signal(int sig, struct tty_struct *tty);
 extern void tty_hangup(struct tty_struct *tty);
 extern void tty_vhangup(struct tty_struct *tty);
-extern void tty_vhangup_locked(struct tty_struct *tty);
 extern void tty_unhangup(struct file *filp);
 extern int tty_hung_up_p(struct file *filp);
 extern void do_SAK(struct tty_struct *tty);
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


cirrusdrmfb broken with simplefb

2013-12-17 Thread Takashi Iwai
Hi,

with the recent enablement of simplefb on x86, cirrusdrmfb on QEMU/KVM
gets broken now, as reported at:
https://bugzilla.novell.com/show_bug.cgi?id=855821

The cirrus VGA resource is reserved at first as "BOOTFB" in
arch/x86/kernel/sysfb_simplefb.c, which is taken by simplefb platform
device.  This resource is, however, never released until the platform
device is destroyed, and the framebuffer switching doesn't trigger
it.  It calls fb's destroy callback, at most.  Then, cirrus driver
tries to assign the resource, fails and gives up, resulting in a
complete blank screen.

The same problem should exist on other KMS drivers like mgag200 or
ast, not only cirrus.  Intel graphics doesn't hit this problem just
because the reserved iomem by BOOTFB isn't required by i915 driver.

The patch below is a quick attempt to solve the issue.  It adds a new
API function for releasing resources of platform_device, and call it
in destroy op of simplefb.  But, forcibly releasing resources of a
parent device doesn't sound like a correct design.  We may take such
as a band aid, but definitely need a more fundamental fix.

Any thoughts?


thanks,

Takashi

---
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 3a94b79..f939236 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -267,6 +267,23 @@ int platform_device_add_data(struct platform_device *pdev, 
const void *data,
 }
 EXPORT_SYMBOL_GPL(platform_device_add_data);
 
+static void do_release_resources(struct platform_device *pdev, int nums)
+{
+   int i;
+
+   for (i = 0; i < nums; i++) {
+   struct resource *r = >resource[i];
+   unsigned long type = resource_type(r);
+
+   if (type == IORESOURCE_MEM || type == IORESOURCE_IO)
+   release_resource(r);
+   }
+
+   kfree(pdev->resource);
+   pdev->resource = NULL;
+   pdev->num_resources = 0;
+}
+
 /**
  * platform_device_add - add a platform device to device hierarchy
  * @pdev: platform device we're adding
@@ -342,13 +359,7 @@ int platform_device_add(struct platform_device *pdev)
pdev->id = PLATFORM_DEVID_AUTO;
}
 
-   while (--i >= 0) {
-   struct resource *r = >resource[i];
-   unsigned long type = resource_type(r);
-
-   if (type == IORESOURCE_MEM || type == IORESOURCE_IO)
-   release_resource(r);
-   }
+   do_release_resources(pdev, i - 1);
 
  err_out:
return ret;
@@ -365,8 +376,6 @@ EXPORT_SYMBOL_GPL(platform_device_add);
  */
 void platform_device_del(struct platform_device *pdev)
 {
-   int i;
-
if (pdev) {
device_del(>dev);
 
@@ -375,17 +384,17 @@ void platform_device_del(struct platform_device *pdev)
pdev->id = PLATFORM_DEVID_AUTO;
}
 
-   for (i = 0; i < pdev->num_resources; i++) {
-   struct resource *r = >resource[i];
-   unsigned long type = resource_type(r);
-
-   if (type == IORESOURCE_MEM || type == IORESOURCE_IO)
-   release_resource(r);
-   }
+   do_release_resources(pdev, pdev->num_resources);
}
 }
 EXPORT_SYMBOL_GPL(platform_device_del);
 
+void platform_device_release_resources(struct platform_device *pdev)
+{
+   do_release_resources(pdev, pdev->num_resources);
+}
+EXPORT_SYMBOL_GPL(platform_device_release_resources);
+
 /**
  * platform_device_register - add a platform-level device
  * @pdev: platform device we're adding
diff --git a/drivers/video/simplefb.c b/drivers/video/simplefb.c
index 210f3a0..fbf5e89 100644
--- a/drivers/video/simplefb.c
+++ b/drivers/video/simplefb.c
@@ -70,6 +70,7 @@ static void simplefb_destroy(struct fb_info *info)
 {
if (info->screen_base)
iounmap(info->screen_base);
+   platform_device_release_resources(to_platform_device(info->device));
 }
 
 static struct fb_ops simplefb_ops = {
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 16f6654..7cc1f54 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -42,6 +42,7 @@ struct platform_device {
 
 extern int platform_device_register(struct platform_device *);
 extern void platform_device_unregister(struct platform_device *);
+extern void platform_device_release_resources(struct platform_device *pdev);
 
 extern struct bus_type platform_bus_type;
 extern struct device platform_bus;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Dec 18

2013-12-17 Thread Stephen Rothwell
Hi all,

Changes since 20131217:

Linus' tree lost its build failure.

The powerpc tree still had its build failure for which I applied a
supplied patch.

The pm tree gained a build failure so I used the version from
next-20131217.

The net-next tree gained a conflict against Linus' tree and a build
failure so I used the version from next-20131217.

The drm-intel tree gained a conflict against Linus' tree.

The mmc tree still had its build failure so I used the version from
next-20131212.

The block tree gained a conflict against the f2fs tree.

The usb tree gained a conflict against the usb.current tree.

The usb-gadget tree gained a conflict against the usb.current tree, but
still had its build failure so I used the version from next-20131206.

The akpm-current tree still had its build failures for which I applied
patches.

Non-merge commits (relative to Linus' tree): -l4371
 4666 files changed, 211107 insertions(+), 116068 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 209 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwell 

$ git checkout master
$ git reset --hard stable
Merging origin/master (b0031f227e47 Merge tag 's2mps11-build' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator)
Merging fixes/master (b0031f227e47 Merge tag 's2mps11-build' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (319e2e3f63c3 Linux 3.13-rc4)
Merging arm-current/fixes (b713aa0b1501 ARM: fix asm/memory.h build error)
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (803c2d2f84da powerpc/powernv: Fix OPAL LPC access 
in Little Endian)
Merging sparc/master (1de425c7b271 sparc64: Fix build regression)
Merging net/master (a7c12639bdf5 Merge branch 'qlcnic')
Merging ipsec/master (239c78db9c41 net: clear local_df when passing skb between 
namespaces)
Merging sound-current/for-linus (ed697e1aaf72 ALSA: Add SNDRV_PCM_STATE_PAUSED 
case in wait_for_avail function)
Merging pci-current/for-linus (f0b75693cbb2 MAINTAINERS: Add DesignWare, i.MX6, 
Armada, R-Car PCI host maintainers)
Merging wireless/master (73f0b56a1ff6 ath9k: Fix interrupt handling for the 
AR9002 family)
Merging driver-core.current/driver-core-linus (a8b14744429f sysfs: give 
different locking key to regular and bin files)
Merging tty.current/tty-linus (1075a6e2dc7e n_tty: Fix apparent order of echoed 
output)
Merging usb.current/usb-linus (fb5f1834c322 usb: ohci-at91: fix irq and iomem 
resource retrieval)
Merging staging.current/staging-linus (c6236c0ce39c staging: comedi: drivers: 
fix return value of comedi_load_firmware())
Merging char-misc.current/char-misc-linus (319e2e3f63c3 Linux 3.13-rc4)
Merging input-current/for-linus (2a4d81547b88 Input: define KEY_WWAN for 
Wireless WAN)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (389a5390583a crypto: scatterwalk - Use 
sg_chain_ptr on chain entries)
Merging ide/master (c2

Re: [PATCH v2] drm/bochs: new driver

2013-12-17 Thread Gerd Hoffmann
On Mi, 2013-12-18 at 11:52 +1000, Dave Airlie wrote:
> On Wed, Dec 18, 2013 at 3:04 AM, Gerd Hoffmann  wrote:
> > DRM driver for (virtual) vga cards using the bochs dispi
> > interface, such as the qemu standard vga (qemu -vga std).
> >
> > Don't bother supporting anything but 32bpp for now, even
> > though the virtual hardware is able to do that.
> 
> Hi Gerd,
> 
> just took a quick look over this and it seems in pretty good shape,
> the one worry I have is if you've tested with vesafb loaded, since you
> do some pci_request_regions and fail hard, if vesafb has some of the
> resources this can end up failing the driver load for no good reason.
> I haven't verified there is a problem here its just something we've
> had in the past.

Tested, works.  bochs_pci_probe handles it before calling
drm_get_pci_dev, simliar to cirrus.

cheers,
  Gerd


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] a multiplication overflow in drivers/pps/pps.c

2013-12-17 Thread xqx12
there is an overflow in the following code :

ticks = fdata.timeout.sec * HZ;

while ticks is a signed 64-bit, but the result of fdata.timeout.sec *
HZ will be converted be 32-bit first. So ticks will be a wrong value
after multiplication overflow.

Reported-by: Qixue Xiao 
Suggested-by: Yongjian Xu 
Suggested-by: Yu Chen 
Signed-off-by: Qixue Xiao 
---
 drivers/pps/pps.c |2 +-
 gentags.sh|4 +++
 memory_leak.txt   |   88 +
 3 files changed, 93 insertions(+), 1 deletion(-)
 create mode 100755 gentags.sh
 create mode 100644 memory_leak.txt

diff --git a/drivers/pps/pps.c b/drivers/pps/pps.c
index 2f07cd6..44ddd22 100644
--- a/drivers/pps/pps.c
+++ b/drivers/pps/pps.c
@@ -164,7 +164,7 @@ static long pps_cdev_ioctl(struct file *file,
dev_dbg(pps->dev, "timeout %lld.%09d\n",
(long long) fdata.timeout.sec,
fdata.timeout.nsec);
-   ticks = fdata.timeout.sec * HZ;
+   ticks = (s64)(fdata.timeout.sec) * HZ;
ticks += fdata.timeout.nsec / (NSEC_PER_SEC / HZ);
 
if (ticks != 0) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2

2013-12-17 Thread Fengguang Wu
Hi Mel,

I'd like to share some test numbers with your patches applied on top of 
v3.13-rc3.

Basically there are

1) no big performance changes

  76628486   -0.7%   76107841   TOTAL vm-scalability.throughput
407038   +1.2% 412032   TOTAL hackbench.throughput
 50307   -1.5%  49549   TOTAL ebizzy.throughput

2) huge proc-vmstat.nr_tlb_* increases

  99986527 +3e+14%  2.988e+20   TOTAL 
proc-vmstat.nr_tlb_local_flush_one
 3.812e+08   +2.2e+13%  8.393e+19   TOTAL 
proc-vmstat.nr_tlb_remote_flush_received
 3.301e+08   +2.2e+13%  7.241e+19   TOTAL 
proc-vmstat.nr_tlb_remote_flush
   5990864   +1.2e+15%  7.032e+19   TOTAL 
proc-vmstat.nr_tlb_local_flush_all

Here are the detailed numbers. eabb1f89905a0c809d13 is the HEAD commit
with 4 patches applied. The "~ N%" notations are the stddev percent.
The "[+-] N%" notations are the increase/decrease percent. The
brickland2, lkp-snb01, lkp-ib03 etc. are testbox names.

  v3.13-rc3   eabb1f89905a0c809d13  
---  -  
   3345155 ~ 0%  -0.3%3335172 ~ 0%  
brickland2/micro/vm-scalability/16G-shm-pread-rand-mt
  33249939 ~ 0%  +3.3%   34336155 ~ 1%  
brickland2/micro/vm-scalability/1T-shm-pread-seq
   4669392 ~ 0%  -0.2%4660378 ~ 0%  
brickland2/micro/vm-scalability/300s-anon-r-rand
  18822426 ~ 5% -10.2%   1691 ~ 0%  
brickland2/micro/vm-scalability/300s-anon-r-seq-mt
   4993937 ~ 1%  +4.6%5221846 ~ 2%  
brickland2/micro/vm-scalability/300s-anon-rx-rand-mt
   4010960 ~ 0%  +0.4%4025880 ~ 0%  
brickland2/micro/vm-scalability/300s-anon-rx-seq-mt
   7536676 ~ 0%  +1.1%7617297 ~ 0%  
brickland2/micro/vm-scalability/300s-lru-file-readtwice
  76628486   -0.7%   76107841   TOTAL vm-scalability.throughput

  v3.13-rc3   eabb1f89905a0c809d13  
---  -  
 88901 ~ 2%  -3.1%  86131 ~ 0%  
brickland2/micro/hackbench/600%-process-pipe
153250 ~ 2%  +3.1% 157931 ~ 1%  
brickland2/micro/hackbench/600%-process-socket
164886 ~ 1%  +1.9% 167969 ~ 0%  
lkp-snb01/micro/hackbench/1600%-threads-pipe
407038   +1.2% 412032   TOTAL hackbench.throughput

  v3.13-rc3   eabb1f89905a0c809d13  
---  -  
 50307 ~ 1%  -1.5%  49549 ~ 0%  lkp-ib03/micro/ebizzy/400%-5-30
 50307   -1.5%  49549   TOTAL ebizzy.throughput

  v3.13-rc3   eabb1f89905a0c809d13  
---  -  
270328 ~ 0%-100.0%  0 ~ 0%  avoton1/crypto/tcrypt/2s-505-509
512691 ~ 0%  +4.7e+14%  2.412e+18 ~51%  
brickland1/micro/will-it-scale/futex1
510718 ~ 1%  +2.8e+14%  1.408e+18 ~83%  
brickland1/micro/will-it-scale/futex2
514847 ~ 0%  +1.5e+14%   7.66e+17 ~44%  
brickland1/micro/will-it-scale/getppid1
512854 ~ 0%  +1.4e+14%  7.159e+17 ~34%  brickland1/micro/will-it-scale/lock1
516614 ~ 0%  +8.1e+13%  4.189e+17 ~82%  
brickland1/micro/will-it-scale/lseek1
514457 ~ 1%  +2.2e+14%   1.12e+18 ~71%  
brickland1/micro/will-it-scale/lseek2
533138 ~ 0%  +4.8e+14%  2.561e+18 ~33%  
brickland1/micro/will-it-scale/malloc2
518503 ~ 0%  +2.7e+14%  1.414e+18 ~74%  brickland1/micro/will-it-scale/open1
512378 ~ 0%  +2.4e+14%  1.232e+18 ~56%  brickland1/micro/will-it-scale/open2
515078 ~ 0%  +1.8e+14%  9.444e+17 ~23%  
brickland1/micro/will-it-scale/page_fault1
511034 ~ 0%  +1.1e+14%  5.572e+17 ~43%  
brickland1/micro/will-it-scale/page_fault2
516217 ~ 0%  +2.8e+14%  1.457e+18 ~57%  
brickland1/micro/will-it-scale/page_fault3
513735 ~ 0%  +4.5e+13%   2.32e+17 ~75%  brickland1/micro/will-it-scale/pipe1
513640 ~ 1%  +7.3e+14%  3.766e+18 ~31%  brickland1/micro/will-it-scale/poll1
515473 ~ 0%  +6.1e+14%  3.138e+18 ~24%  brickland1/micro/will-it-scale/poll2
517039 ~ 0%+2e+14%  1.032e+18 ~48%  
brickland1/micro/will-it-scale/posix_semaphore1
513686 ~ 0%+2e+14%  1.045e+18 ~107%  
brickland1/micro/will-it-scale/pread1
517218 ~ 1%  +1.7e+14%  8.752e+17 ~57%  
brickland1/micro/will-it-scale/pread2
514904 ~ 0%  +1.2e+14%  6.399e+17 ~46%  
brickland1/micro/will-it-scale/pthread_mutex1
512881 ~ 0%  +2.6e+14%  1.314e+18 ~47%  
brickland1/micro/will-it-scale/pthread_mutex2
512844 ~ 0%  +3.1e+14%   1.57e+18 ~91%  
brickland1/micro/will-it-scale/pwrite1
516859 ~ 0%  +2.9e+14%  1.512e+18 ~37%  
brickland1/micro/will-it-scale/pwrite2
513227 ~ 0%  +6.9e+13%  3.518e+17 ~90%  brickland1/micro/will-it-scale/read1
518291 ~ 0%  +3.6e+14%  1.875e+18 ~18%  brickland1/micro/will-it-scale/read2
517795 ~ 0%  +4.5e+14%  2.306e+18 ~53%  
brickland1/micro/will-it-scale/readseek
521558 ~ 0%  +4.3e+14%  2.252e+18 ~41%  
brickland1/micro/will-it-scale/sched_yield
518017 ~ 1%  +1.5e+14%   7.85e+17 ~42%  
brickland1/micro/will-it-scale/unlink2
514742 ~ 0%+4e+14% 

Re: possible regression on 3.13 when calling flush_dcache_page

2013-12-17 Thread Joonsoo Kim
On Mon, Dec 16, 2013 at 03:43:43PM +0100, Ludovic Desroches wrote:
> Hello,
> 
> On Fri, Dec 13, 2013 at 10:59:09AM +0900, Joonsoo Kim wrote:
> > On Thu, Dec 12, 2013 at 03:36:19PM +0100, Ludovic Desroches wrote:
> > > fix mmc mailing list address error
> > > 
> > > On Thu, Dec 12, 2013 at 03:31:50PM +0100, Ludovic Desroches wrote:
> > > > Hi,
> > > > 
> > > > With v3.13-rc3 I have an error when the atmel-mci driver calls
> > > > flush_dcache_page (log at the end of the message).
> > > > 
> > > > Since I didn't have it before, I did a git bisect and the commit 
> > > > introducing
> > > > the error is the following one:
> > > > 
> > > > 106a74e slab: replace free and inuse in struct slab with newly 
> > > > introduced active
> > > > 
> > > > I don't know if this commit has introduced a bug or if it has revealed 
> > > > a bug
> > > > in the atmel-mci driver.
> > 
> > Hello,
> > 
> > I think that this commit may not introduce a bug. This patch remove one
> > variable on slab management structure and replace variable name. So there
> > is no functional change.
> > 
> 
> If I have reverted this patch and other ones you did on top of it and
> the issue disappear.

Hello,

Could you give me your '/proc/slabinfo' before/after this commit (106a74e)?

And how about testing with artificially increasing size of struct slab on
top of this commit (106a74e)?

I really wonder why the problem happens, because this doesn't cause any
functional change as far as I know. Only side-effect from this patch is
decreasing size of struct slab.

Thanks.

diff --git a/mm/slab.c b/mm/slab.c
index 2ec2336..d2240fd 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -174,6 +174,7 @@ struct slab {
struct {
struct list_head list;
void *s_mem;/* including colour offset */
+   unsigned int x;
unsigned int active;/* num of objs active in slab */
};
 };

> 
> > I doubt that side-effect of this patch reveals a bug in other place.
> > Side-effect is reduced memory usage for slab management structure. It would
> > makes some slabs have more objects with more density since slab management
> > structure is sometimes on the page for objects. So if it diminishes, more
> > objects can be in the page.
> > 
> > Anyway, I will look at it more. If you have any progress, please let me 
> > know.
> 
> No progress at the moment.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iscsi: conn error (1020) each time iscsi session logout

2013-12-17 Thread Mike Christie
On 12/18/2013 12:47 AM, Vaughan Cao wrote:
> We do a normal login/logout process to iscsi server. iscsiadm report success,
> but we always see the following error just before conn shutdown in dmesg.
> 
> Oct 15 05:30:09 vmhodtest019 iscsid: Connection1:0 to [target:
> iqn.1986-03.com.sun:02:7b863a18-045a-cb04-c686-841f17df2f9c, portal:
> 10.182.32.162,3260] through [iface: default] is operational now
> Oct 15 05:30:42 vmhodtest019 kernel:  connection1:0: detected conn error
> (1020)
> Oct 15 05:30:42 vmhodtest019 iscsid: Connection1:0 to [target:
> iqn.1986-03.com.sun:02:7b863a18-045a-cb04-c686-841f17df2f9c, portal:
> 10.182.32.162,3260] through [iface: default] is shutdown.
> 
> It's because iscsi_tcp module evaluates socket state in data_ready() callback,
>  and that detect the socket close. However, this socket close on target peer 
> is in response to the logout request from initiator. So this is not an error 
> that should be reported out. I quiesce it by checking session state and err 
> value accordingly.
> 
> Signed-off-by: Vaughan Cao 
> ---
>  drivers/scsi/libiscsi.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 415f2c0..84171ef 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1360,6 +1360,12 @@ void iscsi_conn_failure(struct iscsi_conn *conn, enum 
> iscsi_err err)
>   spin_unlock_bh(>lock);
>   return;
>   }
> + /* Target closed the connection in response to logout */
> + if (session->state == ISCSI_STATE_LOGGING_OUT &&
> + err == ISCSI_ERR_TCP_CONN_CLOSE) {
> + spin_unlock_bh(>lock);
> + return;
> + }
>  
>   if (conn->stop_stage == 0)
>   session->state = ISCSI_STATE_FAILED;
> 


Someone just sent a patch for this.

commit c712495e687e221b00bddae96247dbf6ffbc6200
Author: Chris Leech 
Date:   Thu Sep 26 09:09:44 2013 -0700

[SCSI] iscsi_tcp: consider session state in iscsi_sw_sk_state_check


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/9] Reuse davinci-nand driver for Keystone arch

2013-12-17 Thread Brian Norris
On Tue, Dec 17, 2013 at 02:59:06PM +0200, Ivan Khoronzhuk wrote:
> This series contains fixes and updates of Davinci nand driver in
> order to reuse it for Keystone platform.
> 
> v3..v4:
> - mtd: nand: davinci: fix driver registration
>   dropped __init/__exit/__exit_p as module_platform_driver() is used
> - mtd: nand: davinci: adjust DT properties to MTD generic
>   used of_get_nand_bus_width() helper
> - mtd: nand: davinci: don't request AEMIF address range
>   added comment at code change

Pushed the series to l2-mtd.git. Thanks!

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 03/14] mm, hugetlb: protect region tracking via newly introduced resv_map lock

2013-12-17 Thread Joonsoo Kim
There is a race condition if we map a same file on different processes.
Region tracking is protected by mmap_sem and hugetlb_instantiation_mutex.
When we do mmap, we don't grab a hugetlb_instantiation_mutex, but,
grab a mmap_sem. This doesn't prevent other process to modify region
structure, so it can be modified by two processes concurrently.

To solve this, I introduce a lock to resv_map and make region manipulation
function grab a lock before they do actual work. This makes region
tracking safe.

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 317b0a6..ee304d1 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -26,6 +26,7 @@ struct hugepage_subpool {
 
 struct resv_map {
struct kref refs;
+   spinlock_t lock;
struct list_head regions;
 };
 extern struct resv_map *resv_map_alloc(void);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3e7a44b..cf0eaff 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -135,15 +135,8 @@ static inline struct hugepage_subpool *subpool_vma(struct 
vm_area_struct *vma)
  * Region tracking -- allows tracking of reservations and instantiated pages
  *across the pages in a mapping.
  *
- * The region data structures are protected by a combination of the mmap_sem
- * and the hugetlb_instantiation_mutex.  To access or modify a region the 
caller
- * must either hold the mmap_sem for write, or the mmap_sem for read and
- * the hugetlb_instantiation_mutex:
- *
- * down_write(>mmap_sem);
- * or
- * down_read(>mmap_sem);
- * mutex_lock(_instantiation_mutex);
+ * The region data structures are embedded into a resv_map and
+ * protected by a resv_map's lock
  */
 struct file_region {
struct list_head link;
@@ -156,6 +149,7 @@ static long region_add(struct resv_map *resv, long f, long 
t)
struct list_head *head = >regions;
struct file_region *rg, *nrg, *trg;
 
+   spin_lock(>lock);
/* Locate the region we are either in or before. */
list_for_each_entry(rg, head, link)
if (f <= rg->to)
@@ -185,15 +179,18 @@ static long region_add(struct resv_map *resv, long f, 
long t)
}
nrg->from = f;
nrg->to = t;
+   spin_unlock(>lock);
return 0;
 }
 
 static long region_chg(struct resv_map *resv, long f, long t)
 {
struct list_head *head = >regions;
-   struct file_region *rg, *nrg;
+   struct file_region *rg, *nrg = NULL;
long chg = 0;
 
+retry:
+   spin_lock(>lock);
/* Locate the region we are before or in. */
list_for_each_entry(rg, head, link)
if (f <= rg->to)
@@ -203,15 +200,23 @@ static long region_chg(struct resv_map *resv, long f, 
long t)
 * Subtle, allocate a new region at the position but make it zero
 * size such that we can guarantee to record the reservation. */
if (>link == head || t < rg->from) {
-   nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
-   if (!nrg)
-   return -ENOMEM;
+   if (!nrg) {
+   spin_unlock(>lock);
+   nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
+   if (!nrg)
+   return -ENOMEM;
+
+   goto retry;
+   }
+
nrg->from = f;
nrg->to   = f;
INIT_LIST_HEAD(>link);
list_add(>link, rg->link.prev);
+   nrg = NULL;
 
-   return t - f;
+   chg = t - f;
+   goto out_locked;
}
 
/* Round our left edge to the current segment if it encloses us. */
@@ -224,7 +229,7 @@ static long region_chg(struct resv_map *resv, long f, long 
t)
if (>link == head)
break;
if (rg->from > t)
-   return chg;
+   goto out_locked;
 
/* We overlap with this area, if it extends further than
 * us then we must extend ourselves.  Account for its
@@ -235,6 +240,10 @@ static long region_chg(struct resv_map *resv, long f, long 
t)
}
chg -= rg->to - rg->from;
}
+
+out_locked:
+   spin_unlock(>lock);
+   kfree(nrg);
return chg;
 }
 
@@ -244,12 +253,13 @@ static long region_truncate(struct resv_map *resv, long 
end)
struct file_region *rg, *trg;
long chg = 0;
 
+   spin_lock(>lock);
/* Locate the region we are either in or before. */
list_for_each_entry(rg, head, link)
if (end <= rg->to)
break;
if (>link == head)
-   return 0;
+   goto out;
 
/* If we are in the middle of a region then adjust it. */
if (end > rg->from) {
@@ -266,6 +276,9 @@ static long region_truncate(struct resv_map *resv, long end)
list_del(>link);
 

[PATCH v3 05/14] mm, hugetlb: make vma_resv_map() works for all mapping type

2013-12-17 Thread Joonsoo Kim
Util now, we get a resv_map by two ways according to each mapping type.
This makes code dirty and unreadable. So unfiying it.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ef70b6f..f394454 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -417,13 +417,24 @@ void resv_map_release(struct kref *ref)
kfree(resv_map);
 }
 
+static inline struct resv_map *inode_resv_map(struct inode *inode)
+{
+   return inode->i_mapping->private_data;
+}
+
 static struct resv_map *vma_resv_map(struct vm_area_struct *vma)
 {
VM_BUG_ON(!is_vm_hugetlb_page(vma));
-   if (!(vma->vm_flags & VM_MAYSHARE))
+   if (vma->vm_flags & VM_MAYSHARE) {
+   struct address_space *mapping = vma->vm_file->f_mapping;
+   struct inode *inode = mapping->host;
+
+   return inode_resv_map(inode);
+
+   } else {
return (struct resv_map *)(get_vma_private_data(vma) &
~HPAGE_RESV_MASK);
-   return NULL;
+   }
 }
 
 static void set_vma_resv_map(struct vm_area_struct *vma, struct resv_map *map)
@@ -1174,48 +1185,34 @@ static void return_unused_surplus_pages(struct hstate 
*h,
 static long vma_needs_reservation(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr)
 {
-   struct address_space *mapping = vma->vm_file->f_mapping;
-   struct inode *inode = mapping->host;
-
-   if (vma->vm_flags & VM_MAYSHARE) {
-   pgoff_t idx = vma_hugecache_offset(h, vma, addr);
-   struct resv_map *resv = inode->i_mapping->private_data;
-
-   return region_chg(resv, idx, idx + 1);
+   struct resv_map *resv;
+   pgoff_t idx;
+   long chg;
 
-   } else if (!is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+   resv = vma_resv_map(vma);
+   if (!resv)
return 1;
 
-   } else  {
-   long err;
-   pgoff_t idx = vma_hugecache_offset(h, vma, addr);
-   struct resv_map *resv = vma_resv_map(vma);
+   idx = vma_hugecache_offset(h, vma, addr);
+   chg = region_chg(resv, idx, idx + 1);
 
-   err = region_chg(resv, idx, idx + 1);
-   if (err < 0)
-   return err;
-   return 0;
-   }
+   if (vma->vm_flags & VM_MAYSHARE)
+   return chg;
+   else
+   return chg < 0 ? chg : 0;
 }
 static void vma_commit_reservation(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr)
 {
-   struct address_space *mapping = vma->vm_file->f_mapping;
-   struct inode *inode = mapping->host;
-
-   if (vma->vm_flags & VM_MAYSHARE) {
-   pgoff_t idx = vma_hugecache_offset(h, vma, addr);
-   struct resv_map *resv = inode->i_mapping->private_data;
-
-   region_add(resv, idx, idx + 1);
+   struct resv_map *resv;
+   pgoff_t idx;
 
-   } else if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
-   pgoff_t idx = vma_hugecache_offset(h, vma, addr);
-   struct resv_map *resv = vma_resv_map(vma);
+   resv = vma_resv_map(vma);
+   if (!resv)
+   return;
 
-   /* Mark this page used in the map. */
-   region_add(resv, idx, idx + 1);
-   }
+   idx = vma_hugecache_offset(h, vma, addr);
+   region_add(resv, idx, idx + 1);
 }
 
 static struct page *alloc_huge_page(struct vm_area_struct *vma,
@@ -2278,7 +2275,7 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
 * after this open call completes.  It is therefore safe to take a
 * new reference here without additional locking.
 */
-   if (resv)
+   if (resv && is_vma_resv_set(vma, HPAGE_RESV_OWNER))
kref_get(>refs);
 }
 
@@ -2291,7 +2288,10 @@ static void hugetlb_vm_op_close(struct vm_area_struct 
*vma)
unsigned long start;
unsigned long end;
 
-   if (resv) {
+   if (!resv)
+   return;
+
+   if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
start = vma_hugecache_offset(h, vma, vma->vm_start);
end = vma_hugecache_offset(h, vma, vma->vm_end);
 
@@ -3185,7 +3185,7 @@ int hugetlb_reserve_pages(struct inode *inode,
 * called to make the mapping read-write. Assume !vma is a shm mapping
 */
if (!vma || vma->vm_flags & VM_MAYSHARE) {
-   resv_map = inode->i_mapping->private_data;
+   resv_map = inode_resv_map(inode);
 
chg = region_chg(resv_map, from, to);
 
@@ -3244,7 +3244,7 @@ out_err:
 void hugetlb_unreserve_pages(struct inode *inode, long offset, long freed)
 {
struct hstate *h = hstate_inode(inode);
-   struct resv_map *resv_map = inode->i_mapping->private_data;
+   struct resv_map *resv_map = inode_resv_map(inode);
long chg = 0;
struct 

[RFC Patch v1 10/13] ACPI, i2c-hid: replace open-coded _DSM specific code with helper functions

2013-12-17 Thread Jiang Liu
Use helper functions to simplify _DSM related code in i2c-hid driver.

Signed-off-by: Jiang Liu 
---
 drivers/hid/i2c-hid/i2c-hid.c |   26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/drivers/hid/i2c-hid/i2c-hid.c b/drivers/hid/i2c-hid/i2c-hid.c
index 5f7e55f..d22668f 100644
--- a/drivers/hid/i2c-hid/i2c-hid.c
+++ b/drivers/hid/i2c-hid/i2c-hid.c
@@ -850,37 +850,23 @@ static int i2c_hid_acpi_pdata(struct i2c_client *client,
0xF7, 0xF6, 0xDF, 0x3C, 0x67, 0x42, 0x55, 0x45,
0xAD, 0x05, 0xB3, 0x0A, 0x3D, 0x89, 0x38, 0xDE,
};
-   union acpi_object params[4];
-   struct acpi_object_list input;
+   union acpi_object *obj;
struct acpi_device *adev;
-   unsigned long long value;
acpi_handle handle;
 
handle = ACPI_HANDLE(>dev);
if (!handle || acpi_bus_get_device(handle, ))
return -ENODEV;
 
-   input.count = ARRAY_SIZE(params);
-   input.pointer = params;
-
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = sizeof(i2c_hid_guid);
-   params[0].buffer.pointer = i2c_hid_guid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = 1;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = 1; /* HID function */
-   params[3].type = ACPI_TYPE_PACKAGE;
-   params[3].package.count = 0;
-   params[3].package.elements = NULL;
-
-   if (ACPI_FAILURE(acpi_evaluate_integer(handle, "_DSM", ,
-   ))) {
+   obj = acpi_evaluate_dsm_typed(handle, i2c_hid_guid, 1, 1, NULL,
+ ACPI_TYPE_INTEGER);
+   if (!obj) {
dev_err(>dev, "device _DSM execution failed\n");
return -ENODEV;
}
 
-   pdata->hid_descriptor_address = value;
+   pdata->hid_descriptor_address = obj->integer.value;
+   ACPI_FREE(obj);
 
return 0;
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 12/13] nouveau: fix memory leak in ACPI _DSM related code

2013-12-17 Thread Jiang Liu
Fix memory leak in function nouveau_optimus_dsm() and nouveau_dsm().

Signed-off-by: Jiang Liu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 95c7404..03d4911 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -110,6 +110,7 @@ static int nouveau_optimus_dsm(acpi_handle handle, int 
func, int arg, uint32_t *
 
if (obj->type == ACPI_TYPE_INTEGER)
if (obj->integer.value == 0x8002) {
+   kfree(output.pointer);
return -ENODEV;
}
 
@@ -156,8 +157,10 @@ static int nouveau_dsm(acpi_handle handle, int func, int 
arg, uint32_t *result)
obj = (union acpi_object *)output.pointer;
 
if (obj->type == ACPI_TYPE_INTEGER)
-   if (obj->integer.value == 0x8002)
+   if (obj->integer.value == 0x8002) {
+   kfree(output.pointer);
return -ENODEV;
+   }
 
if (obj->type == ACPI_TYPE_BUFFER) {
if (obj->buffer.length == 4 && result) {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 13/13] ACPI, nouveau: replace open-coded _DSM specific code with helper functions

2013-12-17 Thread Jiang Liu
Use helper functions to simplify _DSM related code in nouveau driver.
After analyzing the ACPI _DSM related code, I changed nouveau_optimus_dsm()
to expect a buffer and nouveau_dsm() to expect an integer only.

Signed-off-by: Jiang Liu 
---
 drivers/gpu/drm/nouveau/core/subdev/mxm/base.c |   48 +++-
 drivers/gpu/drm/nouveau/nouveau_acpi.c |  139 +++-
 2 files changed, 54 insertions(+), 133 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c 
b/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c
index 1291204..13c5af8 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c
@@ -87,55 +87,39 @@ mxm_shadow_dsm(struct nouveau_mxm *mxm, u8 version)
0xB8, 0x9C, 0x79, 0xB6, 0x2F, 0xD5, 0x56, 0x65
};
u32 mxms_args[] = { 0x };
-   union acpi_object args[4] = {
-   /* _DSM MUID */
-   { .buffer.type = 3,
- .buffer.length = sizeof(muid),
- .buffer.pointer = muid,
-   },
-   /* spec says this can be zero to mean "highest revision", but
-* of course there's at least one bios out there which fails
-* unless you pass in exactly the version it supports..
-*/
-   { .integer.type = ACPI_TYPE_INTEGER,
- .integer.value = (version & 0xf0) << 4 | (version & 0x0f),
-   },
-   /* MXMS function */
-   { .integer.type = ACPI_TYPE_INTEGER,
- .integer.value = 0x0010,
-   },
-   /* Pointer to MXMS arguments */
-   { .buffer.type = ACPI_TYPE_BUFFER,
- .buffer.length = sizeof(mxms_args),
- .buffer.pointer = (char *)mxms_args,
-   },
+   union acpi_object argv4 = {
+   .buffer.type = ACPI_TYPE_BUFFER,
+   .buffer.length = sizeof(mxms_args),
+   .buffer.pointer = (char *)mxms_args,
};
-   struct acpi_object_list list = { ARRAY_SIZE(args), args };
-   struct acpi_buffer retn = { ACPI_ALLOCATE_BUFFER, NULL };
union acpi_object *obj;
acpi_handle handle;
-   int ret;
+   int rev;
 
handle = ACPI_HANDLE(>pdev->dev);
if (!handle)
return false;
 
-   ret = acpi_evaluate_object(handle, "_DSM", , );
-   if (ret) {
-   nv_debug(mxm, "DSM MXMS failed: %d\n", ret);
+   /*
+* spec says this can be zero to mean "highest revision", but
+* of course there's at least one bios out there which fails
+* unless you pass in exactly the version it supports..
+*/
+   rev = (version & 0xf0) << 4 | (version & 0x0f);
+   obj = acpi_evaluate_dsm(handle, muid, rev, 0x0010, );
+   if (!obj) {
+   nv_debug(mxm, "DSM MXMS failed\n");
return false;
}
 
-   obj = retn.pointer;
if (obj->type == ACPI_TYPE_BUFFER) {
mxm->mxms = kmemdup(obj->buffer.pointer,
 obj->buffer.length, GFP_KERNEL);
-   } else
-   if (obj->type == ACPI_TYPE_INTEGER) {
+   } else if (obj->type == ACPI_TYPE_INTEGER) {
nv_debug(mxm, "DSM MXMS returned 0x%llx\n", obj->integer.value);
}
 
-   kfree(obj);
+   ACPI_FREE(obj);
return mxm->mxms != NULL;
 }
 #endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 03d4911..7ac7234 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -77,127 +77,66 @@ static const char nouveau_op_dsm_muid[] = {
 
 static int nouveau_optimus_dsm(acpi_handle handle, int func, int arg, uint32_t 
*result)
 {
-   struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
-   struct acpi_object_list input;
-   union acpi_object params[4];
+   int i;
union acpi_object *obj;
-   int i, err;
char args_buff[4];
+   union acpi_object argv4 = {
+   .type = ACPI_TYPE_BUFFER,
+   .buffer.length = 4,
+   .buffer.pointer = args_buff
+   };
 
-   input.count = 4;
-   input.pointer = params;
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = sizeof(nouveau_op_dsm_muid);
-   params[0].buffer.pointer = (char *)nouveau_op_dsm_muid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = 0x0100;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = func;
-   params[3].type = ACPI_TYPE_BUFFER;
-   params[3].buffer.length = 4;
/* ACPI is little endian, AABBCCDD becomes {DD,CC,BB,AA} */
for (i = 0; i < 4; i++)
args_buff[i] = (arg >> i * 8) & 0xFF;
-   params[3].buffer.pointer = args_buff;
 
-   err = acpi_evaluate_object(handle, 

[RFC Patch v1 07/13] ACPI, TPM: matching node name instead of full path when searching for TPM device

2013-12-17 Thread Jiang Liu
When searching ACPI object for TPM device, it should match current
ACPI object name instead of the full path.

Signed-off-by: Jiang Liu 
---
 drivers/char/tpm/tpm_ppi.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_ppi.c b/drivers/char/tpm/tpm_ppi.c
index e1f3337..1e9cc11 100644
--- a/drivers/char/tpm/tpm_ppi.c
+++ b/drivers/char/tpm/tpm_ppi.c
@@ -30,7 +30,7 @@ static acpi_status ppi_callback(acpi_handle handle, u32 
level, void *context,
acpi_status status = AE_OK;
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 
-   if (ACPI_SUCCESS(acpi_get_name(handle, ACPI_FULL_PATHNAME, ))) {
+   if (ACPI_SUCCESS(acpi_get_name(handle, ACPI_SINGLE_NAME, ))) {
if (strstr(buffer.pointer, context) != NULL) {
*return_value = handle;
status = AE_CTRL_TERMINATE;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 05/13] PCI, pci-label: treat PCI label with index 0 as valid label

2013-12-17 Thread Jiang Liu
Current pci-label driver detects ACPI label by checking label index
returned by ACPI _DSM method, and treat it as valid if label index
is positive. According to ACPI Firmware specification 3.1, zero is
also an valid label index. So change code to detect availability of
ACPI slot label by checking availaiblity of ACPI _DSM function for
PCI label.

Signed-off-by: Jiang Liu 
---
 drivers/pci/pci-label.c |   34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
index f12dcd1..0260b14 100644
--- a/drivers/pci/pci-label.c
+++ b/drivers/pci/pci-label.c
@@ -187,7 +187,6 @@ static const char device_label_dsm_uuid[] = {
 };
 
 enum acpi_attr_enum {
-   ACPI_ATTR_NONE = 0,
ACPI_ATTR_LABEL_SHOW,
ACPI_ATTR_INDEX_SHOW,
 };
@@ -222,20 +221,16 @@ dsm_get_label(struct device *dev, char *buf, enum 
acpi_attr_enum attr)
if (obj->type == ACPI_TYPE_PACKAGE && obj->package.count == 2 &&
tmp[0].type == ACPI_TYPE_INTEGER &&
tmp[1].type == ACPI_TYPE_STRING) {
-   len = tmp[0].integer.value;
-   if (buf) {
-   /*
-* This second string element is optional even when
-* this _DSM is implemented; when not implemented,
-* this entry must return a null string.
-*/
-   if (attr == ACPI_ATTR_INDEX_SHOW)
-   scnprintf(buf, PAGE_SIZE, "%llu\n",
-   tmp->integer.value);
-   else if (attr == ACPI_ATTR_LABEL_SHOW)
-   dsm_label_utf16s_to_utf8s(tmp + 1, buf);
-   len = strlen(buf) > 0 ? strlen(buf) : -1;
-   }
+   /*
+* The second string element is optional even when
+* this _DSM is implemented; when not implemented,
+* this entry must return a null string.
+*/
+   if (attr == ACPI_ATTR_INDEX_SHOW)
+   scnprintf(buf, PAGE_SIZE, "%llu\n", tmp->integer.value);
+   else if (attr == ACPI_ATTR_LABEL_SHOW)
+   dsm_label_utf16s_to_utf8s(tmp + 1, buf);
+   len = strlen(buf) > 0 ? strlen(buf) : -1;
}
 
ACPI_FREE(obj);
@@ -246,7 +241,14 @@ dsm_get_label(struct device *dev, char *buf, enum 
acpi_attr_enum attr)
 static bool
 device_has_dsm(struct device *dev)
 {
-   return dsm_get_label(dev, NULL, ACPI_ATTR_NONE) > 0;
+   acpi_handle handle;
+
+   handle = ACPI_HANDLE(dev);
+   if (!handle)
+   return false;
+
+   return !!acpi_check_dsm(handle, device_label_dsm_uuid, 0x2,
+   1 << DEVICE_LABEL_DSM);
 }
 
 static umode_t
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 09/13] ACPI, TPM: detecting PPI features by checking availability of _DSM functions

2013-12-17 Thread Jiang Liu
Detecting physical presence interface features by checking availbility
of corresponding ACPI _DSM functions, it should be more accurate than
checking TPM version number.

Signed-off-by: Jiang Liu 
---
 drivers/char/tpm/tpm_ppi.c |   45 +++-
 1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/drivers/char/tpm/tpm_ppi.c b/drivers/char/tpm/tpm_ppi.c
index ad5143f..272c22d 100644
--- a/drivers/char/tpm/tpm_ppi.c
+++ b/drivers/char/tpm/tpm_ppi.c
@@ -23,39 +23,31 @@ static const u8 tpm_ppi_uuid[] = {
0x8D, 0x10, 0x08, 0x9D, 0x16, 0x53
 };
 
-static char *tpm_device_name = "TPM";
 static char tpm_ppi_version[PPI_VERSION_LEN + 1];
 static acpi_handle tpm_ppi_handle;
 
 static acpi_status ppi_callback(acpi_handle handle, u32 level, void *context,
void **return_value)
 {
-   acpi_status status = AE_OK;
-   struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
+   union acpi_object *obj;
 
-   status = acpi_get_name(handle, ACPI_SINGLE_NAME, );
-   if (ACPI_FAILURE(status))
+   if (!acpi_check_dsm(handle, tpm_ppi_uuid, TPM_PPI_REVISION_ID,
+   1 << TPM_PPI_FN_VERSION))
return AE_OK;
 
-   if (strstr(buffer.pointer, context) != NULL) {
-   union acpi_object *obj;
-
-   /* Cache version string */
-   obj = acpi_evaluate_dsm_typed(handle, tpm_ppi_uuid,
-   TPM_PPI_REVISION_ID, TPM_PPI_FN_VERSION,
-   NULL, ACPI_TYPE_STRING);
-   if (obj) {
-   strlcpy(tpm_ppi_version, obj->string.pointer,
-   PPI_VERSION_LEN + 1);
-   ACPI_FREE(obj);
-   }
-
-   *return_value = handle;
-   status = AE_CTRL_TERMINATE;
+   /* Cache version string */
+   obj = acpi_evaluate_dsm_typed(handle, tpm_ppi_uuid,
+ TPM_PPI_REVISION_ID, TPM_PPI_FN_VERSION,
+ NULL, ACPI_TYPE_STRING);
+   if (obj) {
+   strlcpy(tpm_ppi_version, obj->string.pointer,
+   PPI_VERSION_LEN + 1);
+   ACPI_FREE(obj);
}
-   kfree(buffer.pointer);
 
-   return status;
+   *return_value = handle;
+
+   return AE_CTRL_TERMINATE;
 }
 
 static inline union acpi_object *
@@ -118,7 +110,8 @@ static ssize_t tpm_store_ppi_request(struct device *dev,
 * is updated with function index from SUBREQ to SUBREQ2 since PPI
 * version 1.1
 */
-   if (strcmp(tpm_ppi_version, "1.1") >= 0)
+   if (acpi_check_dsm(tpm_ppi_handle, tpm_ppi_uuid, TPM_PPI_REVISION_ID,
+  1 << TPM_PPI_FN_SUBREQ2))
func = TPM_PPI_FN_SUBREQ2;
 
/*
@@ -272,7 +265,8 @@ static ssize_t show_ppi_operations(char *buf, u32 start, 
u32 end)
"User not required",
};
 
-   if (strcmp(tpm_ppi_version, "1.2") < 0)
+   if (!acpi_check_dsm(tpm_ppi_handle, tpm_ppi_uuid, TPM_PPI_REVISION_ID,
+   1 << TPM_PPI_FN_GETOPR))
return -EPERM;
 
tmp.type = ACPI_TYPE_INTEGER;
@@ -334,8 +328,7 @@ int tpm_add_ppi(struct kobject *parent)
 {
/* Cache TPM ACPI handle and version string */
acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT, ACPI_UINT32_MAX,
-   ppi_callback, NULL,
-   tpm_device_name, _ppi_handle);
+   ppi_callback, NULL, NULL, _ppi_handle);
if (tpm_ppi_handle == NULL)
return -ENODEV;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 11/13] ACPI, i915: replace open-coded _DSM specific code with helper functions

2013-12-17 Thread Jiang Liu
Use helper functions to simplify _DSM related code in i915 driver.

Function intel_dsm() is used to check functions supported by ACPI _DSM
method, but it has strange check for special value 0x8002. After
digging into nouveau driver, I think the check is copied from nouveau
driver and is useless for i915 driver, so remove it.

Signed-off-by: Jiang Liu 
---
 drivers/gpu/drm/i915/intel_acpi.c |  144 -
 1 file changed, 30 insertions(+), 114 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_acpi.c 
b/drivers/gpu/drm/i915/intel_acpi.c
index dfff090..1bfac94 100644
--- a/drivers/gpu/drm/i915/intel_acpi.c
+++ b/drivers/gpu/drm/i915/intel_acpi.c
@@ -12,8 +12,6 @@
 #include "i915_drv.h"
 
 #define INTEL_DSM_REVISION_ID 1 /* For Calpella anyway... */
-
-#define INTEL_DSM_FN_SUPPORTED_FUNCTIONS 0 /* No args */
 #define INTEL_DSM_FN_PLATFORM_MUX_INFO 1 /* No args */
 
 static struct intel_dsm_priv {
@@ -28,61 +26,6 @@ static const u8 intel_dsm_guid[] = {
0x0f, 0x13, 0x17, 0xb0, 0x1c, 0x2c
 };
 
-static int intel_dsm(acpi_handle handle, int func)
-{
-   struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
-   struct acpi_object_list input;
-   union acpi_object params[4];
-   union acpi_object *obj;
-   u32 result;
-   int ret = 0;
-
-   input.count = 4;
-   input.pointer = params;
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = sizeof(intel_dsm_guid);
-   params[0].buffer.pointer = (char *)intel_dsm_guid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = INTEL_DSM_REVISION_ID;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = func;
-   params[3].type = ACPI_TYPE_PACKAGE;
-   params[3].package.count = 0;
-   params[3].package.elements = NULL;
-
-   ret = acpi_evaluate_object(handle, "_DSM", , );
-   if (ret) {
-   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret);
-   return ret;
-   }
-
-   obj = (union acpi_object *)output.pointer;
-
-   result = 0;
-   switch (obj->type) {
-   case ACPI_TYPE_INTEGER:
-   result = obj->integer.value;
-   break;
-
-   case ACPI_TYPE_BUFFER:
-   if (obj->buffer.length == 4) {
-   result = (obj->buffer.pointer[0] |
-   (obj->buffer.pointer[1] <<  8) |
-   (obj->buffer.pointer[2] << 16) |
-   (obj->buffer.pointer[3] << 24));
-   break;
-   }
-   default:
-   ret = -EINVAL;
-   break;
-   }
-   if (result == 0x8002)
-   ret = -ENODEV;
-
-   kfree(output.pointer);
-   return ret;
-}
-
 static char *intel_dsm_port_name(u8 id)
 {
switch (id) {
@@ -137,83 +80,56 @@ static char *intel_dsm_mux_type(u8 type)
 
 static void intel_dsm_platform_mux_info(void)
 {
-   struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
-   struct acpi_object_list input;
-   union acpi_object params[4];
-   union acpi_object *pkg;
-   int i, ret;
-
-   input.count = 4;
-   input.pointer = params;
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = sizeof(intel_dsm_guid);
-   params[0].buffer.pointer = (char *)intel_dsm_guid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = INTEL_DSM_REVISION_ID;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = INTEL_DSM_FN_PLATFORM_MUX_INFO;
-   params[3].type = ACPI_TYPE_PACKAGE;
-   params[3].package.count = 0;
-   params[3].package.elements = NULL;
-
-   ret = acpi_evaluate_object(intel_dsm_priv.dhandle, "_DSM", ,
-  );
-   if (ret) {
-   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret);
-   goto out;
+   int i;
+   union acpi_object *pkg, *connector_count;
+
+   pkg = acpi_evaluate_dsm_typed(intel_dsm_priv.dhandle, intel_dsm_guid,
+   INTEL_DSM_REVISION_ID, INTEL_DSM_FN_PLATFORM_MUX_INFO,
+   NULL, ACPI_TYPE_PACKAGE);
+   if (!pkg) {
+   DRM_DEBUG_DRIVER("failed to evaluate _DSM\n");
+   return;
}
 
-   pkg = (union acpi_object *)output.pointer;
-
-   if (pkg->type == ACPI_TYPE_PACKAGE) {
-   union acpi_object *connector_count = >package.elements[0];
-   DRM_DEBUG_DRIVER("MUX info connectors: %lld\n",
- (unsigned long long)connector_count->integer.value);
-   for (i = 1; i < pkg->package.count; i++) {
-   union acpi_object *obj = >package.elements[i];
-   union acpi_object *connector_id =
-   >package.elements[0];
-   union acpi_object *info = 

[RFC Patch v1 04/13] ACPI, PCI: replace open-coded _DSM specific code with helper functions

2013-12-17 Thread Jiang Liu
Use helper functions to simplify _DSM related code in pci-label driver.
Also enforce more strict checks on objects returned by _DSM method.

Signed-off-by: Jiang Liu 
---
 drivers/pci/pci-label.c |  121 +--
 1 file changed, 34 insertions(+), 87 deletions(-)

diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
index f6e01a5..f12dcd1 100644
--- a/drivers/pci/pci-label.c
+++ b/drivers/pci/pci-label.c
@@ -195,80 +195,58 @@ enum acpi_attr_enum {
 static void dsm_label_utf16s_to_utf8s(union acpi_object *obj, char *buf)
 {
int len;
-   len = utf16s_to_utf8s((const wchar_t *)obj->
- package.elements[1].string.pointer,
- obj->package.elements[1].string.length,
+   len = utf16s_to_utf8s((const wchar_t *)obj->string.pointer,
+ obj->string.length,
  UTF16_LITTLE_ENDIAN,
  buf, PAGE_SIZE);
buf[len] = '\n';
 }
 
 static int
-dsm_get_label(acpi_handle handle, int func,
- struct acpi_buffer *output,
- char *buf, enum acpi_attr_enum attribute)
+dsm_get_label(struct device *dev, char *buf, enum acpi_attr_enum attr)
 {
-   struct acpi_object_list input;
-   union acpi_object params[4];
-   union acpi_object *obj;
-   int len = 0;
-
-   int err;
-
-   input.count = 4;
-   input.pointer = params;
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = sizeof(device_label_dsm_uuid);
-   params[0].buffer.pointer = (char *)device_label_dsm_uuid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = 0x02;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = func;
-   params[3].type = ACPI_TYPE_PACKAGE;
-   params[3].package.count = 0;
-   params[3].package.elements = NULL;
-
-   err = acpi_evaluate_object(handle, "_DSM", , output);
-   if (err)
+   acpi_handle handle;
+   union acpi_object *obj, *tmp;
+   int len = -1;
+
+   handle = ACPI_HANDLE(dev);
+   if (!handle)
return -1;
 
-   obj = (union acpi_object *)output->pointer;
-   if (obj->type == ACPI_TYPE_PACKAGE && obj->package.count == 2) {
-   len = obj->package.elements[0].integer.value;
+   obj = acpi_evaluate_dsm(handle, device_label_dsm_uuid, 0x2,
+   DEVICE_LABEL_DSM, NULL);
+   if (!obj)
+   return -1;
+
+   tmp = obj->package.elements;
+   if (obj->type == ACPI_TYPE_PACKAGE && obj->package.count == 2 &&
+   tmp[0].type == ACPI_TYPE_INTEGER &&
+   tmp[1].type == ACPI_TYPE_STRING) {
+   len = tmp[0].integer.value;
if (buf) {
-   if (attribute == ACPI_ATTR_INDEX_SHOW)
+   /*
+* This second string element is optional even when
+* this _DSM is implemented; when not implemented,
+* this entry must return a null string.
+*/
+   if (attr == ACPI_ATTR_INDEX_SHOW)
scnprintf(buf, PAGE_SIZE, "%llu\n",
-   obj->package.elements[0].integer.value);
-   else if (attribute == ACPI_ATTR_LABEL_SHOW)
-   dsm_label_utf16s_to_utf8s(obj, buf);
-   kfree(output->pointer);
-   return strlen(buf);
+   tmp->integer.value);
+   else if (attr == ACPI_ATTR_LABEL_SHOW)
+   dsm_label_utf16s_to_utf8s(tmp + 1, buf);
+   len = strlen(buf) > 0 ? strlen(buf) : -1;
}
-   kfree(output->pointer);
-   return len;
}
 
-   kfree(output->pointer);
+   ACPI_FREE(obj);
 
-   return -1;
+   return len;
 }
 
 static bool
 device_has_dsm(struct device *dev)
 {
-   acpi_handle handle;
-   struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
-
-   handle = ACPI_HANDLE(dev);
-
-   if (!handle)
-   return FALSE;
-
-   if (dsm_get_label(handle, DEVICE_LABEL_DSM, , NULL,
- ACPI_ATTR_NONE) > 0)
-   return TRUE;
-
-   return FALSE;
+   return dsm_get_label(dev, NULL, ACPI_ATTR_NONE) > 0;
 }
 
 static umode_t
@@ -287,44 +265,13 @@ acpi_index_string_exist(struct kobject *kobj, struct 
attribute *attr, int n)
 static ssize_t
 acpilabel_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
-   struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
-   acpi_handle handle;
-   int length;
-
-   handle = ACPI_HANDLE(dev);
-
-   if (!handle)
-   return -1;
-
-   length = dsm_get_label(handle, DEVICE_LABEL_DSM,
- 

[RFC Patch v1 06/13] ACPI, TPM: fix memory leak when walking ACPI namespace

2013-12-17 Thread Jiang Liu
In function ppi_callback(), memory allocated by acpi_get_name() will get
leaked when current device isn't the desired TPM device, so fix the
memory leak.

Signed-off-by: Jiang Liu 
Cc:  # 3.6
---
 drivers/char/tpm/tpm_ppi.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/char/tpm/tpm_ppi.c b/drivers/char/tpm/tpm_ppi.c
index 8e562dc..e1f3337 100644
--- a/drivers/char/tpm/tpm_ppi.c
+++ b/drivers/char/tpm/tpm_ppi.c
@@ -27,15 +27,18 @@ static char *tpm_device_name = "TPM";
 static acpi_status ppi_callback(acpi_handle handle, u32 level, void *context,
void **return_value)
 {
-   acpi_status status;
+   acpi_status status = AE_OK;
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
-   status = acpi_get_name(handle, ACPI_FULL_PATHNAME, );
-   if (strstr(buffer.pointer, context) != NULL) {
-   *return_value = handle;
+
+   if (ACPI_SUCCESS(acpi_get_name(handle, ACPI_FULL_PATHNAME, ))) {
+   if (strstr(buffer.pointer, context) != NULL) {
+   *return_value = handle;
+   status = AE_CTRL_TERMINATE;
+   }
kfree(buffer.pointer);
-   return AE_CTRL_TERMINATE;
}
-   return AE_OK;
+
+   return status;
 }
 
 static inline void ppi_assign_params(union acpi_object params[4],
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 08/13] ACPI, TPM: replace open-coded _DSM specific code with helper functions

2013-12-17 Thread Jiang Liu
Use helper functions to simplify _DSM related code in TPM driver.

Signed-off-by: Jiang Liu 
---
 drivers/char/tpm/tpm_ppi.c |  404 
 1 file changed, 145 insertions(+), 259 deletions(-)

diff --git a/drivers/char/tpm/tpm_ppi.c b/drivers/char/tpm/tpm_ppi.c
index 1e9cc11..ad5143f 100644
--- a/drivers/char/tpm/tpm_ppi.c
+++ b/drivers/char/tpm/tpm_ppi.c
@@ -2,15 +2,6 @@
 #include 
 #include "tpm.h"
 
-static const u8 tpm_ppi_uuid[] = {
-   0xA6, 0xFA, 0xDD, 0x3D,
-   0x1B, 0x36,
-   0xB4, 0x4E,
-   0xA4, 0x24,
-   0x8D, 0x10, 0x08, 0x9D, 0x16, 0x53
-};
-static char *tpm_device_name = "TPM";
-
 #define TPM_PPI_REVISION_ID1
 #define TPM_PPI_FN_VERSION 1
 #define TPM_PPI_FN_SUBREQ  2
@@ -24,250 +15,185 @@ static char *tpm_device_name = "TPM";
 #define PPI_VS_REQ_END 255
 #define PPI_VERSION_LEN3
 
+static const u8 tpm_ppi_uuid[] = {
+   0xA6, 0xFA, 0xDD, 0x3D,
+   0x1B, 0x36,
+   0xB4, 0x4E,
+   0xA4, 0x24,
+   0x8D, 0x10, 0x08, 0x9D, 0x16, 0x53
+};
+
+static char *tpm_device_name = "TPM";
+static char tpm_ppi_version[PPI_VERSION_LEN + 1];
+static acpi_handle tpm_ppi_handle;
+
 static acpi_status ppi_callback(acpi_handle handle, u32 level, void *context,
void **return_value)
 {
acpi_status status = AE_OK;
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 
-   if (ACPI_SUCCESS(acpi_get_name(handle, ACPI_SINGLE_NAME, ))) {
-   if (strstr(buffer.pointer, context) != NULL) {
-   *return_value = handle;
-   status = AE_CTRL_TERMINATE;
+   status = acpi_get_name(handle, ACPI_SINGLE_NAME, );
+   if (ACPI_FAILURE(status))
+   return AE_OK;
+
+   if (strstr(buffer.pointer, context) != NULL) {
+   union acpi_object *obj;
+
+   /* Cache version string */
+   obj = acpi_evaluate_dsm_typed(handle, tpm_ppi_uuid,
+   TPM_PPI_REVISION_ID, TPM_PPI_FN_VERSION,
+   NULL, ACPI_TYPE_STRING);
+   if (obj) {
+   strlcpy(tpm_ppi_version, obj->string.pointer,
+   PPI_VERSION_LEN + 1);
+   ACPI_FREE(obj);
}
-   kfree(buffer.pointer);
+
+   *return_value = handle;
+   status = AE_CTRL_TERMINATE;
}
+   kfree(buffer.pointer);
 
return status;
 }
 
-static inline void ppi_assign_params(union acpi_object params[4],
-u64 function_num)
+static inline union acpi_object *
+tpm_eval_dsm(int func, acpi_object_type type, union acpi_object *argv4)
 {
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = sizeof(tpm_ppi_uuid);
-   params[0].buffer.pointer = (char *)tpm_ppi_uuid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = TPM_PPI_REVISION_ID;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = function_num;
-   params[3].type = ACPI_TYPE_PACKAGE;
-   params[3].package.count = 0;
-   params[3].package.elements = NULL;
+   BUG_ON(!tpm_ppi_handle);
+   return acpi_evaluate_dsm_typed(tpm_ppi_handle, tpm_ppi_uuid,
+  TPM_PPI_REVISION_ID, func, argv4, type);
 }
 
 static ssize_t tpm_show_ppi_version(struct device *dev,
struct device_attribute *attr, char *buf)
 {
-   acpi_handle handle;
-   acpi_status status;
-   struct acpi_object_list input;
-   struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
-   union acpi_object params[4];
-   union acpi_object *obj;
-
-   input.count = 4;
-   ppi_assign_params(params, TPM_PPI_FN_VERSION);
-   input.pointer = params;
-   status = acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT,
-ACPI_UINT32_MAX, ppi_callback, NULL,
-tpm_device_name, );
-   if (ACPI_FAILURE(status))
-   return -ENXIO;
-
-   status = acpi_evaluate_object_typed(handle, "_DSM", , ,
-ACPI_TYPE_STRING);
-   if (ACPI_FAILURE(status))
-   return -ENOMEM;
-   obj = (union acpi_object *)output.pointer;
-   status = scnprintf(buf, PAGE_SIZE, "%s\n", obj->string.pointer);
-   kfree(output.pointer);
-   return status;
+   return scnprintf(buf, PAGE_SIZE, "%s\n", tpm_ppi_version);
 }
 
 static ssize_t tpm_show_ppi_request(struct device *dev,
struct device_attribute *attr, char *buf)
 {
-   acpi_handle handle;
-   acpi_status status;
-   struct acpi_object_list input;
-   struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
-   union acpi_object params[4];
-   union 

[RFC Patch v1 02/13] ACPI, extlog: replace open-coded _DSM specific code with helper functions

2013-12-17 Thread Jiang Liu
Use helper functions to simplify _DSM related code in acpi_extlog driver.
Also mark initialization data and functions with __init and __initdata
to reduce memory footprint.

Signed-off-by: Jiang Liu 
---
 drivers/acpi/acpi_extlog.c |   61 +---
 1 file changed, 12 insertions(+), 49 deletions(-)

diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index a6869e1..928c4db 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -20,11 +20,9 @@
 #define EXT_ELOG_ENTRY_MASKGENMASK_ULL(51, 0) /* elog entry address mask */
 
 #define EXTLOG_DSM_REV 0x0
-#defineEXTLOG_FN_QUERY 0x0
 #defineEXTLOG_FN_ADDR  0x1
 
 #define FLAG_OS_OPTIN  BIT(0)
-#define EXTLOG_QUERY_L1_EXIST  BIT(1)
 #define ELOG_ENTRY_VALID   (1ULL<<63)
 #define ELOG_ENTRY_LEN 0x1000
 
@@ -43,7 +41,7 @@ struct extlog_l1_head {
u8  rev1[12];
 };
 
-static u8 extlog_dsm_uuid[] = "663E35AF-CC10-41A4-88EA-5470AF055295";
+static u8 extlog_dsm_uuid[] __initdata = 
"663E35AF-CC10-41A4-88EA-5470AF055295";
 
 /* L1 table related physical address */
 static u64 elog_base;
@@ -153,62 +151,27 @@ static int extlog_print(struct notifier_block *nb, 
unsigned long val,
return NOTIFY_DONE;
 }
 
-static int extlog_get_dsm(acpi_handle handle, int rev, int func, u64 *ret)
+static bool __init extlog_get_l1addr(void)
 {
-   struct acpi_buffer buf = {ACPI_ALLOCATE_BUFFER, NULL};
-   struct acpi_object_list input;
-   union acpi_object params[4], *obj;
u8 uuid[16];
-   int i;
+   acpi_handle handle;
+   union acpi_object *obj;
 
acpi_str_to_uuid(extlog_dsm_uuid, uuid);
-   input.count = 4;
-   input.pointer = params;
-   params[0].type = ACPI_TYPE_BUFFER;
-   params[0].buffer.length = 16;
-   params[0].buffer.pointer = uuid;
-   params[1].type = ACPI_TYPE_INTEGER;
-   params[1].integer.value = rev;
-   params[2].type = ACPI_TYPE_INTEGER;
-   params[2].integer.value = func;
-   params[3].type = ACPI_TYPE_PACKAGE;
-   params[3].package.count = 0;
-   params[3].package.elements = NULL;
-
-   if (ACPI_FAILURE(acpi_evaluate_object(handle, "_DSM", , )))
-   return -1;
-
-   *ret = 0;
-   obj = (union acpi_object *)buf.pointer;
-   if (obj->type == ACPI_TYPE_INTEGER) {
-   *ret = obj->integer.value;
-   } else if (obj->type == ACPI_TYPE_BUFFER) {
-   if (obj->buffer.length <= 8) {
-   for (i = 0; i < obj->buffer.length; i++)
-   *ret |= (obj->buffer.pointer[i] << (i * 8));
-   }
-   }
-   kfree(buf.pointer);
-
-   return 0;
-}
-
-static bool extlog_get_l1addr(void)
-{
-   acpi_handle handle;
-   u64 ret;
 
if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", )))
return false;
-
-   if (extlog_get_dsm(handle, EXTLOG_DSM_REV, EXTLOG_FN_QUERY, ) ||
-   !(ret & EXTLOG_QUERY_L1_EXIST))
+   if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << EXTLOG_FN_ADDR))
return false;
-
-   if (extlog_get_dsm(handle, EXTLOG_DSM_REV, EXTLOG_FN_ADDR, ))
+   obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV,
+ EXTLOG_FN_ADDR, NULL, ACPI_TYPE_INTEGER);
+   if (!obj) {
return false;
+   } else {
+   l1_dirbase = obj->integer.value;
+   ACPI_FREE(obj);
+   }
 
-   l1_dirbase = ret;
/* Spec says L1 directory must be 4K aligned, bail out if it isn't */
if (l1_dirbase & ((1 << 12) - 1)) {
pr_warn(FW_BUG "L1 Directory is invalid at physical %llx\n",
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/2] mfd: rtsx: decrease driver size and add new device

2013-12-17 Thread Dan Carpenter
On Wed, Dec 18, 2013 at 10:03:11AM +0800, micky_ch...@realsil.com.cn wrote:
> From: Micky Ching 
> 
> we add a macro to simplify setting pull control, and use a common init
> function to init the common params for 8411-like chips. at last we add
> support for rtl8402 chip.
> 
> Micky Ching (2):
>   mfd: rtsx: add set pull control macro and simplify rtl8411
>   mfd: rtsx: add support for card reader rtl8402
> 
>  drivers/mfd/rtl8411.c  |   95 
> ++--
>  drivers/mfd/rtsx_pcr.c |5 +++
>  drivers/mfd/rtsx_pcr.h |9 +
>  3 files changed, 66 insertions(+), 43 deletions(-)
> 

Great.  Looks good to me.

Reviewed-by: Dan Carpenter 

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC Patch v1 01/13] ACPI: introduce helper interfaces to support ACPI _DSM method

2013-12-17 Thread Jiang Liu
There are several drivers making use of ACPI _DSM method to detect
and invoke device specific methods. Currently every driver has
implemented its private version to support ACPI _DSM method.
So this patch introduces three helper functions to support ACPI _DSM
method, which will be used to replace open-coded versions.

It helps to simplify code and improve code readability.

Signed-off-by: Jiang Liu 
---
 drivers/acpi/utils.c|   98 +++
 include/acpi/acpi_bus.h |   26 +
 2 files changed, 124 insertions(+)

diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
index 6d408bf..9517b0a 100644
--- a/drivers/acpi/utils.c
+++ b/drivers/acpi/utils.c
@@ -574,3 +574,101 @@ acpi_status acpi_evaluate_lck(acpi_handle handle, int 
lock)
 
return status;
 }
+
+/**
+ * acpi_evaluate_dsm: evaluate device's _DSM method
+ * @handle: ACPI device handle
+ * @uuid: UUID of requested functions, should be 16 bytes
+ * @rev: revision number of requested function
+ * @func: requested function number
+ * @argv4: the function specific parameter
+ *
+ * Evaluate device's _DSM method with specified UUID, revision id and
+ * function number. Caller needs to free the returned object.
+ *
+ * Though ACPI defines the fourth parameter for _DSM should be a package,
+ * some old BIOSes do expect a buffer or an integer etc.
+ */
+union acpi_object *
+acpi_evaluate_dsm(acpi_handle handle, const u8 *uuid, int rev, int func,
+ union acpi_object *argv4)
+{
+   acpi_status ret;
+   struct acpi_buffer buf = {ACPI_ALLOCATE_BUFFER, NULL};
+   union acpi_object params[4];
+   struct acpi_object_list input = {
+   .count = 4,
+   .pointer = params,
+   };
+
+   params[0].type = ACPI_TYPE_BUFFER;
+   params[0].buffer.length = 16;
+   params[0].buffer.pointer = (char *)uuid;
+   params[1].type = ACPI_TYPE_INTEGER;
+   params[1].integer.value = rev;
+   params[2].type = ACPI_TYPE_INTEGER;
+   params[2].integer.value = func;
+   if (argv4) {
+   params[3] = *argv4;
+   } else {
+   params[3].type = ACPI_TYPE_PACKAGE;
+   params[3].package.count = 0;
+   params[3].package.elements = NULL;
+   }
+
+   ret = acpi_evaluate_object(handle, "_DSM", , );
+   if (ACPI_SUCCESS(ret))
+   return (union acpi_object *)buf.pointer;
+
+   if (ret != AE_NOT_FOUND)
+   acpi_handle_warn(handle,
+   "failed to evaluate _DSM method (0x%x)\n", ret);
+
+   return NULL;
+}
+EXPORT_SYMBOL(acpi_evaluate_dsm);
+
+/**
+ * acpi_check_dsm: check whether _DSM method under @handle supports
+ *requested functions.
+ * @handle: ACPI device handle
+ * @uuid: UUID of requested functions, should be 16 bytes at least
+ * @rev: revision number of requested functions
+ * @funcs: bitmap of requested functions
+ * @exclude: excluding special value, used to support i915 and nouveau
+ *
+ * Evaluate device's _DSM method to check whether it supports requested
+ * functions. Currently only support 64 functions at maximum, should be
+ * enough for now.
+ */
+bool acpi_check_dsm(acpi_handle handle, const u8 *uuid, int rev, u64 funcs)
+{
+   int i;
+   u64 mask = 0;
+   union acpi_object *obj;
+
+   if (funcs == 0)
+   return false;
+
+   obj = acpi_evaluate_dsm(handle, uuid, rev, 0, NULL);
+   if (!obj)
+   return false;
+
+   /* For compatibility, old BIOSes may return an integer */
+   if (obj->type == ACPI_TYPE_INTEGER)
+   mask = obj->integer.value;
+   else if (obj->type == ACPI_TYPE_BUFFER)
+   for (i = 0; i < obj->buffer.length && i < 8; i++)
+   mask |= (((u8)obj->buffer.pointer[i]) << (i * 8));
+   ACPI_FREE(obj);
+
+   /*
+* Bit 0 indicates whether there's support for any functions other than
+* function 0 for the specified UUID and revision.
+*/
+   if ((mask & 0x1) && (mask & funcs) == funcs)
+   return true;
+
+   return false;
+}
+EXPORT_SYMBOL(acpi_check_dsm);
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index c602c77..efccf39 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -66,6 +66,32 @@ bool acpi_ata_match(acpi_handle handle);
 bool acpi_bay_match(acpi_handle handle);
 bool acpi_dock_match(acpi_handle handle);
 
+bool acpi_check_dsm(acpi_handle handle, const u8 *uuid, int rev, u64 funcs);
+union acpi_object *acpi_evaluate_dsm(acpi_handle handle, const u8 *uuid,
+   int rev, int func, union acpi_object *argv4);
+
+static inline union acpi_object *
+acpi_evaluate_dsm_typed(acpi_handle handle, const u8 *uuid, int rev, int func,
+   union acpi_object *argv4, acpi_object_type type)
+{
+   union acpi_object *obj;
+
+   obj = acpi_evaluate_dsm(handle, uuid, 

[RFC Patch v1 03/13] PCI, pci-label: release allocated ACPI object on error recovery path

2013-12-17 Thread Jiang Liu
Function dsm_get_label() leaks the returned ACPI object if
obj->package.count is not 2, so fix the possible memory leak.

Signed-off-by: Jiang Liu 
---
 drivers/pci/pci-label.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
index d51f45a..f6e01a5 100644
--- a/drivers/pci/pci-label.c
+++ b/drivers/pci/pci-label.c
@@ -233,11 +233,7 @@ dsm_get_label(acpi_handle handle, int func,
return -1;
 
obj = (union acpi_object *)output->pointer;
-
-   switch (obj->type) {
-   case ACPI_TYPE_PACKAGE:
-   if (obj->package.count != 2)
-   break;
+   if (obj->type == ACPI_TYPE_PACKAGE && obj->package.count == 2) {
len = obj->package.elements[0].integer.value;
if (buf) {
if (attribute == ACPI_ATTR_INDEX_SHOW)
@@ -250,10 +246,10 @@ dsm_get_label(acpi_handle handle, int func,
}
kfree(output->pointer);
return len;
-   break;
-   default:
-   kfree(output->pointer);
}
+
+   kfree(output->pointer);
+
return -1;
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 06/14] mm, hugetlb: remove vma_has_reserves()

2013-12-17 Thread Joonsoo Kim
vma_has_reserves() can be substituted by using return value of
vma_needs_reservation(). If chg returned by vma_needs_reservation()
is 0, it means that vma has reserves. Otherwise, it means that vma don't
have reserves and need a hugepage outside of reserve pool. This definition
is perfectly same as vma_has_reserves(), so remove vma_has_reserves().

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f394454..9d456d4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -469,39 +469,6 @@ void reset_vma_resv_huge_pages(struct vm_area_struct *vma)
vma->vm_private_data = (void *)0;
 }
 
-/* Returns true if the VMA has associated reserve pages */
-static int vma_has_reserves(struct vm_area_struct *vma, long chg)
-{
-   if (vma->vm_flags & VM_NORESERVE) {
-   /*
-* This address is already reserved by other process(chg == 0),
-* so, we should decrement reserved count. Without decrementing,
-* reserve count remains after releasing inode, because this
-* allocated page will go into page cache and is regarded as
-* coming from reserved pool in releasing step.  Currently, we
-* don't have any other solution to deal with this situation
-* properly, so add work-around here.
-*/
-   if (vma->vm_flags & VM_MAYSHARE && chg == 0)
-   return 1;
-   else
-   return 0;
-   }
-
-   /* Shared mappings always use reserves */
-   if (vma->vm_flags & VM_MAYSHARE)
-   return 1;
-
-   /*
-* Only the process that called mmap() has reserves for
-* private mappings.
-*/
-   if (is_vma_resv_set(vma, HPAGE_RESV_OWNER))
-   return 1;
-
-   return 0;
-}
-
 static void enqueue_huge_page(struct hstate *h, struct page *page)
 {
int nid = page_to_nid(page);
@@ -555,10 +522,11 @@ static struct page *dequeue_huge_page_vma(struct hstate 
*h,
/*
 * A child process with MAP_PRIVATE mappings created by their parent
 * have no page reserves. This check ensures that reservations are
-* not "stolen". The child may still get SIGKILLed
+* not "stolen". The child may still get SIGKILLed.
+* chg represents whether current user has a reserved hugepages or not,
+* so that we can use it to ensure that reservations are not "stolen".
 */
-   if (!vma_has_reserves(vma, chg) &&
-   h->free_huge_pages - h->resv_huge_pages == 0)
+   if (chg && h->free_huge_pages - h->resv_huge_pages == 0)
goto err;
 
/* If reserves cannot be used, ensure enough pages are in the pool */
@@ -577,7 +545,11 @@ retry_cpuset:
if (page) {
if (avoid_reserve)
break;
-   if (!vma_has_reserves(vma, chg))
+   /*
+* chg means whether current user allocates
+* a hugepage on the reserved pool or not
+*/
+   if (chg)
break;
 
SetPagePrivate(page);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 00/14] mm, hugetlb: remove a hugetlb_instantiation_mutex

2013-12-17 Thread Joonsoo Kim
* NOTE for v3
- Updating patchset is so late because of other works, not issue from
this patchset.

- While reviewing v2, David Gibson who had tried to remove this mutex long
time ago suggested that the race between concurrent call to
alloc_buddy_huge_page() in alloc_huge_page() is also prevented[2] since
this *new* hugepage from it is also contended page for the last allocation.
But I think that it is useless, since if some application's success depends
on the *new* hugepage from alloc_buddy_huge_page() rather than *reserved*
page, it's successful running cannot be guaranteed all the times. So I
don't implement it. Except this issue, there is no issue to this patchset.

* Changes in v3 (No big difference)
- Slightly modify cover-letter since Part 1. is already mereged.
- On patch 1-12, add Reviewed-by from "Aneesh Kumar K.V".
- Patches 1-12 and 14 are just rebased onto v3.13-rc4.
- Patch 13 is changed as following.
add comment on alloc_huge_page()
add in-flight user handling in alloc_huge_page_noerr()
minor code position changes (Suggested by David)

* Changes in v2
- Re-order patches to clear it's relationship
- sleepable object allocation(kmalloc) without holding a spinlock
(Pointed by Hillf)
- Remove vma_has_reserves, instead of vma_needs_reservation.
(Suggest by Aneesh and Naoya)
- Change a way of returning a hugepage back to reserved pool
(Suggedt by Naoya)

Without a hugetlb_instantiation_mutex, if parallel fault occur, we can
fail to allocate a hugepage, because many threads dequeue a hugepage
to handle a fault of same address. This makes reserved pool shortage
just for a little while and this causes faulting thread to get a SIGBUS
signal, although there are enough hugepages.

To solve this problem, we already have a nice solution, that is,
a hugetlb_instantiation_mutex. This blocks other threads to dive into
a fault handler. This solve the problem clearly, but it introduce
performance degradation, because it serialize all fault handling.

Now, I try to remove a hugetlb_instantiation_mutex to get rid of
performance problem reported by Davidlohr Bueso [1].

This patchset consist of 4 parts roughly.

Part 1. (Merged) Random fix and clean-up to enhance error handling.
These are already merged to mainline.

Part 2. (1-3) introduce new protection method for region tracking 
data structure, instead of the hugetlb_instantiation_mutex. There
is race condition when we map the hugetlbfs file to two different
processes. To prevent it, we need to new protection method like
as this patchset.

This can be merged into mainline separately.

Part 3. (4-7) clean-up.

IMO, these make code really simple, so these are worth to go into
mainline separately.

Part 4. (8-14) remove a hugetlb_instantiation_mutex.

Almost patches are just for clean-up to error handling path.
In patch 13, retry approach is implemented that if faulted thread
failed to allocate a hugepage, it continue to run a fault handler
until there is no concurrent thread having a hugepage. This causes
threads who want to get a last hugepage to be serialized, so
threads don't get a SIGBUS if enough hugepage exist.
In patch 14, remove a hugetlb_instantiation_mutex.

These patches are based on v3.13-rc4.

With applying these, I passed a libhugetlbfs test suite clearly which
have allocation-instantiation race test cases.

If there is something I should consider, please let me know!
Thanks.

[1] http://lwn.net/Articles/558863/ 
"[PATCH] mm/hugetlb: per-vma instantiation mutexes"
[2] https://lkml.org/lkml/2013/9/4/630

Joonsoo Kim (14):
  mm, hugetlb: unify region structure handling
  mm, hugetlb: region manipulation functions take resv_map rather
list_head
  mm, hugetlb: protect region tracking via newly introduced resv_map
lock
  mm, hugetlb: remove resv_map_put()
  mm, hugetlb: make vma_resv_map() works for all mapping type
  mm, hugetlb: remove vma_has_reserves()
  mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve
  mm, hugetlb: call vma_needs_reservation before entering
alloc_huge_page()
  mm, hugetlb: remove a check for return value of alloc_huge_page()
  mm, hugetlb: move down outside_reserve check
  mm, hugetlb: move up anon_vma_prepare()
  mm, hugetlb: clean-up error handling in hugetlb_cow()
  mm, hugetlb: retry if failed to allocate and there is concurrent user
  mm, hugetlb: remove a hugetlb_instantiation_mutex

 fs/hugetlbfs/inode.c|   17 +-
 include/linux/hugetlb.h |   11 ++
 mm/hugetlb.c|  401 +--
 3 files changed, 241 insertions(+), 188 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 07/14] mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve

2013-12-17 Thread Joonsoo Kim
Currently, we have two variable to represent whether we can use reserved
page or not, chg and avoid_reserve, respectively. With aggregating these,
we can have more clean code. This makes no functinoal difference.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9d456d4..9927407 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -508,8 +508,7 @@ static inline gfp_t htlb_alloc_mask(struct hstate *h)
 
 static struct page *dequeue_huge_page_vma(struct hstate *h,
struct vm_area_struct *vma,
-   unsigned long address, int avoid_reserve,
-   long chg)
+   unsigned long address, bool use_reserve)
 {
struct page *page = NULL;
struct mempolicy *mpol;
@@ -523,14 +522,10 @@ static struct page *dequeue_huge_page_vma(struct hstate 
*h,
 * A child process with MAP_PRIVATE mappings created by their parent
 * have no page reserves. This check ensures that reservations are
 * not "stolen". The child may still get SIGKILLed.
-* chg represents whether current user has a reserved hugepages or not,
-* so that we can use it to ensure that reservations are not "stolen".
+* Or, when parent process do COW, we cannot use reserved page.
+* In this case, ensure enough pages are in the pool.
 */
-   if (chg && h->free_huge_pages - h->resv_huge_pages == 0)
-   goto err;
-
-   /* If reserves cannot be used, ensure enough pages are in the pool */
-   if (avoid_reserve && h->free_huge_pages - h->resv_huge_pages == 0)
+   if (!use_reserve && h->free_huge_pages - h->resv_huge_pages == 0)
goto err;
 
 retry_cpuset:
@@ -543,13 +538,7 @@ retry_cpuset:
if (cpuset_zone_allowed_softwall(zone, htlb_alloc_mask(h))) {
page = dequeue_huge_page_node(h, zone_to_nid(zone));
if (page) {
-   if (avoid_reserve)
-   break;
-   /*
-* chg means whether current user allocates
-* a hugepage on the reserved pool or not
-*/
-   if (chg)
+   if (!use_reserve)
break;
 
SetPagePrivate(page);
@@ -1194,6 +1183,7 @@ static struct page *alloc_huge_page(struct vm_area_struct 
*vma,
struct hstate *h = hstate_vma(vma);
struct page *page;
long chg;
+   bool use_reserve;
int ret, idx;
struct hugetlb_cgroup *h_cg;
 
@@ -1209,18 +1199,19 @@ static struct page *alloc_huge_page(struct 
vm_area_struct *vma,
chg = vma_needs_reservation(h, vma, addr);
if (chg < 0)
return ERR_PTR(-ENOMEM);
-   if (chg || avoid_reserve)
+   use_reserve = (!chg && !avoid_reserve);
+   if (!use_reserve)
if (hugepage_subpool_get_pages(spool, 1))
return ERR_PTR(-ENOSPC);
 
ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), _cg);
if (ret) {
-   if (chg || avoid_reserve)
+   if (!use_reserve)
hugepage_subpool_put_pages(spool, 1);
return ERR_PTR(-ENOSPC);
}
spin_lock(_lock);
-   page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve, chg);
+   page = dequeue_huge_page_vma(h, vma, addr, use_reserve);
if (!page) {
spin_unlock(_lock);
page = alloc_buddy_huge_page(h, NUMA_NO_NODE);
@@ -1228,7 +1219,7 @@ static struct page *alloc_huge_page(struct vm_area_struct 
*vma,
hugetlb_cgroup_uncharge_cgroup(idx,
   pages_per_huge_page(h),
   h_cg);
-   if (chg || avoid_reserve)
+   if (!use_reserve)
hugepage_subpool_put_pages(spool, 1);
return ERR_PTR(-ENOSPC);
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 12/14] mm, hugetlb: clean-up error handling in hugetlb_cow()

2013-12-17 Thread Joonsoo Kim
Current code include 'Caller expects lock to be held' in every error path.
We can clean-up it as we do error handling in one place.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1817720..a9ae7d3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2577,6 +2577,7 @@ static int hugetlb_cow(struct mm_struct *mm, struct 
vm_area_struct *vma,
int outside_reserve = 0;
long chg;
bool use_reserve = false;
+   int ret = 0;
unsigned long mmun_start;   /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
 
@@ -2601,10 +2602,8 @@ retry_avoidcopy:
 * anon_vma prepared.
 */
if (unlikely(anon_vma_prepare(vma))) {
-   page_cache_release(old_page);
-   /* Caller expects lock to be held */
-   spin_lock(ptl);
-   return VM_FAULT_OOM;
+   ret = VM_FAULT_OOM;
+   goto out_old_page;
}
 
/*
@@ -2623,11 +2622,8 @@ retry_avoidcopy:
if (!outside_reserve) {
chg = vma_needs_reservation(h, vma, address);
if (chg < 0) {
-   page_cache_release(old_page);
-
-   /* Caller expects lock to be held */
-   spin_lock(ptl);
-   return VM_FAULT_OOM;
+   ret = VM_FAULT_OOM;
+   goto out_old_page;
}
use_reserve = !chg;
}
@@ -2661,9 +2657,8 @@ retry_avoidcopy:
WARN_ON_ONCE(1);
}
 
-   /* Caller expects lock to be held */
-   spin_lock(ptl);
-   return VM_FAULT_SIGBUS;
+   ret = VM_FAULT_SIGBUS;
+   goto out_lock;
}
 
copy_user_huge_page(new_page, old_page, address, vma,
@@ -2694,11 +2689,12 @@ retry_avoidcopy:
spin_unlock(ptl);
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
page_cache_release(new_page);
+out_old_page:
page_cache_release(old_page);
-
+out_lock:
/* Caller expects lock to be held */
spin_lock(ptl);
-   return 0;
+   return ret;
 }
 
 /* Return the pagecache page at a given address within a VMA */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 08/14] mm, hugetlb: call vma_needs_reservation before entering alloc_huge_page()

2013-12-17 Thread Joonsoo Kim
In order to validate that this failure is reasonable, we need to know
whether allocation request is for reserved or not on caller function.
So moving vma_needs_reservation() up to the caller of alloc_huge_page().
There is no functional change in this patch and following patch use
this information.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9927407..d960f46 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1177,13 +1177,11 @@ static void vma_commit_reservation(struct hstate *h,
 }
 
 static struct page *alloc_huge_page(struct vm_area_struct *vma,
-   unsigned long addr, int avoid_reserve)
+   unsigned long addr, int use_reserve)
 {
struct hugepage_subpool *spool = subpool_vma(vma);
struct hstate *h = hstate_vma(vma);
struct page *page;
-   long chg;
-   bool use_reserve;
int ret, idx;
struct hugetlb_cgroup *h_cg;
 
@@ -1196,10 +1194,6 @@ static struct page *alloc_huge_page(struct 
vm_area_struct *vma,
 * need pages and subpool limit allocated allocated if no reserve
 * mapping overlaps.
 */
-   chg = vma_needs_reservation(h, vma, addr);
-   if (chg < 0)
-   return ERR_PTR(-ENOMEM);
-   use_reserve = (!chg && !avoid_reserve);
if (!use_reserve)
if (hugepage_subpool_get_pages(spool, 1))
return ERR_PTR(-ENOSPC);
@@ -1244,7 +1238,7 @@ static struct page *alloc_huge_page(struct vm_area_struct 
*vma,
 struct page *alloc_huge_page_noerr(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve)
 {
-   struct page *page = alloc_huge_page(vma, addr, avoid_reserve);
+   struct page *page = alloc_huge_page(vma, addr, !avoid_reserve);
if (IS_ERR(page))
page = NULL;
return page;
@@ -2581,6 +2575,8 @@ static int hugetlb_cow(struct mm_struct *mm, struct 
vm_area_struct *vma,
struct hstate *h = hstate_vma(vma);
struct page *old_page, *new_page;
int outside_reserve = 0;
+   long chg;
+   bool use_reserve;
unsigned long mmun_start;   /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
 
@@ -2612,7 +2608,17 @@ retry_avoidcopy:
 
/* Drop page table lock as buddy allocator may be called */
spin_unlock(ptl);
-   new_page = alloc_huge_page(vma, address, outside_reserve);
+   chg = vma_needs_reservation(h, vma, address);
+   if (chg < 0) {
+   page_cache_release(old_page);
+
+   /* Caller expects lock to be held */
+   spin_lock(ptl);
+   return VM_FAULT_OOM;
+   }
+   use_reserve = !chg && !outside_reserve;
+
+   new_page = alloc_huge_page(vma, address, use_reserve);
 
if (IS_ERR(new_page)) {
long err = PTR_ERR(new_page);
@@ -2742,6 +2748,8 @@ static int hugetlb_no_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
struct address_space *mapping;
pte_t new_pte;
spinlock_t *ptl;
+   long chg;
+   bool use_reserve;
 
/*
 * Currently, we are forced to kill the process in the event the
@@ -2767,7 +2775,15 @@ retry:
size = i_size_read(mapping->host) >> huge_page_shift(h);
if (idx >= size)
goto out;
-   page = alloc_huge_page(vma, address, 0);
+
+   chg = vma_needs_reservation(h, vma, address);
+   if (chg == -ENOMEM) {
+   ret = VM_FAULT_OOM;
+   goto out;
+   }
+   use_reserve = !chg;
+
+   page = alloc_huge_page(vma, address, use_reserve);
if (IS_ERR(page)) {
ret = PTR_ERR(page);
if (ret == -ENOMEM)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 11/14] mm, hugetlb: move up anon_vma_prepare()

2013-12-17 Thread Joonsoo Kim
If we fail with a allocated hugepage, we need some effort to recover
properly. So, it is better not to allocate a hugepage as much as possible.
So move up anon_vma_prepare() which can be failed in OOM situation.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 03ab285..1817720 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2597,6 +2597,17 @@ retry_avoidcopy:
spin_unlock(ptl);
 
/*
+* When the original hugepage is shared one, it does not have
+* anon_vma prepared.
+*/
+   if (unlikely(anon_vma_prepare(vma))) {
+   page_cache_release(old_page);
+   /* Caller expects lock to be held */
+   spin_lock(ptl);
+   return VM_FAULT_OOM;
+   }
+
+   /*
 * If the process that created a MAP_PRIVATE mapping is about to
 * perform a COW due to a shared page count, attempt to satisfy
 * the allocation without using the existing reserves. The pagecache
@@ -2655,18 +2666,6 @@ retry_avoidcopy:
return VM_FAULT_SIGBUS;
}
 
-   /*
-* When the original hugepage is shared one, it does not have
-* anon_vma prepared.
-*/
-   if (unlikely(anon_vma_prepare(vma))) {
-   page_cache_release(new_page);
-   page_cache_release(old_page);
-   /* Caller expects lock to be held */
-   spin_lock(ptl);
-   return VM_FAULT_OOM;
-   }
-
copy_user_huge_page(new_page, old_page, address, vma,
pages_per_huge_page(h));
__SetPageUptodate(new_page);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 13/14] mm, hugetlb: retry if failed to allocate and there is concurrent user

2013-12-17 Thread Joonsoo Kim
If parallel fault occur, we can fail to allocate a hugepage,
because many threads dequeue a hugepage to handle a fault of same address.
This makes reserved pool shortage just for a little while and this cause
faulting thread who can get hugepages to get a SIGBUS signal.

To solve this problem, we already have a nice solution, that is,
a hugetlb_instantiation_mutex. This blocks other threads to dive into
a fault handler. This solve the problem clearly, but it introduce
performance degradation, because it serialize all fault handling.

Now, I try to remove a hugetlb_instantiation_mutex to get rid of
performance degradation. For achieving it, at first, we should ensure that
no one get a SIGBUS if there are enough hugepages.

For this purpose, if we fail to allocate a new hugepage when there is
concurrent user, we return just 0, instead of VM_FAULT_SIGBUS. With this,
these threads defer to get a SIGBUS signal until there is no
concurrent user, and so, we can ensure that no one get a SIGBUS if there
are enough hugepages.

Signed-off-by: Joonsoo Kim 

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ee304d1..daca347 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -255,6 +255,7 @@ struct hstate {
int next_nid_to_free;
unsigned int order;
unsigned long mask;
+   unsigned long nr_dequeue_users;
unsigned long max_huge_pages;
unsigned long nr_huge_pages;
unsigned long free_huge_pages;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a9ae7d3..843c554 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -538,6 +538,7 @@ retry_cpuset:
if (cpuset_zone_allowed_softwall(zone, htlb_alloc_mask(h))) {
page = dequeue_huge_page_node(h, zone_to_nid(zone));
if (page) {
+   h->nr_dequeue_users++;
if (!use_reserve)
break;
 
@@ -557,6 +558,15 @@ err:
return NULL;
 }
 
+static void commit_dequeued_huge_page(struct vm_area_struct *vma)
+{
+   struct hstate *h = hstate_vma(vma);
+
+   spin_lock(_lock);
+   h->nr_dequeue_users--;
+   spin_unlock(_lock);
+}
+
 static void update_and_free_page(struct hstate *h, struct page *page)
 {
int i;
@@ -1176,8 +1186,18 @@ static void vma_commit_reservation(struct hstate *h,
region_add(resv, idx, idx + 1);
 }
 
+/*
+ * alloc_huge_page() calls dequeue_huge_page_vma() and it would increase
+ * hstate's nr_dequeue_users if it gets a page from the queue. This
+ * nr_dequeue_users is used to prevent concurrent users who can get a page on
+ * the queue from being killed by SIGBUS. After determining if we actually use
+ * it or not, we should notify that we are done to hstate by calling
+ * commit_dequeued_huge_page().
+ */
 static struct page *alloc_huge_page(struct vm_area_struct *vma,
-   unsigned long addr, int use_reserve)
+   unsigned long addr, int use_reserve,
+   unsigned long *nr_dequeue_users,
+   bool *do_dequeue)
 {
struct hugepage_subpool *spool = subpool_vma(vma);
struct hstate *h = hstate_vma(vma);
@@ -1205,8 +1225,11 @@ static struct page *alloc_huge_page(struct 
vm_area_struct *vma,
return ERR_PTR(-ENOSPC);
}
spin_lock(_lock);
+   *do_dequeue = true;
+   *nr_dequeue_users = h->nr_dequeue_users;
page = dequeue_huge_page_vma(h, vma, addr, use_reserve);
if (!page) {
+   *do_dequeue = false;
spin_unlock(_lock);
page = alloc_buddy_huge_page(h, NUMA_NO_NODE);
if (!page) {
@@ -1238,9 +1261,16 @@ static struct page *alloc_huge_page(struct 
vm_area_struct *vma,
 struct page *alloc_huge_page_noerr(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve)
 {
-   struct page *page = alloc_huge_page(vma, addr, !avoid_reserve);
+   struct page *page;
+   unsigned long nr_dequeue_users;
+   bool do_dequeue = false;
+
+   page = alloc_huge_page(vma, addr, !avoid_reserve,
+   _dequeue_users, _dequeue);
if (IS_ERR(page))
page = NULL;
+   else if (do_dequeue)
+   commit_dequeued_huge_page(vma);
return page;
 }
 
@@ -1975,6 +2005,7 @@ void __init hugetlb_add_hstate(unsigned order)
h->mask = ~((1ULL << (order + PAGE_SHIFT)) - 1);
h->nr_huge_pages = 0;
h->free_huge_pages = 0;
+   h->nr_dequeue_users = 0;
for (i = 0; i < MAX_NUMNODES; ++i)
INIT_LIST_HEAD(>hugepage_freelists[i]);
INIT_LIST_HEAD(>hugepage_activelist);
@@ -2577,6 +2608,8 @@ static int hugetlb_cow(struct mm_struct *mm, struct 
vm_area_struct *vma,
int outside_reserve = 0;
long chg;
bool 

[PATCH v3 14/14] mm, hugetlb: remove a hugetlb_instantiation_mutex

2013-12-17 Thread Joonsoo Kim
Now, we have an infrastructure in order to remove a this awkward mutex
which serialize all faulting tasks, so remove it.

Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 843c554..6edf423 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2595,9 +2595,7 @@ static int unmap_ref_private(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
 /*
  * Hugetlb_cow() should be called with page lock of the original hugepage held.
- * Called with hugetlb_instantiation_mutex held and pte_page locked so we
- * cannot race with other handlers or page migration.
- * Keep the pte_same checks anyway to make transition from the mutex easier.
+ * Called with pte_page locked so we cannot race with page migration.
  */
 static int hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pte_t *ptep, pte_t pte,
@@ -2941,7 +2939,6 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
int ret;
struct page *page = NULL;
struct page *pagecache_page = NULL;
-   static DEFINE_MUTEX(hugetlb_instantiation_mutex);
struct hstate *h = hstate_vma(vma);
 
address &= huge_page_mask(h);
@@ -2961,17 +2958,9 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
if (!ptep)
return VM_FAULT_OOM;
 
-   /*
-* Serialize hugepage allocation and instantiation, so that we don't
-* get spurious allocation failures if two CPUs race to instantiate
-* the same page in the page cache.
-*/
-   mutex_lock(_instantiation_mutex);
entry = huge_ptep_get(ptep);
-   if (huge_pte_none(entry)) {
-   ret = hugetlb_no_page(mm, vma, address, ptep, flags);
-   goto out_mutex;
-   }
+   if (huge_pte_none(entry))
+   return hugetlb_no_page(mm, vma, address, ptep, flags);
 
ret = 0;
 
@@ -2984,10 +2973,8 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 * consumed.
 */
if ((flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) {
-   if (vma_needs_reservation(h, vma, address) < 0) {
-   ret = VM_FAULT_OOM;
-   goto out_mutex;
-   }
+   if (vma_needs_reservation(h, vma, address) < 0)
+   return VM_FAULT_OOM;
 
if (!(vma->vm_flags & VM_MAYSHARE))
pagecache_page = hugetlbfs_pagecache_page(h,
@@ -3037,9 +3024,6 @@ out_ptl:
unlock_page(page);
put_page(page);
 
-out_mutex:
-   mutex_unlock(_instantiation_mutex);
-
return ret;
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 01/14] mm, hugetlb: unify region structure handling

2013-12-17 Thread Joonsoo Kim
Currently, to track a reserved and allocated region, we use two different
ways for MAP_SHARED and MAP_PRIVATE. For MAP_SHARED, we use
address_mapping's private_list and, for MAP_PRIVATE, we use a resv_map.
Now, we are preparing to change a coarse grained lock which protect
a region structure to fine grained lock, and this difference hinder it.
So, before changing it, unify region structure handling.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index d19b30a..2040275 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -366,7 +366,13 @@ static void truncate_hugepages(struct inode *inode, loff_t 
lstart)
 
 static void hugetlbfs_evict_inode(struct inode *inode)
 {
+   struct resv_map *resv_map;
+
truncate_hugepages(inode, 0);
+   resv_map = (struct resv_map *)inode->i_mapping->private_data;
+   /* root inode doesn't have the resv_map, so we should check it */
+   if (resv_map)
+   resv_map_release(_map->refs);
clear_inode(inode);
 }
 
@@ -476,6 +482,11 @@ static struct inode *hugetlbfs_get_inode(struct 
super_block *sb,
umode_t mode, dev_t dev)
 {
struct inode *inode;
+   struct resv_map *resv_map;
+
+   resv_map = resv_map_alloc();
+   if (!resv_map)
+   return NULL;
 
inode = new_inode(sb);
if (inode) {
@@ -487,7 +498,7 @@ static struct inode *hugetlbfs_get_inode(struct super_block 
*sb,
inode->i_mapping->a_ops = _aops;
inode->i_mapping->backing_dev_info =_backing_dev_info;
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
-   INIT_LIST_HEAD(>i_mapping->private_list);
+   inode->i_mapping->private_data = resv_map;
info = HUGETLBFS_I(inode);
/*
 * The policy is initialized here even if we are creating a
@@ -517,7 +528,9 @@ static struct inode *hugetlbfs_get_inode(struct super_block 
*sb,
break;
}
lockdep_annotate_inode_mutex_key(inode);
-   }
+   } else
+   kref_put(_map->refs, resv_map_release);
+
return inode;
 }
 
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index bd7e987..317b0a6 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -5,6 +5,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 struct ctl_table;
 struct user_struct;
@@ -22,6 +24,13 @@ struct hugepage_subpool {
long max_hpages, used_hpages;
 };
 
+struct resv_map {
+   struct kref refs;
+   struct list_head regions;
+};
+extern struct resv_map *resv_map_alloc(void);
+void resv_map_release(struct kref *ref);
+
 extern spinlock_t hugetlb_lock;
 extern int hugetlb_max_hstate __read_mostly;
 #define for_each_hstate(h) \
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index dee6cf4..2891902 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -376,12 +376,7 @@ static void set_vma_private_data(struct vm_area_struct 
*vma,
vma->vm_private_data = (void *)value;
 }
 
-struct resv_map {
-   struct kref refs;
-   struct list_head regions;
-};
-
-static struct resv_map *resv_map_alloc(void)
+struct resv_map *resv_map_alloc(void)
 {
struct resv_map *resv_map = kmalloc(sizeof(*resv_map), GFP_KERNEL);
if (!resv_map)
@@ -393,7 +388,7 @@ static struct resv_map *resv_map_alloc(void)
return resv_map;
 }
 
-static void resv_map_release(struct kref *ref)
+void resv_map_release(struct kref *ref)
 {
struct resv_map *resv_map = container_of(ref, struct resv_map, refs);
 
@@ -1164,8 +1159,9 @@ static long vma_needs_reservation(struct hstate *h,
 
if (vma->vm_flags & VM_MAYSHARE) {
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
-   return region_chg(>i_mapping->private_list,
-   idx, idx + 1);
+   struct resv_map *resv = inode->i_mapping->private_data;
+
+   return region_chg(>regions, idx, idx + 1);
 
} else if (!is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
return 1;
@@ -1189,7 +1185,9 @@ static void vma_commit_reservation(struct hstate *h,
 
if (vma->vm_flags & VM_MAYSHARE) {
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
-   region_add(>i_mapping->private_list, idx, idx + 1);
+   struct resv_map *resv = inode->i_mapping->private_data;
+
+   region_add(>regions, idx, idx + 1);
 
} else if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
@@ -3159,6 +3157,7 @@ int hugetlb_reserve_pages(struct inode *inode,
long ret, chg;
struct hstate *h = hstate_inode(inode);
struct hugepage_subpool *spool = subpool_inode(inode);
+   struct resv_map *resv_map;
 

[PATCH v3 10/14] mm, hugetlb: move down outside_reserve check

2013-12-17 Thread Joonsoo Kim
Just move down outside_reserve check and don't check
vma_need_reservation() when outside_resever is true. It is slightly
optimized implementation.

This makes code more readable.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 0f56bbf..03ab285 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2576,7 +2576,7 @@ static int hugetlb_cow(struct mm_struct *mm, struct 
vm_area_struct *vma,
struct page *old_page, *new_page;
int outside_reserve = 0;
long chg;
-   bool use_reserve;
+   bool use_reserve = false;
unsigned long mmun_start;   /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
 
@@ -2591,6 +2591,11 @@ retry_avoidcopy:
return 0;
}
 
+   page_cache_get(old_page);
+
+   /* Drop page table lock as buddy allocator may be called */
+   spin_unlock(ptl);
+
/*
 * If the process that created a MAP_PRIVATE mapping is about to
 * perform a COW due to a shared page count, attempt to satisfy
@@ -2604,19 +2609,17 @@ retry_avoidcopy:
old_page != pagecache_page)
outside_reserve = 1;
 
-   page_cache_get(old_page);
-
-   /* Drop page table lock as buddy allocator may be called */
-   spin_unlock(ptl);
-   chg = vma_needs_reservation(h, vma, address);
-   if (chg < 0) {
-   page_cache_release(old_page);
+   if (!outside_reserve) {
+   chg = vma_needs_reservation(h, vma, address);
+   if (chg < 0) {
+   page_cache_release(old_page);
 
-   /* Caller expects lock to be held */
-   spin_lock(ptl);
-   return VM_FAULT_OOM;
+   /* Caller expects lock to be held */
+   spin_lock(ptl);
+   return VM_FAULT_OOM;
+   }
+   use_reserve = !chg;
}
-   use_reserve = !chg && !outside_reserve;
 
new_page = alloc_huge_page(vma, address, use_reserve);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 09/14] mm, hugetlb: remove a check for return value of alloc_huge_page()

2013-12-17 Thread Joonsoo Kim
Now, alloc_huge_page() only return -ENOSPEC if failed.
So, we don't need to worry about other return value.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d960f46..0f56bbf 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2621,7 +2621,6 @@ retry_avoidcopy:
new_page = alloc_huge_page(vma, address, use_reserve);
 
if (IS_ERR(new_page)) {
-   long err = PTR_ERR(new_page);
page_cache_release(old_page);
 
/*
@@ -2650,10 +2649,7 @@ retry_avoidcopy:
 
/* Caller expects lock to be held */
spin_lock(ptl);
-   if (err == -ENOMEM)
-   return VM_FAULT_OOM;
-   else
-   return VM_FAULT_SIGBUS;
+   return VM_FAULT_SIGBUS;
}
 
/*
@@ -2785,11 +2781,7 @@ retry:
 
page = alloc_huge_page(vma, address, use_reserve);
if (IS_ERR(page)) {
-   ret = PTR_ERR(page);
-   if (ret == -ENOMEM)
-   ret = VM_FAULT_OOM;
-   else
-   ret = VM_FAULT_SIGBUS;
+   ret = VM_FAULT_SIGBUS;
goto out;
}
clear_huge_page(page, address, pages_per_huge_page(h));
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 04/14] mm, hugetlb: remove resv_map_put()

2013-12-17 Thread Joonsoo Kim
In following patch, I change vma_resv_map() to return resv_map
for all case. This patch prepares it by removing resv_map_put() which
doesn't works properly with following change, because it works only for
HPAGE_RESV_OWNER's resv_map, not for all resv_maps.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index cf0eaff..ef70b6f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2282,15 +2282,6 @@ static void hugetlb_vm_op_open(struct vm_area_struct 
*vma)
kref_get(>refs);
 }
 
-static void resv_map_put(struct vm_area_struct *vma)
-{
-   struct resv_map *resv = vma_resv_map(vma);
-
-   if (!resv)
-   return;
-   kref_put(>refs, resv_map_release);
-}
-
 static void hugetlb_vm_op_close(struct vm_area_struct *vma)
 {
struct hstate *h = hstate_vma(vma);
@@ -2307,7 +2298,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct 
*vma)
reserve = (end - start) -
region_count(resv, start, end);
 
-   resv_map_put(vma);
+   kref_put(>refs, resv_map_release);
 
if (reserve) {
hugetlb_acct_memory(h, -reserve);
@@ -3245,8 +3236,8 @@ int hugetlb_reserve_pages(struct inode *inode,
region_add(resv_map, from, to);
return 0;
 out_err:
-   if (vma)
-   resv_map_put(vma);
+   if (vma && is_vma_resv_set(vma, HPAGE_RESV_OWNER))
+   kref_put(_map->refs, resv_map_release);
return ret;
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 02/14] mm, hugetlb: region manipulation functions take resv_map rather list_head

2013-12-17 Thread Joonsoo Kim
To change a protection method for region tracking to find grained one,
we pass the resv_map, instead of list_head, to region manipulation
functions. This doesn't introduce any functional change, and it is just
for preparing a next step.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2891902..3e7a44b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -151,8 +151,9 @@ struct file_region {
long to;
 };
 
-static long region_add(struct list_head *head, long f, long t)
+static long region_add(struct resv_map *resv, long f, long t)
 {
+   struct list_head *head = >regions;
struct file_region *rg, *nrg, *trg;
 
/* Locate the region we are either in or before. */
@@ -187,8 +188,9 @@ static long region_add(struct list_head *head, long f, long 
t)
return 0;
 }
 
-static long region_chg(struct list_head *head, long f, long t)
+static long region_chg(struct resv_map *resv, long f, long t)
 {
+   struct list_head *head = >regions;
struct file_region *rg, *nrg;
long chg = 0;
 
@@ -236,8 +238,9 @@ static long region_chg(struct list_head *head, long f, long 
t)
return chg;
 }
 
-static long region_truncate(struct list_head *head, long end)
+static long region_truncate(struct resv_map *resv, long end)
 {
+   struct list_head *head = >regions;
struct file_region *rg, *trg;
long chg = 0;
 
@@ -266,8 +269,9 @@ static long region_truncate(struct list_head *head, long 
end)
return chg;
 }
 
-static long region_count(struct list_head *head, long f, long t)
+static long region_count(struct resv_map *resv, long f, long t)
 {
+   struct list_head *head = >regions;
struct file_region *rg;
long chg = 0;
 
@@ -393,7 +397,7 @@ void resv_map_release(struct kref *ref)
struct resv_map *resv_map = container_of(ref, struct resv_map, refs);
 
/* Clear out any active regions before we release the map. */
-   region_truncate(_map->regions, 0);
+   region_truncate(resv_map, 0);
kfree(resv_map);
 }
 
@@ -1161,7 +1165,7 @@ static long vma_needs_reservation(struct hstate *h,
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
struct resv_map *resv = inode->i_mapping->private_data;
 
-   return region_chg(>regions, idx, idx + 1);
+   return region_chg(resv, idx, idx + 1);
 
} else if (!is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
return 1;
@@ -1171,7 +1175,7 @@ static long vma_needs_reservation(struct hstate *h,
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
struct resv_map *resv = vma_resv_map(vma);
 
-   err = region_chg(>regions, idx, idx + 1);
+   err = region_chg(resv, idx, idx + 1);
if (err < 0)
return err;
return 0;
@@ -1187,14 +1191,14 @@ static void vma_commit_reservation(struct hstate *h,
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
struct resv_map *resv = inode->i_mapping->private_data;
 
-   region_add(>regions, idx, idx + 1);
+   region_add(resv, idx, idx + 1);
 
} else if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
pgoff_t idx = vma_hugecache_offset(h, vma, addr);
struct resv_map *resv = vma_resv_map(vma);
 
/* Mark this page used in the map. */
-   region_add(>regions, idx, idx + 1);
+   region_add(resv, idx, idx + 1);
}
 }
 
@@ -2285,7 +2289,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct 
*vma)
end = vma_hugecache_offset(h, vma, vma->vm_end);
 
reserve = (end - start) -
-   region_count(>regions, start, end);
+   region_count(resv, start, end);
 
resv_map_put(vma);
 
@@ -3176,7 +3180,7 @@ int hugetlb_reserve_pages(struct inode *inode,
if (!vma || vma->vm_flags & VM_MAYSHARE) {
resv_map = inode->i_mapping->private_data;
 
-   chg = region_chg(_map->regions, from, to);
+   chg = region_chg(resv_map, from, to);
 
} else {
resv_map = resv_map_alloc();
@@ -3222,7 +3226,7 @@ int hugetlb_reserve_pages(struct inode *inode,
 * else has to be done for private mappings here
 */
if (!vma || vma->vm_flags & VM_MAYSHARE)
-   region_add(_map->regions, from, to);
+   region_add(resv_map, from, to);
return 0;
 out_err:
if (vma)
@@ -3238,7 +3242,7 @@ void hugetlb_unreserve_pages(struct inode *inode, long 
offset, long freed)
struct hugepage_subpool *spool = subpool_inode(inode);
 
if (resv_map)
-   chg = region_truncate(_map->regions, offset);
+   chg = region_truncate(resv_map, offset);
spin_lock(>i_lock);
inode->i_blocks -= 

[PATCH] drivers: firmware: Move prototype declartions to header file cper.h

2013-12-17 Thread Rashika Kheria
Move prototype declarations of function cper_estatus_print(),
cper_estatus_check_header() and cper_estatus_check() from file
drivers/acpi/apei/apei-internal.h to header file include/linux/cper.h
because these functions are used by both acpi driver and firmware
driver.
The header file include/linux/cper.h was chosen because it is included
in both the drivers.

This eliminates the following warnings in efi/cper.c:
drivers/firmware/efi/cper.c:346:6: warning: no previous prototype for 
‘cper_estatus_print’ [-Wmissing-prototypes]
drivers/firmware/efi/cper.c:374:5: warning: no previous prototype for 
‘cper_estatus_check_header’ [-Wmissing-prototypes]
drivers/firmware/efi/cper.c:387:5: warning: no previous prototype for 
‘cper_estatus_check’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria 
Reviewed-by: Josh Triplett 
---
 drivers/acpi/apei/apei-internal.h |5 -
 include/linux/cper.h  |8 
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/apei-internal.h 
b/drivers/acpi/apei/apei-internal.h
index 21ba34a..4ae847d 100644
--- a/drivers/acpi/apei/apei-internal.h
+++ b/drivers/acpi/apei/apei-internal.h
@@ -135,10 +135,5 @@ static inline u32 cper_estatus_len(struct 
acpi_generic_status *estatus)
return sizeof(*estatus) + estatus->data_length;
 }
 
-void cper_estatus_print(const char *pfx,
-   const struct acpi_generic_status *estatus);
-int cper_estatus_check_header(const struct acpi_generic_status *estatus);
-int cper_estatus_check(const struct acpi_generic_status *estatus);
-
 int apei_osc_setup(void);
 #endif
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 2fc0ec3..080f5f7 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -22,6 +22,14 @@
 #define LINUX_CPER_H
 
 #include 
+#include 
+
+/* Prototype declaration of functions common between acpi and firmware driver*/
+void cper_estatus_print(const char *pfx,
+   const struct acpi_generic_status *estatus);
+int cper_estatus_check_header(const struct acpi_generic_status *estatus);
+int cper_estatus_check(const struct acpi_generic_status *estatus);
+
 
 /* CPER record signature and the size */
 #define CPER_SIG_RECORD"CPER"
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/8] pciehp: Don't disable the link permanently, during removal

2013-12-17 Thread Yinghai Lu
On Tue, Dec 17, 2013 at 10:17 PM, Rajat Jain  wrote:
> Well, in that case I doubt if the patch will solve the problem. I think
> Most likely the "card present / not present" messages may be replaced
> By "link-up / link-down" messages. But I'd appreciate your testing.

I did have a debug patch at that time to report link status change.
And it did report link up and link down.

Thanks

Yinghai
Subject: [PATCH] PCI, pciehp: Report pcie link state change

To see if there is some link state change.

Signed-off-by: Yinghai Lu 

---
 drivers/pci/hotplug/pciehp_hpc.c |   29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
===
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
@@ -603,6 +603,25 @@ int pciehp_power_off_slot(struct slot *
 	return 0;
 }
 
+static u8 pciehp_handle_linkstate_change(struct slot *slot)
+{
+	struct controller *ctrl = slot->ctrl;
+	u16 lnk_status;
+	int retval;
+
+/* LinkState Change */
+	ctrl_dbg(ctrl, "LinkState change\n");
+
+	retval = pciehp_readw(ctrl, PCI_EXP_LNKSTA, _status);
+	if (retval) {
+		ctrl_err(ctrl, "Cannot read LNKSTATUS register\n");
+		return 1;
+	}
+	ctrl_info(ctrl, "lnk_status = %x\n", lnk_status);
+
+	return 1;
+}
+
 static irqreturn_t pcie_isr(int irq, void *dev_id)
 {
 	struct controller *ctrl = (struct controller *)dev_id;
@@ -624,7 +643,7 @@ static irqreturn_t pcie_isr(int irq, voi
 
 		detected &= (PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD |
 			 PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_PDC |
-			 PCI_EXP_SLTSTA_CC);
+			 PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC);
 		detected &= ~intr_loc;
 		intr_loc |= detected;
 		if (!intr_loc)
@@ -648,6 +667,10 @@ static irqreturn_t pcie_isr(int irq, voi
 	if (!(intr_loc & ~PCI_EXP_SLTSTA_CC))
 		return IRQ_HANDLED;
 
+	/* Check Link State Changed */
+	if (intr_loc & PCI_EXP_SLTSTA_DLLSC)
+		pciehp_handle_linkstate_change(slot);
+
 	/* Check MRL Sensor Changed */
 	if (intr_loc & PCI_EXP_SLTSTA_MRLSC)
 		pciehp_handle_switch_change(slot);
@@ -689,10 +712,12 @@ int pcie_enable_notification(struct cont
 		cmd |= PCI_EXP_SLTCTL_MRLSCE;
 	if (!pciehp_poll_mode)
 		cmd |= PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE;
+	cmd |= PCI_EXP_SLTCTL_DLLSCE;
 
 	mask = (PCI_EXP_SLTCTL_PDCE | PCI_EXP_SLTCTL_ABPE |
 		PCI_EXP_SLTCTL_MRLSCE | PCI_EXP_SLTCTL_PFDE |
-		PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE);
+		PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE |
+		PCI_EXP_SLTCTL_DLLSCE);
 
 	if (pcie_write_cmd(ctrl, cmd, mask)) {
 		ctrl_err(ctrl, "Cannot enable software notification\n");


[PATCH] iscsi: conn error (1020) each time iscsi session logout

2013-12-17 Thread Vaughan Cao
We do a normal login/logout process to iscsi server. iscsiadm report success,
but we always see the following error just before conn shutdown in dmesg.

Oct 15 05:30:09 vmhodtest019 iscsid: Connection1:0 to [target:
iqn.1986-03.com.sun:02:7b863a18-045a-cb04-c686-841f17df2f9c, portal:
10.182.32.162,3260] through [iface: default] is operational now
Oct 15 05:30:42 vmhodtest019 kernel:  connection1:0: detected conn error
(1020)
Oct 15 05:30:42 vmhodtest019 iscsid: Connection1:0 to [target:
iqn.1986-03.com.sun:02:7b863a18-045a-cb04-c686-841f17df2f9c, portal:
10.182.32.162,3260] through [iface: default] is shutdown.

It's because iscsi_tcp module evaluates socket state in data_ready() callback,
 and that detect the socket close. However, this socket close on target peer 
is in response to the logout request from initiator. So this is not an error 
that should be reported out. I quiesce it by checking session state and err 
value accordingly.

Signed-off-by: Vaughan Cao 
---
 drivers/scsi/libiscsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 415f2c0..84171ef 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1360,6 +1360,12 @@ void iscsi_conn_failure(struct iscsi_conn *conn, enum 
iscsi_err err)
spin_unlock_bh(>lock);
return;
}
+   /* Target closed the connection in response to logout */
+   if (session->state == ISCSI_STATE_LOGGING_OUT &&
+   err == ISCSI_ERR_TCP_CONN_CLOSE) {
+   spin_unlock_bh(>lock);
+   return;
+   }
 
if (conn->stop_stage == 0)
session->state = ISCSI_STATE_FAILED;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build warning after merge of the target-updates tree

2013-12-17 Thread Stephen Rothwell
Hi Nicholas,

After merging the target-updates tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/target/target_core_alua.c: In function 'core_alua_state_lba_dependent':
drivers/target/target_core_alua.c:473:6: warning: suggest parentheses around 
operand of '!' or change '&' to '&&' or '!' to '~' [-Wparentheses]
  if (!cmd->se_cmd_flags & SCF_SCSI_DATA_CDB)
  ^

Introduced by commit 923aacab87ba ("target_core_alua: Referrals
infrastructure").
-- 
Cheers,
Stephen Rothwell 


pgpcJbxPQeG29.pgp
Description: PGP signature


[PATCH] drivers: base: Add prototype declaration in memory.c

2013-12-17 Thread Rashika Kheria
Add the prototype declaration of function memory_block_size_bytes() in
memory.c.

This eliminates the following warning in memory.c:
drivers/base/memory.c:87:1: warning: no previous prototype for 
‘memory_block_size_bytes’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria 
Reviewed-by: Josh Triplett 
---
 drivers/base/memory.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index bece691..cfa03de 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -83,6 +83,7 @@ static void memory_block_release(struct device *dev)
kfree(mem);
 }
 
+unsigned long __weak memory_block_size_bytes(void);
 unsigned long __weak memory_block_size_bytes(void)
 {
return MIN_MEMORY_BLOCK_SIZE;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] drivers: iommu: Mark function eoi_ioapic_pin_remapped() as static in irq_remapping.c

2013-12-17 Thread Rashika Kheria
Mark function eoi_ioapic_pin_remapped() as static in irq_remapping.c
because it is not used outside this file.

This eliminates the following warning in
irq_remapping.c:drivers/iommu/irq_remapping.c:153:6: warning: no
previous prototype for ‘eoi_ioapic_pin_remapped’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria 
Reviewed-by: Josh Triplett 
---
 drivers/iommu/irq_remapping.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 39f81ae..3b05d1b 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -150,7 +150,7 @@ static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
return do_setup_msix_irqs(dev, nvec);
 }
 
-void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
+static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
 {
/*
 * Intr-remapping uses pin number as the virtual vector
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] drivers: iommu: Mark functions as static in intel_irq_remapping.c

2013-12-17 Thread Rashika Kheria
Mark functions int get_irte() and ir_dev_scope_init() as static in
intel_irq_remapping.c because they are not used outside this file.

This eliminates the following warnings in intel_irq_remapping.c:
drivers/iommu/intel_irq_remapping.c:49:5: warning: no previous prototype for 
‘get_irte’ [-Wmissing-prototypes]
drivers/iommu/intel_irq_remapping.c:810:12: warning: no previous prototype for 
‘ir_dev_scope_init’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria 
Reviewed-by: Josh Triplett 
---
 drivers/iommu/intel_irq_remapping.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index bab10b1..c988b8d 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -46,7 +46,7 @@ static struct irq_2_iommu *irq_2_iommu(unsigned int irq)
return cfg ? >irq_2_iommu : NULL;
 }
 
-int get_irte(int irq, struct irte *entry)
+static int get_irte(int irq, struct irte *entry)
 {
struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
unsigned long flags;
@@ -807,7 +807,7 @@ int __init parse_ioapics_under_ir(void)
return 1;
 }
 
-int __init ir_dev_scope_init(void)
+static int __init ir_dev_scope_init(void)
 {
if (!irq_remapping_enabled)
return 0;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] drivers: iommu: Mark functions as static in dmar.c

2013-12-17 Thread Rashika Kheria
Mark the functions check_zero_address() and dmar_get_fault_reason() as
static in dmar.c because they are not used outside this file.

This eliminates the following warnings in dmar.c:
drivers/iommu/dmar.c:491:12: warning: no previous prototype for 
‘check_zero_address’ [-Wmissing-prototypes]
drivers/iommu/dmar.c:1116:13: warning: no previous prototype for 
‘dmar_get_fault_reason’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria 
Reviewed-by: Josh Triplett 
---
 drivers/iommu/dmar.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 8b452c9..fb35d1b 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -488,7 +488,7 @@ static void warn_invalid_dmar(u64 addr, const char *message)
dmi_get_system_info(DMI_PRODUCT_VERSION));
 }
 
-int __init check_zero_address(void)
+static int __init check_zero_address(void)
 {
struct acpi_table_dmar *dmar;
struct acpi_dmar_header *entry_header;
@@ -1113,7 +1113,7 @@ static const char *irq_remap_fault_reasons[] =
 
 #define MAX_FAULT_REASON_IDX   (ARRAY_SIZE(fault_reason_strings) - 1)
 
-const char *dmar_get_fault_reason(u8 fault_reason, int *fault_type)
+static const char *dmar_get_fault_reason(u8 fault_reason, int *fault_type)
 {
if (fault_reason >= 0x20 && (fault_reason - 0x20 <
ARRAY_SIZE(irq_remap_fault_reasons))) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 06/11] drivers: acpi: Add appropriate ifdef conditions in exdump.c

2013-12-17 Thread Zheng, Lv
Hi,

I think this patch is useless.
It is possible to include dump code into Linux kernel for debugging purposes.
Thus we should do cleanup in different way for them.

Thanks
-Lv

> From: Rashika Kheria [mailto:rashika.khe...@gmail.com]
> Sent: Tuesday, December 17, 2013 5:24 PM
> 
> Enclose functions acpi_ex_dump_namespace_node() and
> acpi_ex_dump_object_descriptor() in appropriate ifdef condition of
> ACPI_FUTURE_USAGE in file acpica/exdump.c.
> 
> This eliminates the following warnings in exdump.c:
> drivers/acpi/acpica/exdump.c:809:6: warning: no previous prototype for 
> ‘acpi_ex_dump_namespace_node’ [-Wmissing-prototypes]
> drivers/acpi/acpica/exdump.c:995:1: warning: no previous prototype for 
> ‘acpi_ex_dump_object_descriptor’ [-Wmissing-prototypes]
> 
> Signed-off-by: Rashika Kheria 
> Reviewed-by: Josh Triplett 
> ---
>  drivers/acpi/acpica/exdump.c |8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/acpi/acpica/exdump.c b/drivers/acpi/acpica/exdump.c
> index 4d046fa..3cd2817 100644
> --- a/drivers/acpi/acpica/exdump.c
> +++ b/drivers/acpi/acpica/exdump.c
> @@ -54,6 +54,7 @@ ACPI_MODULE_NAME("exdump")
>   * The following routines are used for debug output only
>   */
>  #if defined(ACPI_DEBUG_OUTPUT) || defined(ACPI_DEBUGGER)
> +#ifdef  ACPI_FUTURE_USAGE
>  /* Local prototypes */
>  static void acpi_ex_out_string(char *title, char *value);
> 
> @@ -68,6 +69,7 @@ static void acpi_ex_dump_reference_obj(union 
> acpi_operand_object *obj_desc);
>  static void
>  acpi_ex_dump_package_obj(union acpi_operand_object *obj_desc,
>u32 level, u32 index);
> +#endif
> 
>  
> /***
>   *
> @@ -210,6 +212,7 @@ static struct acpi_exdump_info acpi_ex_dump_bank_field[5] 
> = {
>   {ACPI_EXD_POINTER, ACPI_EXD_OFFSET(bank_field.bank_obj), "Bank Object"}
>  };
> 
> +#ifdef  ACPI_FUTURE_USAGE
>  static struct acpi_exdump_info acpi_ex_dump_index_field[5] = {
>   {ACPI_EXD_INIT, ACPI_EXD_TABLE_SIZE(acpi_ex_dump_bank_field), NULL},
>   {ACPI_EXD_FIELD, 0, NULL},
> @@ -218,6 +221,7 @@ static struct acpi_exdump_info 
> acpi_ex_dump_index_field[5] = {
>"Index Object"},
>   {ACPI_EXD_POINTER, ACPI_EXD_OFFSET(index_field.data_obj), "Data Object"}
>  };
> +#endif
> 
>  static struct acpi_exdump_info acpi_ex_dump_reference[8] = {
>   {ACPI_EXD_INIT, ACPI_EXD_TABLE_SIZE(acpi_ex_dump_reference), NULL},
> @@ -287,6 +291,7 @@ static struct acpi_exdump_info acpi_ex_dump_node[5] = {
> 
>  /* Dispatch table, indexed by object type */
> 
> +#ifdef  ACPI_FUTURE_USAGE
>  static struct acpi_exdump_info *acpi_ex_dump_info[] = {
>   NULL,
>   acpi_ex_dump_integer,
> @@ -444,6 +449,7 @@ acpi_ex_dump_object(union acpi_operand_object *obj_desc,
>   count--;
>   }
>  }
> +#endif
> 
>  
> /***
>   *
> @@ -785,6 +791,7 @@ acpi_ex_dump_operands(union acpi_operand_object 
> **operands,
>   *
>   
> **/
> 
> +#ifdef  ACPI_FUTURE_USAGE
>  static void acpi_ex_out_string(char *title, char *value)
>  {
>   acpi_os_printf("%20s : %s\n", title, value);
> @@ -1042,5 +1049,6 @@ acpi_ex_dump_object_descriptor(union 
> acpi_operand_object *obj_desc, u32 flags)
>   acpi_ex_dump_object(obj_desc, acpi_ex_dump_info[obj_desc->common.type]);
>   return_VOID;
>  }
> +#endif
> 
>  #endif
> --
> 1.7.9.5



RE: [PATCHv7 1/4] pwm: Add Freescale FTM PWM driver support

2013-12-17 Thread li.xi...@freescale.com

> On Tue, Dec 17, 2013 at 01:00:10PM +0100, Tomasz Figa wrote:
> > On Tuesday 17 of December 2013 11:51:36 Russell King - ARM Linux wrote:
> > > On Tue, Dec 17, 2013 at 12:10:22PM +0100, Thierry Reding wrote:
> > > > On Fri, Dec 13, 2013 at 04:57:04PM +0800, Xiubo Li wrote:
> > > > > +static inline u32 fsl_pwm_readl(struct fsl_pwm_chip *fpc,
> > > > > + const void __iomem *addr)
> > > > > +{
> > > > > + u32 val;
> > > > > +
> > > > > + val = __raw_readl(addr);
> > > > > +
> > > > > + if (likely(fpc->big_endian))
> > > >
> > > > The likely() probably isn't very useful in this case. But if you
> > > > want to keep it, it should at least be reversed, since
> > > > little-endian is actually the default (you have to specify the
> > > > big-endian property to activate the big endian mode).
> > > >
> > > > > + val = be32_to_cpu(val);
> > > > > + else
> > > > > + val = le32_to_cpu(val);
> > >
> > > This will also cause sparse errors, because when sparse is enabled,
> > > these expect __le32 or __be32 arguments, not u32.
> >
> > My question is why can't you just create two sets of accessors, one
> > big endian and one little endian, add two function pointers to your
> > fsl_pwm_chip struct and let the driver set the to correct accessors in
> > probe?
> 
> I guess that would be one possibility.
>

Yes, that's one possibility.

If so, it must deference the function pointers and do the C stack push/pop stuff
every time when doing the accesses. For instance, but for some devices(USB, NET,
DMA..), we need to do many accesses every time in the frequent interrupt 
handler,
and I think the inline type functions will be more efficiency.

In LS-1 series platforms, there are many devices that need to do the same work 
like
this, and could these be moved to some global files ?


--
Xiubo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 4/8] pciehp: Don't disable the link permanently, during removal

2013-12-17 Thread Rajat Jain

> -Original Message-
> From: yhlu.ker...@gmail.com [mailto:yhlu.ker...@gmail.com] On Behalf Of
> 
> On Tue, Dec 17, 2013 at 7:20 PM, Rajat Jain 
> wrote:
> >
> > Actually I did not understand the original problem and the solution in
> > the first place (so I also do not understand how might disabling of
> > presence detect notification help). If you can give more details on
> > the original problem that shall be great. Here is what I understood
> from the commit log:
> >
> > The believe the HW looks like this:
> >
> > PCIe port <> Repeater <> Device.
> >
> > An in addition there is the presence detect pin that is connected
> > directly from The port to the device. Now, when the device is plugged
> > out, the pin indicates No presence. But are you saying the PCIe link
> from port to repeater is still up?
> 
> After the card is removed from the slot.
> 
> PCIe port try to retrain the link to repeater, like the link will keep
> up and down.

I still did not understand the why would the PCIe port try to do so. Are you
Saying the repeater keeps on flapping the link? Nevertheless, I assume it is
due to some repeater bug (that you mention below) that I do not need to
understand.

> 
> so the presence bit will keep report one card present and not present.
> that present bit should be OR of inband input and outband input.
> We check the outband input and it always report correctly.

Just a suggestion for a work around. I checked the manual of the PCIe switch
we use in our system (IDT 89HPES48H12G2), and it is possible at least in this
switch to control whether or not in-band is considered. An internal 
configuration
register says:

Presence Detect Control. This field controls the manner in which
presence of an adapter in a slot is reported to the hot-plug controller
associated with a downstream switch port.
0x0 - (both) Presence of an adapter in the slot is reported as the
logical “OR” of the receiver detect mechanism and the hotplug
presence detect input (PxPDN).
0x1 - (signal) Presence of an adapter in the slot is reported as the
state of the hot-plug presence detect input (PxPDN).
0x2 - (always) When selected, this mode always informs the hotplug
controller that an adapter is present.
0x3 - (never) When selected, this mode always informs the hotplug
controller that an adapter is not present.

May be some similar config may be present in the switch or the CPU that you use.

Just a random thought.

> 
> According to HW guys and Intel, that should be bug of repeater.

Well, in that case I doubt if the patch will solve the problem. I think
Most likely the "card present / not present" messages may be replaced
By "link-up / link-down" messages. But I'd appreciate your testing.

Thanks,

Rajat

> 
> Disable the link from pcie to repeater, likely to reset the repeater
> 
> Thanks
> 
> Yinghai
> 



Re: [RFC PATCH 0/6] Configurable fair allocation zone policy v3

2013-12-17 Thread Johannes Weiner
On Tue, Dec 17, 2013 at 03:02:10PM -0500, Johannes Weiner wrote:
> Hi Mel,
> 
> On Tue, Dec 17, 2013 at 04:48:18PM +, Mel Gorman wrote:
> > This series is currently untested and is being posted to sync up discussions
> > on the treatment of page cache pages, particularly the sysv part. I have
> > not thought it through in detail but postings patches is the easiest way
> > to highlight where I think a problem might be.
> >
> > Changelog since v2
> > o Drop an accounting patch, behaviour is deliberate
> > o Special case tmpfs and shmem pages for discussion
> > 
> > Changelog since v1
> > o Fix lot of brain damage in the configurable policy patch
> > o Yoink a page cache annotation patch
> > o Only account batch pages against allocations eligible for the fair policy
> > o Add patch that default distributes file pages on remote nodes
> > 
> > Commit 81c0a2bb ("mm: page_alloc: fair zone allocator policy") solved a
> > bug whereby new pages could be reclaimed before old pages because of how
> > the page allocator and kswapd interacted on the per-zone LRU lists.
> 
> Not just that, it was about ensuring predictable cache replacement and
> maximizing the cache's effectiveness.  This implicitely fixed the
> kswapd interaction bug, but that was not the sole reason (I realize
> that the original changelog is incomplete and I apologize for that).
> 
> I have had offline discussions with Andrea back then and his first
> suggestion was too to make this a zone fairness placement that is
> exclusive to the local node, but eventually he agreed that the problem
> applies just as much on the global level and that we should apply
> fairness throughout the system as long as we honor zone_reclaim_mode
> and hard bindings.  During our discussions now, it turned out that
> zone_reclaim_mode is a terrible predictor for preferred locality, but
> we also more or less agreed that the locality issues in the first
> place are not really applicable to cache loads dominated by IO cost.
> 
> So I think the main discrepancy between the original patch and what we
> truly want is that aging fairness is really only relevant for actual
> cache backed by secondary storage, because cache replacement is an
> ongoing operation that involves IO.  As opposed to memory types that
> involve IO only in extreme cases (anon, tmpfs, shmem) or no IO at all
> (slab, kernel allocations), in which case we prefer NUMA locality.
> 
> > Unfortunately a side-effect missed during review was that it's now very
> > easy to allocate remote memory on NUMA machines. The problem is that
> > it is not a simple case of just restoring local allocation policies as
> > there are genuine reasons why global page aging may be prefereable. It's
> > still a major change to default behaviour so this patch makes the policy
> > configurable and sets what I think is a sensible default.
> > 
> > The patches are on top of some NUMA balancing patches currently in -mm.
> > It's untested and posted to discuss patches 4 and 6.
> 
> It might be easier in dealing with -stable if we start with the
> critical fix(es) to restore sane functionality as much and as compact
> as possible and then place the cleanups on top?
> 
> In my local tree, I have the following as the first patch:

Updated version with your tmpfs __GFP_PAGECACHE parts added and
documentation, changelog updated as necessary.  I remain unconvinced
that tmpfs pages should be round-robined, but I agree with you that it
is the conservative change to do for 3.12 and 3.12 and we can figure
out the rest later.  I sure hope that this doesn't drive most people
on NUMA to disable pagecache interleaving right away as I expect most
tmpfs workloads to see little to no reclaim and prefer locality... :/

---
From: Johannes Weiner 
Subject: [patch] mm: page_alloc: restrict fair allocator policy to pagecache

81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") was merged
in order to ensure predictable pagecache replacement and to maximize
the cache's effectiveness of reducing IO regardless of zone or node
topology.

However, it was overzealous in round-robin placing every type of
allocation over all allowable nodes, instead of preferring locality,
which resulted in severe regressions on certain NUMA workloads that
have nothing to do with pagecache.

This patch drastically reduces the impact of the original change by
having the round-robin placement policy only apply to pagecache
allocations and no longer to anonymous memory, shmem, slab and other
types of kernel allocations.

This still changes the long-standing behavior of pagecache adhering to
the configured memory policy and preferring local allocations per
default, so make it configurable in case somebody relies on it.
However, we also expect the majority of users to prefer maximium cache
effectiveness and a predictable replacement behavior over memory
locality, so reflect this in the default setting of the sysctl.

No-signoff-without-Mel's
Cc:  # 3.12
---
 

RE: [PATCH 05/11] drivers: acpi: Include appropriate header file in utstate.c

2013-12-17 Thread Zheng, Lv
Hi,

Currently we didn't do much on automatically optimized out ACPI_FUTURE_USAGE 
functions as long as they do not affect the generation of vmlinux binary.

> From: Moore, Robert
> Sent: Wednesday, December 18, 2013 1:36 AM
> 
> I'm not sure what version of ACPICA you are looking at, but in the master git 
> tree for ACPICA, the file accommon.h includes "acutils.h".
> 
> > -Original Message-
> > From: Rashika Kheria [mailto:rashika.khe...@gmail.com]
> > Sent: Tuesday, December 17, 2013 1:22 AM
> > To: linux-kernel@vger.kernel.org
> > Cc: Moore, Robert; Zheng, Lv; Wysocki, Rafael J; Len Brown; linux-
> > a...@vger.kernel.org; j...@joshtriplett.org; de...@acpica.org
> > Subject: [PATCH 05/11] drivers: acpi: Include appropriate header file in
> > utstate.c
> >
> > Include appropriate header file acutils.h in acpica/utstate.c because
> > function acpi_ut_create_pkg_state_and_push() has its prototype declaration
> > in acutils.h. Also, encloses the function in acpica/utstate.c in ifdef
> > condition of ACPI_FUTURE_USAGE.
> >
> > This eliminates the following warning in utstate.c:
> > drivers/acpi/acpica/utstate.c:64:1: warning: no previous prototype for
> > ‘acpi_ut_create_pkg_state_and_push’ [-Wmissing-prototypes]
> >
> > Signed-off-by: Rashika Kheria 
> > Reviewed-by: Josh Triplett 
> > ---
> >  drivers/acpi/acpica/utstate.c |3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/acpi/acpica/utstate.c b/drivers/acpi/acpica/utstate.c
> > index 03c4c2f..0920d23 100644
> > --- a/drivers/acpi/acpica/utstate.c
> > +++ b/drivers/acpi/acpica/utstate.c
> > @@ -43,6 +43,7 @@
> >
> >  #include 
> >  #include "accommon.h"
> > +#include "acutils.h"

IMO, this line is useless.

> >
> >  #define _COMPONENT  ACPI_UTILITIES
> >  ACPI_MODULE_NAME("utstate")
> > @@ -60,6 +61,7 @@ ACPI_MODULE_NAME("utstate")
> >   * DESCRIPTION: Create a new state and push it
> >   *
> >
> > **
> > /
> > +#ifdef  ACPI_FUTURE_USAGE
> >  acpi_status
> >  acpi_ut_create_pkg_state_and_push(void *internal_object,
> >   void *external_object,
> > @@ -79,6 +81,7 @@ acpi_ut_create_pkg_state_and_push(void *internal_object,
> > acpi_ut_push_generic_state(state_list, state);
> > return (AE_OK);
> >  }
> > +#endif

IMO, these lines are useful.
I have a plan to do a cleanup on all ACPICA build warnings so that such 
inconsistencies can be sorted out.
Maybe it is time to do it right now.

Thanks
-Lv

> >
> >
> > /*
> > **
> >   *
> > --
> > 1.7.9.5



linux-next: manual merge of the usb-gadget tree with the usb.current tree

2013-12-17 Thread Stephen Rothwell
Hi Felipe,

Today's linux-next merge of the usb-gadget tree got a conflict in
drivers/usb/phy/Kconfig between commit 7cd0c298f6e0 ("usb: phy: fix
driver dependencies") from the usb.current tree and commit e1d2e31975e1
("usb: phy: Add OTG FSM configuration option") from the usb-gadget tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/usb/phy/Kconfig
index 2b41c636a52a,54bebba39e91..
--- a/drivers/usb/phy/Kconfig
+++ b/drivers/usb/phy/Kconfig
@@@ -19,9 -27,8 +27,9 @@@ config AB8500_US
  in host mode, low speed.
  
  config FSL_USB2_OTG
 -  bool "Freescale USB OTG Transceiver Driver"
 +  tristate "Freescale USB OTG Transceiver Driver"
-   depends on USB_EHCI_FSL && USB_FSL_USB2 && PM_RUNTIME
+   depends on USB_EHCI_FSL && USB_FSL_USB2 && USB_OTG_FSM && PM_RUNTIME
 +  depends on USB
select USB_OTG
select USB_PHY
help


pgpFXY7nMpuc8.pgp
Description: PGP signature


[PATCH] ssp/pxa2xx: add ssp support for mach-mmp

2013-12-17 Thread Qiao Zhou
mach-mmp also uses ssp request/free APIs. Add mach-mmp support.
Otherwise there will be redefinition error.

Signed-off-by: Qiao Zhou 
---
 include/linux/pxa2xx_ssp.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/pxa2xx_ssp.h b/include/linux/pxa2xx_ssp.h
index 4944420..8a2b87e 100644
--- a/include/linux/pxa2xx_ssp.h
+++ b/include/linux/pxa2xx_ssp.h
@@ -219,7 +219,7 @@ static inline u32 pxa_ssp_read_reg(struct ssp_device *dev, 
u32 reg)
return __raw_readl(dev->mmio_base + reg);
 }
 
-#ifdef CONFIG_ARCH_PXA
+#if defined(CONFIG_ARCH_PXA) || defined(CONFIG_ARCH_MMP)
 struct ssp_device *pxa_ssp_request(int port, const char *label);
 void pxa_ssp_free(struct ssp_device *);
 struct ssp_device *pxa_ssp_request_of(const struct device_node *of_node,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] drivers/net/wireless/hostap: Integer overflow

2013-12-17 Thread Wenliang Fan
The local variable 'value' comes from 'extra', a parameter of function
'prism2_ioctl_priv_prism2_param'. If a large number passed to 'value',
there would be an integer overflow in the following line:
local->passive_scan_timer.expires = jiffies +
local->passive_scan_interval * HZ

Signed-off-by: Wenliang Fan 
---
 drivers/net/wireless/hostap/hostap_ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/hostap/hostap_ioctl.c 
b/drivers/net/wireless/hostap/hostap_ioctl.c
index e509030..63e350a 100644
--- a/drivers/net/wireless/hostap/hostap_ioctl.c
+++ b/drivers/net/wireless/hostap/hostap_ioctl.c
@@ -2567,7 +2567,7 @@ static int prism2_ioctl_priv_prism2_param(struct 
net_device *dev,
local->passive_scan_interval = value;
if (timer_pending(>passive_scan_timer))
del_timer(>passive_scan_timer);
-   if (value > 0) {
+   if (value > 0 && value < INT_MAX / HZ) {
local->passive_scan_timer.expires = jiffies +
local->passive_scan_interval * HZ;
add_timer(>passive_scan_timer);
-- 
1.8.5.rc1.28.g7061504

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the usb tree with the usb.current tree

2013-12-17 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the usb tree got a conflict in
drivers/usb/host/ohci-at91.c between commit fb5f1834c322 ("usb:
ohci-at91: fix irq and iomem resource retrieval") from the usb.current
tree and commit 3c9740a117d4 ("usb: hcd: move controller wakeup setting
initialization to individual driver") from the usb tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/usb/host/ohci-at91.c
index 8c356af79409,29d2093e3cee..
--- a/drivers/usb/host/ohci-at91.c
+++ b/drivers/usb/host/ohci-at91.c
@@@ -203,9 -199,11 +203,11 @@@ static int usb_hcd_at91_probe(const str
ohci->num_ports = board->ports;
at91_start_hc(pdev);
  
 -  retval = usb_add_hcd(hcd, pdev->resource[1].start, IRQF_SHARED);
 +  retval = usb_add_hcd(hcd, irq, IRQF_SHARED);
-   if (retval == 0)
+   if (retval == 0) {
+   device_wakeup_enable(hcd->self.controller);
return retval;
+   }
  
/* Error handling */
at91_stop_hc(pdev);


pgphgtbXx4WYS.pgp
Description: PGP signature


Re: [RESEND][PATCH] scsi: esas2r: fix potential format string flaw

2013-12-17 Thread Joe Perches
On Tue, 2013-12-17 at 10:27 -0800, Kees Cook wrote:
> This makes sure format strings cannot leak into the printk call via the
> constructed buffer.
[]
> diff --git a/drivers/scsi/esas2r/esas2r_log.c 
> b/drivers/scsi/esas2r/esas2r_log.c
[]
> @@ -171,7 +171,7 @@ static int esas2r_log_master(const long level,
>   if (strlen(event_buffer) < buflen)
>   strcat(buffer, "\n");
>  
> - printk(event_buffer);
> + printk("%s", event_buffer);

It's probably better to remove the

if (strlen(event_buffer) < buflen)
strcat(buffer, "\n");

and use

printk("%s\n", event_buffer);

so that the output is always newline terminated.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] [RESEND]sparc64: convert spinloc_t to raw_spinlock_t in mmu_context_t

2013-12-17 Thread Allen Pais
In the attempt of get PREEMPT_RT working on sparc64 using
linux-stable-rt version 3.10.22-rt19+, the kernel crash
with the following trace:

[ 1487.027884] I7: 
[ 1487.027885] Call Trace:
[ 1487.027887]  [004967dc] rt_mutex_setprio+0x3c/0x2c0
[ 1487.027892]  [004afe20] task_blocks_on_rt_mutex+0x180/0x200
[ 1487.027895]  [00819114] rt_spin_lock_slowlock+0x94/0x300
[ 1487.027897]  [00817ebc] __schedule+0x39c/0x53c
[ 1487.027899]  [008185fc] schedule+0x1c/0xc0
[ 1487.027908]  [0048fff4] smpboot_thread_fn+0x154/0x2e0
[ 1487.027913]  [0048753c] kthread+0x7c/0xa0
[ 1487.027920]  [004060c4] ret_from_syscall+0x1c/0x2c
[ 1487.027922]  []   (null)

Thomas debugged this issue and pointed to switch_mm

spin_lock_irqsave(>context.lock, flags);

context.lock needs to be a raw_spinlock.

Signed-off-by: Allen Pais 
---
 arch/sparc/Kconfig  |1 +
 arch/sparc/include/asm/mmu_64.h |2 +-
 arch/sparc/include/asm/mmu_context_64.h |8 
 arch/sparc/kernel/smp_64.c  |4 ++--
 arch/sparc/mm/init_64.c |4 ++--
 arch/sparc/mm/tsb.c |   16 
 6 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 554995d..aae5aa9 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -27,6 +27,7 @@ config SPARC
select HAVE_DMA_API_DEBUG
select HAVE_ARCH_JUMP_LABEL
select HAVE_GENERIC_HARDIRQS
+   select IRQ_FORCED_THREADING
select GENERIC_IRQ_SHOW
select ARCH_WANT_IPC_PARSE_VERSION
select USE_GENERIC_SMP_HELPERS if SMP
diff --git a/arch/sparc/include/asm/mmu_64.h b/arch/sparc/include/asm/mmu_64.h
index 76092c4..e945ddb 100644
--- a/arch/sparc/include/asm/mmu_64.h
+++ b/arch/sparc/include/asm/mmu_64.h
@@ -90,7 +90,7 @@ struct tsb_config {
 #endif
 
 typedef struct {
-   spinlock_t  lock;
+   raw_spinlock_t  lock;
unsigned long   sparc64_ctx_val;
unsigned long   huge_pte_count;
struct page *pgtable_page;
diff --git a/arch/sparc/include/asm/mmu_context_64.h 
b/arch/sparc/include/asm/mmu_context_64.h
index 3d528f0..3a85624 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -77,7 +77,7 @@ static inline void switch_mm(struct mm_struct *old_mm, struct 
mm_struct *mm, str
if (unlikely(mm == _mm))
return;
 
-   spin_lock_irqsave(>context.lock, flags);
+   raw_spin_lock_irqsave(>context.lock, flags);
ctx_valid = CTX_VALID(mm->context);
if (!ctx_valid)
get_new_mmu_context(mm);
@@ -125,7 +125,7 @@ static inline void switch_mm(struct mm_struct *old_mm, 
struct mm_struct *mm, str
__flush_tlb_mm(CTX_HWBITS(mm->context),
   SECONDARY_CONTEXT);
}
-   spin_unlock_irqrestore(>context.lock, flags);
+   raw_spin_unlock_irqrestore(>context.lock, flags);
 }
 
 #define deactivate_mm(tsk,mm)  do { } while (0)
@@ -136,7 +136,7 @@ static inline void activate_mm(struct mm_struct *active_mm, 
struct mm_struct *mm
unsigned long flags;
int cpu;
 
-   spin_lock_irqsave(>context.lock, flags);
+   raw_spin_lock_irqsave(>context.lock, flags);
if (!CTX_VALID(mm->context))
get_new_mmu_context(mm);
cpu = smp_processor_id();
@@ -146,7 +146,7 @@ static inline void activate_mm(struct mm_struct *active_mm, 
struct mm_struct *mm
load_secondary_context(mm);
__flush_tlb_mm(CTX_HWBITS(mm->context), SECONDARY_CONTEXT);
tsb_context_switch(mm);
-   spin_unlock_irqrestore(>context.lock, flags);
+   raw_spin_unlock_irqrestore(>context.lock, flags);
 }
 
 #endif /* !(__ASSEMBLY__) */
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index 77539ed..f42e1a7 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -975,12 +975,12 @@ void __irq_entry smp_new_mmu_context_version_client(int 
irq, struct pt_regs *reg
if (unlikely(!mm || (mm == _mm)))
return;
 
-   spin_lock_irqsave(>context.lock, flags);
+   raw_spin_lock_irqsave(>context.lock, flags);
 
if (unlikely(!CTX_VALID(mm->context)))
get_new_mmu_context(mm);
 
-   spin_unlock_irqrestore(>context.lock, flags);
+   raw_spin_unlock_irqrestore(>context.lock, flags);
 
load_secondary_context(mm);
__flush_tlb_mm(CTX_HWBITS(mm->context),
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 04fd55a..bd5253d 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -350,7 +350,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address, pte_t *
 
mm = vma->vm_mm;
 
-   spin_lock_irqsave(>context.lock, flags);
+   

[PATCH 3/3] [RESEND]sparc64: convert ctx_alloc_lock raw_spinlock_t

2013-12-17 Thread Allen Pais
This patch fixes the kernel crash faced while
trying to attempt linux-stable-rt v3.10.22-rt19
on sparc64.

[ 2317.606015]  [008072f4] rt_spin_lock_slowlock+0x94/0x300
[ 2317.606020]  [00451d74] get_new_mmu_context+0x14/0x160
[ 2317.606026]  [00806394] switch_to_pc+0xd4/0x2a0
[ 2317.606029]  [008067dc] schedule+0x1c/0xc0
[ 2317.606031]  [00807364] rt_spin_lock_slowlock+0x104/0x300
[ 2317.606033]  [00450284] destroy_context+0x84/0x120
[ 2317.606036]  [0045c788] __mmdrop+0x28/0xe0
[ 2317.606045]  [004bf290] rcu_process_callbacks+0x450/0x760
[ 2317.606049]  [00466d48] do_current_softirqs+0x208/0x3c0
[ 2317.606051]  [00466f14] run_ksoftirqd+0x14/0x40
[ 2317.606057]  [0048c64c] smpboot_thread_fn+0x18c/0x2e0
[ 2317.606061]  [00483b5c] kthread+0x7c/0xa0
[ 2317.606069]  [004060c4] ret_from_syscall+0x1c/0x2c
[ 2317.606070]  []   (null)

Signed-off-by: Allen Pais 
---
 arch/sparc/include/asm/mmu_context_64.h |2 +-
 arch/sparc/mm/init_64.c |   10 +-
 arch/sparc/mm/tsb.c |4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_context_64.h 
b/arch/sparc/include/asm/mmu_context_64.h
index 3a85624..44e393b 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -13,7 +13,7 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, 
struct task_struct *tsk)
 {
 }
 
-extern spinlock_t ctx_alloc_lock;
+extern raw_spinlock_t ctx_alloc_lock;
 extern unsigned long tlb_context_cache;
 extern unsigned long mmu_context_bmap[];
 
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index bd5253d..ac5ae7a 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -661,7 +661,7 @@ void __flush_dcache_range(unsigned long start, unsigned 
long end)
 EXPORT_SYMBOL(__flush_dcache_range);
 
 /* get_new_mmu_context() uses "cache + 1".  */
-DEFINE_SPINLOCK(ctx_alloc_lock);
+DEFINE_RAW_SPINLOCK(ctx_alloc_lock);
 unsigned long tlb_context_cache = CTX_FIRST_VERSION - 1;
 #define MAX_CTX_NR (1UL << CTX_NR_BITS)
 #define CTX_BMAP_SLOTS BITS_TO_LONGS(MAX_CTX_NR)
@@ -683,7 +683,7 @@ void get_new_mmu_context(struct mm_struct *mm)
unsigned long orig_pgsz_bits;
int new_version;
 
-   spin_lock(_alloc_lock);
+   raw_spin_lock(_alloc_lock);
orig_pgsz_bits = (mm->context.sparc64_ctx_val & CTX_PGSZ_MASK);
ctx = (tlb_context_cache + 1) & CTX_NR_MASK;
new_ctx = find_next_zero_bit(mmu_context_bmap, 1 << CTX_NR_BITS, ctx);
@@ -719,7 +719,7 @@ void get_new_mmu_context(struct mm_struct *mm)
 out:
tlb_context_cache = new_ctx;
mm->context.sparc64_ctx_val = new_ctx | orig_pgsz_bits;
-   spin_unlock(_alloc_lock);
+   raw_spin_unlock(_alloc_lock);
 
if (unlikely(new_version))
smp_new_mmu_context_version();
@@ -2739,7 +2739,7 @@ void hugetlb_setup(struct pt_regs *regs)
if (tlb_type == cheetah_plus) {
unsigned long ctx;
 
-   spin_lock(_alloc_lock);
+   raw_spin_lock(_alloc_lock);
ctx = mm->context.sparc64_ctx_val;
ctx &= ~CTX_PGSZ_MASK;
ctx |= CTX_PGSZ_BASE << CTX_PGSZ0_SHIFT;
@@ -2760,7 +2760,7 @@ void hugetlb_setup(struct pt_regs *regs)
mm->context.sparc64_ctx_val = ctx;
on_each_cpu(context_reload, mm, 0);
}
-   spin_unlock(_alloc_lock);
+   raw_spin_unlock(_alloc_lock);
}
 }
 #endif
diff --git a/arch/sparc/mm/tsb.c b/arch/sparc/mm/tsb.c
index d84d4ea..9eb10b4 100644
--- a/arch/sparc/mm/tsb.c
+++ b/arch/sparc/mm/tsb.c
@@ -523,12 +523,12 @@ void destroy_context(struct mm_struct *mm)
free_hot_cold_page(page, 0);
}
 
-   spin_lock_irqsave(_alloc_lock, flags);
+   raw_spin_lock_irqsave(_alloc_lock, flags);
 
if (CTX_VALID(mm->context)) {
unsigned long nr = CTX_NRBITS(mm->context);
mmu_context_bmap[nr>>6] &= ~(1UL << (nr & 63));
}
 
-   spin_unlock_irqrestore(_alloc_lock, flags);
+   raw_spin_unlock_irqrestore(_alloc_lock, flags);
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] checkpatch.pl: Fix wrong curly bracket position reporting

2013-12-17 Thread Joe Perches
On Tue, 2013-12-17 at 18:59 -0800, Jean-Baptiste Theou wrote:
> This patch fixes wrong curly bracket position reporting when function
> declarations have only one void argument.
> 
> Missing error (ERROR: space required before the open brace '{') on this
> situation :
> 
> int foo(void){
>   ...
> }

That's true for any declaration with { on the same line.

Perhaps this would be better:
---
 scripts/checkpatch.pl | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 8f3aecd..c4dbb8a 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2784,7 +2784,8 @@ sub process {
 
 # function brace can't be on same line, except for #defines of do while,
 # or if closed on same line
-   if (($line=~/$Type\s*$Ident\(.*\).*\s{/) and
+   if ($^V && $^V ge 5.10.0 &&
+   ($line=~/$Type\s*$Ident\s*$balanced_parens\s*{\s*$/) &&
!($line=~/\#\s*define.*do\s{/) and !($line=~/}/)) {
ERROR("OPEN_BRACE",
  "open brace '{' following function declarations 
go on the next line\n" . $herecurr);
@@ -3159,7 +3160,9 @@ sub process {
 ## }
 
 #need space before brace following if, while, etc
-   if (($line =~ /\(.*\){/ && $line !~ /\($Type\){/) ||
+   if ($^V && $^V ge 5.10.0 &&
+   ($line !~ /$Type\s*$Ident\s*$balanced_parens\s*{\s*$/) &&
+   ($line =~ /\(.*\){/ && $line !~ /\($Type\){/) ||
$line =~ /do{/) {
if (ERROR("SPACING",
  "space required before the open brace '{'\n" 
. $herecurr) &&


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] ASoC: fsl: imx-wm8962: Grant hw_params/free() permission to control FLL

2013-12-17 Thread Nicolin Chen
On Tue, Dec 17, 2013 at 10:50:02PM +, Mark Brown wrote:
> On Thu, Dec 12, 2013 at 05:59:28PM +0800, Nicolin Chen wrote:
> 
> > +   mask = WM8962_MIXINL_TO_HPMIXL_MASK | WM8962_MIXINR_TO_HPMIXL_MASK |
> > +   WM8962_IN4L_TO_HPMIXL_MASK | WM8962_IN4R_TO_HPMIXL_MASK;
> > +   bypass |= snd_soc_read(codec, WM8962_HEADPHONE_MIXER_1) & mask;
> > +   bypass |= snd_soc_read(codec, WM8962_HEADPHONE_MIXER_2) & mask;
> > +   bypass |= snd_soc_read(codec, WM8962_SPEAKER_MIXER_1) & mask;
> > +   bypass |= snd_soc_read(codec, WM8962_SPEAKER_MIXER_2) & mask;
> > +
> > +   /* Don't diable FLL if running multi-substreams or analogue bypass */
> > +   if (codec_dai->active != 1 || bypass)
> > +   return 0;
> 
> I don't think this works with the power down delay we do on playback -
> the DAI will go inactive when closed but we'll still have the CODEC
> active and using its clocks until the power down time has elapsed if
> it's a playback DAI.  Trying to reclock the device while active is at
> best risky, even if it's muted.
>
> I do think refcounting from both here and the bias level changes is
> going to be the most robust thing, that'd also avoid the need to peer
> into the CODEC register map.

I've tried count reference way to handle FLL enabler/disabler here before
I sent this version. But the result shows the FLL would be never disabled
in hw_free() because the refcount is accumulated to 2, one from hw_params()
and the other from set_bias_level(PREPARE), which just made this patch
meaningless to me.

So the reclocking with bypass checking seems to be the last resort I can
figure out right here as the playback flow for 'aplay -Dhw:0 44k16bit.wav
48k24bit.wav' does need to reprogram the FLL during CODEC active.

Thank you,
Nicolin Chen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] random: use the architectural HWRNG for the SHA's IV in extract_buf()

2013-12-17 Thread Theodore Ts'o
To help assuage the fears of those who think the NSA can introduce a
massive hack into the instruction decode and out of order execution
engine in the CPU without hundreds of Intel engineers knowing about
it (only one of which woud need to have the conscience and courage of
Edward Snowden to spill the beans to the public), use the HWRNG to
initialize the SHA starting value, instead of xor'ing it in
afterwards.

Signed-off-by: "Theodore Ts'o" 
---
 drivers/char/random.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 8cc7d65..d07575c 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1012,23 +1012,23 @@ static void extract_buf(struct entropy_store *r, __u8 
*out)
__u8 extract[64];
unsigned long flags;
 
-   /* Generate a hash across the pool, 16 words (512 bits) at a time */
-   sha_init(hash.w);
-   spin_lock_irqsave(>lock, flags);
-   for (i = 0; i < r->poolinfo->poolwords; i += 16)
-   sha_transform(hash.w, (__u8 *)(r->pool + i), workspace);
-
/*
 * If we have an architectural hardware random number
-* generator, mix that in, too.
+* generator, use it for SHA's initial vector
 */
+   sha_init(hash.w);
for (i = 0; i < LONGS(20); i++) {
unsigned long v;
if (!arch_get_random_long())
break;
-   hash.l[i] ^= v;
+   hash.l[i] = v;
}
 
+   /* Generate a hash across the pool, 16 words (512 bits) at a time */
+   spin_lock_irqsave(>lock, flags);
+   for (i = 0; i < r->poolinfo->poolwords; i += 16)
+   sha_transform(hash.w, (__u8 *)(r->pool + i), workspace);
+
/*
 * We mix the hash back into the pool to prevent backtracking
 * attacks (where the attacker knows the state of the pool
-- 
1.8.5.rc3.362.gdf10213

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/3] ARM: dts: sama5d3xcm: add the regulator device node

2013-12-17 Thread Wenyou Yang
Signed-off-by: Wenyou Yang 
---
 arch/arm/boot/dts/sama5d3xcm.dtsi |   46 +
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sama5d3xcm.dtsi 
b/arch/arm/boot/dts/sama5d3xcm.dtsi
index 726a0f3..4571751 100644
--- a/arch/arm/boot/dts/sama5d3xcm.dtsi
+++ b/arch/arm/boot/dts/sama5d3xcm.dtsi
@@ -38,6 +38,52 @@
macb0: ethernet@f0028000 {
phy-mode = "rgmii";
};
+
+   i2c1: i2c@f0018000 {
+   pmic: act8865@5b {
+   compatible = "active-semi,act8865";
+   reg = <0x5b>;
+   status = "disabled";
+
+   regulators {
+   vcc_1v8_reg: DCDC_REG1 {
+   regulator-name = 
"VCC_1V8";
+   regulator-min-microvolt 
= <180>;
+   regulator-max-microvolt 
= <180>;
+   regulator-always-on;
+   };
+
+   vcc_1v2_reg: DCDC_REG2 {
+   regulator-name = 
"VCC_1V2";
+   regulator-min-microvolt 
= <110>;
+   regulator-max-microvolt 
= <130>;
+   
regulator-suspend-mem-microvolt = <115>;
+   
regulator-suspend-standby-microvolt = <115>;
+   regulator-always-on;
+   };
+
+   vcc_3v3_reg: DCDC_REG3 {
+   regulator-name = 
"VCC_3V3";
+   regulator-min-microvolt 
= <330>;
+   regulator-max-microvolt 
= <330>;
+   regulator-always-on;
+   };
+
+   vddfuse_reg: LDO_REG1 {
+   regulator-name = 
"VDDANA";
+   regulator-min-microvolt 
= <330>;
+   regulator-max-microvolt 
= <330>;
+   regulator-always-on;
+   };
+
+   vddana_reg: LDO_REG2 {
+   regulator-name = 
"FUSE_2V5";
+   regulator-min-microvolt 
= <250>;
+   regulator-max-microvolt 
= <250>;
+   };
+   };
+   };
+   };
};
 
nand0: nand@6000 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/18] perf sort: Do not compare dso again

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

The commit 09600e0f9ebb ("perf tools: Compare dso's also when
comparing symbols") added a comparison of dso when comparing symbol.
But if the sort key already has dso, it doesn't need to do it again
since entries have a different dso already filtered out.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/sort.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 68a4fd2f505e..635cd8f8b22e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -13,6 +13,7 @@ int   have_ignore_callees = 0;
 intsort__need_collapse = 0;
 intsort__has_parent = 0;
 intsort__has_sym = 0;
+intsort__has_dso = 0;
 enum sort_mode sort__mode = SORT_MODE__NORMAL;
 
 enum sort_type sort__first_dimension;
@@ -194,9 +195,11 @@ sort__sym_cmp(struct hist_entry *left, struct hist_entry 
*right)
 * comparing symbol address alone is not enough since it's a
 * relative address within a dso.
 */
-   ret = sort__dso_cmp(left, right);
-   if (ret != 0)
-   return ret;
+   if (!sort__has_dso) {
+   ret = sort__dso_cmp(left, right);
+   if (ret != 0)
+   return ret;
+   }
 
return _sort__sym_cmp(left->ms.sym, right->ms.sym);
 }
@@ -1061,6 +1064,8 @@ int sort_dimension__add(const char *tok)
sort__has_parent = 1;
} else if (sd->entry == _sym) {
sort__has_sym = 1;
+   } else if (sd->entry == _dso) {
+   sort__has_dso = 1;
}
 
__sort_dimension__add(sd, i);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/18] perf tools: Do not pass period and weight to add_hist_entry()

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

The @entry argument already has the info so no need to pass them.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/hist.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 822903eaa201..63234e37583c 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -342,15 +342,15 @@ static u8 symbol__parent_filter(const struct symbol 
*parent)
 }
 
 static struct hist_entry *add_hist_entry(struct hists *hists,
- struct hist_entry *entry,
- struct addr_location *al,
- u64 period,
- u64 weight)
+struct hist_entry *entry,
+struct addr_location *al)
 {
struct rb_node **p;
struct rb_node *parent = NULL;
struct hist_entry *he;
int64_t cmp;
+   u64 period = entry->stat.period;
+   u64 weight = entry->stat.weight;
 
p = >entries_in->rb_node;
 
@@ -437,7 +437,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.transaction = transaction,
};
 
-   return add_hist_entry(hists, , al, period, weight);
+   return add_hist_entry(hists, , al);
 }
 
 int64_t
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/3] regulator: act8865: add PMIC act8865 driver

2013-12-17 Thread Wenyou Yang
Signed-off-by: Wenyou Yang 
---
 drivers/regulator/Kconfig |8 +
 drivers/regulator/Makefile|1 +
 drivers/regulator/act8865-regulator.c |  381 +
 include/linux/regulator/act8865.h |   53 +
 4 files changed, 443 insertions(+)
 create mode 100644 drivers/regulator/act8865-regulator.c
 create mode 100644 include/linux/regulator/act8865.h

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index ce785f4..5a8ad84 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -70,6 +70,14 @@ config REGULATOR_88PM8607
help
  This driver supports 88PM8607 voltage regulator chips.
 
+config REGULATOR_ACT8865
+   bool "Active-semi act8865 voltage regulator"
+   depends on I2C
+   select REGMAP_I2C
+   help
+ This driver controls a active-semi act8865 voltage output
+ regulator via I2C bus.
+
 config REGULATOR_AD5398
tristate "Analog Devices AD5398/AD5821 regulators"
depends on I2C
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index 01c597e..3bb3a55 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_REGULATOR_88PM8607) += 88pm8607.o
 obj-$(CONFIG_REGULATOR_AAT2870) += aat2870-regulator.o
 obj-$(CONFIG_REGULATOR_AB3100) += ab3100.o
 obj-$(CONFIG_REGULATOR_AB8500) += ab8500-ext.o ab8500.o
+obj-$(CONFIG_REGULATOR_ACT8865) += act8865-regulator.o
 obj-$(CONFIG_REGULATOR_AD5398) += ad5398.o
 obj-$(CONFIG_REGULATOR_ANATOP) += anatop-regulator.o
 obj-$(CONFIG_REGULATOR_ARIZONA) += arizona-micsupp.o arizona-ldo1.o
diff --git a/drivers/regulator/act8865-regulator.c 
b/drivers/regulator/act8865-regulator.c
new file mode 100644
index 000..bd68b52
--- /dev/null
+++ b/drivers/regulator/act8865-regulator.c
@@ -0,0 +1,381 @@
+/*
+ * act8865-regulator.c - Voltage regulation for the active-semi ACT8865
+ * http://www.active-semi.com/sheets/ACT8865_Datasheet.pdf
+ *
+ * Copyright (C) 2013 Atmel Corporation
+ * Wenyou Yang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * ACT8865 Global Register Map.
+ */
+#defineACT8865_SYS_MODE0x00
+#defineACT8865_SYS_CTRL0x01
+#defineACT8865_DCDC1_VSET1 0x20
+#defineACT8865_DCDC1_VSET2 0x21
+#defineACT8865_DCDC1_CTRL  0x22
+#defineACT8865_DCDC2_VSET1 0x30
+#defineACT8865_DCDC2_VSET2 0x31
+#defineACT8865_DCDC2_CTRL  0x32
+#defineACT8865_DCDC3_VSET1 0x40
+#defineACT8865_DCDC3_VSET2 0x41
+#defineACT8865_DCDC3_CTRL  0x42
+#defineACT8865_LDO1_VSET   0x50
+#defineACT8865_LDO1_CTRL   0x51
+#defineACT8865_LDO2_VSET   0x54
+#defineACT8865_LDO2_CTRL   0x55
+#defineACT8865_LDO3_VSET   0x60
+#defineACT8865_LDO3_CTRL   0x61
+#defineACT8865_LDO4_VSET   0x64
+#defineACT8865_LDO4_CTRL   0x65
+
+/*
+ * Field Definitions.
+ */
+#defineACT8865_ENA 0x80/* ON - [7] */
+#defineACT8865_VSEL_MASK   0x3F/* VSET - [5:0] */
+
+struct act8865 {
+   struct regulator_dev *rdev[ACT8865_REG_NUM];
+   struct regmap *regmap;
+};
+
+static const struct regmap_config act8865_regmap_config = {
+   .reg_bits = 8,
+   .val_bits = 8,
+};
+
+/* ACt8865 voltage table */
+static const unsigned int act8865_voltages_table[] = {
+   60, 625000, 65, 675000,
+   70, 725000, 75, 775000,
+   80, 825000, 85, 875000,
+   90, 925000, 95, 975000,
+   100,1025000,105,1075000,
+   110,1125000,115,1175000,
+   120,125,130,135,
+   140,145,150,155,
+   160,165,170,175,
+   180,185,190,195,
+   200,205,201,215,
+   220,225,230,235,
+   240,250,260,270,
+   280,290,300,310,
+   

[RFC/PATCHSET 00/18] perf report: Add support to accumulate hist periods (v3)

2013-12-17 Thread Namhyung Kim
Hello,

This is my third attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

Please see the patch 04/18.  I refactored functions that add hist
entries with struct add_entry_iter.  While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain.  A hist_entry now has an additional fields to keep the
cumulative period if --cumulate option is given on perf report.

I changed the option as a separate --cumulate and added a new "Total"
column (and renamed the default "Overhead" column into "Self").  The
output will be sorted by total (cumulative) overhead for now.  The
reason I changed to the --cumulate is that I still think it's much
different from other --callchain options and I plan to add support for
showing (remaining) callchains to cumulative entries too.  The
--callchain option will take care of it even with --cumulate option.

I know that the UI should be changed also to be more flexible as Ingo
requested, but I'd like to do this first and then move to work on the
next.  I also added a new config option to enable it by default.

 * changes in v3:
  - change to --cumulate option
  - fix a couple of bugs (Jiri, Rodrigo)
  - rename some help functions (Arnaldo)
  - cache previous hist entries rathen than just symbol and dso
  - add some preparatory cleanups
  - add report.cumulate config option


Let me show you an example:

  $ cat abc.c
  #define barrier() asm volatile("" ::: "memory")

  void a(void)
  {
int i;
for (i = 0; i < 100; i++)
barrier();
  }
  void b(void)
  {
a();
  }
  void c(void)
  {
b();
  }
  int main(void)
  {
c();
return 0;
  }

With this simple program I ran perf record and report:

  $ perf record -g -e cycles:u ./abc

  $ perf report --stdio
  88.29%  abc  abc[.] a  
  |
  --- a
  b
  c
  main
  __libc_start_main

   9.43%  abc  ld-2.17.so [.] _dl_relocate_object
  |
  --- _dl_relocate_object
  dl_main
  _dl_sysdep_start

   2.27%  abc  [kernel.kallsyms]  [k] page_fault 
  |
  --- page_fault
 |  
 |--95.94%-- _dl_sysdep_start
 |  _dl_start_user
 |  
  --4.06%-- _start

   0.00%  abc  ld-2.17.so [.] _start 
  |
  --- _start


When the -g cumulative option is given, it'll be shown like this:

  $ perf report --cumulate --stdio

  # Self Total  Command  Shared Object   Symbol
  #     ...  .  ...
  #
   0.00%88.29%  abc  libc-2.17.so   [.] __libc_start_main  
   0.00%88.29%  abc  abc[.] main   
   0.00%88.29%  abc  abc[.] c  
   0.00%88.29%  abc  abc[.] b  
  88.29%88.29%  abc  abc[.] a  
   0.00%11.61%  abc  ld-2.17.so [.] _dl_sysdep_start   
   0.00% 9.43%  abc  ld-2.17.so [.] dl_main
   9.43% 9.43%  abc  ld-2.17.so [.] _dl_relocate_object
   2.27% 2.27%  abc  [kernel.kallsyms]  [k] page_fault 
   0.00% 2.18%  abc  ld-2.17.so [.] _dl_start_user 
   0.00% 0.10%  abc  ld-2.17.so [.] _start 

As you can see __libc_start_main -> main -> c -> b -> a callchain show
up in the output.

I know it have some rough edges or even bugs, but I really want to
release it and get reviews.  It does not handle event groups and
annotations yet.

You can also get this series on 'perf/cumulate-v3' branch in my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Arun Sharma 
Cc: Frederic Weisbecker 

[1] https://lkml.org/lkml/2012/3/31/6


Namhyung Kim (18):
  perf sort: Compare addresses if no symbol info
  perf sort: Do not compare dso again
  perf tools: Do not pass period and weight to add_hist_entry()
  perf tools: Introduce struct add_entry_iter
  perf hists: Convert hist entry functions to use struct he_stat
  perf hists: Add support for accumulated stat of hist entry
  perf hists: Check if accumulated when adding a hist entry
  perf hists: Accumulate hist entry stat based on the callchain
  perf tools: 

[PATCH 01/18] perf sort: Compare addresses if no symbol info

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

If a hist entry doesn't have symbol information, compare it with its
address.  Currently it only compares its level or whether it's NULL.
This can lead to an undesired result like an overhead exceeds 100%
especially when callchain accumulation is enabled by later patch.

Cc: Stephane Eranian 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/sort.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 8b0bb1f4494a..68a4fd2f505e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -161,6 +161,11 @@ struct sort_entry sort_dso = {
 
 /* --sort symbol */
 
+static int64_t _sort__addr_cmp(u64 left_ip, u64 right_ip)
+{
+   return (int64_t)(right_ip - left_ip);
+}
+
 static int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r)
 {
u64 ip_l, ip_r;
@@ -183,7 +188,7 @@ sort__sym_cmp(struct hist_entry *left, struct hist_entry 
*right)
int64_t ret;
 
if (!left->ms.sym && !right->ms.sym)
-   return right->level - left->level;
+   return _sort__addr_cmp(left->ip, right->ip);
 
/*
 * comparing symbol address alone is not enough since it's a
@@ -372,7 +377,7 @@ sort__sym_from_cmp(struct hist_entry *left, struct 
hist_entry *right)
struct addr_map_symbol *from_r = >branch_info->from;
 
if (!from_l->sym && !from_r->sym)
-   return right->level - left->level;
+   return _sort__addr_cmp(from_l->addr, from_r->addr);
 
return _sort__sym_cmp(from_l->sym, from_r->sym);
 }
@@ -384,7 +389,7 @@ sort__sym_to_cmp(struct hist_entry *left, struct hist_entry 
*right)
struct addr_map_symbol *to_r = >branch_info->to;
 
if (!to_l->sym && !to_r->sym)
-   return right->level - left->level;
+   return _sort__addr_cmp(to_l->addr, to_r->addr);
 
return _sort__sym_cmp(to_l->sym, to_r->sym);
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


TTY/n_gsm: Removing the wrong tty_unlock/lock() in gsm_dlci_release()

2013-12-17 Thread Chuansheng Liu

Commit 4d9b109060f690f5c835(tty: Prevent deadlock in n_gsm driver)
tried to close all the virtual ports synchronously before closing the
phycial ports, so the tty_vhangup() is used.

But the tty_unlock/lock() is wrong:
tty_release
 tty_ldisc_release
  tty_lock_pair(tty, o_tty)  < == Here the tty is for physical port
  tty_ldisc_kill
gsmld_close
  gsm_cleanup_mux
gsm_dlci_release
  tty = tty_port_tty_get(>port)
< == Here the tty(s) are for virtual port

They are different ttys, so before tty_vhangup(virtual tty), do not need
to call the tty_unlock(virtual tty) at all which causes unbalanced unlock
warning.

When enabling mutex debugging option, we will hit the below warning also:
[   99.276903] =
[   99.282172] [ BUG: bad unlock balance detected! ]
[   99.287442] 3.10.20-261976-gaec5ba0 #44 Tainted: G   O
[   99.293972] -
[   99.299240] mmgr/152 is trying to release lock (>legacy_mutex) at:
[   99.306693] [] mutex_unlock+0xd/0x10
[   99.311669] but there are no more locks to release!
[   99.317131]
[   99.317131] other info that might help us debug this:
[   99.324440] 3 locks held by mmgr/152:
[   99.328542]  #0:  (>legacy_mutex/1){..}, at: [] 
tty_lock_nested+0x40/0x90
[   99.338116]  #1:  (>ldisc_mutex){..}, at: [] 
tty_ldisc_kill+0x22/0xd0
[   99.347284]  #2:  (>mutex){..}, at: [] 
gsm_cleanup_mux+0x73/0x170
[   99.356060]
[   99.356060] stack backtrace:
[   99.360932] CPU: 0 PID: 152 Comm: mmgr Tainted: G   O 
3.10.20-261976-gaec5ba0 #44
[   99.370086]  ef4a4de0 ef4a4de0 ef4c1d98 c1b27b91 ef4c1db8 c1292655 c1dd10f5 
c1b2dcad
[   99.378921]  c1b2dcad ef4a4de0 ef4a528c  ef4c1dfc c12930dd 0246 

[   99.387754]    c15e1926  0001 ddfa7530 0003 
c1b2dcad
[   99.396588] Call Trace:
[   99.399326]  [] dump_stack+0x16/0x18
[   99.404307]  [] print_unlock_imbalance_bug+0xe5/0xf0
[   99.410840]  [] ? mutex_unlock+0xd/0x10
[   99.416110]  [] ? mutex_unlock+0xd/0x10
[   99.421382]  [] lock_release_non_nested+0x1cd/0x210
[   99.427818]  [] ? gsm_destroy_network+0x36/0x130
[   99.433964]  [] ? mutex_unlock+0xd/0x10
[   99.439235]  [] lock_release+0x82/0x1c0
[   99.444505]  [] ? mutex_unlock+0xd/0x10
[   99.449776]  [] ? mutex_unlock+0xd/0x10
[   99.455047]  [] __mutex_unlock_slowpath+0x5f/0xd0
[   99.461288]  [] mutex_unlock+0xd/0x10
[   99.466365]  [] tty_unlock+0x21/0x50
[   99.471345]  [] gsm_cleanup_mux+0xc1/0x170
[   99.476906]  [] gsmld_close+0x52/0x90
[   99.481983]  [] tty_ldisc_close.isra.1+0x35/0x50
[   99.488127]  [] tty_ldisc_kill+0x2c/0xd0
[   99.493494]  [] tty_ldisc_release+0x2f/0x50
[   99.499152]  [] tty_release+0x37c/0x4b0
[   99.504424]  [] ? mutex_unlock+0xd/0x10
[   99.509695]  [] ? mutex_unlock+0xd/0x10
[   99.514967]  [] ? eventpoll_release_file+0x7e/0x90
[   99.521307]  [] __fput+0xd9/0x200
[   99.525996]  [] fput+0xd/0x10
[   99.530685]  [] task_work_run+0x81/0xb0
[   99.535957]  [] do_notify_resume+0x49/0x70
[   99.541520]  [] work_notifysig+0x29/0x31
[   99.546897] [ cut here ]

So here we can call tty_vhangup() directly which is for virtual port.

Reviewed-by: Chao Bi 
Signed-off-by: Liu, Chuansheng 
---
 drivers/tty/n_gsm.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index c0f76da..8187ae6 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -1704,11 +1704,8 @@ static void gsm_dlci_release(struct gsm_dlci *dlci)
gsm_destroy_network(dlci);
mutex_unlock(>mutex);
 
-   /* tty_vhangup needs the tty_lock, so unlock and
-  relock after doing the hangup. */
-   tty_unlock(tty);
tty_vhangup(tty);
-   tty_lock(tty);
+
tty_port_tty_set(>port, NULL);
tty_kref_put(tty);
}
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/18] perf hists: Convert hist entry functions to use struct he_stat

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

hist_entry__add_cpumode_period() and hist_entry__decay() are dealing
with hist_entry's stat fields only.  So make them use the struct
directly.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/hist.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 63234e37583c..1f84314546a2 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -182,21 +182,21 @@ void hists__output_recalc_col_len(struct hists *hists, 
int max_rows)
}
 }
 
-static void hist_entry__add_cpumode_period(struct hist_entry *he,
-  unsigned int cpumode, u64 period)
+static void he_stat__add_cpumode_period(struct he_stat *he_stat,
+   unsigned int cpumode, u64 period)
 {
switch (cpumode) {
case PERF_RECORD_MISC_KERNEL:
-   he->stat.period_sys += period;
+   he_stat->period_sys += period;
break;
case PERF_RECORD_MISC_USER:
-   he->stat.period_us += period;
+   he_stat->period_us += period;
break;
case PERF_RECORD_MISC_GUEST_KERNEL:
-   he->stat.period_guest_sys += period;
+   he_stat->period_guest_sys += period;
break;
case PERF_RECORD_MISC_GUEST_USER:
-   he->stat.period_guest_us += period;
+   he_stat->period_guest_us += period;
break;
default:
break;
@@ -223,10 +223,10 @@ static void he_stat__add_stat(struct he_stat *dest, 
struct he_stat *src)
dest->weight+= src->weight;
 }
 
-static void hist_entry__decay(struct hist_entry *he)
+static void he_stat__decay(struct he_stat *he_stat)
 {
-   he->stat.period = (he->stat.period * 7) / 8;
-   he->stat.nr_events = (he->stat.nr_events * 7) / 8;
+   he_stat->period = (he_stat->period * 7) / 8;
+   he_stat->nr_events = (he_stat->nr_events * 7) / 8;
/* XXX need decay for weight too? */
 }
 
@@ -237,7 +237,7 @@ static bool hists__decay_entry(struct hists *hists, struct 
hist_entry *he)
if (prev_period == 0)
return true;
 
-   hist_entry__decay(he);
+   he_stat__decay(>stat);
 
if (!he->filtered)
hists->stats.total_period -= prev_period - he->stat.period;
@@ -403,7 +403,7 @@ static struct hist_entry *add_hist_entry(struct hists 
*hists,
rb_link_node(>rb_node_in, parent, p);
rb_insert_color(>rb_node_in, hists->entries_in);
 out:
-   hist_entry__add_cpumode_period(he, al->cpumode, period);
+   he_stat__add_cpumode_period(>stat, al->cpumode, period);
return he;
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/18] perf report: Cache cumulative callchains

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

It is possble that a callchain has cycles or recursive calls.  In that
case it'll end up having entries more than 100% overhead in the
output.  In order to prevent such entries, cache each callchain node
and skip if same entry already cumulated.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 80c774615287..4ec1a090d1a3 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -406,8 +406,27 @@ iter_prepare_cumulative_entry(struct add_entry_iter *iter,
  struct addr_location *al __maybe_unused,
  struct perf_sample *sample)
 {
+   struct callchain_cursor_node *node;
+   struct hist_entry **he_cache;
+
callchain_cursor_commit(_cursor);
 
+   /*
+* This is for detecting cycles or recursions so that they're
+* cumulated only one time to prevent entries more than 100%
+* overhead.
+*/
+   he_cache = malloc(sizeof(*he_cache) * (PERF_MAX_STACK_DEPTH + 1));
+   if (he_cache == NULL)
+   return -ENOMEM;
+
+   iter->priv = he_cache;
+   iter->curr = 0;
+
+   node = callchain_cursor_current(_cursor);
+   if (node == NULL)
+   return 0;
+
iter->evsel = evsel;
iter->sample = sample;
iter->machine = machine;
@@ -420,6 +439,7 @@ iter_add_single_cumulative_entry(struct add_entry_iter 
*iter,
 {
struct perf_evsel *evsel = iter->evsel;
struct perf_sample *sample = iter->sample;
+   struct hist_entry **he_cache = iter->priv;
struct hist_entry *he;
int err = 0;
 
@@ -429,6 +449,8 @@ iter_add_single_cumulative_entry(struct add_entry_iter 
*iter,
if (he == NULL)
return -ENOMEM;
 
+   he_cache[iter->curr++] = he;
+
/*
 * This is for putting parents upward during output resort iff
 * only a child gets sampled.  See hist_entry__sort_on_period().
@@ -510,8 +532,30 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
 {
struct perf_evsel *evsel = iter->evsel;
struct perf_sample *sample = iter->sample;
+   struct hist_entry **he_cache = iter->priv;
struct hist_entry *he;
+   struct hist_entry he_tmp = {
+   .cpu = al->cpu,
+   .thread = al->thread,
+   .comm = thread__comm(al->thread),
+   .ip = al->addr,
+   .ms = {
+   .map = al->map,
+   .sym = al->sym,
+   },
+   .parent = iter->parent,
+   };
int err = 0;
+   int i;
+
+   /*
+* Check if there's duplicate entries in the callchain.
+* It's possible that it has cycles or recursive calls.
+*/
+   for (i = 0; i < iter->curr; i++) {
+   if (hist_entry__cmp(he_cache[i], _tmp) == 0)
+   return 0;
+   }
 
he = __hists__add_entry(>hists, al, iter->parent, NULL, NULL,
sample->period, sample->weight,
@@ -519,6 +563,8 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
if (he == NULL)
return -ENOMEM;
 
+   he_cache[iter->curr++] = he;
+
/*
 * Only in the TUI browser we are doing integrated annotation,
 * so we don't allocated the extra space needed because the stdio
@@ -547,6 +593,8 @@ iter_finish_cumulative_entry(struct add_entry_iter *iter,
evsel->hists.stats.total_period += sample->period;
hists__inc_nr_events(>hists, PERF_RECORD_SAMPLE);
 
+   free(iter->priv);
+   iter->priv = NULL;
return 0;
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/18] perf hists: Check if accumulated when adding a hist entry

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

To support callchain accumulation, @entry should be recognized if it's
accumulated or not when add_hist_entry() called.  The period of an
accumulated entry should be added to ->stat_acc but not ->stat. Add
@sample_self arg for that.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-annotate.c |  3 ++-
 tools/perf/builtin-diff.c |  2 +-
 tools/perf/builtin-report.c   |  6 +++---
 tools/perf/builtin-top.c  |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c| 23 +++
 tools/perf/util/hist.h|  3 ++-
 7 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 6fd52c8fa682..9c89bb2e3002 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -65,7 +65,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
 
-   he = __hists__add_entry(>hists, al, NULL, NULL, NULL, 1, 1, 0);
+   he = __hists__add_entry(>hists, al, NULL, NULL, NULL, 1, 1, 0,
+   true);
if (he == NULL)
return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 2a85cc9a2d09..4dbc14c33ab9 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -308,7 +308,7 @@ static int hists__add_entry(struct hists *hists,
u64 weight, u64 transaction)
 {
if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-  transaction) != NULL)
+  transaction, true) != NULL)
return 0;
return -ENOMEM;
 }
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 5830bf923955..4e4572b47e04 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -154,7 +154,7 @@ iter_add_single_mem_entry(struct add_entry_iter *iter, 
struct addr_location *al)
 * and the he_stat__add_period() function.
 */
he = __hists__add_entry(>evsel->hists, al, iter->parent, NULL, mi,
-   cost, cost, 0);
+   cost, cost, 0, true);
if (!he)
return -ENOMEM;
 
@@ -286,7 +286,7 @@ iter_add_next_branch_entry(struct add_entry_iter *iter, 
struct addr_location *al
 * and not events sampled. Thus we use a pseudo period of 1.
 */
he = __hists__add_entry(>hists, al, iter->parent, [i], NULL,
-   1, 1, 0);
+   1, 1, 0, true);
if (he == NULL)
return -ENOMEM;
 
@@ -351,7 +351,7 @@ iter_add_single_normal_entry(struct add_entry_iter *iter, 
struct addr_location *
 
he = __hists__add_entry(>hists, al, iter->parent, NULL, NULL,
sample->period, sample->weight,
-   sample->transaction);
+   sample->transaction, true);
if (he == NULL)
return -ENOMEM;
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 03d37a76c612..ef54e9d1468f 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -248,7 +248,7 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct 
perf_evsel *evsel,
pthread_mutex_lock(>hists.lock);
he = __hists__add_entry(>hists, al, NULL, NULL, NULL,
sample->period, sample->weight,
-   sample->transaction);
+   sample->transaction, true);
pthread_mutex_unlock(>hists.lock);
if (he == NULL)
return NULL;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 173bf42cc03e..b829c2a1a598 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -223,7 +223,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(>hists, , NULL,
-   NULL, NULL, 1, 1, 0);
+   NULL, NULL, 1, 1, 0, true);
if (he == NULL)
goto out;
 
@@ -246,7 +246,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(>hists, , NULL,
-   NULL, NULL, 1, 1, 0);
+   NULL, NULL, 1, 1, 0, true);
if (he == NULL)
goto out;
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6dfa1b48a1a9..22b80b509c85 100644
--- a/tools/perf/util/hist.c
+++ 

[PATCH 06/18] perf hists: Add support for accumulated stat of hist entry

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Maintain accumulated stat information in hist_entry->stat_acc if
symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
vaules initially, and will be updated as callchain is processed later.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/hist.c   | 18 ++
 tools/perf/util/sort.h   |  1 +
 tools/perf/util/symbol.h |  1 +
 3 files changed, 20 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 1f84314546a2..6dfa1b48a1a9 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -238,6 +238,8 @@ static bool hists__decay_entry(struct hists *hists, struct 
hist_entry *he)
return true;
 
he_stat__decay(>stat);
+   if (symbol_conf.cumulate_callchain)
+   he_stat__decay(he->stat_acc);
 
if (!he->filtered)
hists->stats.total_period -= prev_period - he->stat.period;
@@ -285,6 +287,15 @@ static struct hist_entry *hist_entry__new(struct 
hist_entry *template)
if (he != NULL) {
*he = *template;
 
+   if (symbol_conf.cumulate_callchain) {
+   he->stat_acc = malloc(sizeof(he->stat));
+   if (he->stat_acc == NULL) {
+   free(he);
+   return NULL;
+   }
+   memcpy(he->stat_acc, >stat, sizeof(he->stat));
+   }
+
if (he->ms.map)
he->ms.map->referenced = true;
 
@@ -296,6 +307,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry 
*template)
 */
he->branch_info = malloc(sizeof(*he->branch_info));
if (he->branch_info == NULL) {
+   free(he->stat_acc);
free(he);
return NULL;
}
@@ -368,6 +380,8 @@ static struct hist_entry *add_hist_entry(struct hists 
*hists,
 
if (!cmp) {
he_stat__add_period(>stat, period, weight);
+   if (symbol_conf.cumulate_callchain)
+   he_stat__add_period(he->stat_acc, period, 
weight);
 
/*
 * This mem info was allocated from machine__resolve_mem
@@ -404,6 +418,8 @@ static struct hist_entry *add_hist_entry(struct hists 
*hists,
rb_insert_color(>rb_node_in, hists->entries_in);
 out:
he_stat__add_cpumode_period(>stat, al->cpumode, period);
+   if (symbol_conf.cumulate_callchain)
+   he_stat__add_cpumode_period(he->stat_acc, al->cpumode, period);
return he;
 }
 
@@ -503,6 +519,8 @@ static bool hists__collapse_insert_entry(struct hists 
*hists __maybe_unused,
 
if (!cmp) {
he_stat__add_stat(>stat, >stat);
+   if (symbol_conf.cumulate_callchain)
+   he_stat__add_stat(iter->stat_acc, he->stat_acc);
 
if (symbol_conf.use_callchain) {
callchain_cursor_reset(_cursor);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 43e5ff42a609..309f2838a1b4 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -81,6 +81,7 @@ struct hist_entry {
struct list_head head;
} pairs;
struct he_stat  stat;
+   struct he_stat  *stat_acc;
struct map_symbol   ms;
struct thread   *thread;
struct comm *comm;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 8a9d910c5345..66f429633804 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -92,6 +92,7 @@ struct symbol_conf {
show_nr_samples,
show_total_period,
use_callchain,
+   cumulate_callchain,
exclude_other,
show_cpu_utilization,
initialized,
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/18] perf tools: Introduce struct add_entry_iter

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

There're some duplicate code when adding hist entries.  They are
different in that some have branch info or mem info but generally do
same thing.  So introduce new struct add_entry_iter and add callbacks
to customize each case in general way.

The new perf_evsel__add_entry() function will look like:

  iter->prepare_entry();
  iter->add_single_entry();

  while (iter->next_entry())
iter->add_next_entry();

  iter->finish_entry();

This will help further work like the cumulative callchain patchset.

Cc: Jiri Olsa 
Cc: Stephane Eranian 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c | 453 +---
 1 file changed, 300 insertions(+), 153 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3a14dbed387c..5830bf923955 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -75,38 +75,74 @@ static int perf_report_config(const char *var, const char 
*value, void *cb)
return perf_default_config(var, value, cb);
 }
 
-static int perf_report__add_mem_hist_entry(struct perf_tool *tool,
-  struct addr_location *al,
-  struct perf_sample *sample,
-  struct perf_evsel *evsel,
-  struct machine *machine,
-  union perf_event *event)
-{
-   struct perf_report *rep = container_of(tool, struct perf_report, tool);
-   struct symbol *parent = NULL;
-   u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
-   int err = 0;
+struct add_entry_iter {
+   int total;
+   int curr;
+
+   struct perf_report *rep;
+   struct perf_evsel *evsel;
+   struct perf_sample *sample;
struct hist_entry *he;
-   struct mem_info *mi, *mx;
-   uint64_t cost;
+   struct symbol *parent;
+   void *priv;
+
+   int (*prepare_entry)(struct add_entry_iter *, struct machine *,
+struct perf_evsel *, struct addr_location *,
+struct perf_sample *);
+   int (*add_single_entry)(struct add_entry_iter *, struct addr_location 
*);
+   int (*next_entry)(struct add_entry_iter *, struct addr_location *);
+   int (*add_next_entry)(struct add_entry_iter *, struct addr_location *);
+   int (*finish_entry)(struct add_entry_iter *, struct addr_location *);
+};
 
-   if ((sort__has_parent || symbol_conf.use_callchain) &&
-   sample->callchain) {
-   err = machine__resolve_callchain(machine, evsel, al->thread,
-sample, , al,
-rep->max_stack);
-   if (err)
-   return err;
-   }
+static int
+iter_next_nop_entry(struct add_entry_iter *iter __maybe_unused,
+   struct addr_location *al __maybe_unused)
+{
+   return 0;
+}
+
+static int
+iter_add_next_nop_entry(struct add_entry_iter *iter __maybe_unused,
+   struct addr_location *al __maybe_unused)
+{
+   return 0;
+}
+
+static int
+iter_prepare_mem_entry(struct add_entry_iter *iter, struct machine *machine,
+  struct perf_evsel *evsel, struct addr_location *al,
+  struct perf_sample *sample)
+{
+   union perf_event *event = iter->priv;
+   struct mem_info *mi;
+   u8 cpumode;
+
+   BUG_ON(event == NULL);
+
+   cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
mi = machine__resolve_mem(machine, al->thread, sample, cpumode);
-   if (!mi)
+   if (mi == NULL)
return -ENOMEM;
 
-   if (rep->hide_unresolved && !al->sym)
+   iter->evsel = evsel;
+   iter->sample = sample;
+   iter->priv = mi;
+   return 0;
+}
+
+static int
+iter_add_single_mem_entry(struct add_entry_iter *iter, struct addr_location 
*al)
+{
+   u64 cost;
+   struct mem_info *mi = iter->priv;
+   struct hist_entry *he;
+
+   if (iter->rep->hide_unresolved && !al->sym)
return 0;
 
-   cost = sample->weight;
+   cost = iter->sample->weight;
if (!cost)
cost = 1;
 
@@ -117,17 +153,33 @@ static int perf_report__add_mem_hist_entry(struct 
perf_tool *tool,
 * and this is indirectly achieved by passing period=weight here
 * and the he_stat__add_period() function.
 */
-   he = __hists__add_entry(>hists, al, parent, NULL, mi,
+   he = __hists__add_entry(>evsel->hists, al, iter->parent, NULL, mi,
cost, cost, 0);
if (!he)
return -ENOMEM;
 
+   iter->he = he;
+   return 0;
+}
+
+static int
+iter_finish_mem_entry(struct add_entry_iter *iter, struct addr_location *al)
+{
+   struct perf_evsel 

[PATCH 09/18] perf tools: Update cpumode for each cumulative entry

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

The cpumode and level in struct addr_localtion was set for a sample
and but updated as cumulative callchains were added.  This led to have
non-matching symbol and cpumode in the output.

Update it accordingly based on the fact whether the map is a part of
the kernel or not.  This is a reverse of what thread__find_addr_map()
does.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c | 34 +++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3bc48e410d06..80c774615287 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -82,6 +82,7 @@ struct add_entry_iter {
struct perf_report *rep;
struct perf_evsel *evsel;
struct perf_sample *sample;
+   struct machine *machine;
struct hist_entry *he;
struct symbol *parent;
void *priv;
@@ -400,7 +401,7 @@ iter_finish_normal_entry(struct add_entry_iter *iter, 
struct addr_location *al)
 
 static int
 iter_prepare_cumulative_entry(struct add_entry_iter *iter,
- struct machine *machine __maybe_unused,
+ struct machine *machine,
  struct perf_evsel *evsel,
  struct addr_location *al __maybe_unused,
  struct perf_sample *sample)
@@ -409,6 +410,7 @@ iter_prepare_cumulative_entry(struct add_entry_iter *iter,
 
iter->evsel = evsel;
iter->sample = sample;
+   iter->machine = machine;
return 0;
 }
 
@@ -469,9 +471,35 @@ iter_next_cumulative_entry(struct add_entry_iter *iter,
else
al->addr = node->ip;
 
-   if (iter->rep->hide_unresolved && al->sym == NULL)
-   return 0;
+   if (al->sym == NULL) {
+   if (iter->rep->hide_unresolved)
+   return 0;
+   if (al->map == NULL)
+   goto out;
+   }
 
+   if (al->map->groups == >machine->kmaps) {
+   if (machine__is_host(iter->machine)) {
+   al->cpumode = PERF_RECORD_MISC_KERNEL;
+   al->level = 'k';
+   } else {
+   al->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+   al->level = 'g';
+   }
+   } else {
+   if (machine__is_host(iter->machine)) {
+   al->cpumode = PERF_RECORD_MISC_USER;
+   al->level = '.';
+   } else if (perf_guest) {
+   al->cpumode = PERF_RECORD_MISC_GUEST_USER;
+   al->level = 'u';
+   } else {
+   al->cpumode = PERF_RECORD_MISC_HYPERVISOR;
+   al->level = 'H';
+   }
+   }
+
+out:
callchain_cursor_advance(_cursor);
return 1;
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/18] perf hists: Accumulate hist entry stat based on the callchain

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Call __hists__add_entry() for each callchain node to get an
accumulated stat for an entry.  Introduce new cumulative_iter ops to
process them properly.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c | 136 +++-
 tools/perf/ui/stdio/hist.c  |   2 +-
 2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4e4572b47e04..3bc48e410d06 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -398,6 +398,130 @@ iter_finish_normal_entry(struct add_entry_iter *iter, 
struct addr_location *al)
return err;
 }
 
+static int
+iter_prepare_cumulative_entry(struct add_entry_iter *iter,
+ struct machine *machine __maybe_unused,
+ struct perf_evsel *evsel,
+ struct addr_location *al __maybe_unused,
+ struct perf_sample *sample)
+{
+   callchain_cursor_commit(_cursor);
+
+   iter->evsel = evsel;
+   iter->sample = sample;
+   return 0;
+}
+
+static int
+iter_add_single_cumulative_entry(struct add_entry_iter *iter,
+struct addr_location *al)
+{
+   struct perf_evsel *evsel = iter->evsel;
+   struct perf_sample *sample = iter->sample;
+   struct hist_entry *he;
+   int err = 0;
+
+   he = __hists__add_entry(>hists, al, iter->parent, NULL, NULL,
+   sample->period, sample->weight,
+   sample->transaction, true);
+   if (he == NULL)
+   return -ENOMEM;
+
+   /*
+* This is for putting parents upward during output resort iff
+* only a child gets sampled.  See hist_entry__sort_on_period().
+*/
+   he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
+
+   /*
+* Only in the TUI browser we are doing integrated annotation,
+* so we don't allocated the extra space needed because the stdio
+* code will not use it.
+*/
+   if (sort__has_sym && he->ms.sym && use_browser == 1) {
+   struct annotation *notes = symbol__annotation(he->ms.sym);
+
+   assert(evsel != NULL);
+
+   if (notes->src == NULL && symbol__alloc_hist(he->ms.sym) < 0)
+   return -ENOMEM;
+
+   err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+   }
+
+   return err;
+}
+
+static int
+iter_next_cumulative_entry(struct add_entry_iter *iter,
+  struct addr_location *al)
+{
+   struct callchain_cursor_node *node;
+
+   node = callchain_cursor_current(_cursor);
+   if (node == NULL)
+   return 0;
+
+   al->map = node->map;
+   al->sym = node->sym;
+   if (node->map)
+   al->addr = node->map->map_ip(node->map, node->ip);
+   else
+   al->addr = node->ip;
+
+   if (iter->rep->hide_unresolved && al->sym == NULL)
+   return 0;
+
+   callchain_cursor_advance(_cursor);
+   return 1;
+}
+
+static int
+iter_add_next_cumulative_entry(struct add_entry_iter *iter,
+  struct addr_location *al)
+{
+   struct perf_evsel *evsel = iter->evsel;
+   struct perf_sample *sample = iter->sample;
+   struct hist_entry *he;
+   int err = 0;
+
+   he = __hists__add_entry(>hists, al, iter->parent, NULL, NULL,
+   sample->period, sample->weight,
+   sample->transaction, false);
+   if (he == NULL)
+   return -ENOMEM;
+
+   /*
+* Only in the TUI browser we are doing integrated annotation,
+* so we don't allocated the extra space needed because the stdio
+* code will not use it.
+*/
+   if (sort__has_sym && he->ms.sym && use_browser == 1) {
+   struct annotation *notes = symbol__annotation(he->ms.sym);
+
+   assert(evsel != NULL);
+
+   if (notes->src == NULL && symbol__alloc_hist(he->ms.sym) < 0)
+   return -ENOMEM;
+
+   err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+   }
+   return err;
+}
+
+static int
+iter_finish_cumulative_entry(struct add_entry_iter *iter,
+struct addr_location *al __maybe_unused)
+{
+   struct perf_evsel *evsel = iter->evsel;
+   struct perf_sample *sample = iter->sample;
+
+   evsel->hists.stats.total_period += sample->period;
+   hists__inc_nr_events(>hists, PERF_RECORD_SAMPLE);
+
+   return 0;
+}
+
 static struct add_entry_iter mem_iter = {
.prepare_entry  = iter_prepare_mem_entry,
.add_single_entry   = iter_add_single_mem_entry,
@@ -422,6 +546,14 @@ static struct add_entry_iter normal_iter = {
.finish_entry

[PATCH 15/18] perf tools: Apply percent-limit to cumulative percentage

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

If -g cumulative option is given, it needs to show entries which don't
have self overhead.  So apply percent-limit to accumulated overhead
percentage in this case.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/browsers/hists.c | 34 ++
 tools/perf/ui/gtk/hists.c  | 11 +--
 tools/perf/ui/stdio/hist.c | 11 +--
 3 files changed, 44 insertions(+), 12 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index efa78894f70d..b02e71ecc5fe 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -828,12 +828,19 @@ static unsigned int hist_browser__refresh(struct 
ui_browser *browser)
 
for (nd = browser->top; nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-   float percent = h->stat.period * 100.0 /
-   hb->hists->stats.total_period;
+   float percent;
 
if (h->filtered)
continue;
 
+   if (symbol_conf.cumulate_callchain) {
+   percent = h->stat_acc->period * 100.0 /
+   hb->hists->stats.total_period;
+   } else {
+   percent = h->stat.period * 100.0 /
+   hb->hists->stats.total_period;
+   }
+
if (percent < hb->min_pcnt)
continue;
 
@@ -851,13 +858,17 @@ static struct rb_node *hists__filter_entries(struct 
rb_node *nd,
 {
while (nd != NULL) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-   float percent = h->stat.period * 100.0 /
-   hists->stats.total_period;
+   float percent;
 
-   if (percent < min_pcnt)
-   return NULL;
+   if (symbol_conf.cumulate_callchain) {
+   percent = h->stat_acc->period * 100.0 /
+   hists->stats.total_period;
+   } else {
+   percent = h->stat.period * 100.0 /
+   hists->stats.total_period;
+   }
 
-   if (!h->filtered)
+   if (!h->filtered && percent >= min_pcnt)
return nd;
 
nd = rb_next(nd);
@@ -872,8 +883,15 @@ static struct rb_node *hists__filter_prev_entries(struct 
rb_node *nd,
 {
while (nd != NULL) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-   float percent = h->stat.period * 100.0 /
+   float percent;
+
+   if (symbol_conf.cumulate_callchain) {
+   percent = h->stat_acc->period * 100.0 /
+   hists->stats.total_period;
+   } else {
+   percent = h->stat.period * 100.0 /
hists->stats.total_period;
+   }
 
if (!h->filtered && percent >= min_pcnt)
return nd;
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 70ed0d5e1b94..06ae3342e14f 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -296,12 +296,19 @@ static void perf_gtk__show_hists(GtkWidget *window, 
struct hists *hists,
for (nd = rb_first(>entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
GtkTreeIter iter;
-   float percent = h->stat.period * 100.0 /
-   hists->stats.total_period;
+   float percent;
 
if (h->filtered)
continue;
 
+   if (symbol_conf.cumulate_callchain) {
+   percent = h->stat_acc->period * 100.0 /
+   hists->stats.total_period;
+   } else {
+   percent = h->stat.period * 100.0 /
+   hists->stats.total_period;
+   }
+
if (percent < min_pcnt)
continue;
 
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 4c4986e809d8..7ea8502192b0 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -487,12 +487,19 @@ print_entries:
 
for (nd = rb_first(>entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-   float percent = h->stat.period * 100.0 /
-   hists->stats.total_period;
+   float percent;
 
if (h->filtered)
continue;
 
+   if 

[PATCH 16/18] perf tools: Add more hpp helper functions

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Sometimes it needs to disable some columns at runtime.  Add help
functions to support that.

Cc: Jiri Olsa 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/hist.c   | 17 +
 tools/perf/util/hist.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index b365260645d3..2a076dd86518 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -275,12 +275,29 @@ void perf_hpp__column_register(struct perf_hpp_fmt 
*format)
list_add_tail(>list, _hpp__list);
 }
 
+void perf_hpp__column_unregister(struct perf_hpp_fmt *format)
+{
+   list_del(>list);
+}
+
 void perf_hpp__column_enable(unsigned col)
 {
BUG_ON(col >= PERF_HPP__MAX_INDEX);
perf_hpp__column_register(_hpp__format[col]);
 }
 
+void perf_hpp__column_disable(unsigned col)
+{
+   BUG_ON(col >= PERF_HPP__MAX_INDEX);
+   perf_hpp__column_unregister(_hpp__format[col]);
+}
+
+void perf_hpp__cancel_cumulate(void)
+{
+   perf_hpp__column_disable(PERF_HPP__OVERHEAD_ACC);
+   perf_hpp__format[PERF_HPP__OVERHEAD].header = hpp__header_overhead;
+}
+
 int hist_entry__sort_snprintf(struct hist_entry *he, char *s, size_t size,
  struct hists *hists)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 60134e79103d..5a3bccb5413e 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -169,7 +169,10 @@ enum {
 
 void perf_hpp__init(void);
 void perf_hpp__column_register(struct perf_hpp_fmt *format);
+void perf_hpp__column_unregister(struct perf_hpp_fmt *format);
 void perf_hpp__column_enable(unsigned col);
+void perf_hpp__column_disable(unsigned col);
+void perf_hpp__cancel_cumulate(void);
 
 static inline size_t perf_hpp__use_color(void)
 {
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/18] perf report: Add --cumulate option

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

The --cumulate option is for showing accumulated overhead (period)
value as well as self overhead.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-report.txt |  5 +
 tools/perf/builtin-report.c  | 12 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 8eab8a4bdeb8..44e53ea45098 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -141,6 +141,11 @@ OPTIONS
 
Default: fractal,0.5,callee,function.
 
+--cumulate::
+   Accumulate callchain to parent entry so that then can show up in the
+   output.  The output will have a new "Overhead/acc." column and will
+   bo sorted on the data.  It requires callchain are recorded.
+
 --max-stack::
Set the stack depth limit when parsing the callchain, anything
beyond the specified depth will be ignored. This is a trade-off
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index b2bcb98a7300..206947a52fa8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -770,6 +770,14 @@ static int perf_report__setup_sample_type(struct 
perf_report *rep)
}
}
 
+   if (symbol_conf.cumulate_callchain) {
+   /* Silently ignore if callchain is not used */
+   if (!symbol_conf.use_callchain) {
+   symbol_conf.cumulate_callchain = false;
+   perf_hpp__cancel_cumulate();
+   }
+   }
+
if (sort__mode == SORT_MODE__BRANCH) {
if (!is_pipe &&
!(sample_type & PERF_SAMPLE_BRANCH_STACK)) {
@@ -1197,8 +1205,10 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_BOOLEAN('x', "exclude-other", _conf.exclude_other,
"Only display entries with parent-match"),
OPT_CALLBACK_DEFAULT('g', "call-graph", , 
"output_type,min_percent[,print_limit],call_order",
-"Display callchains using output_type (graph, flat, 
fractal, or none) , min percent threshold, optional print limit, callchain 
order, key (function or address). "
+"Display callchains using output_type (graph, flat, 
fractal or none) , min percent threshold, optional print limit, callchain 
order, key (function or address). "
 "Default: fractal,0.5,callee,function", 
_callchain_opt, callchain_default_opt),
+   OPT_BOOLEAN(0, "cumulate", _conf.cumulate_callchain,
+   "Accumulate callchain and show cumulative overhead as 
well"),
OPT_INTEGER(0, "max-stack", _stack,
"Set the maximum stack depth when parsing the callchain, "
"anything beyond the specified depth will be ignored. "
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-17 Thread Minchan Kim
Hello,

Please don't break thread.
You should reply to my mail instead of your original post.

On Wed, Dec 18, 2013 at 01:29:37PM +0900, Chanho Min wrote:
> 
> > I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
> > In experiment, I couldn't see much gain like you both system and even it
> > was regressed at bs=32k test, maybe workqueue allocation/schedule of work
> > per I/O.
> > Your test is rather special or what I am missing?
> Can you specify your test result on ARM with eMMC.

Sure.
before  after
32K 3.6M 3.4M
64K 6.3M 8.2M
128K11.4M11.7M
160K13.6M13.8M
256K19.8M19M
288K21.3M20.8M

> 
> > Before that, I'd like to know fundamental reason why your implementation
> > for asynchronous read enhance. At a first glance, I thought it's caused by
> > readahead from MM layer but when I read code, I found I was wrong.
> > MM's readahead logic works based on PageReadahead marker but squashfs
> > invalidates by grab_cache_page_nowait so it wouldn't work as we expected.
> >
> > Another possibility is block I/O merging in block layder by plugging logic,
> > which was what I tried a few month ago although implementation was really
> > bad. But it wouldn't work with your patch because do_generic_file_read
> > will unplug block layer by lock_page without merging enough I/O.
> >
> > So, what do you think real actuator for enhance your experiment?
> > Then, I could investigate why I can't get a benefit.
> Currently, squashfs adds request to the block device queue synchronously with
> wait for competion. mmc takes this request one by one and push them to host 
> driver,
> But it allows mmc to be idle frequently. This patch allows to add block 
> requset
> asynchrously without wait for competion, mmcqd can fetch a lot of request 
> from block
> at a time. As a result, mmcqd get busy and use a more bandwidth of mmc.
> For test, I added two count variables in mmc_queue_thread as bellows
> and tested same dd transfer.
> 
> static int mmc_queue_thread(void *d)
> {
> ..
>   do {
>   if (req || mq->mqrq_prev->req) {
>   fetch++;
>   } else {
>   idle++;
>   }
>   } while (1);
> ..
> }
> 
> without patch:
>  fetch: 920, idle: 460
> 
> with patch
>  fetch: 918, idle: 40

It's a result which isn't what I want to know.
What I wnat to know is why upper layer issues more I/O per second.

For example, you read 32K so MM layer will prepare 8 pages to read in but
at issuing at a first page, squashfs make 32 pages and fill the page cache
if we assume you use 128K compression so MM layer's already prepared 7 page
would be freed without further I/O and do_generic_file_read will wait for
completion by lock_page without further I/O queueing. It's not suprising.
One of page freed is a READA marked page so readahead couldn't work.
If readahead works, it would be just by luck. Actually, by simulation
64K dd, I found readahead logic would be triggered but it's just by luck
and it's not intended, I think.

If first issued I/O complete, squashfs decompress the I/O with 128K pages
so all 4 iteration(128K/32K) would be hit in page cache.
If all 128K hit in page cache, mm layer start to issue next I/O and
repeat above logic until you ends up reading all file size.
So my opition is that upper layer wouldn't issue more I/O logically.
If it worked, it's not what we expect but side-effect.

That's why I'd like to know what's your thought for increasing IOPS.
Please, could you say your thought why IOPS increased, not a result
on low level driver?

Anyway, in my opinion, we should take care of MM layer's readahead for
enhance sequential I/O. For it, we should use buffer pages passed by MM
instead of freeing them and allocating new pages in squashfs.
IMHO, it would be better to implement squashfs_readpages but my insight
is very weak so I guess Phillip will give more good idea/insight about
the issue.

Thanks!


> 
> Thanks
> Chanho.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 18/18] perf report: Add report.cumulate config option

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Add report.cumulate config option for setting default value of
callchain accumulation.  It affects the report output only if
perf.data contains callchain info.

A user can write .perfconfig file like below to enable accumulation
by default:

  $ cat ~/.perfconfig
  [report]
  cumulate = true

And it can be disabled through command line:

  $ perf report --no-cumulate

Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 206947a52fa8..923ed2752209 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -71,6 +71,10 @@ static int perf_report_config(const char *var, const char 
*value, void *cb)
rep->min_percent = strtof(value, NULL);
return 0;
}
+   if (!strcmp(var, "report.cumulate")) {
+   symbol_conf.cumulate_callchain = perf_config_bool(var, value);
+   return 0;
+   }
 
return perf_default_config(var, value, cb);
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
On Wed, Dec 18, 2013 at 12:58 PM, Tom Herbert  wrote:
>>> Yes , in it's current state it's broken. But maybe we can try to fix
>>> it instead of arbitrarily removing it. Please see my patches on
>>> plumbing RFS into tuntap which may start to make it useful.
>> Do you mean you patch [5/5] tun: Added support for RFS on tun flows?
>> Sorry, can you say with more details?
>
> Correct. It was RFC since I didn't have a good way to test, if you do
> please try it and see if there's any effect. We should also be able to
Interesting, i will try to dig it. Sorry, i don't understand why you
can't test. Does it require some special hardware support? or other
facilities?
> do something similar for KVM guests, either doing the flow lookup on
> each packet from the guest, or use aRFS interface from the guest
> driver for end to end RFS (more exciting prospect). We are finding
which two ends do you mean?
> that guest to driver accelerations like this (and tso, lro) are quite
Sorry, i got a bit confused, the driver here mean "virtio_net" or tuntap driver?
> important in getting virtual networking performance up.
>
>>
>>>
>>> Tom
>>>
 Signed-off-by: Zhi Yong Wu 
 ---
  drivers/net/tun.c |  208 
 +++--
  1 files changed, 10 insertions(+), 198 deletions(-)

 diff --git a/drivers/net/tun.c b/drivers/net/tun.c
 index 7c8343a..7c27fdc 100644
 --- a/drivers/net/tun.c
 +++ b/drivers/net/tun.c
 @@ -32,12 +32,15 @@
   *
   *  Daniel Podlejski 
   *Modifications for 2.3.99-pre5 kernel.
 + *
 + *  Zhi Yong Wu 
 + *Remove the flow cache.
   */

  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

  #define DRV_NAME   "tun"
 -#define DRV_VERSION"1.6"
 +#define DRV_VERSION"1.7"
  #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
  #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "

 @@ -146,18 +149,6 @@ struct tun_file {
 struct tun_struct *detached;
  };

 -struct tun_flow_entry {
 -   struct hlist_node hash_link;
 -   struct rcu_head rcu;
 -   struct tun_struct *tun;
 -
 -   u32 rxhash;
 -   int queue_index;
 -   unsigned long updated;
 -};
 -
 -#define TUN_NUM_FLOW_ENTRIES 1024
 -
  /* Since the socket were moved to tun_file, to preserve the behavior of 
 persist
   * device, socket filter, sndbuf and vnet header size were restore when 
 the
   * file were attached to a persist device.
 @@ -184,163 +175,11 @@ struct tun_struct {
 int debug;
  #endif
 spinlock_t lock;
 -   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
 -   struct timer_list flow_gc_timer;
 -   unsigned long ageing_time;
 unsigned int numdisabled;
 struct list_head disabled;
 void *security;
 -   u32 flow_count;
  };

 -static inline u32 tun_hashfn(u32 rxhash)
 -{
 -   return rxhash & 0x3ff;
 -}
 -
 -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
 rxhash)
 -{
 -   struct tun_flow_entry *e;
 -
 -   hlist_for_each_entry_rcu(e, head, hash_link) {
 -   if (e->rxhash == rxhash)
 -   return e;
 -   }
 -   return NULL;
 -}
 -
 -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
 - struct hlist_head *head,
 - u32 rxhash, u16 queue_index)
 -{
 -   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
 -
 -   if (e) {
 -   tun_debug(KERN_INFO, tun, "create flow: hash %u index 
 %u\n",
 - rxhash, queue_index);
 -   e->updated = jiffies;
 -   e->rxhash = rxhash;
 -   e->queue_index = queue_index;
 -   e->tun = tun;
 -   hlist_add_head_rcu(>hash_link, head);
 -   ++tun->flow_count;
 -   }
 -   return e;
 -}
 -
 -static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry 
 *e)
 -{
 -   tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n",
 - e->rxhash, e->queue_index);
 -   hlist_del_rcu(>hash_link);
 -   kfree_rcu(e, rcu);
 -   --tun->flow_count;
 -}
 -
 -static void tun_flow_flush(struct tun_struct *tun)
 -{
 -   int i;
 -
 -   spin_lock_bh(>lock);
 -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
 -   struct tun_flow_entry *e;
 -   struct hlist_node *n;
 -
 -   hlist_for_each_entry_safe(e, n, >flows[i], hash_link)
 -   

[PATCH 11/18] perf hists: Sort hist entries by accumulated period

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

When callchain accumulation is requested, we need to sort the entries
by accumulated period value.  When accumulated periods of two entries
are same (i.e. single path callchain) put the caller above since
accumulation tends to put callers on higher position for obvious
reason.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c |  6 ++
 tools/perf/util/hist.c  | 12 
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4ec1a090d1a3..b2bcb98a7300 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -566,6 +566,12 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
he_cache[iter->curr++] = he;
 
/*
+* This is for putting parents upward during output resort iff
+* only a child gets sampled.  See hist_entry__sort_on_period().
+*/
+   he->callchain->max_depth = callchain_cursor.nr - callchain_cursor.pos;
+
+   /*
 * Only in the TUI browser we are doing integrated annotation,
 * so we don't allocated the extra space needed because the stdio
 * code will not use it.
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 22b80b509c85..84fd1e6e9a37 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -626,6 +626,18 @@ static int hist_entry__sort_on_period(struct hist_entry *a,
struct hist_entry *pair;
u64 *periods_a, *periods_b;
 
+   if (symbol_conf.cumulate_callchain) {
+   /*
+* Put caller above callee when they have equal period.
+*/
+   if (a->stat_acc->period != b->stat_acc->period)
+   return a->stat_acc->period > b->stat_acc->period ? 1 : 
-1;
+
+   if (a->callchain->max_depth != b->callchain->max_depth)
+   return a->callchain->max_depth < 
b->callchain->max_depth ?
+   1 : -1;
+   }
+
ret = period_cmp(a->stat.period, b->stat.period);
if (ret || !symbol_conf.event_group)
return ret;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/18] perf ui/browser: Add support to accumulated hist stat

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Print accumulated stat of a hist entry if requested.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/browsers/hists.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index a440e03cd8c2..efa78894f70d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -693,11 +693,26 @@ hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt 
__maybe_unused,\
return __hpp__color_fmt(hpp, he, __hpp_get_##_field, _cb);  \
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field, _cb) \
+static u64 __hpp_get_acc_##_field(struct hist_entry *he)   \
+{  \
+   return he->stat_acc->_field;\
+}  \
+   \
+static int \
+hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
+   struct perf_hpp *hpp,   \
+   struct hist_entry *he)  \
+{  \
+   return __hpp__color_fmt(hpp, he, __hpp_get_acc_##_field, _cb);  \
+}
+
 __HPP_COLOR_PERCENT_FN(overhead, period, __hpp__color_callchain)
 __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys, NULL)
 __HPP_COLOR_PERCENT_FN(overhead_us, period_us, NULL)
 __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys, NULL)
 __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us, NULL)
+__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period, NULL)
 
 #undef __HPP_COLOR_PERCENT_FN
 
@@ -715,6 +730,8 @@ void hist_browser__init_hpp(void)
hist_browser__hpp_color_overhead_guest_sys;
perf_hpp__format[PERF_HPP__OVERHEAD_GUEST_US].color =
hist_browser__hpp_color_overhead_guest_us;
+   perf_hpp__format[PERF_HPP__OVERHEAD_ACC].color =
+   hist_browser__hpp_color_overhead_acc;
 }
 
 static int hist_browser__show_entry(struct hist_browser *browser,
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/18] perf ui/hist: Add support to accumulated hist stat

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Print accumulated stat of a hist entry if requested.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/hist.c   | 45 +
 tools/perf/util/hist.h |  1 +
 2 files changed, 46 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 78f4c92e9b73..b365260645d3 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -129,6 +129,28 @@ static int hpp__entry_##_type(struct perf_hpp_fmt *_fmt 
__maybe_unused,\
  scnprintf, true); 
\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)  
\
+static u64 he_get_acc_##_field(struct hist_entry *he)  
\
+{  
\
+   return he->stat_acc->_field;
\
+}  
\
+   
\
+static int hpp__color_acc_##_type(struct perf_hpp_fmt *fmt __maybe_unused, 
\
+ struct perf_hpp *hpp, struct hist_entry *he)  
\
+{  
\
+   return __hpp__fmt(hpp, he, he_get_acc_##_field, " %6.2f%%", 
\
+ (hpp_snprint_fn)percent_color_snprintf, true);
\
+}
+
+#define __HPP_ENTRY_ACC_PERCENT_FN(_type, _field)  
\
+static int hpp__entry_acc_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,
\
+ struct perf_hpp *hpp, struct hist_entry *he)  
\
+{  
\
+   const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%"; 
\
+   return __hpp__fmt(hpp, he, he_get_acc_##_field, fmt,
\
+ scnprintf, true); 
\
+}
+
 #define __HPP_ENTRY_RAW_FN(_type, _field)  
\
 static u64 he_get_raw_##_field(struct hist_entry *he)  
\
 {  
\
@@ -148,17 +170,25 @@ __HPP_WIDTH_FN(_type, _min_width, _unit_width)
\
 __HPP_COLOR_PERCENT_FN(_type, _field)  \
 __HPP_ENTRY_PERCENT_FN(_type, _field)
 
+#define HPP_PERCENT_ACC_FNS(_type, _str, _field, _min_width, _unit_width)\
+__HPP_HEADER_FN(_type, _str, _min_width, _unit_width)  \
+__HPP_WIDTH_FN(_type, _min_width, _unit_width) \
+__HPP_COLOR_ACC_PERCENT_FN(_type, _field)  \
+__HPP_ENTRY_ACC_PERCENT_FN(_type, _field)
+
 #define HPP_RAW_FNS(_type, _str, _field, _min_width, _unit_width)  \
 __HPP_HEADER_FN(_type, _str, _min_width, _unit_width)  \
 __HPP_WIDTH_FN(_type, _min_width, _unit_width) \
 __HPP_ENTRY_RAW_FN(_type, _field)
 
+__HPP_HEADER_FN(overhead_self, "Self", 8, 8)
 
 HPP_PERCENT_FNS(overhead, "Overhead", period, 8, 8)
 HPP_PERCENT_FNS(overhead_sys, "sys", period_sys, 8, 8)
 HPP_PERCENT_FNS(overhead_us, "usr", period_us, 8, 8)
 HPP_PERCENT_FNS(overhead_guest_sys, "guest sys", period_guest_sys, 9, 8)
 HPP_PERCENT_FNS(overhead_guest_us, "guest usr", period_guest_us, 9, 8)
+HPP_PERCENT_ACC_FNS(overhead_acc, "Total", period, 8, 8)
 
 HPP_RAW_FNS(samples, "Samples", nr_events, 12, 12)
 HPP_RAW_FNS(period, "Period", period, 12, 12)
@@ -171,6 +201,14 @@ HPP_RAW_FNS(period, "Period", period, 12, 12)
.entry  = hpp__entry_ ## _name  \
}
 
+#define HPP__COLOR_ACC_PRINT_FNS(_name)\
+   {   \
+   .header = hpp__header_ ## _name,\
+   .width  = hpp__width_ ## _name, \
+   .color  = hpp__color_acc_ ## _name, \
+   .entry  = hpp__entry_acc_ ## _name  \
+   }
+
 #define HPP__PRINT_FNS(_name)  \
{   \
.header = hpp__header_ ## _name,\
@@ -184,6 +222,7 @@ struct perf_hpp_fmt perf_hpp__format[] = {
HPP__COLOR_PRINT_FNS(overhead_us),
HPP__COLOR_PRINT_FNS(overhead_guest_sys),
HPP__COLOR_PRINT_FNS(overhead_guest_us),
+   HPP__COLOR_ACC_PRINT_FNS(overhead_acc),
HPP__PRINT_FNS(samples),
HPP__PRINT_FNS(period)
 };
@@ -208,6 +247,12 @@ void perf_hpp__init(void)
 {
perf_hpp__column_enable(PERF_HPP__OVERHEAD);
 
+   if (symbol_conf.cumulate_callchain) {
+   perf_hpp__format[PERF_HPP__OVERHEAD].header =
+   

[PATCH v3 2/3] regulator: act8865: add device tree binding doc

2013-12-17 Thread Wenyou Yang
Signed-off-by: Wenyou Yang 
---
 .../bindings/regulator/act8865-regulator.txt   |   58 
 .../devicetree/bindings/vendor-prefixes.txt|1 +
 2 files changed, 59 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/act8865-regulator.txt

diff --git a/Documentation/devicetree/bindings/regulator/act8865-regulator.txt 
b/Documentation/devicetree/bindings/regulator/act8865-regulator.txt
new file mode 100644
index 000..bcea8ab
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/act8865-regulator.txt
@@ -0,0 +1,58 @@
+ACT8865 regulator
+---
+
+Required properties:
+- compatible: "active-semi,act8865"
+- reg: I2C slave address
+
+The valid names for regulators are:
+   DCDC_REG1, DCDC_REG2, DCDC_REG3, LDO_REG1, LDO_REG2, LDO_REG3, LDO_REG4.
+
+Example:
+
+
+   i2c1: i2c@f0018000 {
+   pmic: act8865@5b {
+   compatible = "active-semi,act8865";
+   reg = <0x5b>;
+   status = "disabled";
+
+   regulators {
+   vcc_1v8_reg: DCDC_REG1 {
+   regulator-name = "VCC_1V8";
+   regulator-min-microvolt = 
<180>;
+   regulator-max-microvolt = 
<180>;
+   regulator-always-on;
+   };
+
+   vcc_1v2_reg: DCDC_REG2 {
+   regulator-name = "VCC_1V2";
+   regulator-min-microvolt = 
<110>;
+   regulator-max-microvolt = 
<130>;
+   regulator-suspend-mem-microvolt 
= <115>;
+   
regulator-suspend-standby-microvolt = <115>;
+   regulator-always-on;
+   };
+
+   vcc_3v3_reg: DCDC_REG3 {
+   regulator-name = "VCC_3V3";
+   regulator-min-microvolt = 
<330>;
+   regulator-max-microvolt = 
<330>;
+   regulator-always-on;
+   };
+
+   vddfuse_reg: LDO_REG1 {
+   regulator-name = "VDDANA";
+   regulator-min-microvolt = 
<330>;
+   regulator-max-microvolt = 
<330>;
+   regulator-always-on;
+   };
+
+   vddana_reg: LDO_REG2 {
+   regulator-name = "FUSE_2V5";
+   regulator-min-microvolt = 
<250>;
+   regulator-max-microvolt = 
<250>;
+   };
+   };
+   };
+   };
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index edbb8d8..519421f 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -3,6 +3,7 @@ Device tree binding vendor prefix registry.  Keep list in 
alphabetical order.
 This isn't an exhaustive list, but you should add new prefixes to it before
 using them to avoid name-space collisions.
 
+active-semiActive-Semi International Inc
 ad Avionic Design GmbH
 adiAnalog Devices, Inc.
 aeroflexgaislerAeroflex Gaisler AB
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/18] perf ui/gtk: Add support to accumulated hist stat

2013-12-17 Thread Namhyung Kim
From: Namhyung Kim 

Print accumulated stat of a hist entry if requested.

Cc: Arun Sharma 
Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/gtk/hists.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 2ca66cc1160f..70ed0d5e1b94 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -98,11 +98,25 @@ static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt 
*fmt __maybe_unused,
return __hpp__color_fmt(hpp, he, he_get_##_field);  
\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)  
\
+static u64 he_get_acc_##_field(struct hist_entry *he)  
\
+{  
\
+   return he->stat_acc->_field;
\
+}  
\
+   
\
+static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt 
__maybe_unused,\
+  struct perf_hpp *hpp,
\
+  struct hist_entry *he)   
\
+{  
\
+   return __hpp__color_fmt(hpp, he, he_get_acc_##_field);  
\
+}
+
 __HPP_COLOR_PERCENT_FN(overhead, period)
 __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys)
 __HPP_COLOR_PERCENT_FN(overhead_us, period_us)
 __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys)
 __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us)
+__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period)
 
 #undef __HPP_COLOR_PERCENT_FN
 
@@ -121,6 +135,8 @@ void perf_gtk__init_hpp(void)
perf_gtk__hpp_color_overhead_guest_sys;
perf_hpp__format[PERF_HPP__OVERHEAD_GUEST_US].color =
perf_gtk__hpp_color_overhead_guest_us;
+   perf_hpp__format[PERF_HPP__OVERHEAD_ACC].color =
+   perf_gtk__hpp_color_overhead_acc;
 }
 
 static void callchain_list__sym_name(struct callchain_list *cl,
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/3] regulator: act8865: add PMIC driver

2013-12-17 Thread Wenyou Yang
Hi Mark,

Thanks a lot for your direction.

According to your advice, I prepared this version.

The patch set is to add act8865 PMIC driver.

The active-semi act8865 is designed as a PMIC for Atmel sama5d3x and at91sam9 
series.
Its datasheet is available at: 
http://www.active-semi.com/sheets/ACT8865_Datasheet.pdf.

The patches is based on the branch: for-next of git respository, 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git
and [PATCH] regulator: read low power states configuration from device tree 
from Vincent Palatin
https://patchwork.kernel.org/patch/2833667/

Thanks.

Best Regards,
Wenyou Yang

v3 changelog:
 1./ Add map_voltage() operation which missed.
 2./ Remove regulator_unregister statement which no need.
 3./ Remvoe memset statement.
 4./ Change the device tree regulator-name with the supply name in the 
schematic.
 5./ List all theregulator name in the binding doc.

v2 changelog:
 1./ Using regmap for register I/O instead of i2c function directly.
 2./ Using the helpers provided by the core.
 3./ Remove noisy logging.
 4./ Using the latest regulator register API.
 5./ Using module_i2c_driver helper macro replace module_init and module_exit.
 6./ Remove the vsel-state-low dt property which is not used now.


Wenyou Yang (3):
  regulator: act8865: add PMIC act8865 driver
  regulator: act8865: add device tree binding doc
  ARM: dts: sama5d3xcm: add the regulator device node

 .../bindings/regulator/act8865-regulator.txt   |   58 +++
 .../devicetree/bindings/vendor-prefixes.txt|1 +
 arch/arm/boot/dts/sama5d3xcm.dtsi  |   46 +++
 drivers/regulator/Kconfig  |8 +
 drivers/regulator/Makefile |1 +
 drivers/regulator/act8865-regulator.c  |  381 
 include/linux/regulator/act8865.h  |   53 +++
 7 files changed, 548 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/act8865-regulator.txt
 create mode 100644 drivers/regulator/act8865-regulator.c
 create mode 100644 include/linux/regulator/act8865.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/8] pciehp: Don't disable the link permanently, during removal

2013-12-17 Thread Yinghai Lu
On Tue, Dec 17, 2013 at 7:20 PM, Rajat Jain  wrote:
>
> Actually I did not understand the original problem and the solution in the 
> first
> place (so I also do not understand how might disabling of presence detect 
> notification
> help). If you can give more details on the original problem that shall be 
> great. Here
> is what I understood from the commit log:
>
> The believe the HW looks like this:
>
> PCIe port <> Repeater <> Device.
>
> An in addition there is the presence detect pin that is connected directly 
> from
> The port to the device. Now, when the device is plugged out, the pin indicates
> No presence. But are you saying the PCIe link from port to repeater is still 
> up?

After the card is removed from the slot.

PCIe port try to retrain the link to repeater, like the link will keep
up and down.

so the presence bit will keep report one card present and not present.
that present bit should be OR of inband input and outband input.
We check the outband input and it always report correctly.

According to HW guys and Intel, that should be bug of repeater.

Disable the link from pcie to repeater, likely to reset the repeater

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Tom Herbert
>> Yes , in it's current state it's broken. But maybe we can try to fix
>> it instead of arbitrarily removing it. Please see my patches on
>> plumbing RFS into tuntap which may start to make it useful.
> Do you mean you patch [5/5] tun: Added support for RFS on tun flows?
> Sorry, can you say with more details?

Correct. It was RFC since I didn't have a good way to test, if you do
please try it and see if there's any effect. We should also be able to
do something similar for KVM guests, either doing the flow lookup on
each packet from the guest, or use aRFS interface from the guest
driver for end to end RFS (more exciting prospect). We are finding
that guest to driver accelerations like this (and tso, lro) are quite
important in getting virtual networking performance up.

>
>>
>> Tom
>>
>>> Signed-off-by: Zhi Yong Wu 
>>> ---
>>>  drivers/net/tun.c |  208 
>>> +++--
>>>  1 files changed, 10 insertions(+), 198 deletions(-)
>>>
>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>> index 7c8343a..7c27fdc 100644
>>> --- a/drivers/net/tun.c
>>> +++ b/drivers/net/tun.c
>>> @@ -32,12 +32,15 @@
>>>   *
>>>   *  Daniel Podlejski 
>>>   *Modifications for 2.3.99-pre5 kernel.
>>> + *
>>> + *  Zhi Yong Wu 
>>> + *Remove the flow cache.
>>>   */
>>>
>>>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>>
>>>  #define DRV_NAME   "tun"
>>> -#define DRV_VERSION"1.6"
>>> +#define DRV_VERSION"1.7"
>>>  #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
>>>  #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "
>>>
>>> @@ -146,18 +149,6 @@ struct tun_file {
>>> struct tun_struct *detached;
>>>  };
>>>
>>> -struct tun_flow_entry {
>>> -   struct hlist_node hash_link;
>>> -   struct rcu_head rcu;
>>> -   struct tun_struct *tun;
>>> -
>>> -   u32 rxhash;
>>> -   int queue_index;
>>> -   unsigned long updated;
>>> -};
>>> -
>>> -#define TUN_NUM_FLOW_ENTRIES 1024
>>> -
>>>  /* Since the socket were moved to tun_file, to preserve the behavior of 
>>> persist
>>>   * device, socket filter, sndbuf and vnet header size were restore when the
>>>   * file were attached to a persist device.
>>> @@ -184,163 +175,11 @@ struct tun_struct {
>>> int debug;
>>>  #endif
>>> spinlock_t lock;
>>> -   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
>>> -   struct timer_list flow_gc_timer;
>>> -   unsigned long ageing_time;
>>> unsigned int numdisabled;
>>> struct list_head disabled;
>>> void *security;
>>> -   u32 flow_count;
>>>  };
>>>
>>> -static inline u32 tun_hashfn(u32 rxhash)
>>> -{
>>> -   return rxhash & 0x3ff;
>>> -}
>>> -
>>> -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
>>> rxhash)
>>> -{
>>> -   struct tun_flow_entry *e;
>>> -
>>> -   hlist_for_each_entry_rcu(e, head, hash_link) {
>>> -   if (e->rxhash == rxhash)
>>> -   return e;
>>> -   }
>>> -   return NULL;
>>> -}
>>> -
>>> -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
>>> - struct hlist_head *head,
>>> - u32 rxhash, u16 queue_index)
>>> -{
>>> -   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
>>> -
>>> -   if (e) {
>>> -   tun_debug(KERN_INFO, tun, "create flow: hash %u index %u\n",
>>> - rxhash, queue_index);
>>> -   e->updated = jiffies;
>>> -   e->rxhash = rxhash;
>>> -   e->queue_index = queue_index;
>>> -   e->tun = tun;
>>> -   hlist_add_head_rcu(>hash_link, head);
>>> -   ++tun->flow_count;
>>> -   }
>>> -   return e;
>>> -}
>>> -
>>> -static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry 
>>> *e)
>>> -{
>>> -   tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n",
>>> - e->rxhash, e->queue_index);
>>> -   hlist_del_rcu(>hash_link);
>>> -   kfree_rcu(e, rcu);
>>> -   --tun->flow_count;
>>> -}
>>> -
>>> -static void tun_flow_flush(struct tun_struct *tun)
>>> -{
>>> -   int i;
>>> -
>>> -   spin_lock_bh(>lock);
>>> -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
>>> -   struct tun_flow_entry *e;
>>> -   struct hlist_node *n;
>>> -
>>> -   hlist_for_each_entry_safe(e, n, >flows[i], hash_link)
>>> -   tun_flow_delete(tun, e);
>>> -   }
>>> -   spin_unlock_bh(>lock);
>>> -}
>>> -
>>> -static void tun_flow_delete_by_queue(struct tun_struct *tun, u16 
>>> queue_index)
>>> -{
>>> -   int i;
>>> -
>>> -   spin_lock_bh(>lock);
>>> -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
>>> -   struct tun_flow_entry *e;
>>> -   struct hlist_node *n;
>>> -
>>> -   hlist_for_each_entry_safe(e, n, >flows[i], 

Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
HI, Tom,

On Wed, Dec 18, 2013 at 12:06 PM, Tom Herbert  wrote:
> On Mon, Dec 16, 2013 at 11:26 PM, Zhi Yong Wu  wrote:
>> From: Zhi Yong Wu 
>>
>> The flow cache is an extremely broken concept, and it usually brings up
>> growth issues and DoS attacks, so this patch is trying to remove it from
>> the tuntap driver, and insteadly use a simpler way for its flow control.
>>
> Yes , in it's current state it's broken. But maybe we can try to fix
> it instead of arbitrarily removing it. Please see my patches on
> plumbing RFS into tuntap which may start to make it useful.
Do you mean you patch [5/5] tun: Added support for RFS on tun flows?
Sorry, can you say with more details?

>
> Tom
>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  drivers/net/tun.c |  208 
>> +++--
>>  1 files changed, 10 insertions(+), 198 deletions(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 7c8343a..7c27fdc 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -32,12 +32,15 @@
>>   *
>>   *  Daniel Podlejski 
>>   *Modifications for 2.3.99-pre5 kernel.
>> + *
>> + *  Zhi Yong Wu 
>> + *Remove the flow cache.
>>   */
>>
>>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>
>>  #define DRV_NAME   "tun"
>> -#define DRV_VERSION"1.6"
>> +#define DRV_VERSION"1.7"
>>  #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
>>  #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "
>>
>> @@ -146,18 +149,6 @@ struct tun_file {
>> struct tun_struct *detached;
>>  };
>>
>> -struct tun_flow_entry {
>> -   struct hlist_node hash_link;
>> -   struct rcu_head rcu;
>> -   struct tun_struct *tun;
>> -
>> -   u32 rxhash;
>> -   int queue_index;
>> -   unsigned long updated;
>> -};
>> -
>> -#define TUN_NUM_FLOW_ENTRIES 1024
>> -
>>  /* Since the socket were moved to tun_file, to preserve the behavior of 
>> persist
>>   * device, socket filter, sndbuf and vnet header size were restore when the
>>   * file were attached to a persist device.
>> @@ -184,163 +175,11 @@ struct tun_struct {
>> int debug;
>>  #endif
>> spinlock_t lock;
>> -   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
>> -   struct timer_list flow_gc_timer;
>> -   unsigned long ageing_time;
>> unsigned int numdisabled;
>> struct list_head disabled;
>> void *security;
>> -   u32 flow_count;
>>  };
>>
>> -static inline u32 tun_hashfn(u32 rxhash)
>> -{
>> -   return rxhash & 0x3ff;
>> -}
>> -
>> -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
>> rxhash)
>> -{
>> -   struct tun_flow_entry *e;
>> -
>> -   hlist_for_each_entry_rcu(e, head, hash_link) {
>> -   if (e->rxhash == rxhash)
>> -   return e;
>> -   }
>> -   return NULL;
>> -}
>> -
>> -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
>> - struct hlist_head *head,
>> - u32 rxhash, u16 queue_index)
>> -{
>> -   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
>> -
>> -   if (e) {
>> -   tun_debug(KERN_INFO, tun, "create flow: hash %u index %u\n",
>> - rxhash, queue_index);
>> -   e->updated = jiffies;
>> -   e->rxhash = rxhash;
>> -   e->queue_index = queue_index;
>> -   e->tun = tun;
>> -   hlist_add_head_rcu(>hash_link, head);
>> -   ++tun->flow_count;
>> -   }
>> -   return e;
>> -}
>> -
>> -static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry 
>> *e)
>> -{
>> -   tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n",
>> - e->rxhash, e->queue_index);
>> -   hlist_del_rcu(>hash_link);
>> -   kfree_rcu(e, rcu);
>> -   --tun->flow_count;
>> -}
>> -
>> -static void tun_flow_flush(struct tun_struct *tun)
>> -{
>> -   int i;
>> -
>> -   spin_lock_bh(>lock);
>> -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
>> -   struct tun_flow_entry *e;
>> -   struct hlist_node *n;
>> -
>> -   hlist_for_each_entry_safe(e, n, >flows[i], hash_link)
>> -   tun_flow_delete(tun, e);
>> -   }
>> -   spin_unlock_bh(>lock);
>> -}
>> -
>> -static void tun_flow_delete_by_queue(struct tun_struct *tun, u16 
>> queue_index)
>> -{
>> -   int i;
>> -
>> -   spin_lock_bh(>lock);
>> -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
>> -   struct tun_flow_entry *e;
>> -   struct hlist_node *n;
>> -
>> -   hlist_for_each_entry_safe(e, n, >flows[i], hash_link) {
>> -   if (e->queue_index == queue_index)
>> -   tun_flow_delete(tun, e);
>> -   }
>> -   }
>> -   spin_unlock_bh(>lock);
>> -}
>> -
>> -static void 

Re: [Query] Ticks happen in pair for NO_HZ_FULL cores ?

2013-12-17 Thread Viresh Kumar
On 17 December 2013 22:05, Kevin Hilman  wrote:
> For future reference, for generating email friendly trace output for
> discussion like this, you can use something like:
>
>trace-cmd report --cpu=1 trace.dat

Okay..

>> And after that the next event comes after 5 Seconds.
>>
>> And so I was talking for the Event 41.
>
> That first event (Event 41) is an interrupt, and comes from the
> scheduler tick.  The tick is happening because the writeback workqueue
> just ran and we're not in NO_HZ mode.

This is what I was trying to ask. Why can't we enter in NO_HZ_FULL mode
as soon as writeback workqueue just ran? That way we can go into NOHZ
mode earlier..

> However, as soon as that IRQ (and resulting softirqs) are finished, we
> enter NO_HZ mode again.  But as you mention, it only lasts for ~5 sec
> when the timer fires again.  Once again, it fires because of the
> writeback workqueue, and soon therafter it switches back to NO_HZ mode
> again.

That's fine.. It wasn't part of my query :) .. But yes your trick
would be useful
for my usecase :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-17 Thread Chanho Min

> I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
> In experiment, I couldn't see much gain like you both system and even it
> was regressed at bs=32k test, maybe workqueue allocation/schedule of work
> per I/O.
> Your test is rather special or what I am missing?
Can you specify your test result on ARM with eMMC.

> Before that, I'd like to know fundamental reason why your implementation
> for asynchronous read enhance. At a first glance, I thought it's caused by
> readahead from MM layer but when I read code, I found I was wrong.
> MM's readahead logic works based on PageReadahead marker but squashfs
> invalidates by grab_cache_page_nowait so it wouldn't work as we expected.
>
> Another possibility is block I/O merging in block layder by plugging logic,
> which was what I tried a few month ago although implementation was really
> bad. But it wouldn't work with your patch because do_generic_file_read
> will unplug block layer by lock_page without merging enough I/O.
>
> So, what do you think real actuator for enhance your experiment?
> Then, I could investigate why I can't get a benefit.
Currently, squashfs adds request to the block device queue synchronously with
wait for competion. mmc takes this request one by one and push them to host 
driver,
But it allows mmc to be idle frequently. This patch allows to add block requset
asynchrously without wait for competion, mmcqd can fetch a lot of request from 
block
at a time. As a result, mmcqd get busy and use a more bandwidth of mmc.
For test, I added two count variables in mmc_queue_thread as bellows
and tested same dd transfer.

static int mmc_queue_thread(void *d)
{
..
do {
if (req || mq->mqrq_prev->req) {
fetch++;
} else {
idle++;
}
} while (1);
..
}

without patch:
 fetch: 920, idle: 460

with patch
 fetch: 918, idle: 40

Thanks
Chanho.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Regression] sched: division by zero in find_busiest_group()

2013-12-17 Thread Hedi Berriche
On Mon, Dec 09, 2013 at 18:10 Hedi Berriche wrote:
| Folks,
| 
| The following panic occurs *early* at boot time on high *enough* CPU count
| machines:
| 
| divide error:  [#1] SMP 
| Modules linked in:
| CPU: 22 PID: 1146 Comm: kworker/22:0 Not tainted 3.13.0-rc2-00122-gdea4f48 #8
| Hardware name: Intel Corp. Stoutland Platform, BIOS 2.20 UEFI2.10 PI1.0 X64 
2013-09-20
| task: 8827d49f31c0 ti: 8827d4a18000 task.ti: 8827d4a18000
| RIP: 0010:[]  [] 
find_busiest_group+0x26b/0x890
| RSP: :8827d4a19b68  EFLAGS: 00010006
| RAX: 7fff RBX: 8000 RCX: 0200
| RDX:  RSI: 8000 RDI: 0020
| RBP: 8827d4a19cc0 R08:  R09: 
| R10:  R11:  R12: 
| R13: 8827d4a19d28 R14: 8827d4a19b98 R15: 
| FS:  () GS:8827dfd8() knlGS:
| CS:  0010 DS:  ES:  CR0: 8005003b
| CR2: 00b8 CR3: 018da000 CR4: 07e0
| Stack:
| 8827d4b35800  00014600 00014600
|  8827d4b35818  
|   8000 
| Call Trace:
| [] load_balance+0x166/0x7f0
| [] idle_balance+0x10e/0x1b0
| [] __schedule+0x723/0x780
| [] schedule+0x29/0x70
| [] worker_thread+0x1c9/0x400
| [] ? rescuer_thread+0x3e0/0x3e0
| [] kthread+0xd2/0xf0
| [] ? kthread_create_on_node+0x180/0x180
| [] ret_from_fork+0x7c/0xb0
| [] ? kthread_create_on_node+0x180/0x180

Hmm...had time to dig into this a bit deeper and looking at
build_overlap_sched_groups(), specifically this bit of code:

kernel/sched/core.c:

5066 static int
5067 build_overlap_sched_groups(struct sched_domain *sd, int cpu)
5068 {
...
5109 /*
5110  * Initialize sgp->power such that even if we mess up the
5111  * domains and no possible iteration will get us here, we 
won't
5112  * die on a /0 trap.
5113  */
5114 sg->sgp->power = SCHED_POWER_SCALE * 
cpumask_weight(sg_span);

I'm wondering whether the same precaution should be used when it comes to 
sg->sgp->power_orig.

Cheers,
Hedi.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v12] PPC: POWERNV: move iommu_add_device earlier

2013-12-17 Thread Alexey Kardashevskiy
The current implementation of IOMMU on sPAPR does not use iommu_ops
and therefore does not call IOMMU API's bus_set_iommu() which
1) sets iommu_ops for a bus
2) registers a bus notifier
Instead, PCI devices are added to IOMMU groups from
subsys_initcall_sync(tce_iommu_init) which does basically the same
thing without using iommu_ops callbacks.

However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
implements iommu_ops and when tce_iommu_init is called, every PCI device
is already added to some group so there is a conflict.

This patch does 2 things:
1. removes the loop in which PCI devices were added to groups and
adds explicit iommu_add_device() calls to add devices as soon as they get
the iommu_table pointer assigned to them.
2. moves a bus notifier to powernv code in order to avoid conflict with
the notifier from Freescale driver.

iommu_add_device() and iommu_del_device() are public now.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v12:
* removed redundant bus notifier from common POWERPC code

v11:
* rebased on upstream

v10:
* fixed linker error when IOMMU_API is not enabled

v9:
* removed "KVM" from the subject as it is not really a KVM patch so
PPC mainainter (hi Ben!) can review/include it into his tree

v8:
* added the check for iommu_group!=NULL before removing device from a group
as suggested by Wei Yang 

v2:
* added a helper - set_iommu_table_base_and_group - which does
set_iommu_table_base() and iommu_add_device()
---
 arch/powerpc/include/asm/iommu.h| 26 ++
 arch/powerpc/kernel/iommu.c | 41 +++--
 arch/powerpc/platforms/powernv/pci-ioda.c   |  8 +++---
 arch/powerpc/platforms/powernv/pci-p5ioc2.c |  2 +-
 arch/powerpc/platforms/powernv/pci.c| 31 +-
 arch/powerpc/platforms/pseries/iommu.c  |  8 +++---
 6 files changed, 70 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index c34656a..774fa27 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -101,8 +101,34 @@ extern void iommu_free_table(struct iommu_table *tbl, 
const char *node_name);
  */
 extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
int nid);
+#ifdef CONFIG_IOMMU_API
 extern void iommu_register_group(struct iommu_table *tbl,
 int pci_domain_number, unsigned long pe_num);
+extern int iommu_add_device(struct device *dev);
+extern void iommu_del_device(struct device *dev);
+#else
+static inline void iommu_register_group(struct iommu_table *tbl,
+   int pci_domain_number,
+   unsigned long pe_num)
+{
+}
+
+static inline int iommu_add_device(struct device *dev)
+{
+   return 0;
+}
+
+static inline void iommu_del_device(struct device *dev)
+{
+}
+#endif /* !CONFIG_IOMMU_API */
+
+static inline void set_iommu_table_base_and_group(struct device *dev,
+ void *base)
+{
+   set_iommu_table_base(dev, base);
+   iommu_add_device(dev);
+}
 
 extern int iommu_map_sg(struct device *dev, struct iommu_table *tbl,
struct scatterlist *sglist, int nelems,
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 572bb5b..ecbf468 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1105,7 +1105,7 @@ void iommu_release_ownership(struct iommu_table *tbl)
 }
 EXPORT_SYMBOL_GPL(iommu_release_ownership);
 
-static int iommu_add_device(struct device *dev)
+int iommu_add_device(struct device *dev)
 {
struct iommu_table *tbl;
int ret = 0;
@@ -1134,46 +1134,13 @@ static int iommu_add_device(struct device *dev)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(iommu_add_device);
 
-static void iommu_del_device(struct device *dev)
+void iommu_del_device(struct device *dev)
 {
iommu_group_remove_device(dev);
 }
-
-static int iommu_bus_notifier(struct notifier_block *nb,
- unsigned long action, void *data)
-{
-   struct device *dev = data;
-
-   switch (action) {
-   case BUS_NOTIFY_ADD_DEVICE:
-   return iommu_add_device(dev);
-   case BUS_NOTIFY_DEL_DEVICE:
-   iommu_del_device(dev);
-   return 0;
-   default:
-   return 0;
-   }
-}
-
-static struct notifier_block tce_iommu_bus_nb = {
-   .notifier_call = iommu_bus_notifier,
-};
-
-static int __init tce_iommu_init(void)
-{
-   struct pci_dev *pdev = NULL;
-
-   BUILD_BUG_ON(PAGE_SIZE < IOMMU_PAGE_SIZE);
-
-   for_each_pci_dev(pdev)
-   iommu_add_device(>dev);
-
-   bus_register_notifier(_bus_type, _iommu_bus_nb);
-   return 0;
-}
-
-subsys_initcall_sync(tce_iommu_init);
+EXPORT_SYMBOL_GPL(iommu_del_device);
 
 #else
 
diff --git 

Re: [PATCH 13/14] tools lib traceevent: Get rid of die() in some string conversion funcitons

2013-12-17 Thread Namhyung Kim
Hi Arnaldo,

On Tue, 17 Dec 2013 17:02:39 -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Dec 17, 2013 at 09:02:36AM +0900, Namhyung Kim escreveu:
>> On Mon, 16 Dec 2013 09:40:51 -0300, Arnaldo Carvalho de Melo wrote:
>> > Em Mon, Dec 16, 2013 at 01:49:11PM +0900, Namhyung Kim escreveu:
>> >> On Fri, 13 Dec 2013 11:52:04 -0300, Arnaldo Carvalho de Melo wrote:
>> >> > All the rest is ok, so its just the malloc + strcpy that remains to be
>> >> > converted, do you want me to do it?
>
>> >> Hmm.. did you mean like this?
>
>> >>   str = NULL;
>> >> if (val)
>> >>   asprintf(, "TRUE");
>> >> else
>> >>   asprintf(, "FALSE");
>> >> return str;
>
>> > More compact:
>
>> >if (asprintf(, "%s", val ? "TRUE" : "FALSE") < 0)
>> >// error handling path
>
>> > At that point str already is set to NULL.
>
>> Okay, this is a new one:
>
> Thanks, it all seems now, but just prior to applying this I noticed:
>
>> Those functions are for stringify filter arguments.  As caller of
>> those functions handles NULL string properly, it seems that it's
>> enough to return NULL rather than calling die().
>
> It handles NULL in what way? This comment:
>
>> @@ -2369,7 +2340,7 @@ static char *arg_to_str(struct event_filter *filter, 
>> struct filter_arg *arg)
>>   * Returns a string that displays the filter contents.
>>   *  This string must be freed with free(str).
>> - *  NULL is returned if no filter is found.
>> + *  NULL is returned if no filter is found or allocation failed.
>>   */
>>  char *
>>  pevent_filter_make_string(struct event_filter *filter, int event_id)
>
> Made me a bit unconfortable, so if it handles NULL as a filter not
> found, how will it figure out what happened?
>
> /me looks at the callers...
>
> From just a quick look I couldn't see cases where NULL could cause
> segfaults, but saw some cases where allocation errors would not be
> notified in any way to the user :-\

Right.  I just wanted to keep the existing interface as long as possible.

>
> Anyway, applying this patch, those are other kinds of problems, i.e. further
> fallout from converting from the previous panic()-at-alloc-failure approach.

Thanks!  But there's one more patch (14/14) left from the series.
Please also consider merging it too. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Tom Herbert
On Mon, Dec 16, 2013 at 11:26 PM, Zhi Yong Wu  wrote:
> From: Zhi Yong Wu 
>
> The flow cache is an extremely broken concept, and it usually brings up
> growth issues and DoS attacks, so this patch is trying to remove it from
> the tuntap driver, and insteadly use a simpler way for its flow control.
>
Yes , in it's current state it's broken. But maybe we can try to fix
it instead of arbitrarily removing it. Please see my patches on
plumbing RFS into tuntap which may start to make it useful.

Tom

> Signed-off-by: Zhi Yong Wu 
> ---
>  drivers/net/tun.c |  208 
> +++--
>  1 files changed, 10 insertions(+), 198 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 7c8343a..7c27fdc 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -32,12 +32,15 @@
>   *
>   *  Daniel Podlejski 
>   *Modifications for 2.3.99-pre5 kernel.
> + *
> + *  Zhi Yong Wu 
> + *Remove the flow cache.
>   */
>
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
>  #define DRV_NAME   "tun"
> -#define DRV_VERSION"1.6"
> +#define DRV_VERSION"1.7"
>  #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
>  #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "
>
> @@ -146,18 +149,6 @@ struct tun_file {
> struct tun_struct *detached;
>  };
>
> -struct tun_flow_entry {
> -   struct hlist_node hash_link;
> -   struct rcu_head rcu;
> -   struct tun_struct *tun;
> -
> -   u32 rxhash;
> -   int queue_index;
> -   unsigned long updated;
> -};
> -
> -#define TUN_NUM_FLOW_ENTRIES 1024
> -
>  /* Since the socket were moved to tun_file, to preserve the behavior of 
> persist
>   * device, socket filter, sndbuf and vnet header size were restore when the
>   * file were attached to a persist device.
> @@ -184,163 +175,11 @@ struct tun_struct {
> int debug;
>  #endif
> spinlock_t lock;
> -   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
> -   struct timer_list flow_gc_timer;
> -   unsigned long ageing_time;
> unsigned int numdisabled;
> struct list_head disabled;
> void *security;
> -   u32 flow_count;
>  };
>
> -static inline u32 tun_hashfn(u32 rxhash)
> -{
> -   return rxhash & 0x3ff;
> -}
> -
> -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
> rxhash)
> -{
> -   struct tun_flow_entry *e;
> -
> -   hlist_for_each_entry_rcu(e, head, hash_link) {
> -   if (e->rxhash == rxhash)
> -   return e;
> -   }
> -   return NULL;
> -}
> -
> -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
> - struct hlist_head *head,
> - u32 rxhash, u16 queue_index)
> -{
> -   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
> -
> -   if (e) {
> -   tun_debug(KERN_INFO, tun, "create flow: hash %u index %u\n",
> - rxhash, queue_index);
> -   e->updated = jiffies;
> -   e->rxhash = rxhash;
> -   e->queue_index = queue_index;
> -   e->tun = tun;
> -   hlist_add_head_rcu(>hash_link, head);
> -   ++tun->flow_count;
> -   }
> -   return e;
> -}
> -
> -static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e)
> -{
> -   tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n",
> - e->rxhash, e->queue_index);
> -   hlist_del_rcu(>hash_link);
> -   kfree_rcu(e, rcu);
> -   --tun->flow_count;
> -}
> -
> -static void tun_flow_flush(struct tun_struct *tun)
> -{
> -   int i;
> -
> -   spin_lock_bh(>lock);
> -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
> -   struct tun_flow_entry *e;
> -   struct hlist_node *n;
> -
> -   hlist_for_each_entry_safe(e, n, >flows[i], hash_link)
> -   tun_flow_delete(tun, e);
> -   }
> -   spin_unlock_bh(>lock);
> -}
> -
> -static void tun_flow_delete_by_queue(struct tun_struct *tun, u16 queue_index)
> -{
> -   int i;
> -
> -   spin_lock_bh(>lock);
> -   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
> -   struct tun_flow_entry *e;
> -   struct hlist_node *n;
> -
> -   hlist_for_each_entry_safe(e, n, >flows[i], hash_link) {
> -   if (e->queue_index == queue_index)
> -   tun_flow_delete(tun, e);
> -   }
> -   }
> -   spin_unlock_bh(>lock);
> -}
> -
> -static void tun_flow_cleanup(unsigned long data)
> -{
> -   struct tun_struct *tun = (struct tun_struct *)data;
> -   unsigned long delay = tun->ageing_time;
> -   unsigned long next_timer = jiffies + delay;
> -   unsigned long count = 0;
> -   int i;
> -
> -   tun_debug(KERN_INFO, tun, "tun_flow_cleanup\n");
> -
> -   

  1   2   3   4   5   6   7   8   9   10   >