date:20130818

Re: [PATCH] dmaengine: make dmatest less noisy

2013-08-18 Thread Vinod Koul

On Sat, Aug 17, 2013 at 12:42:40PM +0200, Linus Walleij wrote:
> Commit 95019c8c5 "dmatest: gather test results in the linked list"
> started to warning whenever we add results to a test thread.
> A warning for something completely normal? This is just cluttering
> my terminal. Get rid of this.
> 
> Cc: Andy Shevchenko 
> Signed-off-by: Linus Walleij 
> ---
>  drivers/dma/dmatest.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
> index e88ded2..6bb51e2 100644
> --- a/drivers/dma/dmatest.c
> +++ b/drivers/dma/dmatest.c
> @@ -406,7 +406,6 @@ static int thread_result_add(struct dmatest_info *info,
>   list_add_tail(>node, >results);
>   mutex_unlock(>results_lock);
>  
> - pr_warn("%s\n", thread_result_get(r->name, tr));
perhaps move to debug?

>   return 0;
>  }
>  
> -- 
> 1.8.1.4
> 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG REPORT] ZSWAP: theoretical race condition issues

2013-08-18 Thread Minchan Kim

On Mon, Aug 19, 2013 at 10:17:38AM +0800, Bob Liu wrote:
> Hi Weijie,
> 
> On 08/19/2013 12:14 AM, Weijie Yang wrote:
> > I found a few bugs in zswap when I review Linux-3.11-rc5, and I have
> > also some questions about it, described as following:
> > 
> > BUG:
> > 1. A race condition when reclaim a page
> > when a handle alloced from zbud, zbud considers this handle is used
> > validly by upper(zswap) and can be a candidate for reclaim.
> > But zswap has to initialize it such as setting swapentry and addding
> > it to rbtree. so there is a race condition, such as:
> > thread 0: obtain handle x from zbud_alloc
> > thread 1: zbud_reclaim_page is called
> > thread 1: callback zswap_writeback_entry to reclaim handle x
> > thread 1: get swpentry from handle x (it is random value now)
> > thread 1: bad thing may happen
> > thread 0: initialize handle x with swapentry

Nice catch!

> 
> Yes, this may happen potentially but in rare case.
> Because we have a LRU list for page frames, after Thread 0 called
> zbud_alloc the corresponding page will be add to the head of LRU
> list,While zbud_reclaim_page(Thread 1 called) is started from the tail
> of LRU list.
> 
> > Of course, this situation almost never happen, it is a "theoretical
> > race condition" issue.

But it's doable and we should prevent that although you feel it's rare
because system could go hang. When I look at the code, Why should zbud
have LRU logic instead of zswap? If I missed some history, sorry about that.
But at least to me, zbud is just allocator so it should have a role
to handle alloc/free object and how client of the allocator uses objects
depends on the upper layer so zbud should handle LRU. If so, we wouldn't
encounter this problem, either.

> > 
> > 2. Pollute swapcache data by add a pre-invalided swap page
> > when a swap_entry is invalidated, it will be reused by other anon
> > page. At the same time, zswap is reclaiming old page, pollute
> > swapcache of new page as a result, because old page and new page use
> > the same swap_entry, such as:
> > thread 1: zswap reclaim entry x
> > thread 0: zswap_frontswap_invalidate_page entry x
> > thread 0: entry x reused by other anon page
> > thread 1: add old data to swapcache of entry x
> 
> I didn't get your idea here, why thread1 will add old data to entry x?
> 
> > thread 0: swapcache of entry x is polluted
> > Of course, this situation almost never happen, it is another
> > "theoretical race condition" issue.

Don't swapcache_prepare close the race?

> > 
> > 3. Frontswap uses frontswap_map bitmap to track page in "backend"
> > implementation, when zswap reclaim a
> > page, the corresponding bitmap record is not cleared.
> >
> 
> That's true, but I don't think it's a big problem.
> Only waste little time to search rbtree during zswap_frontswap_load().
> 
> > 4. zswap_tree is not freed when swapoff, and it got re-kzalloc in
> > swapon, memory leak occurs.
> 
> Nice catch! I think it should be freed in zswap_frontswap_invalidate_area().
> 
> > 
> > questions:
> > 1. How about SetPageReclaim befor __swap_writepage, so that move it to
> > the tail of the inactive list?

It's a good idea to avoid unnecessary page scanning.

> 
> It will be added to inactive now.
> 
> > 2. zswap uses GFP_KERNEL flag to alloc things in store and reclaim
> > function, does this lead to these function called recursively?
> 
> Yes, that's a potential problem.

It should use GFP_NOIO.

> 
> > 3. for reclaiming one zbud page which contains two buddies, zswap
> > needs to alloc two pages. Does this reclaim cost-efficient?

It would be better to evict zpage which is a compressed sequence of
PAGE_SIZE bytes rather than decompresesed PAGE_SIZE bytes a page when
we are about to reclaim the page but it's hard part from frontswap API.

> > 
> 
> Yes, that's a problem too. And that's why we use zbud as the default
> allocator instead of zsmalloc.
> I think improving the write back path of zswap is the next important
> step for zswap.
> 
> -- 
> Regards,
> -Bob
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/4] dmaengine: add driver for Samsung s3c24xx SoCs

2013-08-18 Thread Vinod Koul

On Wed, Aug 14, 2013 at 02:00:25PM +0200, Heiko Stübner wrote:
> This adds a new driver to support the s3c24xx dma using the dmaengine
> and makes the old one in mach-s3c24xx obsolete in the long run.
> 
> Conceptually the s3c24xx-dma feels like a distant relative of the pl08x
> with numerous virtual channels being mapped to a lot less physical ones.
> The driver therefore borrows a lot from the amba-pl08x driver in this
> regard. Functionality-wise the driver gains a memcpy ability in addition
> to the slave_sg one.
If that is the case why cant we have this driver supported from pl08x driver? If
the delta is only mapping then can that be seprated or both mapping hanlded?
Maybe you and Linus have already though about that?

> The driver supports both the method for requesting the peripheral used
> by SoCs before the S3C2443 and the different method for S3C2443 and later.
> 
> On earlier SoCs the hardware channels usable for specific peripherals is
> constrainted while on later SoCs all channels can be used for any peripheral.
> 
> Tested on a s3c2416-based board, memcpy using the dmatest module and
> slave_sg partially using the spi-s3c64xx driver.
> 
> Signed-off-by: Heiko Stuebner 

> +#define DISRC(0x00)
> +#define DISRCC   (0x04)
> +#define DISRCC_INC_INCREMENT (0 << 0)
> +#define DISRCC_INC_FIXED (1 << 0)
> +#define DISRCC_LOC_AHB   (0 << 1)
> +#define DISRCC_LOC_APB   (1 << 1)
> +
> +#define DIDST(0x08)
> +#define DIDSTC   (0x0C)
> +#define DIDSTC_INC_INCREMENT (0 << 0)
> +#define DIDSTC_INC_FIXED (1 << 0)
> +#define DIDSTC_LOC_AHB   (0 << 1)
> +#define DIDSTC_LOC_APB   (1 << 1)
> +#define DIDSTC_INT_TC0   (0 << 2)
> +#define DIDSTC_INT_RELOAD(1 << 2)
> +
> +#define DCON (0x10)
> +
> +#define DCON_TC_MASK 0xf
> +#define DCON_DSZ_BYTE(0 << 20)
> +#define DCON_DSZ_HALFWORD(1 << 20)
> +#define DCON_DSZ_WORD(2 << 20)
> +#define DCON_DSZ_MASK(3 << 20)
> +#define DCON_DSZ_SHIFT   20
> +#define DCON_AUTORELOAD  (0 << 22)
> +#define DCON_NORELOAD(1 << 22)
> +#define DCON_HWTRIG  (1 << 23)
> +#define DCON_HWSRC_SHIFT 24
> +#define DCON_SERV_SINGLE (0 << 27)
> +#define DCON_SERV_WHOLE  (1 << 27)
> +#define DCON_TSZ_UNIT(0 << 28)
> +#define DCON_TSZ_BURST4  (1 << 28)
> +#define DCON_INT (1 << 29)
> +#define DCON_SYNC_PCLK   (0 << 30)
> +#define DCON_SYNC_HCLK   (1 << 30)
> +#define DCON_DEMAND  (0 << 31)
> +#define DCON_HANDSHAKE   (1 << 31)
> +
> +#define DSTAT(0x14)
> +#define DSTAT_STAT_BUSY  (1 << 20)
> +#define DSTAT_CURRTC_MASK0xf
> +
> +#define DMASKTRIG(0x20)
> +#define DMASKTRIG_STOP   (1 << 2)
> +#define DMASKTRIG_ON (1 << 1)
> +#define DMASKTRIG_SWTRIG (1 << 0)
> +
> +#define DMAREQSEL(0x24)
> +#define DMAREQSEL_HW (1 << 0)
This is proper namespacing...

> +static int s3c24xx_dma_set_runtime_config(struct s3c24xx_dma_chan *s3cchan,
> +   struct dma_slave_config *config)
> +{
> + if (!s3cchan->slave)
> + return -EINVAL;
> +
> + /* Reject definitely invalid configurations */
> + if (config->src_addr_width == DMA_SLAVE_BUSWIDTH_8_BYTES ||
> + config->dst_addr_width == DMA_SLAVE_BUSWIDTH_8_BYTES)
> + return -EINVAL;
> +
> + s3cchan->cfg = *config;
you are takinga  ref to client pointer without a clue on when that would be
freed. I dont think its a good idea!

> +static irqreturn_t s3c24xx_dma_irq(int irq, void *data)
> +{
> + struct s3c24xx_dma_phy *phy = data;
> + struct s3c24xx_dma_chan *s3cchan = phy->serving;
> + struct s3c24xx_txd *txd;
> +
> + dev_dbg(>host->pdev->dev, "interrupt on channel %d\n", phy->id);
> +
> + if (!s3cchan) {
> + dev_err(>host->pdev->dev, "interrupt on unused channel 
> %d\n",
> + phy->id);
> + return IRQ_NONE;
hmmm, these channles belong to you. So if one of them is behvaing badly, then
not handling the interrupt will make things worse...

~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

UNSECURED LOANS @ 4.5% APPLY NOW

2013-08-18 Thread Cash Advance®

Good day

We are Merchant Cash Advance, we’re here to present you with a loan program 
with an affordable interest rate of 4.5%, if you are interested in getting 
loan, please fill out the registration form below.

(1) Full name:
(2) The loan amount required:
(3) Duration of loan:
(4) Country:
(5) Phone number:

For more information contact: 
m.c.a...@btinternet.com

Thank you
Merchant Cash Advance®
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 0/5] zram/zsmalloc promotion

2013-08-18 Thread Luigi Semenzato

On Sun, Aug 18, 2013 at 9:37 PM, Minchan Kim  wrote:
> Hello Bob,
>
> Sorry for the late response. I was on holiday.
>
> On Mon, Aug 19, 2013 at 11:57:41AM +0800, Bob Liu wrote:
>> Hi Minchan,
>>
>> On 08/19/2013 11:18 AM, Minchan Kim wrote:
>> > Hello Mel,
>> >
>> > On Fri, Aug 16, 2013 at 09:33:47AM +0100, Mel Gorman wrote:
>> >> On Fri, Aug 16, 2013 at 01:26:41PM +0900, Minchan Kim wrote:
>> >> 
>> >> If it's used for something like tmpfs then it becomes much worse. 
>> >> Normal
>> >> tmpfs without swap can lockup if tmpfs is allowed to fill memory. In a
>> >> sane configuration, lockups will be avoided and deleting a tmpfs file 
>> >> is
>> >> guaranteed to free memory. When zram is used to back tmpfs, there is 
>> >> no
>> >> guarantee that any memory is freed due to fragmentation of the 
>> >> compressed
>> >> pages. The only way to recover the memory may be to kill applications
>> >> holding tmpfs files open and then delete them which is fairly drastic
>> >> action in a normal server environment.
>> >
>> > Indeed.
>> > Actually, I had a plan to support zsmalloc compaction. The zsmalloc 
>> > exposes
>> > handle instead of pure pointer so it could migrate some zpages to 
>> > somewhere
>> > to pack in. Then, it could help above problem and OOM storm problem.
>> > Anyway, it's a totally new feature and requires many changes and 
>> > experiement.
>> > Although we don't have such feature, zram is still good for many 
>> > people.
>> >
>> 
>>  And is zsmalloc was pluggable for zswap then it would also benefit.
>> >>>
>> >>> But zswap isn't pseudo block device so it couldn't be used for block 
>> >>> device.
>> >>
>> >> It would not be impossible to write one. Taking a quick look it might even
>> >> be doable by just providing a zbud_ops that does not have an evict handler
>> >> and make sure the errors are handled correctly. i.e. does the following
>> >> patch mean that zswap never writes back and instead just compresses pages
>> >> in memory?
>> >>
>> >> diff --git a/mm/zswap.c b/mm/zswap.c
>> >> index deda2b6..99e41c8 100644
>> >> --- a/mm/zswap.c
>> >> +++ b/mm/zswap.c
>> >> @@ -819,7 +819,6 @@ static void zswap_frontswap_invalidate_area(unsigned 
>> >> type)
>> >>  }
>> >>
>> >>  static struct zbud_ops zswap_zbud_ops = {
>> >> -  .evict = zswap_writeback_entry
>> >>  };
>> >>
>> >>  static void zswap_frontswap_init(unsigned type)
>> >>
>> >> If so, it should be doable to link that up in a sane way so it can be
>> >> configured at runtime.
>> >>
>> >> Did you ever even try something like this?
>> >
>> > Never. Because I didn't have such requirement for zram.
>> >
>> >>
>> >>> Let say one usecase for using zram-blk.
>> >>>
>> >>> 1) Many embedded system don't have swap so although tmpfs can support 
>> >>> swapout
>> >>> it's pointless still so such systems should have sane configuration to 
>> >>> limit
>> >>> memory space so it's not only zram problem.
>> >>>
>> >>
>> >> If zswap was backed by a pseudo device that failed all writes or an an
>> >> ops with no evict handler then it would be functionally similar.
>> >>
>> >>> 2) Many embedded system don't have enough memory. Let's assume 
>> >>> short-lived
>> >>> file growing up until half of system memory once in a while. We don't 
>> >>> want
>> >>> to write it on flash by wear-leveing issue and very slowness so we want 
>> >>> to use
>> >>> in-memory but if we uses tmpfs, it should evict half of working set to 
>> >>> cover
>> >>> them when the size reach peak. zram would be better choice.
>> >>>
>> >>
>> >> Then back it by a pseudo device that fails all writes so it does not have
>> >> to write to disk.
>> >
>> > You mean "make pseudo block device and register make_request_fn
>> > and prevent writeback". Bah, yes, it's doable but what is it different 
>> > with below?
>> >
>> > 1) move zbud into zram
>> > 2) implement frontswap API in zram
>> > 3) implement writebazk in zram
>> >
>> > The zram has been for a long time in staging to be promoted and have been
>> > maintained/deployed. Of course, I have asked the promotion several times
>> > for above a year.
>> >
>> > Why can't zram include zswap functions if you really want to merge them?
>> > Is there any problem?
>>
>> I think merging zram into zswap or merging zswap into zram are the same
>> thing. It's no difference.
>
> True but i'd like to merge zswap code into zram.
> Because as you know, zram has already lots of users while zswap is almost
> new young so I'd like to keep backward compatibility for zram so moving zswap 
> code
> into zram is more handy and could keep the git log as well.
>
>> Both way will result in a solution finally with zram block device,
>> frontswap API etc.
>
> Right but z* family people should discuss that zswap-writeback is really
> good solution for compressed swap. Firstly, I thought zswap is differnt with
> zram so there is no issue to promote zram so I

Re: [PATCH 01/15] drivers: phy: add generic PHY framework

2013-08-18 Thread Kishon Vijay Abraham I

Felipe,

ping..

On Wednesday 14 August 2013 08:35 PM, Kishon Vijay Abraham I wrote:
> Hi,
> 
> On Wednesday 14 August 2013 04:34 AM, Tomasz Figa wrote:
>> On Wednesday 14 of August 2013 00:19:28 Sylwester Nawrocki wrote:
>>> W dniu 2013-08-13 14:05, Kishon Vijay Abraham I pisze:
 On Tuesday 13 August 2013 05:07 PM, Tomasz Figa wrote:
> On Tuesday 13 of August 2013 16:14:44 Kishon Vijay Abraham I wrote:
>> On Wednesday 31 July 2013 11:45 AM, Felipe Balbi wrote:
>>> On Wed, Jul 31, 2013 at 11:14:32AM +0530, Kishon Vijay Abraham I 
>> wrote:
> IMHO we need a lookup method for PHYs, just like for clocks,
> regulators, PWMs or even i2c busses because there are complex
> cases
> when passing just a name using platform data will not work. I
> would
> second what Stephen said [1] and define a structure doing
> things
> in a
> DT-like way.
>
> Example;
>
> [platform code]
>
> static const struct phy_lookup my_phy_lookup[] = {
>
>   PHY_LOOKUP("s3c-hsotg.0", "otg", "samsung-usbphy.1",
>   "phy.2"),

 The only problem here is that if *PLATFORM_DEVID_AUTO* is used
 while
 creating the device, the ids in the device name would change
 and
 PHY_LOOKUP wont be useful.
>>>
>>> I don't think this is a problem. All the existing lookup
>>> methods
>>> already
>>> use ID to identify devices (see regulators, clkdev, PWMs, i2c,
>>> ...). You
>>> can simply add a requirement that the ID must be assigned
>>> manually,
>>> without using PLATFORM_DEVID_AUTO to use PHY lookup.
>>
>> And I'm saying that this idea, of using a specific name and id,
>> is
>> frought with fragility and will break in the future in various
>> ways
>> when
>> devices get added to systems, making these strings constantly
>> have
>> to be
>> kept up to date with different board configurations.
>>
>> People, NEVER, hardcode something like an id.  The fact that
>> this
>> happens today with the clock code, doesn't make it right, it
>> makes
>> the
>> clock code wrong.  Others have already said that this is wrong
>> there
>> as
>> well, as systems change and dynamic ids get used more and more.
>>
>> Let's not repeat the same mistakes of the past just because we
>> refuse to
>> learn from them...
>>
>> So again, the "find a phy by a string" functions should be
>> removed,
>> the
>> device id should be automatically created by the phy core just
>> to
>> make
>> things unique in sysfs, and no driver code should _ever_ be
>> reliant
>> on
>> the number that is being created, and the pointer to the phy
>> structure
>> should be used everywhere instead.
>>
>> With those types of changes, I will consider merging this
>> subsystem,
>> but
>> without them, sorry, I will not.
>
> I'll agree with Greg here, the very fact that we see people
> trying to
> add a requirement of *NOT* using PLATFORM_DEVID_AUTO already
> points
> to a big problem in the framework.
>
> The fact is that if we don't allow PLATFORM_DEVID_AUTO we will
> end up
> adding similar infrastructure to the driver themselves to make
> sure
> we
> don't end up with duplicate names in sysfs in case we have
> multiple
> instances of the same IP in the SoC (or several of the same PCIe
> card).
> I really don't want to go back to that.

 If we are using PLATFORM_DEVID_AUTO, then I dont see any way we
 can
 give the correct binding information to the PHY framework. I think
 we
 can drop having this non-dt support in PHY framework? I see only
 one
 platform (OMAP3) going to be needing this non-dt support and we
 can
 use the USB PHY library for it.>
>>>
>>> you shouldn't drop support for non-DT platform, in any case we
>>> lived
>>> without DT (and still do) for years. Gotta find a better way ;-)
>>
>> hmm..
>>
>> how about passing the device names of PHY in platform data of the
>> controller? It should be deterministic as the PHY framework assigns
>> its
>> own id and we *don't* want to add any requirement that the ID must
>> be
>> assigned manually without using PLATFORM_DEVID_AUTO. We can get rid
>> of
>> *phy_init_data* in the v10 patch

[PATCH] rtc: rtc-nuc900: use NULL instead of 0

2013-08-18 Thread Jingoo Han

check_rtc_access_enable() returns pointer, thus NULL should be
used instead of 0 in order to fix the following sparse warning:

drivers/rtc/rtc-nuc900.c:102:16: warning: Using plain integer as NULL pointer

Signed-off-by: Jingoo Han 
---
 drivers/rtc/rtc-nuc900.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-nuc900.c b/drivers/rtc/rtc-nuc900.c
index 22861c5..248653c 100644
--- a/drivers/rtc/rtc-nuc900.c
+++ b/drivers/rtc/rtc-nuc900.c
@@ -99,7 +99,7 @@ static int *check_rtc_access_enable(struct nuc900_rtc 
*nuc900_rtc)
if (!timeout)
return ERR_PTR(-EPERM);
 
-   return 0;
+   return NULL;
 }
 
 static int nuc900_rtc_bcd2bin(unsigned int timereg,
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] extcon: palmas: Modified the compatible type to ti,palmas-usb-vid

2013-08-18 Thread Kishon Vijay Abraham I

Hi,

On Saturday 17 August 2013 03:51 AM, Stephen Warren wrote:
> On 08/16/2013 04:20 AM, Kishon Vijay Abraham I wrote:
>> The Palmas device contains only a USB VID detector, so modified the
>> compatible type to *ti,palmas-usb-vid*.
> 
>> diff --git a/Documentation/devicetree/bindings/extcon/extcon-palmas.txt 
>> b/Documentation/devicetree/bindings/extcon/extcon-palmas.txt
> 
>>  PALMAS USB COMPARATOR
>>  Required Properties:
>> - - compatible : Should be "ti,palmas-usb" or "ti,twl6035-usb"
>> + - compatible : Should be "ti,palmas-usb-vid".
> 
> Has the old value been published in a release kernel? If so, it makes

No. This was merged only in 3.11-rc1. So I think we should take this version?
Chanwoo can you take this patch?

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.11-rc6 genetlink locking fix offends lockdep

2013-08-18 Thread Hugh Dickins

3.11-rc6's commit 58ad436fcf49 ("genetlink: fix family dump race")
gives me the lockdep trace below at startup.

I think it needs to be reverted until you can refine it.  And it has
already gone into today's stable review series, as 04/12 for 3.0.92,
26/34 for 3.4.59, 18/45 for 3.10.8: I raise an objection to those.

Hugh

[4.004286] e1000e :00:19.0: irq 43 for MSI/MSI-X
[4.105671] e1000e :00:19.0: irq 43 for MSI/MSI-X
[4.106123] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[4.110096] 
[4.110113] ==
[4.110146] [ INFO: possible circular locking dependency detected ]
[4.110180] 3.11.0-rc6 #1 Not tainted
[4.110201] ---
[4.110234] NetworkManager/358 is trying to acquire lock:
[4.110262]  (genl_mutex){+.+.+.}, at: [] 
genl_lock+0x12/0x14
[4.110315] 
[4.110315] but task is already holding lock:
[4.110346]  (nlk->cb_mutex){+.+.+.}, at: [] 
netlink_dump+0x1c/0x1d7
[4.110400] 
[4.110400] which lock already depends on the new lock.
[4.110400] 
[4.110442] 
[4.110442] the existing dependency chain (in reverse order) is:
[4.110482] 
[4.110482] -> #1 (nlk->cb_mutex){+.+.+.}:
[4.110517][] __lock_acquire+0x865/0x956
[4.110555][] lock_acquire+0x57/0x6d
[4.110589][] mutex_lock_nested+0x5e/0x345
[4.110627][] __netlink_dump_start+0xae/0x14e
[4.110665][] genl_rcv_msg+0xf4/0x252
[4.110699][] netlink_rcv_skb+0x3e/0x8c
[4.110734][] genl_rcv+0x24/0x34
[4.110766][] netlink_unicast+0xed/0x17a
[4.110801][] netlink_sendmsg+0x2fb/0x345
[4.110838][] sock_sendmsg+0x79/0x8e
[4.110871][] ___sys_sendmsg+0x231/0x2be
[4.110907][] __sys_sendmsg+0x3d/0x5e
[4.110942][] SyS_sendmsg+0xd/0x19
[4.110975][] system_call_fastpath+0x16/0x1b
[4.111012] 
[4.111012] -> #0 (genl_mutex){+.+.+.}:
[4.111047][] validate_chain.isra.21+0x836/0xe8e
[4.111086][] __lock_acquire+0x865/0x956
[4.22][] lock_acquire+0x57/0x6d
[4.57][] mutex_lock_nested+0x5e/0x345
[4.93][] genl_lock+0x12/0x14
[4.111226][] ctrl_dumpfamily+0x31/0xfa
[4.111260][] netlink_dump+0x88/0x1d7
[4.111295][] netlink_recvmsg+0x1b1/0x2d1
[4.111331][] sock_recvmsg+0x83/0x98
[4.111365][] ___sys_recvmsg+0x15d/0x207
[4.111400][] __sys_recvmsg+0x3d/0x5e
[4.111434][] SyS_recvmsg+0xd/0x19
[4.111467][] system_call_fastpath+0x16/0x1b
[4.111504] 
[4.111504] other info that might help us debug this:
[4.111504] 
[4.111545]  Possible unsafe locking scenario:
[4.111545] 
[4.111577]CPU0CPU1
[4.111601]
[4.111625]   lock(nlk->cb_mutex);
[4.112865]lock(genl_mutex);
[4.114216]lock(nlk->cb_mutex);
[4.115315]   lock(genl_mutex);
[4.116500] 
[4.116500]  *** DEADLOCK ***
[4.116500] 
[4.119670] 1 lock held by NetworkManager/358:
[4.120906]  #0:  (nlk->cb_mutex){+.+.+.}, at: [] 
netlink_dump+0x1c/0x1d7
[4.122196] 
[4.122196] stack backtrace:
[4.124533] CPU: 0 PID: 358 Comm: NetworkManager Not tainted 3.11.0-rc6 #1
[4.125779] Hardware name: LENOVO 4174EH1/4174EH1, BIOS 8CET51WW (1.31 ) 
11/29/2011
[4.126979]  81d0a0f0 88022b91d8c8 8157cf80 
0006
[4.128274]  81cc8750 88022b91d918 8157a898 
88022d798080
[4.129472]  88022d798080 88022d798080 88022d798750 
88022d798080
[4.130645] Call Trace:
[4.131801]  [] dump_stack+0x4f/0x84
[4.132817]  [] print_circular_bug+0x2ad/0x2be
[4.133839]  [] validate_chain.isra.21+0x836/0xe8e
[4.134821]  [] ? sock_def_write_space+0x1b5/0x1b5
[4.135800]  [] __lock_acquire+0x865/0x956
[4.136842]  [] ? mark_held_locks+0xce/0xfa
[4.137828]  [] ? genl_lock+0x12/0x14
[4.138876]  [] lock_acquire+0x57/0x6d
[4.139856]  [] ? genl_lock+0x12/0x14
[4.141027]  [] mutex_lock_nested+0x5e/0x345
[4.142194]  [] ? genl_lock+0x12/0x14
[4.143219]  [] ? __kmalloc_node_track_caller+0x26/0x2d
[4.144340]  [] genl_lock+0x12/0x14
[4.145387]  [] ctrl_dumpfamily+0x31/0xfa
[4.146387]  [] ? __alloc_skb+0x97/0x1a0
[4.147454]  [] netlink_dump+0x88/0x1d7
[4.148448]  [] netlink_recvmsg+0x1b1/0x2d1
[4.149475]  [] sock_recvmsg+0x83/0x98
[4.150494]  [] ? might_fault+0x52/0xa2
[4.151471]  [] ___sys_recvmsg+0x15d/0x207
[4.152516]  [] ? __lock_acquire+0x865/0x956
[4.153501]  [] ? fget_light+0x35c/0x377
[4.154550]  [] ? fget_light+0x164/0x377
[4.155521]  [] __sys_recvmsg+0x3d/0x5e
[4.156568]  [] ?

[PATCH] documentations: treewide: Fix typo in Documentations/filesystems

2013-08-18 Thread Masanari Iida

Correct spelling typo in Documentations/filesystems.

Signed-off-by: Masanari Iida 

---
 Documentation/filesystems/btrfs.txt | 2 +-
 Documentation/filesystems/f2fs.txt  | 2 +-
 Documentation/filesystems/nfs/Exporting | 2 +-
 Documentation/filesystems/qnx6.txt  | 2 +-
 Documentation/filesystems/xfs.txt   | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/btrfs.txt 
b/Documentation/filesystems/btrfs.txt
index b349d57..9dae594 100644
--- a/Documentation/filesystems/btrfs.txt
+++ b/Documentation/filesystems/btrfs.txt
@@ -87,7 +87,7 @@ Unless otherwise specified, all options default to off.
 
   device=
Specify a device during mount so that ioctls on the control device
-   can be avoided.  Especialy useful when trying to mount a multi-device
+   can be avoided.  Especially useful when trying to mount a multi-device
setup as root.  May be specified multiple times for multiple devices.
 
   discard
diff --git a/Documentation/filesystems/f2fs.txt 
b/Documentation/filesystems/f2fs.txt
index 3cd27be..3b57456 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -216,7 +216,7 @@ The dump.f2fs shows the information of specific inode and 
dumps SSA and SIT to
 file. Each file is dump_ssa and dump_sit.
 
 The dump.f2fs is used to debug on-disk data structures of the f2fs filesystem.
-It shows on-disk inode information reconized by a given inode number, and is
+It shows on-disk inode information recognized by a given inode number, and is
 able to dump all the SSA and SIT entries into predefined files, ./dump_ssa and
 ./dump_sit respectively.
 
diff --git a/Documentation/filesystems/nfs/Exporting 
b/Documentation/filesystems/nfs/Exporting
index 09994c2..e543b1a 100644
--- a/Documentation/filesystems/nfs/Exporting
+++ b/Documentation/filesystems/nfs/Exporting
@@ -93,7 +93,7 @@ For a filesystem to be exportable it must:
2/ make sure that d_splice_alias is used rather than d_add
   when ->lookup finds an inode for a given parent and name.
 
-  If inode is NULL, d_splice_alias(inode, dentry) is eqivalent to
+  If inode is NULL, d_splice_alias(inode, dentry) is equivalent to
 
d_add(dentry, inode), NULL
 
diff --git a/Documentation/filesystems/qnx6.txt 
b/Documentation/filesystems/qnx6.txt
index 99e9018..4086797 100644
--- a/Documentation/filesystems/qnx6.txt
+++ b/Documentation/filesystems/qnx6.txt
@@ -149,7 +149,7 @@ Bitmap system area
 --
 
 The bitmap itself is divided into three parts.
-First the system area, that is split into two halfs.
+First the system area, that is split into two halves.
 Then userspace.
 
 The requirement for a static, fixed preallocated system area comes from how
diff --git a/Documentation/filesystems/xfs.txt 
b/Documentation/filesystems/xfs.txt
index 12525b1..5be51fd 100644
--- a/Documentation/filesystems/xfs.txt
+++ b/Documentation/filesystems/xfs.txt
@@ -135,7 +135,7 @@ default behaviour.
If the memory cost of 8 log buffers is too high on small
systems, then it may be reduced at some cost to performance
on metadata intensive workloads. The logbsize option below
-   controls the size of each buffer and so is also relevent to
+   controls the size of each buffer and so is also relevant to
this case.
 
   logbsize=value
-- 
1.8.4.rc3.2.g2c2b664

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] PM / devfreq: create_freezable_workqueue() doesn't return an ERR_PTR

2013-08-18 Thread MyungJoo Ham

On Thu, Aug 15, 2013 at 4:55 PM, Dan Carpenter  wrote:
>
> The create_freezable_workqueue() function returns a NULL on error and
> not an ERR_PTR.
>
> Signed-off-by: Dan Carpenter 

Thanks. I'll apply this.

Signed-off-by: MyungJoo Ham 

>
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index e94e619..5088523 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -983,10 +983,10 @@ static int __init devfreq_init(void)
> }
>
> devfreq_wq = create_freezable_workqueue("devfreq_wq");
> -   if (IS_ERR(devfreq_wq)) {
> +   if (!devfreq_wq) {
> class_destroy(devfreq_class);
> pr_err("%s: couldn't create workqueue\n", __FILE__);
> -   return PTR_ERR(devfreq_wq);
> +   return -ENOMEM;
> }
> devfreq_class->dev_attrs = devfreq_attrs;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
MyungJoo Ham, Ph.D.
System S/W Lab, S/W Center, Samsung Electronics
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 5/6] cpuidle/powerpc: Backend-powerpc idle driver for powernv and pseries.

2013-08-18 Thread Deepthi Dharwar

On 08/07/2013 05:11 AM, Scott Wood wrote:
> On Wed, 2013-08-07 at 09:30 +1000, Benjamin Herrenschmidt wrote:
>> On Tue, 2013-08-06 at 18:08 -0500, Scott Wood wrote:
>>> Here's another example.  get_lppaca() will only build on book3s -- and
>>> yet we get requests for e500 code to use this file.
>>
>> Indeed, Besides there is already accessors afaik for lppaca that compile
>> to nothing on E (and if not they would be trivial to add).
> 
> I don't see such an accessor, but if there were, what would happen when
> the caller goes on to dereference that nothing?
> 
> There is an accessor for shared_proc specifically (in the spinlock code)
> -- not that it would be much help on booke to just compile away that
> check and always select one of the pseries state tables over the other.
> 
> -Scott

Thanks a lot Scott and Ben for the review.
I have addressed the issues in V3 of this patch series which I have just
posted out.

Regards,
Deepthi


> 
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG: scheduling while atomic 3.10.7 in ZRAM Swap

2013-08-18 Thread Minchan Kim

Hello,

On Mon, Aug 19, 2013 at 12:13:02PM +0800, Michael wang wrote:
> Hi, Mitch
> 
> On 08/17/2013 10:01 PM, Mitch Harder wrote:
> > I'm encountering a BUG while using a ZRAM Swap device.
> > 
> > The call trace seems to involve the changes recently added to 3.10.6
> > by the patch:
> > zram: use zram->lock to protect zram_free_page() in swap free notify path
> > 
> > The hardware is a x86 single CPU AMD Athlon XP system with 1GB RAM.
> > 
> > I'm implementing a 352MB ZRAM swap device, and also have 1GB swap
> > space on the hard disk.
> 
> IMHO, it was caused by that swap_entry_free() was invoked with page
> spin-locked, thus zram_slot_free_notify() should not use rw-lock which
> may goto sleep.
> 
> CC folks related.

Thanks for Ccing me, Michael,

Mitch, It's known problem and it should be fixed by [1] in recent linux-next.

[1] a0c516cbfc, zram: don't grab mutex in zram_slot_free_noity

Thanks for the report!

> 
> Regards,
> Michael Wang
-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/34] cpufreq: exynos5440: set CPUFREQ_NO_NOTIFICATION flag

2013-08-18 Thread Viresh Kumar

On 19 August 2013 09:12, amit daniel kachhap  wrote:
>>> +   .flags  = CPUFREQ_STICKY | CPUFREQ_NO_NOTIFICATION,
> How about naming the flag as CPUFREQ_ASYNC_NOTIFICATION? For platforms
> not defining this flag, the notifiers can be called synchronously from
> the core driver.

Nice.. +1

My repo will be updated with this change..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 0/5] zram/zsmalloc promotion

2013-08-18 Thread Minchan Kim

Hello Bob,

Sorry for the late response. I was on holiday.

On Mon, Aug 19, 2013 at 11:57:41AM +0800, Bob Liu wrote:
> Hi Minchan,
> 
> On 08/19/2013 11:18 AM, Minchan Kim wrote:
> > Hello Mel,
> > 
> > On Fri, Aug 16, 2013 at 09:33:47AM +0100, Mel Gorman wrote:
> >> On Fri, Aug 16, 2013 at 01:26:41PM +0900, Minchan Kim wrote:
> >> 
> >> If it's used for something like tmpfs then it becomes much worse. 
> >> Normal
> >> tmpfs without swap can lockup if tmpfs is allowed to fill memory. In a
> >> sane configuration, lockups will be avoided and deleting a tmpfs file 
> >> is
> >> guaranteed to free memory. When zram is used to back tmpfs, there is no
> >> guarantee that any memory is freed due to fragmentation of the 
> >> compressed
> >> pages. The only way to recover the memory may be to kill applications
> >> holding tmpfs files open and then delete them which is fairly drastic
> >> action in a normal server environment.
> >
> > Indeed.
> > Actually, I had a plan to support zsmalloc compaction. The zsmalloc 
> > exposes
> > handle instead of pure pointer so it could migrate some zpages to 
> > somewhere
> > to pack in. Then, it could help above problem and OOM storm problem.
> > Anyway, it's a totally new feature and requires many changes and 
> > experiement.
> > Although we don't have such feature, zram is still good for many people.
> >
> 
>  And is zsmalloc was pluggable for zswap then it would also benefit.
> >>>
> >>> But zswap isn't pseudo block device so it couldn't be used for block 
> >>> device.
> >>
> >> It would not be impossible to write one. Taking a quick look it might even
> >> be doable by just providing a zbud_ops that does not have an evict handler
> >> and make sure the errors are handled correctly. i.e. does the following
> >> patch mean that zswap never writes back and instead just compresses pages
> >> in memory?
> >>
> >> diff --git a/mm/zswap.c b/mm/zswap.c
> >> index deda2b6..99e41c8 100644
> >> --- a/mm/zswap.c
> >> +++ b/mm/zswap.c
> >> @@ -819,7 +819,6 @@ static void zswap_frontswap_invalidate_area(unsigned 
> >> type)
> >>  }
> >>  
> >>  static struct zbud_ops zswap_zbud_ops = {
> >> -  .evict = zswap_writeback_entry
> >>  };
> >>  
> >>  static void zswap_frontswap_init(unsigned type)
> >>
> >> If so, it should be doable to link that up in a sane way so it can be
> >> configured at runtime.
> >>
> >> Did you ever even try something like this?
> > 
> > Never. Because I didn't have such requirement for zram.
> > 
> >>
> >>> Let say one usecase for using zram-blk.
> >>>
> >>> 1) Many embedded system don't have swap so although tmpfs can support 
> >>> swapout
> >>> it's pointless still so such systems should have sane configuration to 
> >>> limit
> >>> memory space so it's not only zram problem.
> >>>
> >>
> >> If zswap was backed by a pseudo device that failed all writes or an an
> >> ops with no evict handler then it would be functionally similar.
> >>
> >>> 2) Many embedded system don't have enough memory. Let's assume short-lived
> >>> file growing up until half of system memory once in a while. We don't want
> >>> to write it on flash by wear-leveing issue and very slowness so we want 
> >>> to use
> >>> in-memory but if we uses tmpfs, it should evict half of working set to 
> >>> cover
> >>> them when the size reach peak. zram would be better choice.
> >>>
> >>
> >> Then back it by a pseudo device that fails all writes so it does not have
> >> to write to disk.
> > 
> > You mean "make pseudo block device and register make_request_fn
> > and prevent writeback". Bah, yes, it's doable but what is it different with 
> > below?
> > 
> > 1) move zbud into zram
> > 2) implement frontswap API in zram
> > 3) implement writebazk in zram
> > 
> > The zram has been for a long time in staging to be promoted and have been
> > maintained/deployed. Of course, I have asked the promotion several times
> > for above a year.
> > 
> > Why can't zram include zswap functions if you really want to merge them?
> > Is there any problem?
> 
> I think merging zram into zswap or merging zswap into zram are the same
> thing. It's no difference.

True but i'd like to merge zswap code into zram.
Because as you know, zram has already lots of users while zswap is almost
new young so I'd like to keep backward compatibility for zram so moving zswap 
code
into zram is more handy and could keep the git log as well.

> Both way will result in a solution finally with zram block device,
> frontswap API etc.

Right but z* family people should discuss that zswap-writeback is really
good solution for compressed swap. Firstly, I thought zswap is differnt with
zram so there is no issue to promote zram so I and Nitin helped zsmalloc
promotion for Seth and have reviewed at zswap inital phases but the situation
is chainging. Let's discussion further points about compresssed swap solution.
I raised issues as reply of

Re: [PATCH 0/4] mm: merge zram into zswap

2013-08-18 Thread Bob Liu

Hi Minchan,

On 08/19/2013 12:10 PM, Minchan Kim wrote:
> On Sun, Aug 18, 2013 at 04:40:45PM +0800, Bob Liu wrote:
>> Both zswap and zram are used to compress anon pages in memory so as to reduce
>> swap io operation. The main different is that zswap uses zbud as its 
>> allocator
>> while zram uses zsmalloc. The other different is zram will create a block
>> device, the user need to mkswp and swapon it.
>>
>> Minchan has areadly try to promote zram/zsmalloc into drivers/block/, but it 
>> may
>> cause increase maintenance headaches. Since the purpose of zswap and zram are
>> the same, this patch series try to merge them together as Mel suggested.
>> Dropped zram from staging and extended zswap with the same feature as zram.
>>
>> zswap todo:
>> Improve the writeback of zswap pool pages!
>>
>> Bob Liu (4):
>>   drivers: staging: drop zram and zsmalloc
> 
> Bob, I feel you're very rude and I'm really upset.
> 
> You're just dropping the subsystem you didn't do anything without any 
> consensus
> from who are contriubting lots of patches to make it works well for a long 
> time.

I apologize for that, at least I should add [RFC] in the patch title!

-- 
Regards,
-Bob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build warning after merge of the tip tree

2013-08-18 Thread Stephen Rothwell

Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
produced this warning:

arch/x86/kernel/paravirt.c:66:0: warning: "DEF_NATIVE" redefined [enabled by 
default]
 #define DEF_NATIVE(ops, name, code) \
 ^
In file included from arch/x86/include/asm/ptrace.h:65:0,
 from arch/x86/include/asm/alternative.h:8,
 from arch/x86/include/asm/bitops.h:16,
 from include/linux/bitops.h:22,
 from include/linux/kernel.h:10,
 from include/linux/cache.h:4,
 from include/linux/time.h:4,
 from include/linux/stat.h:18,
 from include/linux/module.h:10,
 from arch/x86/kernel/paravirt.c:22:
arch/x86/include/asm/paravirt_types.h:391:0: note: this is the location of the 
previous definition
 #define DEF_NATIVE(ops, name, code)  \
 ^

Introduced by commit 9a55fdbe941e ("x86, asmlinkage, paravirt: Add
__visible/asmlinkage to xen paravirt ops").  The 2 definitions used to be
identical ... maybe there should be only one.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpDQruobR0Gt.pgp
Description: PGP signature

[git pull] drm fixes

2013-08-18 Thread Dave Airlie


Hi Linus,

bit late with these, was under the weather for a a few days, nothing 
too crazy, some radeon regression fixes, one intel regression fix, and 
one fix to avoid a warn with i915 when used with dma-buf.

Dave.

The following changes since commit d4e4ab86bcba5a72779c43dc1459f71fea3d89c8:

  Linux 3.11-rc5 (2013-08-11 18:04:20 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 3387ed83943daf6cb1bb4195ae369067b9cd80ce:

  Merge tag 'drm-intel-fixes-2013-08-15' of 
git://people.freedesktop.org/~danvet/drm-intel (2013-08-19 13:49:20 +1000)



Alex Deucher (1):
  drm/radeon/r7xx: fix copy paste typo in golden register setup

Chris Wilson (1):
  drm/i915: Don't deref pipe->cpu_transcoder in the hangcheck code

Christian König (1):
  drm/radeon: fix UVD message buffer validation

Daniel Vetter (1):
  drm/i915: unpin backing storage in dmabuf_unmap

Dave Airlie (2):
  Merge branch 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux
  Merge tag 'drm-intel-fixes-2013-08-15' of 
git://people.freedesktop.org/~danvet/drm-intel

Rafał Miłecki (1):
  drm/radeon: fix WREG32_OR macro setting bits in a register

 drivers/gpu/drm/i915/i915_gem_dmabuf.c |  8 
 drivers/gpu/drm/i915/intel_display.c   | 86 ++
 drivers/gpu/drm/radeon/radeon.h|  2 +-
 drivers/gpu/drm/radeon/radeon_uvd.c|  8 
 drivers/gpu/drm/radeon/rv770.c | 12 ++---
 5 files changed, 80 insertions(+), 36 deletions(-)

Re: [PATCH 5/6] sched, fair: Make group power more consitent

2013-08-18 Thread Preeti U Murthy

Hi Peter,

On 08/16/2013 03:42 PM, Peter Zijlstra wrote:

I have a few comments and clarification to seek.

1. How are you ensuring from this patch that sgs->group_power does not
change over the course of load balancing?

The only path to update_group_power() where sg->sgp->power gets
updated, is from update_sg_lb_stats(). You are updating sgs->group_power
in update_sg_lb_stats(). Any change to group->sgp->power will get
reflected in sgs->group_power as well right?

2. This point is aside from your patch. In the current implementation,
each time the cpu power gets updated in update_cpu_power(), should not
the power of the sched_groups comprising of that cpu also get updated?
Why wait till the load balancing is done at the sched_domain level of
that group, to update its group power?

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 1/5] rcu: Add duplicate-callback tests to rcutorture

2013-08-18 Thread Josh Triplett

On Sun, Aug 18, 2013 at 08:55:28PM -0700, Paul E. McKenney wrote:
> On Sat, Aug 17, 2013 at 07:54:20PM -0700, Josh Triplett wrote:
> > On Sat, Aug 17, 2013 at 07:25:13PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" 
> > > 
> > > This commit adds a object_debug option to rcutorture to allow the
> > > debug-object-based checks for duplicate call_rcu() invocations to
> > > be deterministically tested.
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > > Cc: Mathieu Desnoyers 
> > > Cc: Sedat Dilek 
> > > Cc: Davidlohr Bueso 
> > > Cc: Rik van Riel 
> > > Cc: Thomas Gleixner 
> > > Cc: Linus Torvalds 
> > > Tested-by: Sedat Dilek 
> > 
> > Two comments below; with those fixed,
> > Reviewed-by: Josh Triplett 
> > 
> > > ---
> > > @@ -100,6 +101,8 @@ module_param(fqs_stutter, int, 0444);
> > >  MODULE_PARM_DESC(fqs_stutter, "Wait time between fqs bursts (s)");
> > >  module_param(n_barrier_cbs, int, 0444);
> > >  MODULE_PARM_DESC(n_barrier_cbs, "# of callbacks/kthreads for barrier 
> > > testing");
> > > +module_param(object_debug, int, 0444);
> > > +MODULE_PARM_DESC(object_debug, "Enable debug-object double call_rcu() 
> > > testing");
> > 
> > modules-next has a change to ignore and warn about
> > unknown module parameters.  Thus, I'd suggest wrapping the ifdef around
> > this module parameter, so it doesn't exist at all without
> > CONFIG_DEBUG_OBJECTS_RCU_HEAD.
> > 
> > Alternatively, consider providing the test unconditionally, and just
> > printing a big warning message saying that it's going to cause
> > corruption in the !CONFIG_DEBUG_OBJECTS_RCU_HEAD case.
> 
> I currently do something like the above.  The module parameter
> is defined unconditionally, but the actual tests are under #ifdef
> CONFIG_DEBUG_OBJECTS_RCU_HEAD.  If you specify object_debug for a
> !CONFIG_DEBUG_OBJECTS_RCU_HEAD kernel, the pr_alert() below happens,
> and the test is omitted, thus avoiding the list corruption.
> 
> Seem reasonable?

That's exactly the bit I was commenting on.  I'm saying that you should
either make the test unconditional (perhaps with a warning saying it's
about to cause list corruption), or you should compile out the module
parameter as well and then you don't need the pr_alert (since current
kernels will emit a warning when you pass a non-existent module
parameter).

Personally, I'd go with the latter.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 11/11] jiffies: Avoid undefined behavior from signed overflow

2013-08-18 Thread Paul E. McKenney

On Sun, Aug 18, 2013 at 06:20:32PM -0700, Josh Triplett wrote:
> On Sun, Aug 18, 2013 at 05:41:20PM -0700, Paul E. McKenney wrote:
> > On Sat, Aug 17, 2013 at 08:23:51PM -0700, Josh Triplett wrote:
> > > On Sat, Aug 17, 2013 at 06:37:56PM -0700, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" 
> > > > 
> > > > According to the C standard 3.4.3p3, overflow of a signed integer 
> > > > results
> > > > in undefined behavior.  This commit therefore changes the definitions
> > > > of time_after(), time_after_eq(), time_after64(), and time_after_eq64()
> > > > to avoid this undefined behavior.  The trick is that the subtraction
> > > > is done using unsigned arithmetic, which according to 6.2.5p9 cannot
> > > > overflow because it is defined as modulo arithmetic.  This has the added
> > > > (though admittedly quite small) benefit of shortening two lines of code
> > > > by four characters each.
> > > > 
> > > > Note that the C standard considers the cast from unsigned to
> > > > signed to be implementation-defined, see 6.3.1.3p3.  However, on a
> > > > two-complement system, an implementation that defines anything other
> > > > than a reinterpretation of the bits is free come to me, and I will be
> > > 
> > > s/free come/free to come/
> > 
> > Good catch, fixed!
> 
> Just realized when looking at this again that there's another typo:
> "two-complement" should be "two's-complement".

OK, fixed that as well.  ;-)

Thank you for all the reviews and comments!

Thanx, Paul

> > > > happy to act as a witness for its being committed to an insane asylum.
> > > 
> > > With the typo above fixed:
> > > Reviewed-by: Josh Triplett 
> > > 
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG: scheduling while atomic 3.10.7 in ZRAM Swap

2013-08-18 Thread Michael wang

Hi, Mitch

On 08/17/2013 10:01 PM, Mitch Harder wrote:
> I'm encountering a BUG while using a ZRAM Swap device.
> 
> The call trace seems to involve the changes recently added to 3.10.6
> by the patch:
> zram: use zram->lock to protect zram_free_page() in swap free notify path
> 
> The hardware is a x86 single CPU AMD Athlon XP system with 1GB RAM.
> 
> I'm implementing a 352MB ZRAM swap device, and also have 1GB swap
> space on the hard disk.

IMHO, it was caused by that swap_entry_free() was invoked with page
spin-locked, thus zram_slot_free_notify() should not use rw-lock which
may goto sleep.

CC folks related.

Regards,
Michael Wang

> 
> The log include multiple messages similar to the following:
> 
> [ 3019.011511] BUG: scheduling while atomic: cc1/23223/0x0001
> [ 3019.011517] Modules linked in: zram(C) nvidia(PO) nvidia_agp
> i2c_nforce2 xts gf128mul sha256_generic
> [ 3019.011528] CPU: 0 PID: 23223 Comm: cc1 Tainted: P C O 3.10.7-std 
> #1
> [ 3019.011531] Hardware name:/MS-6570, BIOS 6.00 PG 03/29/2004
> [ 3019.011534]  f18d0c88 f18d0c88 e8673d30 c1859479 e8673d48 c1853a6d
> c1a11f18 f4f1b79c
> [ 3019.011539]  5ab7 0001 e8673dc8 c185e9dd e8673d60 c11130f0
> f6298e00 
> [ 3019.011543]  c1b61b40 c10d8c40 f4f1b4f0 1000 f4f1b4f0 0001
> e8673d8c c10250ac
> [ 3019.011548] Call Trace:
> [ 3019.011561]  [] dump_stack+0x16/0x18
> [ 3019.011566]  [] __schedule_bug+0x4e/0x5c
> [ 3019.011573]  [] __schedule+0x4fd/0x5a0
> [ 3019.011580]  [] ? bio_put+0x40/0x70
> [ 3019.011586]  [] ? end_swap_bio_read+0x30/0x80
> [ 3019.011593]  [] ? kmap_atomic_prot+0x4c/0xd0
> [ 3019.011597]  [] ? kmap_atomic+0x13/0x20
> [ 3019.011604]  [] ? get_page_from_freelist+0x278/0x500
> [ 3019.011609]  [] schedule+0x22/0x60
> [ 3019.011613]  [] rwsem_down_write_failed+0x95/0x110
> [ 3019.011618]  [] call_rwsem_down_write_failed+0x6/0x8
> [ 3019.011623]  [] ? zram_free_page+0xb0/0xb0 [zram]
> [ 3019.011627]  [] ? down_write+0x24/0x30
> [ 3019.011630]  [] zram_slot_free_notify+0x29/0x50 [zram]
> [ 3019.011635]  [] swap_entry_free+0xe4/0x140
> [ 3019.011639]  [] swapcache_free+0x28/0x40
> [ 3019.011643]  [] delete_from_swap_cache+0x26/0x40
> [ 3019.011646]  [] reuse_swap_page+0x6e/0x80
> [ 3019.011652]  [] do_wp_page.isra.84+0x225/0x5c0
> [ 3019.011656]  [] ? lru_cache_add_lru+0x22/0x40
> [ 3019.011662]  [] ? page_add_new_anon_rmap+0x5c/0xa0
> [ 3019.011666]  [] handle_pte_fault+0x2db/0x5e0
> [ 3019.011669]  [] handle_mm_fault+0x87/0xd0
> [ 3019.011674]  [] ? __do_page_fault+0x480/0x480
> [ 3019.011677]  [] __do_page_fault+0x178/0x480
> [ 3019.011683]  [] ? __do_softirq+0x10f/0x1e0
> [ 3019.011691]  [] ? handle_level_irq+0x58/0x90
> [ 3019.011695]  [] ? irq_exit+0x54/0x90
> [ 3019.011700]  [] ? do_IRQ+0x48/0x94
> [ 3019.011706]  [] ? SyS_write+0x57/0xa0
> [ 3019.011710]  [] ? __do_page_fault+0x480/0x480
> [ 3019.011713]  [] do_page_fault+0xd/0x10
> [ 3019.011717]  [] error_code+0x65/0x6c
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] mm: merge zram into zswap

2013-08-18 Thread Minchan Kim

On Sun, Aug 18, 2013 at 04:40:45PM +0800, Bob Liu wrote:
> Both zswap and zram are used to compress anon pages in memory so as to reduce
> swap io operation. The main different is that zswap uses zbud as its allocator
> while zram uses zsmalloc. The other different is zram will create a block
> device, the user need to mkswp and swapon it.
> 
> Minchan has areadly try to promote zram/zsmalloc into drivers/block/, but it 
> may
> cause increase maintenance headaches. Since the purpose of zswap and zram are
> the same, this patch series try to merge them together as Mel suggested.
> Dropped zram from staging and extended zswap with the same feature as zram.
> 
> zswap todo:
> Improve the writeback of zswap pool pages!
> 
> Bob Liu (4):
>   drivers: staging: drop zram and zsmalloc

Bob, I feel you're very rude and I'm really upset.

You're just dropping the subsystem you didn't do anything without any consensus
from who are contriubting lots of patches to make it works well for a long time.
I understand you want to merge zram/zswap to remove the concern Mel suggested
but so your intention might help the community. But the approach was totally 
wrong.
You just said a few days ago in my thread and I was holiday so I didn't have
a time to reply all of the mail sent to me. Should I break my holiday for
just replying to you? Are you okay that someone else removes or moves your 
efforts
without any consensus with you while you're spending good time with family?

Please be careful. Bob.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [trivial]treewide: Fix typo in printk

2013-08-18 Thread Masanari Iida

Correct spelling typo in printk.

Signed-off-by: Masanari Iida 
---
 drivers/gpu/drm/exynos/exynos_drm_fimc.c | 2 +-
 drivers/gpu/drm/exynos/exynos_drm_gsc.c  | 4 ++--
 drivers/infiniband/ulp/isert/Kconfig | 2 +-
 drivers/media/i2c/Kconfig| 2 +-
 drivers/media/i2c/adv7183.c  | 2 +-
 drivers/media/i2c/s5c73m3/s5c73m3-core.c | 4 ++--
 drivers/media/v4l2-core/v4l2-ctrls.c | 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimc.c 
b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
index 6e047bd..8926d68 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
@@ -343,7 +343,7 @@ static bool fimc_check_ovf(struct fimc_context *ctx)
 
fimc_write(cfg, EXYNOS_CIWDOFST);
 
-   dev_err(ippdrv->dev, "occured overflow at %d, status 0x%x.\n",
+   dev_err(ippdrv->dev, "occurred overflow at %d, status 0x%x.\n",
ctx->id, status);
return true;
}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c 
b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
index 90b8a1a..7751a43 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
@@ -1300,13 +1300,13 @@ static irqreturn_t gsc_irq_handler(int irq, void 
*dev_id)
 
status = gsc_read(GSC_IRQ);
if (status & GSC_IRQ_STATUS_OR_IRQ) {
-   dev_err(ippdrv->dev, "occured overflow at %d, status 0x%x.\n",
+   dev_err(ippdrv->dev, "occurred overflow at %d, status 0x%x.\n",
ctx->id, status);
return IRQ_NONE;
}
 
if (status & GSC_IRQ_STATUS_OR_FRM_DONE) {
-   dev_dbg(ippdrv->dev, "occured frame done at %d, status 0x%x.\n",
+   dev_dbg(ippdrv->dev, "occurred frame done at %d, status 
0x%x.\n",
ctx->id, status);
 
buf_id[EXYNOS_DRM_OPS_SRC] = gsc_get_src_buf_index(ctx);
diff --git a/drivers/infiniband/ulp/isert/Kconfig 
b/drivers/infiniband/ulp/isert/Kconfig
index ce3fd32..5afcfa6 100644
--- a/drivers/infiniband/ulp/isert/Kconfig
+++ b/drivers/infiniband/ulp/isert/Kconfig
@@ -1,5 +1,5 @@
 config INFINIBAND_ISERT
-   tristate "iSCSI Extentions for RDMA (iSER) target support"
+   tristate "iSCSI Extensions for RDMA (iSER) target support"
depends on INET && INFINIBAND_ADDR_TRANS && TARGET_CORE && ISCSI_TARGET
---help---
Support for iSCSI Extentions for RDMA (iSER) Target on Infiniband 
fabrics.
diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
index b2cd8ca..70c4671 100644
--- a/drivers/media/i2c/Kconfig
+++ b/drivers/media/i2c/Kconfig
@@ -623,7 +623,7 @@ config VIDEO_UPD64083
  To compile this driver as a module, choose M here: the
  module will be called upd64083.
 
-comment "Miscelaneous helper chips"
+comment "Miscellaneous helper chips"
 
 config VIDEO_THS7303
tristate "THS7303/53 Video Amplifier"
diff --git a/drivers/media/i2c/adv7183.c b/drivers/media/i2c/adv7183.c
index 6f738d8..d45e0e3 100644
--- a/drivers/media/i2c/adv7183.c
+++ b/drivers/media/i2c/adv7183.c
@@ -178,7 +178,7 @@ static int adv7183_log_status(struct v4l2_subdev *sd)
adv7183_read(sd, ADV7183_VS_FIELD_CTRL_1),
adv7183_read(sd, ADV7183_VS_FIELD_CTRL_2),
adv7183_read(sd, ADV7183_VS_FIELD_CTRL_3));
-   v4l2_info(sd, "adv7183: Hsync positon control 1 2 and 3 = 0x%02x 0x%02x 
0x%02x\n",
+   v4l2_info(sd, "adv7183: Hsync position control 1 2 and 3 = 0x%02x 
0x%02x 0x%02x\n",
adv7183_read(sd, ADV7183_HS_POS_CTRL_1),
adv7183_read(sd, ADV7183_HS_POS_CTRL_2),
adv7183_read(sd, ADV7183_HS_POS_CTRL_3));
diff --git a/drivers/media/i2c/s5c73m3/s5c73m3-core.c 
b/drivers/media/i2c/s5c73m3/s5c73m3-core.c
index 825ea86..d153440 100644
--- a/drivers/media/i2c/s5c73m3/s5c73m3-core.c
+++ b/drivers/media/i2c/s5c73m3/s5c73m3-core.c
@@ -1455,7 +1455,7 @@ static int s5c73m3_oif_registered(struct v4l2_subdev *sd)
mutex_unlock(>lock);
 
v4l2_dbg(1, s5c73m3_dbg, sd, "%s: Booting %s (%d)\n",
-__func__, ret ? "failed" : "succeded", ret);
+__func__, ret ? "failed" : "succeeded", ret);
 
return ret;
 }
@@ -1646,7 +1646,7 @@ static int s5c73m3_probe(struct i2c_client *client,
if (ret < 0)
goto out_err;
 
-   v4l2_info(sd, "%s: completed succesfully\n", __func__);
+   v4l2_info(sd, "%s: completed successfully\n", __func__);
return 0;
 
 out_err:
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c 
b/drivers/media/v4l2-core/v4l2-ctrls.c
index fccd08b..1765182 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -420,7 +420,7 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
"Advanced

Re: [PATCH tip/core/rcu 2/3] rcu: Update RTFP documentation

2013-08-18 Thread Paul E. McKenney

On Sun, Aug 18, 2013 at 05:38:51PM -0700, Josh Triplett wrote:
> On Sun, Aug 18, 2013 at 05:20:02PM -0700, Paul E. McKenney wrote:
> > On Sat, Aug 17, 2013 at 07:46:30PM -0700, Josh Triplett wrote:
> > > On Sat, Aug 17, 2013 at 06:25:52PM -0700, Paul E. McKenney wrote:
> > > > +In 2012, Josh Triplett received his Ph.D. with his dissertation
> > > > +covering RCU-protected resizable hash tables and the relationship
> > > > +between memory barriers and read-side traversal order:  If the updater
> > > > +is making changes in the opposite direction from the read-side traveral
> > > > +order, the updater need only execute a memory-barrier instruction,
> > > > +but if in the same direction, the updater needs to wait for a grace
> > > > +period between the individual updates [JoshTriplettPhD].  Also in 2012,
> > > 
> > > :)
> > > 
> > > > +after seventeen years of attempts, an RCU paper made it into a 
> > > > top-flight
> > > > +academic journal, IEEE Transactions on Parallel and Distributed Systems
> > > > +[MathieuDesnoyers2012URCU].  A group of researchers in Spain applied
> > > 
> > > What about the 2010 paper in Operating Systems Review?
> > 
> > It is already there, but not visible in this patch:
> > 
> > 2010 produced a simpler preemptible-RCU implementation
> > based on TREE_RCU [PaulEMcKenney2010SimpleOptRCU], lockdep-RCU
> > [PaulEMcKenney2010LockdepRCU], another resizeable RCU-protected hash
> > table [HerbertXu2010RCUResizeHash] (this one consuming more memory,
> > but allowing arbitrary changes in hash function, as required for DoS
> > avoidance in the networking code), realization of the 2009 RCU-protected
> > hash table with atomic node move [JoshTriplett2010RPHash], an update on
> > the RCU API [PaulEMcKenney2010RCUAPI].
> > 
> > And:
> > 
> > @article{JoshTriplett2010RPHash
> > ,author="Josh Triplett and Paul E. McKenney and Jonathan Walpole"
> > ,title="Scalable Concurrent Hash Tables via Relativistic Programming"
> > ,journal="ACM Operating Systems Review"
> > ,year=2010
> > ,volume=44
> > ,number=3
> > ,month="July"
> > ,annotation={
> > RP fun with hash tables.
> > http://portal.acm.org/citation.cfm?id=1842733.1842750
> > }
> 
> Right, I saw it in the file when I checked; I meant, that journal paper
> seems to contradict "after seventeen years of attempts, an RCU paper
> made it into a top-flight academic journal". :)

Ah, from what I can see, OSR is on its way up, but still mid-ranks.
(Some years back, it was low-end -- unreviewed.)

> > > > +,day = {25}
> > > > +,doi = {10.1007/s11227-012-0766-x}
> > > > +,issn = {0920-8542}
> > > > +,journal = {The Journal of Supercomputing}
> > > > +,keywords = {linux, simulation}
> > > > +,month = apr
> > > > +,posted-at = {2012-05-03 09:12:04}
> > > > +,priority = {2}
> > > > +,title = {{A Read-Copy Update based parallel server for distributed 
> > > > crowd simulations}}
> > > > +,url = {http://dx.doi.org/10.1007/s11227-012-0766-x}
> > > > +,year = {2012}
> > > > +}
> > > > +
> > > > +
> > > > +@unpublished{JonCorbet2012ACCESS:ONCE
> > > 
> > > LWN is not "unpublished"; it's at least "misc", and I'd suggest
> > > "article".  Ditto for every other LWN cite in this bibliography.
> > 
> > There does seem to be a diverse set of advice out there, with some
> > agreeing with you on "misc", others advocating for "electronic", and
> > still others suggesting use of LaBibTex with its "online" tag, and with
> > the Tex Frequently Asked Questions page saying:
> > 
> > There is no citation type for URLs, per se, in the standard
> > BibTeX styles, though Oren Patashnik (the author of BibTeX)
> > is believed to be considering developing one such for use with
> > the long-awaited BibTeX version 1.0.
> > 
> > I couldn't find any online .bib files with entries for Linux Weekly News
> > articles.  Other than my own, of course!  (I know people have cited
> > them in papers, but Google doesn't see the corresponding .bib files.)
> > 
> > Given all that, I am going to stick with "unpublished" for the moment,
> > and wait at least one year to see if BibTex version 1.0 comes out.
> 
> Several different tags make sense, but "unpublished" isn't one of them.
> "unpublished" exists for entirely un-reviewed works such as self-hosted
> PDFs.  LWN has editorial standards.  Thus, of the standard tags that
> work with all BibTeX styles, I think either "article" or "misc" would
> make more sense than "unpublished".
> 
> An example from one of my own .bib files:
> 
> @article{tiny-rcu-lwn,
> author = "Paul E. McKenney",
> title = {{RCU: The Bloatwatch Edition}},
> journal = "Linux Weekly News",
> month = "March",
> year = "2009",
> day = "17",
> url = {https://lwn.net/Articles/323929/}
> }
> 
> (With the obvious change that since you don't use "url" in your .bib
> files, that should go in "howpublished" or "note" instead.)

I might do this at some point, but don't want to do

linux-next: manual merge of the tip tree with the pm tree

2013-08-18 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the tip tree got a conflict in
arch/x86/include/asm/processor.h between commit 61c63e5ed3b9 ("cpufreq:
Remove unused APERF/MPERF support") from the pm tree and commit
96e39ac0e9d1 ("x86: Introduce hypervisor_cpuid_base()") from the tip tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/x86/include/asm/processor.h
index 4f4a3d9,61a5533..000
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@@ -942,6 -942,50 +942,21 @@@ extern int set_tsc_mode(unsigned int va
  
  extern u16 amd_get_nb_id(int cpu);
  
 -struct aperfmperf {
 -  u64 aperf, mperf;
 -};
 -
 -static inline void get_aperfmperf(struct aperfmperf *am)
 -{
 -  WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_APERFMPERF));
 -
 -  rdmsrl(MSR_IA32_APERF, am->aperf);
 -  rdmsrl(MSR_IA32_MPERF, am->mperf);
 -}
 -
 -#define APERFMPERF_SHIFT 10
 -
 -static inline
 -unsigned long calc_aperfmperf_ratio(struct aperfmperf *old,
 -  struct aperfmperf *new)
 -{
 -  u64 aperf = new->aperf - old->aperf;
 -  u64 mperf = new->mperf - old->mperf;
 -  unsigned long ratio = aperf;
 -
 -  mperf >>= APERFMPERF_SHIFT;
 -  if (mperf)
 -  ratio = div64_u64(aperf, mperf);
 -
 -  return ratio;
 -}
 -
+ static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
+ {
+   uint32_t base, eax, signature[3];
+ 
+   for (base = 0x4000; base < 0x4001; base += 0x100) {
+   cpuid(base, , [0], [1], [2]);
+ 
+   if (!memcmp(sig, signature, 12) &&
+   (leaves == 0 || ((eax - base) >= leaves)))
+   return base;
+   }
+ 
+   return 0;
+ }
+ 
  extern unsigned long arch_align_stack(unsigned long sp);
  extern void free_init_pages(char *what, unsigned long begin, unsigned long 
end);
  


pgp5pFW59kRiO.pgp
Description: PGP signature

Re: [PATCH tip/core/rcu 5/5] rcu: Make rcutorture emit online failures if verbose

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 07:59:05PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 07:25:17PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > Although rcutorture counts CPU-hotplug online failures, it does
> > not explicitly record which CPUs were having trouble coming online.
> > This commit therefore emits a console message when online failure occurs.
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> One completely optional note below; with or without that change,
> Reviewed-by: Josh Triplett 
> 
> >  kernel/rcutorture.c | 8 +++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
> > index 7d42d13..15bec39 100644
> > --- a/kernel/rcutorture.c
> > +++ b/kernel/rcutorture.c
> > @@ -1437,7 +1437,13 @@ rcu_torture_onoff(void *arg)
> >  torture_type, cpu);
> > starttime = jiffies;
> > n_online_attempts++;
> > -   if (cpu_up(cpu) == 0) {
> > +   ret = cpu_up(cpu);
> > +   if (ret != 0) {
> 
> Or just "if (ret) {"

Makes sense, fixed!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 3/5] rcu: Sort rcutorture module parameters

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 07:57:40PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 07:25:15PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > There are getting to be too many module parameters to permit the current
> > semi-random order, so this patch orders them.
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> As long as you're reordering them anyway, how about grouping each of the
> variables together with the corresponding module parameter macros, and
> then dropping the comments that duplicate the module parameter
> documentation?
> 
> With that change:
> Reviewed-by: Josh Triplett 

Good point, done!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 0/5] zram/zsmalloc promotion

2013-08-18 Thread Bob Liu

Hi Minchan,

On 08/19/2013 11:18 AM, Minchan Kim wrote:
> Hello Mel,
> 
> On Fri, Aug 16, 2013 at 09:33:47AM +0100, Mel Gorman wrote:
>> On Fri, Aug 16, 2013 at 01:26:41PM +0900, Minchan Kim wrote:
>> 
>> If it's used for something like tmpfs then it becomes much worse. Normal
>> tmpfs without swap can lockup if tmpfs is allowed to fill memory. In a
>> sane configuration, lockups will be avoided and deleting a tmpfs file is
>> guaranteed to free memory. When zram is used to back tmpfs, there is no
>> guarantee that any memory is freed due to fragmentation of the compressed
>> pages. The only way to recover the memory may be to kill applications
>> holding tmpfs files open and then delete them which is fairly drastic
>> action in a normal server environment.
>
> Indeed.
> Actually, I had a plan to support zsmalloc compaction. The zsmalloc 
> exposes
> handle instead of pure pointer so it could migrate some zpages to 
> somewhere
> to pack in. Then, it could help above problem and OOM storm problem.
> Anyway, it's a totally new feature and requires many changes and 
> experiement.
> Although we don't have such feature, zram is still good for many people.
>

 And is zsmalloc was pluggable for zswap then it would also benefit.
>>>
>>> But zswap isn't pseudo block device so it couldn't be used for block device.
>>
>> It would not be impossible to write one. Taking a quick look it might even
>> be doable by just providing a zbud_ops that does not have an evict handler
>> and make sure the errors are handled correctly. i.e. does the following
>> patch mean that zswap never writes back and instead just compresses pages
>> in memory?
>>
>> diff --git a/mm/zswap.c b/mm/zswap.c
>> index deda2b6..99e41c8 100644
>> --- a/mm/zswap.c
>> +++ b/mm/zswap.c
>> @@ -819,7 +819,6 @@ static void zswap_frontswap_invalidate_area(unsigned 
>> type)
>>  }
>>  
>>  static struct zbud_ops zswap_zbud_ops = {
>> -.evict = zswap_writeback_entry
>>  };
>>  
>>  static void zswap_frontswap_init(unsigned type)
>>
>> If so, it should be doable to link that up in a sane way so it can be
>> configured at runtime.
>>
>> Did you ever even try something like this?
> 
> Never. Because I didn't have such requirement for zram.
> 
>>
>>> Let say one usecase for using zram-blk.
>>>
>>> 1) Many embedded system don't have swap so although tmpfs can support 
>>> swapout
>>> it's pointless still so such systems should have sane configuration to limit
>>> memory space so it's not only zram problem.
>>>
>>
>> If zswap was backed by a pseudo device that failed all writes or an an
>> ops with no evict handler then it would be functionally similar.
>>
>>> 2) Many embedded system don't have enough memory. Let's assume short-lived
>>> file growing up until half of system memory once in a while. We don't want
>>> to write it on flash by wear-leveing issue and very slowness so we want to 
>>> use
>>> in-memory but if we uses tmpfs, it should evict half of working set to cover
>>> them when the size reach peak. zram would be better choice.
>>>
>>
>> Then back it by a pseudo device that fails all writes so it does not have
>> to write to disk.
> 
> You mean "make pseudo block device and register make_request_fn
> and prevent writeback". Bah, yes, it's doable but what is it different with 
> below?
> 
> 1) move zbud into zram
> 2) implement frontswap API in zram
> 3) implement writebazk in zram
> 
> The zram has been for a long time in staging to be promoted and have been
> maintained/deployed. Of course, I have asked the promotion several times
> for above a year.
> 
> Why can't zram include zswap functions if you really want to merge them?
> Is there any problem?

I think merging zram into zswap or merging zswap into zram are the same
thing. It's no difference.
Both way will result in a solution finally with zram block device,
frontswap API etc.

The difference is just the name and the merging patch title, I think
it's unimportant.

I've implemented a series [PATCH 0/4] mm: merge zram into zswap, I can
change the tile to "merge zswap into zram" if you want and rename zswap
to something like zhybrid.

-- 
Regards,
-Bob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 1/5] rcu: Add duplicate-callback tests to rcutorture

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 07:54:20PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 07:25:13PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > This commit adds a object_debug option to rcutorture to allow the
> > debug-object-based checks for duplicate call_rcu() invocations to
> > be deterministically tested.
> > 
> > Signed-off-by: Paul E. McKenney 
> > Cc: Mathieu Desnoyers 
> > Cc: Sedat Dilek 
> > Cc: Davidlohr Bueso 
> > Cc: Rik van Riel 
> > Cc: Thomas Gleixner 
> > Cc: Linus Torvalds 
> > Tested-by: Sedat Dilek 
> 
> Two comments below; with those fixed,
> Reviewed-by: Josh Triplett 
> 
> > ---
> > @@ -100,6 +101,8 @@ module_param(fqs_stutter, int, 0444);
> >  MODULE_PARM_DESC(fqs_stutter, "Wait time between fqs bursts (s)");
> >  module_param(n_barrier_cbs, int, 0444);
> >  MODULE_PARM_DESC(n_barrier_cbs, "# of callbacks/kthreads for barrier 
> > testing");
> > +module_param(object_debug, int, 0444);
> > +MODULE_PARM_DESC(object_debug, "Enable debug-object double call_rcu() 
> > testing");
> 
> modules-next has a change to ignore and warn about
> unknown module parameters.  Thus, I'd suggest wrapping the ifdef around
> this module parameter, so it doesn't exist at all without
> CONFIG_DEBUG_OBJECTS_RCU_HEAD.
> 
> Alternatively, consider providing the test unconditionally, and just
> printing a big warning message saying that it's going to cause
> corruption in the !CONFIG_DEBUG_OBJECTS_RCU_HEAD case.

I currently do something like the above.  The module parameter
is defined unconditionally, but the actual tests are under #ifdef
CONFIG_DEBUG_OBJECTS_RCU_HEAD.  If you specify object_debug for a
!CONFIG_DEBUG_OBJECTS_RCU_HEAD kernel, the pr_alert() below happens,
and the test is omitted, thus avoiding the list corruption.

Seem reasonable?

> > @@ -2163,6 +2178,28 @@ rcu_torture_init(void)
> > firsterr = retval;
> > goto unwind;
> > }
> > +   if (object_debug) {
> > +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
> > +   struct rcu_head rh1;
> > +   struct rcu_head rh2;
> > +
> > +   init_rcu_head_on_stack();
> > +   init_rcu_head_on_stack();
> > +   pr_alert("rcutorture: WARN: Duplicate call_rcu() test 
> > starting.\n");
> > +   local_irq_disable(); /* Make it hard to finish grace period. */
> > +   call_rcu(, rcu_torture_leak_cb); /* start grace period. */
> > +   call_rcu(, rcu_torture_err_cb);
> > +   call_rcu(, rcu_torture_err_cb); /* duplicate callback. */
> > +   local_irq_enable();
> > +   rcu_barrier();
> > +   pr_alert("rcutorture: WARN: Duplicate call_rcu() test 
> > complete.\n");
> > +   destroy_rcu_head_on_stack();
> > +   destroy_rcu_head_on_stack();
> > +#else /* #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD */
> > +   pr_alert("rcutorture: !%s, not testing duplicate call_rcu()\n",
> > +"CONFIG_DEBUG_OBJECTS_RCU_HEAD");
> 
> Why put this parameter in a separate string?  That makes it harder to
> grep for the full error message.  (That's assuming you keep the error
> message, given the comment above.)

Force of habit, fixed.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cgroup/next tree: reference to uninitialized percpu ref

2013-08-18 Thread Li Zefan

On 2013/8/19 11:32, Ming Lei wrote:
> Hi,
> 
> The kernel oops[1] is triggered during kernel boot with the latest next
> tree(3.11.0-rc5-next-20130816), looks it is caused by reference to 
> uninitialized
> percpu ref of root cgroup, and below patch can fix the problem:
> 

Thanks for the report. Li Zhong has summited a patch to fix it:

www.spinics.net/lists/linux-next/msg26414.html

and it should show up in linux-next tree when next is updated.

> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 723194f..0e8954b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4485,7 +4485,8 @@ static long cgroup_create(struct cgroup *parent,
> struct dentry *dentry,
>   struct cgroup_subsys_state *css = css_ar[ss->subsys_id];
> 
>   dget(dentry);
> - percpu_ref_get(>parent->refcnt);
> + if (!(css->parent->flags & CSS_ROOT))
> + percpu_ref_get(>parent->refcnt);
>   }
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/34] cpufreq: exynos5440: set CPUFREQ_NO_NOTIFICATION flag

2013-08-18 Thread amit daniel kachhap

On Sun, Aug 18, 2013 at 4:24 PM, amit daniel kachhap
 wrote:
> On Fri, Aug 16, 2013 at 7:55 AM, Viresh Kumar  wrote:
>> Most of the drivers do following in their ->target_index() routines:
>>
>> struct cpufreq_freqs freqs;
>> freqs.old = old freq...
>> freqs.new = new freq...
>>
>> cpufreq_notify_transition(policy, , CPUFREQ_PRECHANGE);
>>
>> /* Change rate here */
>>
>> cpufreq_notify_transition(policy, , CPUFREQ_POSTCHANGE);
>>
>> This is replicated over all cpufreq drivers today and there doesn't exists a
>> good enough reason why this shouldn't be moved to cpufreq core instead.
>>
>> Earlier patches have added support in cpufreq core to do cpufreq 
>> notification on
>> frequency change, but this drivers needs to do this notification itself and 
>> so
>> it sets its CPUFREQ_NO_NOTIFICATION flag.
>>
>> Cc: Kukjin Kim 
>> Signed-off-by: Viresh Kumar 
> The code change looks fine,
> Acked-By: Amit Daniel Kachhap 
>
> Thanks
> Amit Daniel
>> ---
>>  drivers/cpufreq/exynos5440-cpufreq.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/exynos5440-cpufreq.c 
>> b/drivers/cpufreq/exynos5440-cpufreq.c
>> index 91a64d6..8fb6183 100644
>> --- a/drivers/cpufreq/exynos5440-cpufreq.c
>> +++ b/drivers/cpufreq/exynos5440-cpufreq.c
>> @@ -323,7 +323,7 @@ static int exynos_cpufreq_cpu_init(struct cpufreq_policy 
>> *policy)
>>  }
>>
>>  static struct cpufreq_driver exynos_driver = {
>> -   .flags  = CPUFREQ_STICKY,
>> +   .flags  = CPUFREQ_STICKY | CPUFREQ_NO_NOTIFICATION,
How about naming the flag as CPUFREQ_ASYNC_NOTIFICATION? For platforms
not defining this flag, the notifiers can be called synchronously from
the core driver.

Thanks,
Amit Daniel
>> .verify = cpufreq_generic_frequency_table_verify,
>> .target_index   = exynos_target,
>> .get= exynos_getspeed,
>> --
>> 1.7.12.rc2.18.g61b472e
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag

2013-08-18 Thread Eric W. Biederman




>
>Sorry Eric, I'm not clear to what you mean by ``short one core''...
>Which are you suggesting? Disabling BSP if crash happens on AP is
>reasonable?
>Or restricting cpus to a single one only just as the current kdump
>configuration is reasonable?

I am suggesting we start every cpu except the BSP from the AP we started on.

N-1 cpus seems like a good tradeoff between performance and reliability for 
those who need it.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 6/9] nohz_full: Add full-system idle states and variables

2013-08-18 Thread Paul E. McKenney

On Sun, Aug 18, 2013 at 07:49:14PM -0700, Josh Triplett wrote:
> On Sun, Aug 18, 2013 at 06:39:25PM -0700, Paul E. McKenney wrote:
> > On Sat, Aug 17, 2013 at 08:09:21PM -0700, Josh Triplett wrote:
> > > On Sat, Aug 17, 2013 at 06:49:41PM -0700, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" 
> > > > 
> > > > This commit adds control variables and states for full-system idle.
> > > > The system will progress through the states in numerical order when
> > > > the system is fully idle (other than the timekeeping CPU), and reset
> > > > down to the initial state if any non-timekeeping CPU goes non-idle.
> > > > The current state is kept in full_sysidle_state.
> > > > 
> > > > A RCU_SYSIDLE_SMALL macro is defined, and systems with this number
> > > > of CPUs or fewer move through the states more aggressively.  The idea
> > > > is that the resulting memory contention is less of a problem on small
> > > > systems.  Architectures can adjust this value (which defaults to 8)
> > > > using CONFIG_ARCH_RCU_SYSIDLE_SMALL.
> > > > 
> > > > One flavor of RCU will be in charge of driving the state machine,
> > > > defined by rcu_sysidle_state.  This should be the busiest flavor of RCU.
> > > > 
> > > > Signed-off-by: Paul E. McKenney 
> > > > Cc: Frederic Weisbecker 
> > > > Cc: Steven Rostedt 
> > > 
> > > One issue (and one question) below; with the issue addressed,
> > > Reviewed-by: Josh Triplett 
> > > 
> > > >  kernel/rcutree_plugin.h | 28 
> > > >  1 file changed, 28 insertions(+)
> > > > 
> > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > > > index eab81da..64a05b9f 100644
> > > > --- a/kernel/rcutree_plugin.h
> > > > +++ b/kernel/rcutree_plugin.h
> > > > @@ -2378,6 +2378,34 @@ static void rcu_kick_nohz_cpu(int cpu)
> > > >  #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> > > >  
> > > >  /*
> > > > + * Handle small systems specially, accelerating their transition into
> > > > + * full idle state.  Allow arches to override this code's idea of
> > > > + * what constitutes a "small" system.
> > > > + */
> > > > +#ifdef CONFIG_ARCH_RCU_SYSIDLE_SMALL
> > > 
> > > I don't see any Kconfig creating this new config option.
> > > 
> > > Also, why not simply define this config option unconditionally, with a
> > > default of 8, and then use its value directly?
> > 
> > Good point, removing this and adding a Kconfig option in the
> > "nohz_full: Add full-system-idle state machine" commit, with a
> > default value of 8.  Architecture maintainers who want something
> > different can then set that up in their defconfig files.
> 
> Sounds good.
> 
> > > > +static int __maybe_unused full_sysidle_state; /* Current system-idle 
> > > > state. */
> > > > +#define RCU_SYSIDLE_NOT0   /* Some CPU is not 
> > > > idle. */
> > > > +#define RCU_SYSIDLE_SHORT  1   /* All CPUs idle for brief 
> > > > period. */
> > > > +#define RCU_SYSIDLE_LONG   2   /* All CPUs idle for long 
> > > > enough. */
> > > > +#define RCU_SYSIDLE_FULL   3   /* All CPUs idle, ready for 
> > > > sysidle. */
> > > > +#define RCU_SYSIDLE_FULL_NOTED 4   /* Actually entered sysidle 
> > > > state. */
> > > 
> > > Perhaps there's a kernel style rule I'm not thinking of that makes it
> > > verboten, but: why not use an enum for a state variable like this?
> > 
> > I didn't trust enum interactions with xchg and cmpxchg, so opted for "int"
> > instead.  That said, enum is much more portable than when I last looked
> > at it.  Admittedly, the last time I looked at it was in the early 1980s...
> 
> That would make sense if this was an atomic_t, but it's an int; unless I
> missed something, you don't currently use xchg or cmpxchg on it.

The xchg and cmpxchg show up in the "Add full-system-idle state machine"
commit.  Of course, now I am trying to remember why I used int instead
of atomic_t in this case...  :-/

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

cgroup/next tree: reference to uninitialized percpu ref

2013-08-18 Thread Ming Lei

Hi,

The kernel oops[1] is triggered during kernel boot with the latest next
tree(3.11.0-rc5-next-20130816), looks it is caused by reference to uninitialized
percpu ref of root cgroup, and below patch can fix the problem:

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 723194f..0e8954b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4485,7 +4485,8 @@ static long cgroup_create(struct cgroup *parent,
struct dentry *dentry,
  struct cgroup_subsys_state *css = css_ar[ss->subsys_id];

  dget(dentry);
- percpu_ref_get(>parent->refcnt);
+ if (!(css->parent->flags & CSS_ROOT))
+ percpu_ref_get(>parent->refcnt);
  }

  /* hold a ref to the parent's dentry */



[1], oops log:
[3.155985] Unable to handle kernel paging request at virtual
address 011bb000
[3.163083] pgd = ee864000
[3.165715] [011bb000] *pgd=
[3.169219] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[3.174428] Modules linked in: ipv6
[3.177844] CPU: 1 PID: 1 Comm: systemd Not tainted
3.11.0-rc5-next-20130816+ #237
[3.185280] task: ef00e400 ti: ef09c000 task.ti: ef09c000
[3.190573] PC is at cgroup_mkdir+0x324/0x5a0
[3.194841] LR is at cgroup_mkdir+0x314/0x5a0
[3.199114] pc : []lr : []psr: 40010013
[3.199114] sp : ef09def8  ip :   fp : ee8e
[3.210393] r10: c064b4e4  r9 : ef09c000  r8 : c064b528
[3.215511] r7 : c064b4dc  r6 : ee8e0018  r5 : eed1f880  r4 : eebf2c00
[3.221918] r3 : 011bb000  r2 : ef09de78  r1 : 60010013  r0 : 0030
[3.228326] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[3.235329] Control: 10c5387d  Table: 6e86406a  DAC: 0015
[3.240964] Process systemd (pid: 1, stack limit = 0xef09c238)
[3.246687] Stack: (0xef09def8 to 0xef09e000)
[3.250956] dee0:
00d0 eef837f8
[3.258993] df00: 180fb270  eeb75023 ee8e000c eed1f8d4
ee8e0130  
[3.267020] df20: eea1fa80 ef0de880   
 0003 
[3.275049] df40: eefe2d70 eed1f880 01ed ff9c ef09c000
0011be80 be9814ac c00ec0bc
[3.283077] df60: eefe2d70 eed1f880 01ed eed1f880 0002
01ed 0027 c00efb74
[3.291106] df80: ef2ebc10 eef837f8 01ed 0011be80 01ed
000cc730 0027 c000e048
[3.299135] dfa0:  c000dea0 0011be80 01ed 0011be80
01ed 0001 
[3.307164] dfc0: 0011be80 01ed 000cc730 0027 000d7202
ffef 0011be80 be9814ac
[3.315192] dfe0: 000cc03c be9813bc 00055280 b6d6769c 60010010
0011be80 fffd 
[3.323225] [] (cgroup_mkdir+0x324/0x5a0) from
[] (vfs_mkdir+0x88/0xc8)
[3.331424] [] (vfs_mkdir+0x88/0xc8) from []
(SyS_mkdirat+0x6c/0xa0)
[3.339369] [] (SyS_mkdirat+0x6c/0xa0) from []
(ret_fast_syscall+0x0/0x30)
[3.347821] Code: e59dc00c e31c0003 1a04 ee1d3f90 (e793200c)
[3.353817] ---[ end trace ea0a2516971df41f ]---

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 00/45] 3.10.8-stable review

2013-08-18 Thread Greg Kroah-Hartman

On Sun, Aug 18, 2013 at 06:43:18PM -0700, Guenter Roeck wrote:
> On 08/18/2013 01:35 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.10.8 release.
> > There are 45 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Tue Aug 20 20:36:09 UTC 2013.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.10.8-rc1.gz
> > and the diffstat can be found below.
> >
> 
> Cross build results:
>   Total builds: 76 Total build errors: 0
> Previous release:
>   Total builds: 69 Total build errors: 0
> 
> qemu:
>   mips, ppc, x86, x86_64 pass (boot to login prompt)
>   arm: fail (known problem, see https://lkml.org/lkml/2013/8/11/41)
> 
> More builds, still no failures.

Thanks for testing all 3 of these releases and letting me know all is
well.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] MAINTAINERS: add the generic sched_clock under timekeeping

2013-08-18 Thread Baruch Siach

Hi John,

On Wed, Jul 17, 2013 at 12:21:12PM -0700, John Stultz wrote:
> On 07/17/2013 12:12 PM, Baruch Siach wrote:
> >On Wed, Jul 17, 2013 at 11:57:32AM -0700, John Stultz wrote:
> >>On 07/17/2013 03:05 AM, Baruch Siach wrote:
> >>>Signed-off-by: Baruch Siach 
> >>>---
> >>>  MAINTAINERS | 2 ++
> >>>  1 file changed, 2 insertions(+)
> >>>
> >>>diff --git a/MAINTAINERS b/MAINTAINERS
> >>>index bf61e04..bd9616a 100644
> >>>--- a/MAINTAINERS
> >>>+++ b/MAINTAINERS
> >>>@@ -7129,6 +7129,7 @@ M:   Thomas Gleixner 
> >>>  T:   git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> >>> timers/core
> >>>  S:   Supported
> >>>  F:   include/linux/clocksource.h
> >>>+F:include/linux/sched_clock.h
> >>>  F:   include/linux/time.h
> >>>  F:   include/linux/timex.h
> >>>  F:   include/uapi/linux/time.h
> >>>@@ -7136,6 +7137,7 @@ F:   include/uapi/linux/timex.h
> >>>  F:   kernel/time/clocksource.c
> >>>  F:   kernel/time/time*.c
> >>>  F:   kernel/time/ntp.c
> >>>+F:kernel/time/sched_clock.c
> >>It seems like we could probably simplify this bit to kernel/time/*, no?
> >This would add jiffies.c and all the tick-* files that are currently covered
> >by the "HIGH-RESOLUTION TIMERS, CLOCKEVENTS, DYNTICKS" section.
> 
> Ok, fair enough. My thought was it all goes through Thomas anyway,
> but for those sections there are different combinations of folks who
> co-maintain.

So would you take it for v3.12?

baruch

-- 
 http://baruch.siach.name/blog/  ~. .~   Tk Open Systems
=}ooO--U--Ooo{=
   - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] usb: chipidea: USB_CHIPIDEA should depend on HAS_DMA

2013-08-18 Thread Peter Chen

On Sun, Aug 18, 2013 at 10:20:44PM +0200, Geert Uytterhoeven wrote:
> If NO_DMA=y:
> 
> drivers/built-in.o: In function `dma_set_coherent_mask':
> include/linux/dma-mapping.h:93: undefined reference to `dma_supported'
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/usb/chipidea/Kconfig |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/chipidea/Kconfig b/drivers/usb/chipidea/Kconfig
> index d1bd8ef..dbd5232 100644
> --- a/drivers/usb/chipidea/Kconfig
> +++ b/drivers/usb/chipidea/Kconfig
> @@ -1,6 +1,6 @@
>  config USB_CHIPIDEA
>   tristate "ChipIdea Highspeed Dual Role Controller"
> - depends on USB || USB_GADGET
> + depends on (USB || USB_GADGET) && HAS_DMA
>   help
> Say Y here if your system has a dual role high speed USB
> controller based on ChipIdea silicon IP. Currently, only the

I can't understand why the DMA can't be changed to fix this instead
of changing every driver?

-- 

Best Regards,
Peter Chen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [PATCH] Staging: rtl8192e: rtllib_rx: checking NULL value afterdoing dev_alloc_skb

2013-08-18 Thread Greg KH

On Mon, Aug 19, 2013 at 09:15:15AM +0800, rucsoftsec wrote:
> I have read that file. But the trouble is that I was not sure whether it is a
> bug or not. So I report it to BugZilla, and wait for further confirmation.

Don't worry about bugzilla, please send us a patch through email so we
can accept it.  We can't take anything from bugzilla.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the libata tree

2013-08-18 Thread Terry Suereth

Per my original patch, "0x3x26" in code should be "0x3726" (as in the
unpatched version).  I refer to "x26" in comments as a forward-looking
assumption, but AFAIK only 3726 and 3826 exist at this time.

terry.suer...@gmail.com


On Sun, Aug 18, 2013 at 6:54 PM, Stephen Rothwell  wrote:
> Hi Tejun,
>
> After merging the libata tree, today's linux-next build (powerpc
> ppc64_defconfig) failed like this:
>
> drivers/ata/libata-pmp.c: In function 'sata_pmp_quirks':
> drivers/ata/libata-pmp.c:386:36: error: invalid suffix "x26" on integer 
> constant
>   if (vendor == 0x1095 && (devid == 0x3x26 || devid == 0x3826)) {
> ^
>
> Caused by commit f1a313ad86b9 ("libata: apply behavioral quirks to sil3826 
> PMP").
>
> I have used the libata tree from next-20130816 for today.
> --
> Cheers,
> Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 0/5] zram/zsmalloc promotion

2013-08-18 Thread Minchan Kim

Hello Mel,

On Fri, Aug 16, 2013 at 09:33:47AM +0100, Mel Gorman wrote:
> On Fri, Aug 16, 2013 at 01:26:41PM +0900, Minchan Kim wrote:
> > > > > 
> > > > > If it's used for something like tmpfs then it becomes much worse. 
> > > > > Normal
> > > > > tmpfs without swap can lockup if tmpfs is allowed to fill memory. In a
> > > > > sane configuration, lockups will be avoided and deleting a tmpfs file 
> > > > > is
> > > > > guaranteed to free memory. When zram is used to back tmpfs, there is 
> > > > > no
> > > > > guarantee that any memory is freed due to fragmentation of the 
> > > > > compressed
> > > > > pages. The only way to recover the memory may be to kill applications
> > > > > holding tmpfs files open and then delete them which is fairly drastic
> > > > > action in a normal server environment.
> > > > 
> > > > Indeed.
> > > > Actually, I had a plan to support zsmalloc compaction. The zsmalloc 
> > > > exposes
> > > > handle instead of pure pointer so it could migrate some zpages to 
> > > > somewhere
> > > > to pack in. Then, it could help above problem and OOM storm problem.
> > > > Anyway, it's a totally new feature and requires many changes and 
> > > > experiement.
> > > > Although we don't have such feature, zram is still good for many people.
> > > > 
> > > 
> > > And is zsmalloc was pluggable for zswap then it would also benefit.
> > 
> > But zswap isn't pseudo block device so it couldn't be used for block device.
> 
> It would not be impossible to write one. Taking a quick look it might even
> be doable by just providing a zbud_ops that does not have an evict handler
> and make sure the errors are handled correctly. i.e. does the following
> patch mean that zswap never writes back and instead just compresses pages
> in memory?
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index deda2b6..99e41c8 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -819,7 +819,6 @@ static void zswap_frontswap_invalidate_area(unsigned type)
>  }
>  
>  static struct zbud_ops zswap_zbud_ops = {
> - .evict = zswap_writeback_entry
>  };
>  
>  static void zswap_frontswap_init(unsigned type)
> 
> If so, it should be doable to link that up in a sane way so it can be
> configured at runtime.
> 
> Did you ever even try something like this?

Never. Because I didn't have such requirement for zram.

> 
> > Let say one usecase for using zram-blk.
> > 
> > 1) Many embedded system don't have swap so although tmpfs can support 
> > swapout
> > it's pointless still so such systems should have sane configuration to limit
> > memory space so it's not only zram problem.
> > 
> 
> If zswap was backed by a pseudo device that failed all writes or an an
> ops with no evict handler then it would be functionally similar.
> 
> > 2) Many embedded system don't have enough memory. Let's assume short-lived
> > file growing up until half of system memory once in a while. We don't want
> > to write it on flash by wear-leveing issue and very slowness so we want to 
> > use
> > in-memory but if we uses tmpfs, it should evict half of working set to cover
> > them when the size reach peak. zram would be better choice.
> > 
> 
> Then back it by a pseudo device that fails all writes so it does not have
> to write to disk.

You mean "make pseudo block device and register make_request_fn
and prevent writeback". Bah, yes, it's doable but what is it different with 
below?

1) move zbud into zram
2) implement frontswap API in zram
3) implement writebazk in zram

The zram has been for a long time in staging to be promoted and have been
maintained/deployed. Of course, I have asked the promotion several times
for above a year.

Why can't zram include zswap functions if you really want to merge them?
Is there any problem?

> 
> > > 
> > > > > These are the sort of reason why I feel that zram has limited cases 
> > > > > where
> > > > > it is safe to use and zswap has a wider range of applications. At 
> > > > > least
> > > > > I would be very unhappy to try supporting zram in the field for normal
> > > > > servers. zswap should be able to replace the functionality of 
> > > > > zram+swap
> > > > > by backing zswap with a pseudo block device that rejects all writes. I
> > > > 
> > > > One of difference between zswap and zram is asynchronous I/O support.
> > > 
> > > As zram is not writing to disk, how compelling is asynchronous IO? If
> > > zswap was backed by the pseudo device is there a measurable bottleneck?
> > 
> > Compression. It was really bottlneck point. I had an internal patch which
> > can make zram use various compressor, not only LZO.
> > The better good compressor was, the more bottlenck compressor was.
> > 
> 
> There are two issues there. One that different compression algorithms
> should be optional with tradeoffs on speed vs compression ratio. There is
> no reason why that couldn't be hacked into zswap.
> 
> The second is that only one page can be compressed at a time. That would
> require further work to allow the frontswap API to

[RFC PATCH] pwm: atmel-pwm: add pwm controller driver

2013-08-18 Thread Bo Shen

add atmel pwm controller driver based on PWM framework

this is basic function implementation of pwm controller
it can work with pwm based led and backlight

Signed-off-by: Bo Shen 

---
This patch is based on Linux v3.11 rc6
Tested on sama5d31ek and at91sam9m10g45ek board
---
 .../devicetree/bindings/pwm/atmel-pwm.txt  |   19 ++
 drivers/pwm/Kconfig|9 +
 drivers/pwm/Makefile   |1 +
 drivers/pwm/pwm-atmel.c|  327 
 4 files changed, 356 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pwm/atmel-pwm.txt
 create mode 100644 drivers/pwm/pwm-atmel.c

diff --git a/Documentation/devicetree/bindings/pwm/atmel-pwm.txt 
b/Documentation/devicetree/bindings/pwm/atmel-pwm.txt
new file mode 100644
index 000..127fcdb
--- /dev/null
+++ b/Documentation/devicetree/bindings/pwm/atmel-pwm.txt
@@ -0,0 +1,19 @@
+Atmel PWM controller
+
+Required properties:
+  - compatible: should be one of:
+- "atmel,at91sam9rl-pwm"
+- "atmel,sama5-pwm"
+  - reg: physical base address and length of the controller's registers
+  - #pwm-cells: Should be 3.
+- The first cell specifies the per-chip index of the PWM to use
+- The second cell is the period in nanoseconds
+- The third cell is used to encode the polarity of PWM output
+
+Example:
+
+   pwm0: pwm@f8034000 {
+   compatible = "atmel,at91sam9rl-pwm";
+   reg = <0xf8034000 0x400>;
+   #pwm-cells = <3>;
+   };
diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index 75840b5..54237b9 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -41,6 +41,15 @@ config PWM_AB8500
  To compile this driver as a module, choose M here: the module
  will be called pwm-ab8500.
 
+config PWM_ATMEL
+   tristate "Atmel PWM support"
+   depends on ARCH_AT91
+   help
+ Generic PWM framework driver for Atmel SoC.
+
+ To compile this driver as a module, choose M here: the module
+ will be called pwm-atmel.
+
 config PWM_ATMEL_TCB
tristate "Atmel TC Block PWM support"
depends on ATMEL_TCLIB && OF
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index 77a8c18..5b193f8 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -1,6 +1,7 @@
 obj-$(CONFIG_PWM)  += core.o
 obj-$(CONFIG_PWM_SYSFS)+= sysfs.o
 obj-$(CONFIG_PWM_AB8500)   += pwm-ab8500.o
+obj-$(CONFIG_PWM_ATMEL)+= pwm-atmel.o
 obj-$(CONFIG_PWM_ATMEL_TCB)+= pwm-atmel-tcb.o
 obj-$(CONFIG_PWM_BFIN) += pwm-bfin.o
 obj-$(CONFIG_PWM_IMX)  += pwm-imx.o
diff --git a/drivers/pwm/pwm-atmel.c b/drivers/pwm/pwm-atmel.c
new file mode 100644
index 000..b83d68e
--- /dev/null
+++ b/drivers/pwm/pwm-atmel.c
@@ -0,0 +1,327 @@
+/*
+ * Driver for Atmel Pulse Width Modulation Controller
+ *
+ * Copyright (C) 2013 Atmel Semiconductor Technology Ltd.
+ *  Bo Shen 
+ *
+ * GPL v2 or later
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PWM_MR 0x00
+#define PWM_ENA0x04
+#define PWM_DIS0x08
+#define PWM_SR 0x0C
+
+#define PWM_CMR0x00
+
+/* The following register for PWM v1 */
+#define PWMv1_CDTY 0x04
+#define PWMv1_CPRD 0x08
+#define PWMv1_CUPD 0x10
+
+/* The following register for PWM v2 */
+#define PWMv2_CDTY 0x04
+#define PWMv2_CDTYUPD  0x08
+#define PWMv2_CPRD 0x0C
+#define PWMv2_CPRDUPD  0x10
+
+#define PWM_NUM4
+
+struct atmel_pwm_chip {
+   struct pwm_chip chip;
+   struct clk *clk;
+   void __iomem *base;
+
+   void (*config)(struct atmel_pwm_chip *chip, struct pwm_device *pwm,
+   unsigned int dty, unsigned int prd);
+};
+
+#define to_atmel_pwm_chip(chip) container_of(chip, struct atmel_pwm_chip, chip)
+
+static inline u32 atmel_pwm_readl(struct atmel_pwm_chip *chip, int offset)
+{
+   return readl(chip->base + offset);
+}
+
+static inline void atmel_pwm_writel(struct atmel_pwm_chip *chip, int offset,
+   u32 val)
+{
+   writel(val, chip->base + offset);
+}
+
+static inline u32 atmel_pwm_ch_readl(struct atmel_pwm_chip *chip, int ch,
+   int offset)
+{
+   return readl(chip->base + 0x200 + ch * 0x20 + offset);
+}
+
+static inline void atmel_pwm_ch_writel(struct atmel_pwm_chip *chip, int ch,
+   int offset, u32 val)
+{
+   writel(val, chip->base + 0x200 + ch * 0x20 + offset);
+}
+
+static int atmel_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+   int duty_ns, int period_ns)
+{
+   struct atmel_pwm_chip *atmel_pwm = to_atmel_pwm_chip(chip);
+   unsigned long long val, prd, dty;
+   unsigned long long div, clk_rate;
+   int ret, pres = 0;
+
+   clk_rate = clk_get_rate(atmel_pwm->clk);
+
+

Re: [GIT PULL] Generic sched_clock fix for 3.11

2013-08-18 Thread Baruch Siach

Hi Ingo,

On Mon, Aug 12, 2013 at 06:13:11PM +0200, Ingo Molnar wrote:
> 
> * John Stultz  wrote:
> 
> > Hey Thomas,
> > Just one small fix against tip/timers/urgent for 3.11.
> > 
> > thanks
> > -john
> > 
> > 
> > The following changes since commit b0ec636c93ddd77235bf0f023a8a95d78cb6cafe:
> > 
> >   Merge branch 'timers/clockevents' of
> > git://git.linaro.org/people/dlezcano/clockevents into timers/urgent
> > (2013-07-12 17:10:30 +0200)
> > 
> > are available in the git repository at:
> > 
> > 
> >   git://git.linaro.org/people/jstultz/linux.git fortglx/3.11/time
> > 
> > for you to fetch changes up to 53c035204253efe373d9ff166fae6147e8c693b6:
> > 
> >   sched_clock: Fix integer overflow (2013-07-22 16:24:22 -0700)
> > 
> > 
> > Baruch Siach (1):
> >   sched_clock: Fix integer overflow
> > 
> >  kernel/time/sched_clock.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Pulled, thanks John!

Do you intend to push it for -rc7?

baruch

-- 
 http://baruch.siach.name/blog/  ~. .~   Tk Open Systems
=}ooO--U--Ooo{=
   - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH resend] kernel: fix new kernel-doc warning in wait.c

2013-08-18 Thread Randy Dunlap

From: Randy Dunlap 

Fix new kernel-doc warnings in kernel/wait.c:

Warning(kernel/wait.c:374): No description found for parameter 'p'
Warning(kernel/wait.c:374): Excess function parameter 'word' description in 
'wake_up_atomic_t'
Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 
'wake_up_atomic_t'

Signed-off-by: Randy Dunlap 
Cc: David Howells 
---
 kernel/wait.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- lnx-311-rc6.orig/kernel/wait.c
+++ lnx-311-rc6/kernel/wait.c
@@ -363,8 +363,7 @@ EXPORT_SYMBOL(out_of_line_wait_on_atomic
 
 /**
  * wake_up_atomic_t - Wake up a waiter on a atomic_t
- * @word: The word being waited on, a kernel virtual address
- * @bit: The bit of the word being waited on
+ * @p: The atomic_t being waited on, a kernel virtual address
  *
  * Wake up anyone waiting for the atomic_t to go to zero.
  *
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] drivers: block :swim3: fixed the errors on coding style

2013-08-18 Thread Joe Perches

On Mon, 2013-08-19 at 01:09 +0530, Thiagarajan Thangavel wrote:
> Fixed the coding style errors
[]
> diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
[]
> @@ -783,7 +783,7 @@ static irqreturn_t swim3_interrupt(int irq, void *dev_id)
>   act(fs);
>   } else {
>   swim3_err("Error %sing block %ld (err=%x)\n",
> -rq_data_dir(req) == WRITE ? "writ" : 
> "read",
> + rq_data_dir(req) == WRITE ? "writ" : "read",

This looks worse to me.

My preference would be to align the arguments to the open
parenthesis and to use full words instead:

swim3_err("Error %s block %ld (err=%x)\n",
  rq_data_dir(req) == WRITE ? "writing" 
: "reading",
  (long)blk_rq_pos(req), err);

> @@ -894,7 +894,18 @@ static int fd_eject(struct floppy_state *fs)
[]
> -static struct floppy_struct floppy_type = { 2880, 18, 2, 80, 0, 0x1B, 0x00, 
> 0xCF, 0x6C, NULL };  /*  7 1.44MB 3.5"   */
> +static struct floppy_struct floppy_type = {
> + 2880,
> + 18,
> + 2,
> + 80,
> + 0,
> + 0x1B,
> + 0x00,
> + 0xCF,
> + 0x6C,
> + NULL
> +};   /*  7 1.44MB 3.5"   */

These changes are unattractive to me.
I don't find much wrong with the original though
I would probably have written it as:

static struct floppy_struct floppy_type = { /* 7 1.44MB 3.5" */
2880, 18, 2, 80, 0, 0x1B, 0x00, 0xCF, 0x6C, NULL
};


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 6/9] nohz_full: Add full-system idle states and variables

2013-08-18 Thread Josh Triplett

On Sun, Aug 18, 2013 at 06:39:25PM -0700, Paul E. McKenney wrote:
> On Sat, Aug 17, 2013 at 08:09:21PM -0700, Josh Triplett wrote:
> > On Sat, Aug 17, 2013 at 06:49:41PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" 
> > > 
> > > This commit adds control variables and states for full-system idle.
> > > The system will progress through the states in numerical order when
> > > the system is fully idle (other than the timekeeping CPU), and reset
> > > down to the initial state if any non-timekeeping CPU goes non-idle.
> > > The current state is kept in full_sysidle_state.
> > > 
> > > A RCU_SYSIDLE_SMALL macro is defined, and systems with this number
> > > of CPUs or fewer move through the states more aggressively.  The idea
> > > is that the resulting memory contention is less of a problem on small
> > > systems.  Architectures can adjust this value (which defaults to 8)
> > > using CONFIG_ARCH_RCU_SYSIDLE_SMALL.
> > > 
> > > One flavor of RCU will be in charge of driving the state machine,
> > > defined by rcu_sysidle_state.  This should be the busiest flavor of RCU.
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > > Cc: Frederic Weisbecker 
> > > Cc: Steven Rostedt 
> > 
> > One issue (and one question) below; with the issue addressed,
> > Reviewed-by: Josh Triplett 
> > 
> > >  kernel/rcutree_plugin.h | 28 
> > >  1 file changed, 28 insertions(+)
> > > 
> > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > > index eab81da..64a05b9f 100644
> > > --- a/kernel/rcutree_plugin.h
> > > +++ b/kernel/rcutree_plugin.h
> > > @@ -2378,6 +2378,34 @@ static void rcu_kick_nohz_cpu(int cpu)
> > >  #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> > >  
> > >  /*
> > > + * Handle small systems specially, accelerating their transition into
> > > + * full idle state.  Allow arches to override this code's idea of
> > > + * what constitutes a "small" system.
> > > + */
> > > +#ifdef CONFIG_ARCH_RCU_SYSIDLE_SMALL
> > 
> > I don't see any Kconfig creating this new config option.
> > 
> > Also, why not simply define this config option unconditionally, with a
> > default of 8, and then use its value directly?
> 
> Good point, removing this and adding a Kconfig option in the
> "nohz_full: Add full-system-idle state machine" commit, with a
> default value of 8.  Architecture maintainers who want something
> different can then set that up in their defconfig files.

Sounds good.

> > > +static int __maybe_unused full_sysidle_state; /* Current system-idle 
> > > state. */
> > > +#define RCU_SYSIDLE_NOT  0   /* Some CPU is not idle. */
> > > +#define RCU_SYSIDLE_SHORT1   /* All CPUs idle for brief 
> > > period. */
> > > +#define RCU_SYSIDLE_LONG 2   /* All CPUs idle for long enough. */
> > > +#define RCU_SYSIDLE_FULL 3   /* All CPUs idle, ready for sysidle. */
> > > +#define RCU_SYSIDLE_FULL_NOTED   4   /* Actually entered sysidle 
> > > state. */
> > 
> > Perhaps there's a kernel style rule I'm not thinking of that makes it
> > verboten, but: why not use an enum for a state variable like this?
> 
> I didn't trust enum interactions with xchg and cmpxchg, so opted for "int"
> instead.  That said, enum is much more portable than when I last looked
> at it.  Admittedly, the last time I looked at it was in the early 1980s...

That would make sense if this was an atomic_t, but it's an int; unless I
missed something, you don't currently use xchg or cmpxchg on it.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the wireless-next tree with the wireless tree

2013-08-18 Thread Stephen Rothwell

Hi John,

Today's linux-next merge of the wireless-next tree got a conflict in 
drivers/net/wireless/iwlwifi/pcie/trans.c between commit eabc4ac5d760 
("iwlwifi: pcie: disable L1 Active after pci_enable_device") from the wireless 
tree and commit 6965a3540a4b ("iwlwifi: pcie: don't swallow error codes in 
iwl_trans_pcie_alloc()") from the wireless-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/net/wireless/iwlwifi/pcie/trans.c
index 8c6c405,bad95d2..000
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@@ -1400,11 -1401,6 +1401,10 @@@ struct iwl_trans *iwl_trans_pcie_alloc(
spin_lock_init(_pcie->reg_lock);
init_waitqueue_head(_pcie->ucode_write_waitq);
  
-   if (pci_enable_device(pdev)) {
-   err = -ENODEV;
++  err = pci_enable_device(pdev);
++  if (err)
 +  goto out_no_pci;
-   }
 +
if (!cfg->base_params->pcie_l1_allowed) {
/*
 * W/A - seems to solve weird behavior. We need to remove this


pgp50sL4jPdw1.pgp
Description: PGP signature

Re: [PATCH] usb: phy: Cleanup error code in _usb_get_phy_() APIs

2013-08-18 Thread Vivek Gautam

Hi,


On Thu, Aug 8, 2013 at 12:05 AM, Julius Werner  wrote:
>> @@ -94,11 +94,11 @@ static int devm_usb_phy_match(struct device *dev, void 
>> *res, void *match_data)
>>   */
>>  struct usb_phy *devm_usb_get_phy(struct device *dev, enum usb_phy_type type)
>>  {
>> -   struct usb_phy **ptr, *phy;
>> +   struct usb_phy  *phy = ERR_PTR(-ENOMEM), **ptr;
>
> This looks a little roundabout, don't you think? Why don't you just
> directly have 'return ERR_PTR(-ENOMEM)' down there where you put 'goto
> err0'?

Ok, will change this.

>
>>
>> ptr = devres_alloc(devm_usb_phy_release, sizeof(*ptr), GFP_KERNEL);
>> if (!ptr)
>> -   return NULL;
>> +   goto err0;
>>
>> phy = usb_get_phy(type);
>> if (!IS_ERR(phy)) {
>> @@ -107,6 +107,7 @@ struct usb_phy *devm_usb_get_phy(struct device *dev, 
>> enum usb_phy_type type)
>> } else
>> devres_free(ptr);
>>
>> +err0:
>> return phy;
>>  }
>>  EXPORT_SYMBOL_GPL(devm_usb_get_phy);
>
>>  struct usb_phy *devm_usb_get_phy_dev(struct device *dev, u8 index)
>>  {
>> -   struct usb_phy **ptr, *phy;
>> +   struct usb_phy  *phy = ERR_PTR(-ENOMEM), **ptr;
>
> Same here

will change this too.

>
>>
>> ptr = devres_alloc(devm_usb_phy_release, sizeof(*ptr), GFP_KERNEL);
>> if (!ptr)
>> -   return NULL;
>> +   goto err0;
>>
>> phy = usb_get_phy_dev(dev, index);
>> if (!IS_ERR(phy)) {
>> @@ -267,6 +268,7 @@ struct usb_phy *devm_usb_get_phy_dev(struct device *dev, 
>> u8 index)
>> } else
>> devres_free(ptr);
>>
>> +err0:
>> return phy;
>>  }
>>  EXPORT_SYMBOL_GPL(devm_usb_get_phy_dev);
>
>> @@ -142,7 +142,7 @@ extern void usb_remove_phy(struct usb_phy *);
>>  /* helpers for direct access thru low-level io interface */
>>  static inline int usb_phy_io_read(struct usb_phy *x, u32 reg)
>>  {
>> -   if (x->io_ops && x->io_ops->read)
>> +   if (!IS_ERR(x) && x->io_ops && x->io_ops->read)
>
> I liked the ones where we had IS_ERR_OR_NULL() here (and in all the
> ones below)... you sometimes have to handle PHYs in
> platform-independent code where you don't want to worry about if this
> platform actually has a PHY driver there or not. Any reason you
> changed that?

The **get_phy_*() APIs never return a NULL pointer now, do we still
need to handle that in that case.
Or are we assuming that code will use these phy operations without
getting a phy in the first place ?

>
>> return x->io_ops->read(x, reg);
>>
>> return -EINVAL;
>> @@ -150,7 +150,7 @@ static inline int usb_phy_io_read(struct usb_phy *x, u32 
>> reg)
>>
>>  static inline int usb_phy_io_write(struct usb_phy *x, u32 val, u32 reg)
>>  {
>> -   if (x->io_ops && x->io_ops->write)
>> +   if (!IS_ERR(x) && x->io_ops && x->io_ops->write)
>> return x->io_ops->write(x, val, reg);
>>
>> return -EINVAL;
>> @@ -159,7 +159,7 @@ static inline int usb_phy_io_write(struct usb_phy *x, 
>> u32 val, u32 reg)
>>  static inline int
>>  usb_phy_init(struct usb_phy *x)
>>  {
>> -   if (x->init)
>> +   if (!IS_ERR(x) && x->init)
>> return x->init(x);
>>
>> return 0;
>> @@ -168,26 +168,27 @@ usb_phy_init(struct usb_phy *x)
>>  static inline void
>>  usb_phy_shutdown(struct usb_phy *x)
>>  {
>> -   if (x->shutdown)
>> +   if (!IS_ERR(x) && x->shutdown)
>> x->shutdown(x);
>>  }
>>
>>  static inline int
>>  usb_phy_vbus_on(struct usb_phy *x)
>>  {
>> -   if (!x->set_vbus)
>> -   return 0;
>> +   if (!IS_ERR(x) && x->set_vbus)
>> +   return x->set_vbus(x, true);
>>
>> -   return x->set_vbus(x, true);
>> +   return 0;
>>  }
>>
>>  static inline int
>>  usb_phy_vbus_off(struct usb_phy *x)
>>  {
>> -   if (!x->set_vbus)
>> -   return 0;
>> +   if (!IS_ERR(x) && x->set_vbus)
>> +   return x->set_vbus(x, false);
>> +
>> +   return 0;
>>
>> -   return x->set_vbus(x, false);
>>  }
>>
>>  /* for usb host and peripheral controller drivers */
>> @@ -249,8 +250,9 @@ static inline int usb_bind_phy(const char *dev_name, u8 
>> index,
>>  static inline int
>>  usb_phy_set_power(struct usb_phy *x, unsigned mA)
>>  {
>> -   if (x && x->set_power)
>> +   if (!IS_ERR(x) && x->set_power)
>> return x->set_power(x, mA);
>> +
>> return 0;
>>  }
>>
>> @@ -258,28 +260,28 @@ usb_phy_set_power(struct usb_phy *x, unsigned mA)
>>  static inline int
>>  usb_phy_set_suspend(struct usb_phy *x, int suspend)
>>  {
>> -   if (x->set_suspend != NULL)
>> +   if (!IS_ERR(x) && x->set_suspend != NULL)
>> return x->set_suspend(x, suspend);
>> -   else
>> -   return 0;
>> +
>> +   return 0;
>>  }
>>
>>  static inline int
>>  usb_phy_notify_connect(struct usb_phy *x, enum usb_device_speed speed)
>>  {
>> -   if (x->notify_connect)
>> +   if

Re: [PATCH] arm64: wire in generic parport.h

2013-08-18 Thread Mark Salter

On Sun, 2013-08-18 at 22:25 +0200, Geert Uytterhoeven wrote:
> On Sun, Aug 18, 2013 at 6:01 PM, Mark Salter  wrote:
> > The arm64 port doesn't provide a parport.h which causes a build failure
> > with some configurations:
> >
> >   drivers/parport/parport_pc.c:67:25: fatal error: asm/parport.h: No such 
> > file or directory
> >#include 
> >
> > This patch wires in the generic parport.h for arm64.
> 
> Can arm64 have a PC-style parport?

Good question. I'm not sure, but really doubt it.

> 
> If not, you're better off disabling it in drivers/parport/Kconfig.
> 
> You will receive bonus points for introducing ARCH_MAY_HAVE_PC_PARPORT,
> cfr. ARCH_MAY_HAVE_PC_FDC.
> 

Yes, good point. I'll work up a new patch. I can use some bonus points.

--Mark


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] xen: initialize xen panic handler for PVHVM

2013-08-18 Thread vaughan

On 08/16/2013 08:43 PM, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 16, 2013 at 04:10:56PM +0800, Vaughan Cao wrote:
>> kernel use callback linked in panic_notifier_list to notice others when panic
>> happens.
>> NORET_TYPE void panic(const char * fmt, ...){
>> ...
>> atomic_notifier_call_chain(_notifier_list, 0, buf);
>> }
>> When xen aware this, it will call xen_reboot(SHUTDOWN_crash) to send out an
>   ^^^-> "When Xen becomes aware of this"
>
>> event with reason code - SHUTDOWN_crash.
>> xen_panic_handler_init() is defined to register on panic_notifier_list but
>> we only call it in xen_arch_setup which only be called by pvm, this patch is
>   ^^^-> "is only"
>>  necessary for pvhvm.
> Could you tell me what has been happening without this patch?
Setting 'on_crash=coredump-restart' in PVHVM guest config file can't
lead a vmcore to be generate when the guest panics. It can be reproduced
with 'echo c > /proc/sysrq-trigger'.
>From the xend.log, we find the reason code gotten by dominfo is not as
expected:
  [2013-07-29 08:55:37 19378] INFO (XendDomainInfo:2148) Domain has
shutdown: name=oakDom1 id=15 reason=reboot.
While log from a guest who can capture the crash is as below:   
  [2013-07-29 08:13:42 19378] WARNING (XendDomainInfo:2131) Domain has
crashed: name=oakDom1 id=14.

Thanks,
Vaughan
>
> Thank you.
>
>> Signed-off-by: Vaughan Cao 
>> ---
>>  arch/x86/xen/enlighten.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index 4aec5ed..53e5726 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -1713,6 +1713,8 @@ static void __init xen_hvm_guest_init(void)
>>  
>>  xen_hvm_init_shared_info();
>>  
>> +xen_panic_handler_init();
>> +
>>  if (xen_feature(XENFEAT_hvm_callback_vector))
>>  xen_have_vector_callback = 1;
>>  xen_hvm_smp_init();
>> -- 
>> 1.7.11.7
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag

2013-08-18 Thread HATAYAMA Daisuke


(2013/08/15 4:45), Eric W. Biederman wrote:

Jingbai Ma  writes:


I found a side effect of unsetting BSP flag.
It affected system rebooting, once the BSP flags been removed, and issue
reboot command, system will hang after message:
Restarting system.
And have to do a hardware reset to recover it.

I have reproduced this problem on the following systems:
HP EliteBook 6930p
HP Compaq DC7700
HP ProLiant DL980 (4 sockets, 40 cores)

I have an idea: To avoid such kind of issue, we can unset BSP flag in
the first kernel during crash processing, and restore it in the second
kernel in the APs initializing.


The premise was clearing BSP would not be an issue.  If we could
reliably count on unsetting the BSP during crash processing we could
just switch to the BSP and be done totally avoid this problem.

Given that there are reald world issues with clearing the BSP flag,
I believe the alternate suggestion was to simply never attempt to start
the bootstrap processor during processor bring up.

If as normal we are running on the bootstrap processor everything will
work the same, but if we are in the kdump scenario we will be short one
core.  Being short one core seems like a reasonable tradeoff between
reliability and performance.

Eric


Sorry Eric, I'm not clear to what you mean by ``short one core''...
Which are you suggesting? Disabling BSP if crash happens on AP is reasonable?
Or restricting cpus to a single one only just as the current kdump
configuration is reasonable?

--
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG REPORT] ZSWAP: theoretical race condition issues

2013-08-18 Thread Bob Liu

Hi Weijie,

On 08/19/2013 12:14 AM, Weijie Yang wrote:
> I found a few bugs in zswap when I review Linux-3.11-rc5, and I have
> also some questions about it, described as following:
> 
> BUG:
> 1. A race condition when reclaim a page
> when a handle alloced from zbud, zbud considers this handle is used
> validly by upper(zswap) and can be a candidate for reclaim.
> But zswap has to initialize it such as setting swapentry and addding
> it to rbtree. so there is a race condition, such as:
> thread 0: obtain handle x from zbud_alloc
> thread 1: zbud_reclaim_page is called
> thread 1: callback zswap_writeback_entry to reclaim handle x
> thread 1: get swpentry from handle x (it is random value now)
> thread 1: bad thing may happen
> thread 0: initialize handle x with swapentry

Yes, this may happen potentially but in rare case.
Because we have a LRU list for page frames, after Thread 0 called
zbud_alloc the corresponding page will be add to the head of LRU
list,While zbud_reclaim_page(Thread 1 called) is started from the tail
of LRU list.

> Of course, this situation almost never happen, it is a "theoretical
> race condition" issue.
> 
> 2. Pollute swapcache data by add a pre-invalided swap page
> when a swap_entry is invalidated, it will be reused by other anon
> page. At the same time, zswap is reclaiming old page, pollute
> swapcache of new page as a result, because old page and new page use
> the same swap_entry, such as:
> thread 1: zswap reclaim entry x
> thread 0: zswap_frontswap_invalidate_page entry x
> thread 0: entry x reused by other anon page
> thread 1: add old data to swapcache of entry x

I didn't get your idea here, why thread1 will add old data to entry x?

> thread 0: swapcache of entry x is polluted
> Of course, this situation almost never happen, it is another
> "theoretical race condition" issue.
> 
> 3. Frontswap uses frontswap_map bitmap to track page in "backend"
> implementation, when zswap reclaim a
> page, the corresponding bitmap record is not cleared.
>

That's true, but I don't think it's a big problem.
Only waste little time to search rbtree during zswap_frontswap_load().

> 4. zswap_tree is not freed when swapoff, and it got re-kzalloc in
> swapon, memory leak occurs.

Nice catch! I think it should be freed in zswap_frontswap_invalidate_area().

> 
> questions:
> 1. How about SetPageReclaim befor __swap_writepage, so that move it to
> the tail of the inactive list?

It will be added to inactive now.

> 2. zswap uses GFP_KERNEL flag to alloc things in store and reclaim
> function, does this lead to these function called recursively?

Yes, that's a potential problem.

> 3. for reclaiming one zbud page which contains two buddies, zswap
> needs to alloc two pages. Does this reclaim cost-efficient?
> 

Yes, that's a problem too. And that's why we use zbud as the default
allocator instead of zsmalloc.
I think improving the write back path of zswap is the next important
step for zswap.

-- 
Regards,
-Bob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] pinctrl: core: Add proper mutex lock in pinctrl_request_gpio

2013-08-18 Thread Axel Lin

This one is missed in commit 42fed7ba "pinctrl: move subsystem mutex to
pinctrl_dev struct".

Signed-off-by: Axel Lin 
---
 drivers/pinctrl/core.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/pinctrl/core.c b/drivers/pinctrl/core.c
index 53c40d9..92f86ab 100644
--- a/drivers/pinctrl/core.c
+++ b/drivers/pinctrl/core.c
@@ -562,11 +562,15 @@ int pinctrl_request_gpio(unsigned gpio)
return ret;
}
 
+   mutex_lock(>mutex);
+
/* Convert to the pin controllers number space */
pin = gpio_to_pin(range, gpio);
 
ret = pinmux_request_gpio(pctldev, range, pin, gpio);
 
+   mutex_unlock(>mutex);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(pinctrl_request_gpio);
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] pinctrl: core: Remove unnecessary test for desc->name

2013-08-18 Thread Axel Lin

The implementation in pinctrl_register_one_pin() ensures pindesc->name is always
not NULL before insert the pindesc to radix tree.
If the desc return from pin_desc_get is not NULL, desc->name is always not NULL.

Signed-off-by: Axel Lin 
---
 drivers/pinctrl/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/pinctrl/core.c b/drivers/pinctrl/core.c
index 4adef2f..53c40d9 100644
--- a/drivers/pinctrl/core.c
+++ b/drivers/pinctrl/core.c
@@ -153,9 +153,7 @@ int pin_get_from_name(struct pinctrl_dev *pctldev, const 
char *name)
pin = pctldev->desc->pins[i].number;
desc = pin_desc_get(pctldev, pin);
/* Pin space may be sparse */
-   if (desc == NULL)
-   continue;
-   if (desc->name && !strcmp(name, desc->name))
+   if (desc && !strcmp(name, desc->name))
return pin;
}
 
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: readahead: make context readahead more conservative

2013-08-18 Thread Fengguang Wu

On Mon, Aug 19, 2013 at 09:59:09AM +0800, Miao Xie wrote:
> Hi, everyone
> 
> On Thu, 8 Aug 2013 16:54:18 +0800, Fengguang Wu wrote:
> > This helps performance on moderately dense random reads on SSD.
> > 
> > Transaction-Per-Second numbers provided by Taobao:
> > 
> > QPS case
> > ---
> > 7536disable context readahead totally
> > w/ patch:   7129slower size rampup and start RA on the 3rd read
> > 6717slower size rampup
> > w/o patch:  5581unmodified context readahead
> > 
> > Before, readahead will be started whenever reading page N+1 when it
> > happen to read N recently. After patch, we'll only start readahead
> > when *three* random reads happen to access pages N, N+1, N+2. The
> > probability of this happening is extremely low for pure random reads,
> > unless they are very dense, which actually deserves some readahead.
> > 
> > Also start with a smaller readahead window. The impact to interleaved
> > sequential reads should be small, because for a long run stream, the
> > the small readahead window rampup phase is negletable.
> > 
> > The context readahead actually benefits clustered random reads on HDD
> > whose seek cost is pretty high. However as SSD is increasingly used
> > for random read workloads it's better for the context readahead to
> > concentrate on interleaved sequential reads.
> > 
> > Another SSD rand read test from Miao
> > 
> > # file size:2GB
> > # read IO amount: 625MB
> > sysbench --test=fileio  \
> > --max-requests=1\
> > --num-threads=1 \
> > --file-num=1\
> > --file-block-size=64K   \
> > --file-test-mode=rndrd  \
> > --file-fsync-freq=0 \
> > --file-fsync-end=offrun
> > 
> > shows the performance of btrfs grows up from 69MB/s to 121MB/s,
> > ext4 from 104MB/s to 121MB/s.
> 
> I did the same test on the hard disk recently,
> for btrfs, there is ~5% regression(10.65MB/s -> 10.09MB/s),
> for ext4, the performance grows up a bit.(9.98MB/s -> 10.04MB/s).
> (I run the test for 4 times, and the above result is the average of the test.)
> 
> Any comment?

Thanks for the tests! Minor regressions on the HDD cases are expected.

Since random read workloads are migrating to SSD as it becomes cheaper
and larger, it seems a good tradeoff to optimize for random read
performance on SSD.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] cgroup: change cgroup_from_id() to css_from_id()

2013-08-18 Thread Li Zefan

Now we want cgroup core to always provide the css to use to the
subsystems, so change this API to css_from_id().

Uninline css_from_id(), because it's getting bigger and cgroup_css()
has been unexported.

While at it, remove the #ifdef, and shuffle the order of the args.

Signed-off-by: Li Zefan 
---

v2: change the order of the args.

---
 include/linux/cgroup.h | 20 ++--
 kernel/cgroup.c| 22 ++
 2 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 11a6419..3aac34d 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -742,27 +742,11 @@ static inline struct cgroup *task_cgroup(struct 
task_struct *task,
return task_css(task, subsys_id)->cgroup;
 }
 
-/**
- * cgroup_from_id - lookup cgroup by id
- * @ss: cgroup subsys to be looked into
- * @id: the cgroup id
- *
- * Returns the cgroup if there's valid one with @id, otherwise returns NULL.
- * Should be called under rcu_read_lock().
- */
-static inline struct cgroup *cgroup_from_id(struct cgroup_subsys *ss, int id)
-{
-#ifdef CONFIG_PROVE_RCU
-   rcu_lockdep_assert(rcu_read_lock_held() ||
-  lockdep_is_held(_mutex),
-  "cgroup_from_id() needs proper protection");
-#endif
-   return idr_find(>root->cgroup_idr, id);
-}
-
 struct cgroup_subsys_state *css_next_child(struct cgroup_subsys_state *pos,
   struct cgroup_subsys_state *parent);
 
+struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss);
+
 /**
  * css_for_each_child - iterate through children of a css
  * @pos: the css * to use as the loop cursor
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 99785d6..07740d3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5710,6 +5710,28 @@ struct cgroup_subsys_state *cgroup_css_from_dir(struct 
file *f, int id)
return css ? css : ERR_PTR(-ENOENT);
 }
 
+/**
+ * css_from_id - lookup css by id
+ * @id: the cgroup id
+ * @ss: cgroup subsys to be looked into
+ *
+ * Returns the css if there's valid one with @id, otherwise returns NULL.
+ * Should be called under rcu_read_lock().
+ */
+struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss)
+{
+   struct cgroup *cgrp;
+
+   rcu_lockdep_assert(rcu_read_lock_held() ||
+  lockdep_is_held(_mutex),
+  "css_from_id() needs proper protection");
+
+   cgrp = idr_find(>root->cgroup_idr, id);
+   if (cgrp)
+   return cgroup_css(cgrp, ss->subsys_id);
+   return NULL;
+}
+
 #ifdef CONFIG_CGROUP_DEBUG
 static struct cgroup_subsys_state *
 debug_css_alloc(struct cgroup_subsys_state *parent_css)
-- 
1.8.0.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag

2013-08-18 Thread HATAYAMA Daisuke


(2013/08/14 18:13), Jingbai Ma wrote:

On 08/13/2013 06:55 PM, Jingbai Ma wrote:

On 08/06/2013 05:19 PM, HATAYAMA Daisuke wrote:

Hello,

I've addressing kdump restriction that there's only one cpu available
on the kdump 2nd kernel. Now I need to check if the following CPU0 SMI
corruption issue fixed in the following commit can again be reproduced
by unsetting BSP flag of the boot cpu:

commit 74b5820808215f65b70b05a099d6d3c969b82689
Author: Bjorn Helgaas
Date:   Wed Jul 29 15:54:25 2009 -0600

   ACPI: bind workqueues to CPU 0 to avoid SMI corruption

   On some machines, a software-initiated SMI causes corruption unless the
   SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically 
it's
   done in GPE-related methods that are run via workqueues, so we can avoid
   the known corruption cases by binding the workqueues to CPU 0.

   References:
   http://bugzilla.kernel.org/show_bug.cgi?id=13751
   https://bugs.launchpad.net/bugs/157171
   https://bugs.launchpad.net/bugs/157691

   Signed-off-by: Bjorn Helgaas
   Signed-off-by: Len Brown

The reason is that in the current situation, I have two ideas to deal
with the avove kdump restriction:

 1) Disable BSP at the 2nd kernel, posted at:
   [PATCH v1 0/2] x86, apic: Disable BSP if boot cpu is AP
   https://lkml.org/lkml/2012/10/16/15

 2) Unset BSP flag at the 1st kernel, suggested by Eric Biederman
during the discussion of the idea 1).

On the idea 1), BSP is disabled on the kdump 2nd kernel. My conclusion
is that we have no method to reset BSP, i.e. recover BPS's healthy
state, while we can recover AP by means of INIT as described in MP
specification.

The idea 2) is simpler. We unset BSP flag of the boot cpu at 1st
kernel. The behaviour when receiving INIT depends on whether or not
BSP flag is set or not on its MSR; we can set and unset BSP flag of
MSR freely at runtime. (I don't mean we should).

So, next thing I should do is to evalute risk of the idea 2). In fact,
during the discussion of the idea 1), HPA pointed out that some kind
of firmware affects if BSP flag is unset. Also, maybe from the same
reason, recently introduced cpu0 hot-plugging feature by Fenghua Yu
doesn't appear to unset BSP flag.

The biggest problem next is that I don't have any machines reported in
the bugzilla articles; this issue inherently depends on firmware.

So, could anyone help testing the idea 2) above if you have which of
the following machines? (or other ones that can lead to the same bug)

- HP Compaq 6910p
- HP Compaq 6710b
- HP Compaq 6710s
- HP Compaq 6510b
- HP Compaq 2510p

I prepared a small programs for this test. See the attached file.
The steps to try to reproduce the bug is as follows:

 1. $ tar xf bsp_flag_modules.tar.gz; cd bsp_flag_modules
 2. $ make # to build these programs
 3. $ insmod unsetbspflag.ko # to unset BSP flag of the boot cpu
 4. $ insmod getcpuinfo.ko # to confirm if BSP flag of the boot cpu has
   # been unset.
$ dmesg | tail
 5. Close the lid of the machine.
 6. Wait some minutes if necessary.
 7. Open the lid and you can see oops on the screen if bug has
   successfully been reproduced.



I couldn't find any model list above, but found one HP EliteBook 6930p.
I tested this machine with kernel 2.6.30 first. After resuming from
suspend, system hang.

Then, I tested with kernel 3.11.0-rc5, it worked well, could resume from
suspend without any problem.

Next, I tested your program to clear BSP flag, I found the
unsetbspflag.ko didn't work everytime, sometimes I have to execute
insmod/rmmod several times to clear the BSP flag. (I used your
getcpuinfo.ko to check the BSP flag)

cpu: 0 bios_apic: 0 apic: 0 AP
cpu: 1 bios_apic: 1 apic: 1 AP

I suspended it, and them resumed it. This machine resumed from suspend
successfully, but the BSP flag has been set back:

cpu: 0 bios_apic: 0 apic: 0 BSP
cpu: 1 bios_apic: 1 apic: 1 AP

That's all my observation. Hope it's helpful.



I found a side effect of unsetting BSP flag.
It affected system rebooting, once the BSP flags been removed, and issue
reboot command, system will hang after message:
Restarting system.
And have to do a hardware reset to recover it.

I have reproduced this problem on the following systems:
HP EliteBook 6930p
HP Compaq DC7700
HP ProLiant DL980 (4 sockets, 40 cores)



# Sorry for the delayed response. I was in vacation last week.

Thanks for your help, Ma. This result is enough to indicate risk of unsetting
BSP flag in the 1st kernel.

BTW, I have question that does normal kdump work well if crash happens on some
AP? I wonder the same issue could happen on the 2nd kernel.


I have an idea: To avoid such kind of issue, we can unset BSP flag in
the first kernel during crash processing, and restore it in the second
kernel in the APs initializing.



As Eric has already suggested, we cannot rely on kdump crash path.

Re: readahead: make context readahead more conservative

2013-08-18 Thread Miao Xie

Hi, everyone

On Thu, 8 Aug 2013 16:54:18 +0800, Fengguang Wu wrote:
> This helps performance on moderately dense random reads on SSD.
> 
> Transaction-Per-Second numbers provided by Taobao:
> 
>   QPS case
>   ---
>   7536disable context readahead totally
> w/ patch: 7129slower size rampup and start RA on the 3rd read
>   6717slower size rampup
> w/o patch:5581unmodified context readahead
> 
> Before, readahead will be started whenever reading page N+1 when it
> happen to read N recently. After patch, we'll only start readahead
> when *three* random reads happen to access pages N, N+1, N+2. The
> probability of this happening is extremely low for pure random reads,
> unless they are very dense, which actually deserves some readahead.
> 
> Also start with a smaller readahead window. The impact to interleaved
> sequential reads should be small, because for a long run stream, the
> the small readahead window rampup phase is negletable.
> 
> The context readahead actually benefits clustered random reads on HDD
> whose seek cost is pretty high. However as SSD is increasingly used
> for random read workloads it's better for the context readahead to
> concentrate on interleaved sequential reads.
> 
> Another SSD rand read test from Miao
> 
> # file size:2GB
> # read IO amount: 625MB
> sysbench --test=fileio  \
> --max-requests=1\
> --num-threads=1 \
> --file-num=1\
> --file-block-size=64K   \
> --file-test-mode=rndrd  \
> --file-fsync-freq=0 \
> --file-fsync-end=offrun
> 
> shows the performance of btrfs grows up from 69MB/s to 121MB/s,
> ext4 from 104MB/s to 121MB/s.

I did the same test on the hard disk recently,
for btrfs, there is ~5% regression(10.65MB/s -> 10.09MB/s),
for ext4, the performance grows up a bit.(9.98MB/s -> 10.04MB/s).
(I run the test for 4 times, and the above result is the average of the test.)

Any comment?

Thanks
Miao

> 
> Tested-by: Tao Ma 
> Tested-by: Miao Xie 
> Signed-off-by: Wu Fengguang 
> ---
>  mm/readahead.c |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> --- linux-next.orig/mm/readahead.c2013-08-08 16:21:29.675286154 +0800
> +++ linux-next/mm/readahead.c 2013-08-08 16:21:33.851286019 +0800
> @@ -371,10 +371,10 @@ static int try_context_readahead(struct
>   size = count_history_pages(mapping, ra, offset, max);
>  
>   /*
> -  * no history pages:
> +  * not enough history pages:
>* it could be a random read
>*/
> - if (!size)
> + if (size <= req_size)
>   return 0;
>  
>   /*
> @@ -385,8 +385,8 @@ static int try_context_readahead(struct
>   size *= 2;
>  
>   ra->start = offset;
> - ra->size = get_init_ra_size(size + req_size, max);
> - ra->async_size = ra->size;
> + ra->size = min(size + req_size, max);
> + ra->async_size = 1;
>  
>   return 1;
>  }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] f2fs: fix a compound statement label error

2013-08-18 Thread Gu Zheng

>From 685b72b66cb8ce019429b1958c91f346b260bc65 Mon Sep 17 00:00:00 2001
From: Gu Zheng 
Date: Mon, 19 Aug 2013 09:41:15 +0800
Subject: [PATCH] f2fs: fix a compound statement label error
An error "label at end of compound statement" will occur if CONFIG_F2FS_STAT_FS
disabled.
fs/f2fs/segment.c:556:1: error: label at end of compound statement
So clean up the 'out' label to fix it.

Reported-by: Fengguang Wu 
Signed-off-by: Gu Zheng 
---
 fs/f2fs/segment.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 9c45b8e..09af9c7 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -540,12 +540,9 @@ static void allocate_segment_by_default(struct 
f2fs_sb_info *sbi,
 {
struct curseg_info *curseg = CURSEG_I(sbi, type);
 
-   if (force) {
+   if (force)
new_curseg(sbi, type, true);
-   goto out;
-   }
-
-   if (type == CURSEG_WARM_NODE)
+   else if (type == CURSEG_WARM_NODE)
new_curseg(sbi, type, false);
else if (curseg->alloc_type == LFS && is_next_segment_free(sbi, type))
new_curseg(sbi, type, false);
@@ -553,7 +550,6 @@ static void allocate_segment_by_default(struct f2fs_sb_info 
*sbi,
change_curseg(sbi, type, true);
else
new_curseg(sbi, type, false);
-out:
 #ifdef CONFIG_F2FS_STAT_FS
sbi->segment_count[curseg->alloc_type]++;
 #endif
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] powerpc/iommu: check dev->iommu_group before remove a device from iommu_group

2013-08-18 Thread Wei Yang

On Mon, Aug 19, 2013 at 11:39:49AM +1000, Alexey Kardashevskiy wrote:
>On 08/19/2013 11:29 AM, Wei Yang wrote:
>> On Fri, Aug 16, 2013 at 08:15:36PM +1000, Alexey Kardashevskiy wrote:
>>> On 08/16/2013 08:08 PM, Wei Yang wrote:
 ---
  arch/powerpc/kernel/iommu.c |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)

 diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
 index b20ff17..5abf7c3 100644
 --- a/arch/powerpc/kernel/iommu.c
 +++ b/arch/powerpc/kernel/iommu.c
 @@ -1149,7 +1149,8 @@ static int iommu_bus_notifier(struct notifier_block 
 *nb,
case BUS_NOTIFY_ADD_DEVICE:
return iommu_add_device(dev);
case BUS_NOTIFY_DEL_DEVICE:
 -  iommu_del_device(dev);
 +  if (dev->iommu_group)
 +  iommu_del_device(dev);
return 0;
default:
return 0;

>>>
>>> This one seems redundant, no?
>> 
>> Sorry for the late.
>> 
>> Yes, these two patches have the same purpose to guard the system, while in 
>> two
>> different places.  One is in powernv platform, the other is in the generic 
>> iommu 
>> driver.
>> 
>> The one in powernv platform is used to correct the original logic.
>> 
>> The one in generic iommu driver is to keep system safe in case other 
>> platform to
>> call iommu_group_remove_device() without the check.
>
>
>But I am moving bus notifier to powernv code (posted a patch last week,
>otherwise Freescale's IOMMU conflicted) so this won't be the case.

Yes, I see the patch.

This means other platforms, besides powernv, will check the dev->iommu_group
before remove the device? This would be a convention?

If this is the case, the second patch is enough. We don't need to check it in
generic iommu driver.

Since I am not very familiar with the code convention, I post these two
patches together. This doesn't mean I need to push both of them. Your comments
are welcome, lets me understand which one is more suitable in this case.

>
>
>
>-- 
>Alexey

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the libata tree

2013-08-18 Thread Stephen Rothwell

Hi Tejun,

After merging the libata tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

drivers/ata/libata-pmp.c: In function 'sata_pmp_quirks':
drivers/ata/libata-pmp.c:386:36: error: invalid suffix "x26" on integer constant
  if (vendor == 0x1095 && (devid == 0x3x26 || devid == 0x3826)) {
^

Caused by commit f1a313ad86b9 ("libata: apply behavioral quirks to sil3826 
PMP").

I have used the libata tree from next-20130816 for today.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpRzyQaPBomQ.pgp
Description: PGP signature

Re: [PATCH tip/core/rcu 7/9] nohz_full: Add full-system-idle arguments to API

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 08:11:20PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:49:42PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > This commit adds an isidle and jiffies argument to force_qs_rnp(),
> > dyntick_save_progress_counter(), and rcu_implicit_dynticks_qs() to enable
> > RCU's force-quiescent-state process to check for full-system idle.
> > 
> > Signed-off-by: Paul E. McKenney 
> > Cc: Frederic Weisbecker 
> > Cc: Steven Rostedt 
> > Cc: Lai Jiangshan 
> > [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ]
> 
> One optional comment below; with or without that,
> Reviewed-by: Josh Triplett 
> 
> >  kernel/rcutree.c | 23 ---
> >  1 file changed, 16 insertions(+), 7 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index b0d2cc3..f1a0b05 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -246,7 +246,9 @@ module_param(jiffies_till_next_fqs, ulong, 0644);
> >  
> >  static void rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node 
> > *rnp,
> >   struct rcu_data *rdp);
> > -static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data 
> > *));
> > +static void force_qs_rnp(struct rcu_state *rsp,
> > +int (*f)(struct rcu_data *, bool *, unsigned long *),
> > +bool *isidle, unsigned long *maxj);
> 
> You might consider giving the parameters of the function pointer names
> (both here and in the definition), to make it more self-documenting.

Forces a line break, but given that I couldn't immediately recall all
of the parameter names myself, I made the change.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 00/34] 3.4.59-stable review

2013-08-18 Thread Guenter Roeck


On 08/18/2013 01:34 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.4.59 release.
There are 34 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Note, there are a number of "build fixes" in this round, in the quest to
get all arches building properly to be able to track future regressions
easier.  Many thanks to Guenter Roeck and Geert Uytterhoeven for their
work in doing this.

Responses should be made by Tue Aug 20 20:32:48 UTC 2013.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.4.59-rc1.gz
and the diffstat can be found below.



Build results:
Total builds: 69 Total build errors: 2
Previous release:
Total builds: 62 Total build errors: 10

qemu:
mips, mips64, ppc, x86, x86_64: pass (boot to login prompt)
arm: skipped

Details:
http://server.roeck-us.net:8010/builders

More builds, significantly fewer failures, so results are excellent.

Still failing builds are arm:allmodconfig and sparc64:allmodconfig.
Both may be difficult to fix.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] generic-ipi: Fix misleading smp_call_function_any() description

2013-08-18 Thread Xie XiuQi

Cc: Ingo Molnar 

On 2013/7/29 11:52, Xie XiuQi wrote:
> After commit:8969a5ede0f9e17da4b943712429aef2c9bcd82b
> "generic-ipi: remove kmalloc()", wait = 0 can be guaranteed.
> 
> Signed-off-by: Xie XiuQi 
> Cc: Sheng Yang 
> Cc: Peter Zijlstra 
> Cc: Jens Axboe 
> Cc: Rusty Russell 
> ---
>  kernel/smp.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index fe9f773..b1c9034 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -278,8 +278,6 @@ EXPORT_SYMBOL(smp_call_function_single);
>   * @wait: If true, wait until function has completed.
>   *
>   * Returns 0 on success, else a negative status code (if no cpus were 
> online).
> - * Note that @wait will be implicitly turned on in case of allocation 
> failures,
> - * since we fall back to on-stack allocation.
>   *
>   * Selection preference:
>   *   1) current cpu if in @mask
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 00/45] 3.10.8-stable review

2013-08-18 Thread Guenter Roeck


On 08/18/2013 01:35 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.10.8 release.
There are 45 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Aug 20 20:36:09 UTC 2013.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.10.8-rc1.gz
and the diffstat can be found below.



Cross build results:
Total builds: 76 Total build errors: 0
Previous release:
Total builds: 69 Total build errors: 0

qemu:
mips, ppc, x86, x86_64 pass (boot to login prompt)
arm: fail (known problem, see https://lkml.org/lkml/2013/8/11/41)

More builds, still no failures.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] powerpc/iommu: check dev->iommu_group before remove a device from iommu_group

2013-08-18 Thread Alexey Kardashevskiy

On 08/19/2013 11:29 AM, Wei Yang wrote:
> On Fri, Aug 16, 2013 at 08:15:36PM +1000, Alexey Kardashevskiy wrote:
>> On 08/16/2013 08:08 PM, Wei Yang wrote:
>>> ---
>>>  arch/powerpc/kernel/iommu.c |3 ++-
>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>>> index b20ff17..5abf7c3 100644
>>> --- a/arch/powerpc/kernel/iommu.c
>>> +++ b/arch/powerpc/kernel/iommu.c
>>> @@ -1149,7 +1149,8 @@ static int iommu_bus_notifier(struct notifier_block 
>>> *nb,
>>> case BUS_NOTIFY_ADD_DEVICE:
>>> return iommu_add_device(dev);
>>> case BUS_NOTIFY_DEL_DEVICE:
>>> -   iommu_del_device(dev);
>>> +   if (dev->iommu_group)
>>> +   iommu_del_device(dev);
>>> return 0;
>>> default:
>>> return 0;
>>>
>>
>> This one seems redundant, no?
> 
> Sorry for the late.
> 
> Yes, these two patches have the same purpose to guard the system, while in two
> different places.  One is in powernv platform, the other is in the generic 
> iommu 
> driver.
> 
> The one in powernv platform is used to correct the original logic.
> 
> The one in generic iommu driver is to keep system safe in case other platform 
> to
> call iommu_group_remove_device() without the check.


But I am moving bus notifier to powernv code (posted a patch last week,
otherwise Freescale's IOMMU conflicted) so this won't be the case.



-- 
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 00/12] 3.0.92-stable review

2013-08-18 Thread Guenter Roeck


On 08/18/2013 01:30 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.0.92 release.
There are 12 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Aug 20 20:29:24 UTC 2013.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.92-rc1.gz
and the diffstat can be found below.



Cross build results (including commit ea077b1b96):
Total builds: 65 Total build errors: 8
Previous release:
Total builds: 58 Total build errors: 14

qemu tests:
ppc, x86, x86_64 tested ok
arm, mips, mips64 skipped

More builds, fewer failures. Excellent results.

Build details are at http://server.roeck-us.net:8010/builders.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 6/9] nohz_full: Add full-system idle states and variables

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 08:09:21PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:49:41PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > This commit adds control variables and states for full-system idle.
> > The system will progress through the states in numerical order when
> > the system is fully idle (other than the timekeeping CPU), and reset
> > down to the initial state if any non-timekeeping CPU goes non-idle.
> > The current state is kept in full_sysidle_state.
> > 
> > A RCU_SYSIDLE_SMALL macro is defined, and systems with this number
> > of CPUs or fewer move through the states more aggressively.  The idea
> > is that the resulting memory contention is less of a problem on small
> > systems.  Architectures can adjust this value (which defaults to 8)
> > using CONFIG_ARCH_RCU_SYSIDLE_SMALL.
> > 
> > One flavor of RCU will be in charge of driving the state machine,
> > defined by rcu_sysidle_state.  This should be the busiest flavor of RCU.
> > 
> > Signed-off-by: Paul E. McKenney 
> > Cc: Frederic Weisbecker 
> > Cc: Steven Rostedt 
> 
> One issue (and one question) below; with the issue addressed,
> Reviewed-by: Josh Triplett 
> 
> >  kernel/rcutree_plugin.h | 28 
> >  1 file changed, 28 insertions(+)
> > 
> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > index eab81da..64a05b9f 100644
> > --- a/kernel/rcutree_plugin.h
> > +++ b/kernel/rcutree_plugin.h
> > @@ -2378,6 +2378,34 @@ static void rcu_kick_nohz_cpu(int cpu)
> >  #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> >  
> >  /*
> > + * Handle small systems specially, accelerating their transition into
> > + * full idle state.  Allow arches to override this code's idea of
> > + * what constitutes a "small" system.
> > + */
> > +#ifdef CONFIG_ARCH_RCU_SYSIDLE_SMALL
> 
> I don't see any Kconfig creating this new config option.
> 
> Also, why not simply define this config option unconditionally, with a
> default of 8, and then use its value directly?

Good point, removing this and adding a Kconfig option in the
"nohz_full: Add full-system-idle state machine" commit, with a
default value of 8.  Architecture maintainers who want something
different can then set that up in their defconfig files.

> > +static int __maybe_unused full_sysidle_state; /* Current system-idle 
> > state. */
> > +#define RCU_SYSIDLE_NOT0   /* Some CPU is not idle. */
> > +#define RCU_SYSIDLE_SHORT  1   /* All CPUs idle for brief period. */
> > +#define RCU_SYSIDLE_LONG   2   /* All CPUs idle for long enough. */
> > +#define RCU_SYSIDLE_FULL   3   /* All CPUs idle, ready for sysidle. */
> > +#define RCU_SYSIDLE_FULL_NOTED 4   /* Actually entered sysidle 
> > state. */
> 
> Perhaps there's a kernel style rule I'm not thinking of that makes it
> verboten, but: why not use an enum for a state variable like this?

I didn't trust enum interactions with xchg and cmpxchg, so opted for "int"
instead.  That said, enum is much more portable than when I last looked
at it.  Admittedly, the last time I looked at it was in the early 1980s...

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 4/9] nohz_full: Add rcu_dyntick data for scalable detection of all-idle state

2013-08-18 Thread Josh Triplett

On Sun, Aug 18, 2013 at 06:22:29PM -0700, Paul E. McKenney wrote:
> On Sat, Aug 17, 2013 at 08:02:34PM -0700, Josh Triplett wrote:
> > On Sat, Aug 17, 2013 at 06:49:39PM -0700, Paul E. McKenney wrote:
> > > +#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> > > +
> > > +/*
> > > + * Initialize dynticks sysidle state for CPUs coming online.
> > > + */
> > > +static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> > > +{
> > > + rdtp->dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE;
> > > +}
> > > +
> > > +#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> > > +
> > > +static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> > > +{
> > > +}
> > > +
> > > +#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> > 
> > Just move the ifdef around the function body:
> > 
> > static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> > {
> > #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> > rdtp->dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE;
> > #endif /* CONFIG_NO_HZ_FULL_SYSIDLE */
> > }
> 
> This makes sense for this isolated function, and it would also
> make sense if the end result had only functions that were exported.
> But if I try to apply this to the result, I will end up with something
> like the following.  Is that really what you want?
> 
> I suppose I could individually enclose whole functions whose definitions
> are unneeded for CONFIG_NO_HZ_FULL_SYSIDLE=n, but that doesn't seem
> helpful either.
> 
> Thoughts?

I see what you mean.  Short of sorting the functions to put all the
unexported ones together, which seems suboptimal, I don't see an obvious
fix.  The result you posted isn't *terrible*, but it's not great either.

I had mostly hoped to avoid having two duplicate function headers that
would then both need changing whenever changing the function signature,
and which could then potentially get out of sync without causing a
compilation error.

I'd say that if you have a single-function block like the one above, you
should use the ifdef-body approach, but if you've got a group of
functions that don't all use one or the other approach, go ahead and
wrap the whole thing in one big ifdef rather than one per function.

Use whichever approach seems most sensible on a case-by-case basis; your
call.  Feel free to add a
Reviewed-by: Josh Triplett 
to whichever approach you go with.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] mfd: 88pm8xx: platform data bug fix

2013-08-18 Thread Chao Xie

The patches fix the bug that pdata may be NULL when driver uses it.

Chao Xie (2):
  mfd: 88pm800: Fix the bug that pdata may be NULL
  mfd: 88pm805: Fix the bug that pdata may be NULL

 drivers/mfd/88pm800.c |   10 ++
 drivers/mfd/88pm805.c |2 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] powerpc/iommu: check dev->iommu_group before remove a device from iommu_group

2013-08-18 Thread Wei Yang

On Fri, Aug 16, 2013 at 08:15:36PM +1000, Alexey Kardashevskiy wrote:
>On 08/16/2013 08:08 PM, Wei Yang wrote:
>> ---
>>  arch/powerpc/kernel/iommu.c |3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>> 
>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>> index b20ff17..5abf7c3 100644
>> --- a/arch/powerpc/kernel/iommu.c
>> +++ b/arch/powerpc/kernel/iommu.c
>> @@ -1149,7 +1149,8 @@ static int iommu_bus_notifier(struct notifier_block 
>> *nb,
>>  case BUS_NOTIFY_ADD_DEVICE:
>>  return iommu_add_device(dev);
>>  case BUS_NOTIFY_DEL_DEVICE:
>> -iommu_del_device(dev);
>> +if (dev->iommu_group)
>> +iommu_del_device(dev);
>>  return 0;
>>  default:
>>  return 0;
>> 
>
>This one seems redundant, no?

Sorry for the late.

Yes, these two patches have the same purpose to guard the system, while in two
different places.  One is in powernv platform, the other is in the generic 
iommu 
driver.

The one in powernv platform is used to correct the original logic.

The one in generic iommu driver is to keep system safe in case other platform to
call iommu_group_remove_device() without the check.

>
>
>-- 
>Alexey

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/35] i2c: use dev_get_platdata()

2013-08-18 Thread Jingoo Han

On Friday, August 16, 2013 3:03 AM, Wolfram Sang wrote:
> On Tue, Jul 30, 2013 at 04:59:33PM +0900, Jingoo Han wrote:
> > Use the wrapper function for retrieving the platform data instead of
> > accessing dev->platform_data directly.
> >
> > Signed-off-by: Jingoo Han 
> 
> Not convincing. I couldn't find a cover letter explaining the motivation
> and if this should go via the seperate trees or via one cleanup pull
> request. (and if there is one, the i2c list should be on cc) Also, all

CC'ed Mark Brown (author of dev_get_platdata function)

1. Motivation
This is a cosmetic change by in order to enhance readability and make
the code simpler.

If you want, I will modify the commit message as below:
"Use the wrapper function for retrieving the platform data instead of
 accessing dev->platform_data directly. This is a cosmetic change
 in order to enhance readability and make the code simpler."

2. via the separate trees
It should go via the separate trees.

> 35 patches seem to be seperate mails and not threaded, so I can't easily
> check other opinions. Will skip for now unless somebody points out a
> strong reason.

3. Other opinions
Until now, there is no objection.
Also, 21 patches of all 35 patches have been applied by each maintainer.

Mark Brown,
Sorry for CC'ing you.
If I am wrong, please let me know kindly. :)

Best regards,
Jingoo Han

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] mfd: 88pm800: Fix the bug that pdata may be NULL

2013-08-18 Thread Chao Xie

User pass platform data to device, and platform data may be
NULL. Add the check for pdata.

Signed-off-by: Chao Xie 
---
 drivers/mfd/88pm800.c |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/mfd/88pm800.c b/drivers/mfd/88pm800.c
index 6c95483..d4d272f 100644
--- a/drivers/mfd/88pm800.c
+++ b/drivers/mfd/88pm800.c
@@ -333,9 +333,11 @@ static int device_rtc_init(struct pm80x_chip *chip,
 {
int ret;
 
-   rtc_devs[0].platform_data = pdata->rtc;
-   rtc_devs[0].pdata_size =
-   pdata->rtc ? sizeof(struct pm80x_rtc_pdata) : 0;
+   if (pdata) {
+   rtc_devs[0].platform_data = pdata->rtc;
+   rtc_devs[0].pdata_size =
+   pdata->rtc ? sizeof(struct pm80x_rtc_pdata) : 0;
+   }
ret = mfd_add_devices(chip->dev, 0, _devs[0],
  ARRAY_SIZE(rtc_devs), NULL, 0, NULL);
if (ret) {
@@ -578,7 +580,7 @@ static int pm800_probe(struct i2c_client *client,
goto err_device_init;
}
 
-   if (pdata->plat_config)
+   if (pdata && pdata->plat_config)
pdata->plat_config(chip, pdata);
 
return 0;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] mfd: 88pm805: Fix the bug that pdata may be NULL

2013-08-18 Thread Chao Xie

User pass platform data to device, and platform data may be
NULL. Add the check for pdata.

Signed-off-by: Chao Xie 
---
 drivers/mfd/88pm805.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/mfd/88pm805.c b/drivers/mfd/88pm805.c
index 5216022..57135bb 100644
--- a/drivers/mfd/88pm805.c
+++ b/drivers/mfd/88pm805.c
@@ -243,7 +243,7 @@ static int pm805_probe(struct i2c_client *client,
goto err_805_init;
}
 
-   if (pdata->plat_config)
+   if (pdata && pdata->plat_config)
pdata->plat_config(chip, pdata);
 
 err_805_init:
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] perf, x86: Add Silvermont (22nm Atom) support

2013-08-18 Thread Yan, Zheng

ping

On 07/19/2013 10:46 AM, Yan, Zheng wrote:
> From: "Yan, Zheng" 
> 
> Compare to old atom, Silvermont has offcore and has more events
> that support PEBS.
> 
> Signed-off-by: Yan, Zheng 
> ---
> Changes since v1:
>  - test shows that "event 0x013c != fixed counter2", fix the code
>  - remove _PS suffixes in PEBS events' comments
>  - add mode number 77 for Avoton "Silvermont"
> 
>  arch/x86/kernel/cpu/perf_event.h  |   2 +
>  arch/x86/kernel/cpu/perf_event_intel.c| 150 
> ++
>  arch/x86/kernel/cpu/perf_event_intel_ds.c |  26 ++
>  3 files changed, 178 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.h 
> b/arch/x86/kernel/cpu/perf_event.h
> index 97e557b..cc16faa 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -641,6 +641,8 @@ extern struct event_constraint 
> intel_core2_pebs_event_constraints[];
>  
>  extern struct event_constraint intel_atom_pebs_event_constraints[];
>  
> +extern struct event_constraint intel_slm_pebs_event_constraints[];
> +
>  extern struct event_constraint intel_nehalem_pebs_event_constraints[];
>  
>  extern struct event_constraint intel_westmere_pebs_event_constraints[];
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
> b/arch/x86/kernel/cpu/perf_event_intel.c
> index d312edf..1e3896c 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -886,6 +886,140 @@ static __initconst const u64 atom_hw_cache_event_ids
>   },
>  };
>  
> +static struct extra_reg intel_slm_extra_regs[] __read_mostly =
> +{
> + /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
> + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x768005, RSP_0),
> + INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x768005, RSP_1),
> + EVENT_EXTRA_END
> +};
> +
> +#define SLM_DMND_READSNB_DMND_DATA_RD
> +#define SLM_DMND_WRITE   SNB_DMND_RFO
> +#define SLM_DMND_PREFETCH(SNB_PF_DATA_RD|SNB_PF_RFO)
> +
> +#define SLM_LLC_ACCESS   SNB_RESP_ANY
> +#define SLM_LLC_MISS (SNB_SNP_NONE|SNB_SNP_MISS| \
> +  SNB_NO_FWD|SNB_HITM|SNB_NON_DRAM)
> +
> +static __initconst const u64 slm_hw_cache_extra_regs
> + [PERF_COUNT_HW_CACHE_MAX]
> + [PERF_COUNT_HW_CACHE_OP_MAX]
> + [PERF_COUNT_HW_CACHE_RESULT_MAX] =
> +{
> + [ C(LL  ) ] = {
> + [ C(OP_READ) ] = {
> + [ C(RESULT_ACCESS) ] = SLM_DMND_READ|SLM_LLC_ACCESS,
> + [ C(RESULT_MISS)   ] = SLM_DMND_READ|SLM_LLC_MISS,
> + },
> + [ C(OP_WRITE) ] = {
> + [ C(RESULT_ACCESS) ] = SLM_DMND_WRITE|SLM_LLC_ACCESS,
> + [ C(RESULT_MISS)   ] = SLM_DMND_WRITE|SLM_LLC_MISS,
> + },
> + [ C(OP_PREFETCH) ] = {
> + [ C(RESULT_ACCESS) ] = SLM_DMND_PREFETCH|SLM_LLC_ACCESS,
> + [ C(RESULT_MISS)   ] = SLM_DMND_PREFETCH|SLM_LLC_MISS,
> + },
> + },
> +};
> +
> +static __initconst const u64 slm_hw_cache_event_ids
> + [PERF_COUNT_HW_CACHE_MAX]
> + [PERF_COUNT_HW_CACHE_OP_MAX]
> + [PERF_COUNT_HW_CACHE_RESULT_MAX] =
> +{
> + [ C(L1D) ] = {
> + [ C(OP_READ) ] = {
> + [ C(RESULT_ACCESS) ] = 0,
> + [ C(RESULT_MISS)   ] = 0x0104, /* LD_DCU_MISS */
> + },
> + [ C(OP_WRITE) ] = {
> + [ C(RESULT_ACCESS) ] = 0,
> + [ C(RESULT_MISS)   ] = 0,
> + },
> + [ C(OP_PREFETCH) ] = {
> + [ C(RESULT_ACCESS) ] = 0,
> + [ C(RESULT_MISS)   ] = 0,
> + },
> + },
> + [ C(L1I ) ] = {
> + [ C(OP_READ) ] = {
> + [ C(RESULT_ACCESS) ] = 0x0380, /* ICACHE.ACCESSES */
> + [ C(RESULT_MISS)   ] = 0x0280, /* ICACGE.MISSES */
> + },
> + [ C(OP_WRITE) ] = {
> + [ C(RESULT_ACCESS) ] = -1,
> + [ C(RESULT_MISS)   ] = -1,
> + },
> + [ C(OP_PREFETCH) ] = {
> + [ C(RESULT_ACCESS) ] = 0,
> + [ C(RESULT_MISS)   ] = 0,
> + },
> + },
> + [ C(LL  ) ] = {
> + [ C(OP_READ) ] = {
> + /* OFFCORE_RESPONSE.ANY_DATA.LOCAL_CACHE */
> + [ C(RESULT_ACCESS) ] = 0x01b7,
> + /* OFFCORE_RESPONSE.ANY_DATA.ANY_LLC_MISS */
> + [ C(RESULT_MISS)   ] = 0x01b7,
> + },
> + [ C(OP_WRITE) ] = {
> + /* OFFCORE_RESPONSE.ANY_RFO.LOCAL_CACHE */
> + [ C(RESULT_ACCESS) ] = 0x01b7,
> + /* OFFCORE_RESPONSE.ANY_RFO.ANY_LLC_MISS */
> + [ C(RESULT_MISS)   ] = 0x01b7,
> + },
> + [ C(OP_PREFETCH) ] = {
> + /* OFFCORE_RESPONSE.PREFETCH.LOCAL_CACHE */
> + [ C(RESULT_ACCESS) ] = 0x01b7,
> + /* OFFCORE_RESPONSE.PREFETCH.ANY_LLC_MISS */
> + [ C(RESULT_MISS)   ] = 0x01b7,
> + },
> + },
> + [ C(DTLB) ] = {
> + [

Re: [PATCH 1/2] perf, x86: use INTEL_UEVENT_EXTRA_REG to define MSR_OFFCORE_RSP_X

2013-08-18 Thread Yan, Zheng

ping

On 07/18/2013 05:02 PM, Yan, Zheng wrote:
> From: "Yan, Zheng" 
> 
> Silvermont (22nm Atom) has two offcore response configuration MSRs,
> unlike other Intel CPU, its event code for MSR_OFFCORE_RSP_1 is 0x02b7.
> To avoid complicating intel_fixup_er(), use INTEL_UEVENT_EXTRA_REG to
> define MSR_OFFCORE_RSP_X. So intel_fixup_er() can find the event code
> for OFFCORE_RSP_N by x86_pmu.extra_regs[N].event.
> 
> Signed-off-by: Yan, Zheng 
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c | 22 +-
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
> b/arch/x86/kernel/cpu/perf_event_intel.c
> index fbc9210..d312edf 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -81,7 +81,8 @@ static struct event_constraint 
> intel_nehalem_event_constraints[] __read_mostly =
>  
>  static struct extra_reg intel_nehalem_extra_regs[] __read_mostly =
>  {
> - INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0x, RSP_0),
> + /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
> + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x, RSP_0),
>   INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x100b),
>   EVENT_EXTRA_END
>  };
> @@ -143,8 +144,9 @@ static struct event_constraint 
> intel_ivb_event_constraints[] __read_mostly =
>  
>  static struct extra_reg intel_westmere_extra_regs[] __read_mostly =
>  {
> - INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0x, RSP_0),
> - INTEL_EVENT_EXTRA_REG(0xbb, MSR_OFFCORE_RSP_1, 0x, RSP_1),
> + /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
> + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x, RSP_0),
> + INTEL_UEVENT_EXTRA_REG(0x01bb, MSR_OFFCORE_RSP_1, 0x, RSP_1),
>   INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x100b),
>   EVENT_EXTRA_END
>  };
> @@ -163,15 +165,17 @@ static struct event_constraint 
> intel_gen_event_constraints[] __read_mostly =
>  };
>  
>  static struct extra_reg intel_snb_extra_regs[] __read_mostly = {
> - INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0x3f807f8fffull, RSP_0),
> - INTEL_EVENT_EXTRA_REG(0xbb, MSR_OFFCORE_RSP_1, 0x3f807f8fffull, RSP_1),
> + /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
> + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x3f807f8fffull, 
> RSP_0),
> + INTEL_UEVENT_EXTRA_REG(0x01bb, MSR_OFFCORE_RSP_1, 0x3f807f8fffull, 
> RSP_1),
>   INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
>   EVENT_EXTRA_END
>  };
>  
>  static struct extra_reg intel_snbep_extra_regs[] __read_mostly = {
> - INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0x3f8fffull, RSP_0),
> - INTEL_EVENT_EXTRA_REG(0xbb, MSR_OFFCORE_RSP_1, 0x3f8fffull, RSP_1),
> + /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
> + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x3f8fffull, 
> RSP_0),
> + INTEL_UEVENT_EXTRA_REG(0x01bb, MSR_OFFCORE_RSP_1, 0x3f8fffull, 
> RSP_1),
>   INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
>   EVENT_EXTRA_END
>  };
> @@ -1301,11 +1305,11 @@ static void intel_fixup_er(struct perf_event *event, 
> int idx)
>  
>   if (idx == EXTRA_REG_RSP_0) {
>   event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
> - event->hw.config |= 0x01b7;
> + event->hw.config |= x86_pmu.extra_regs[EXTRA_REG_RSP_0].event;
>   event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
>   } else if (idx == EXTRA_REG_RSP_1) {
>   event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
> - event->hw.config |= 0x01bb;
> + event->hw.config |= x86_pmu.extra_regs[EXTRA_REG_RSP_1].event;
>   event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
>   }
>  }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 4/9] nohz_full: Add rcu_dyntick data for scalable detection of all-idle state

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 08:02:34PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:49:39PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > This commit adds fields to the rcu_dyntick structure that are used to
> > detect idle CPUs.  These new fields differ from the existing ones in
> > that the existing ones consider a CPU executing in user mode to be idle,
> > where the new ones consider CPUs executing in user mode to be busy.
> > The handling of these new fields is otherwise quite similar to that for
> > the exiting fields.  This commit also adds the initialization required
> > for these fields.
> > 
> > So, why is usermode execution treated differently, with RCU considering
> > it a quiescent state equivalent to idle, while in contrast the new
> > full-system idle state detection considers usermode execution to be
> > non-idle?
> > 
> > It turns out that although one of RCU's quiescent states is usermode
> > execution, it is not a full-system idle state.  This is because the
> > purpose of the full-system idle state is not RCU, but rather determining
> > when accurate timekeeping can safely be disabled.  Whenever accurate
> > timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one
> > CPU must keep the scheduling-clock tick going.  If even one CPU is
> > executing in user mode, accurate timekeeping is requires, particularly for
> > architectures where gettimeofday() and friends do not enter the kernel.
> > Only when all CPUs are really and truly idle can accurate timekeeping be
> > disabled, allowing all CPUs to turn off the scheduling clock interrupt,
> > thus greatly improving energy efficiency.
> > 
> > This naturally raises the question "Why is this code in RCU rather than in
> > timekeeping?", and the answer is that RCU has the data and infrastructure
> > to efficiently make this determination.
> > 
> > Signed-off-by: Paul E. McKenney 
> > Acked-by: Frederic Weisbecker 
> > Cc: Steven Rostedt 
> 
> One comment below.  With that change:
> Reviewed-by: Josh Triplett 
> 
> > +#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> > +
> > +/*
> > + * Initialize dynticks sysidle state for CPUs coming online.
> > + */
> > +static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> > +{
> > +   rdtp->dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE;
> > +}
> > +
> > +#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> > +
> > +static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> > +{
> > +}
> > +
> > +#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> 
> Just move the ifdef around the function body:
> 
> static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> {
> #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
>   rdtp->dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE;
> #endif /* CONFIG_NO_HZ_FULL_SYSIDLE */
> }

This makes sense for this isolated function, and it would also
make sense if the end result had only functions that were exported.
But if I try to apply this to the result, I will end up with something
like the following.  Is that really what you want?

I suppose I could individually enclose whole functions whose definitions
are unneeded for CONFIG_NO_HZ_FULL_SYSIDLE=n, but that doesn't seem
helpful either.

Thoughts?

Thanx, Paul


/*
 * Invoked to note exit from irq or task transition to idle.  Note that
 * usermode execution does -not- count as idle here!  After all, we want
 * to detect full-system idle states, not RCU quiescent states and grace
 * periods.  The caller must have disabled interrupts.
 */
static void rcu_sysidle_enter(struct rcu_dynticks *rdtp, int irq)
{
#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
unsigned long j;

/* Adjust nesting, check for fully idle. */
if (irq) {
rdtp->dynticks_idle_nesting--;
WARN_ON_ONCE(rdtp->dynticks_idle_nesting < 0);
if (rdtp->dynticks_idle_nesting != 0)
return;  /* Still not fully idle. */
} else {
if ((rdtp->dynticks_idle_nesting & DYNTICK_TASK_NEST_MASK) ==
DYNTICK_TASK_NEST_VALUE) {
rdtp->dynticks_idle_nesting = 0;
} else {
rdtp->dynticks_idle_nesting -= DYNTICK_TASK_NEST_VALUE;
WARN_ON_ONCE(rdtp->dynticks_idle_nesting < 0);
return;  /* Still not fully idle. */
}
}

/* Record start of fully idle period. */
j = jiffies;
ACCESS_ONCE(rdtp->dynticks_idle_jiffies) = j;
smp_mb__before_atomic_inc();
atomic_inc(>dynticks_idle);
smp_mb__after_atomic_inc();
WARN_ON_ONCE(atomic_read(>dynticks_idle) & 0x1);
#endif /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
}

#ifdef CONFIG_NO_HZ_FULL_SYSIDLE

/*
 * Unconditionally force exit from full system-idle state.  This is
 * invoked when a normal CPU exits idle, but must be called separately
 * for

Re: [PATCH tip/core/rcu 11/11] jiffies: Avoid undefined behavior from signed overflow

2013-08-18 Thread Josh Triplett

On Sun, Aug 18, 2013 at 05:41:20PM -0700, Paul E. McKenney wrote:
> On Sat, Aug 17, 2013 at 08:23:51PM -0700, Josh Triplett wrote:
> > On Sat, Aug 17, 2013 at 06:37:56PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" 
> > > 
> > > According to the C standard 3.4.3p3, overflow of a signed integer results
> > > in undefined behavior.  This commit therefore changes the definitions
> > > of time_after(), time_after_eq(), time_after64(), and time_after_eq64()
> > > to avoid this undefined behavior.  The trick is that the subtraction
> > > is done using unsigned arithmetic, which according to 6.2.5p9 cannot
> > > overflow because it is defined as modulo arithmetic.  This has the added
> > > (though admittedly quite small) benefit of shortening two lines of code
> > > by four characters each.
> > > 
> > > Note that the C standard considers the cast from unsigned to
> > > signed to be implementation-defined, see 6.3.1.3p3.  However, on a
> > > two-complement system, an implementation that defines anything other
> > > than a reinterpretation of the bits is free come to me, and I will be
> > 
> > s/free come/free to come/
> 
> Good catch, fixed!

Just realized when looking at this again that there's another typo:
"two-complement" should be "two's-complement".

>   Thanx, Paul
> 
> > > happy to act as a witness for its being committed to an insane asylum.
> > 
> > With the typo above fixed:
> > Reviewed-by: Josh Triplett 
> > 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch v2 3/3] mm: page_alloc: fair zone allocator policy

2013-08-18 Thread Stephen Rothwell

Hi all,

On Fri, 16 Aug 2013 14:52:11 -0700 Kevin Hilman  wrote:
>
> Johannes Weiner  writes:
> 
> > On Fri, Aug 16, 2013 at 10:17:01AM -0700, Kevin Hilman wrote:
> >> Johannes Weiner  writes:
> >> > On Wed, Aug 07, 2013 at 11:37:43AM -0400, Johannes Weiner wrote:
> >> > Subject: [patch] mm: page_alloc: use vmstats for fair zone allocation 
> >> > batching
> >> >
> >> > Avoid dirtying the same cache line with every single page allocation
> >> > by making the fair per-zone allocation batch a vmstat item, which will
> >> > turn it into batched percpu counters on SMP.
> >> >
> >> > Signed-off-by: Johannes Weiner 
> >> 
> >> I bisected several boot failures on various ARM platform in
> >> next-20130816 down to this patch (commit 67131f9837 in linux-next.)
> >> 
> >> Simply reverting it got things booting again on top of -next.  Example
> >> boot crash below.
> >
> > Thanks for the bisect and report!
> 
> You're welcome.  Thanks for the quick fix!
> 
> > I deref the percpu pointers before initializing them properly.  It
> > didn't trigger on x86 because the percpu offset added to the pointer
> > is big enough so that it does not fall into PFN 0, but it probably
> > ended up corrupting something...
> >
> > Could you try this patch on top of linux-next instead of the revert?
> 
> Yup, that change fixes it.
> 
> Tested-by: Kevin Hilman 

> Tested-by: Stephen Warren 

I will add that into the akpm-current tree in linux-next today (unless
Andrew releases a new mmotm in the mean time).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpPbQeMS0KIv.pgp
Description: PGP signature

Re: [PATCH tip/core/rcu 11/11] jiffies: Avoid undefined behavior from signed overflow

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 08:23:51PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:37:56PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > According to the C standard 3.4.3p3, overflow of a signed integer results
> > in undefined behavior.  This commit therefore changes the definitions
> > of time_after(), time_after_eq(), time_after64(), and time_after_eq64()
> > to avoid this undefined behavior.  The trick is that the subtraction
> > is done using unsigned arithmetic, which according to 6.2.5p9 cannot
> > overflow because it is defined as modulo arithmetic.  This has the added
> > (though admittedly quite small) benefit of shortening two lines of code
> > by four characters each.
> > 
> > Note that the C standard considers the cast from unsigned to
> > signed to be implementation-defined, see 6.3.1.3p3.  However, on a
> > two-complement system, an implementation that defines anything other
> > than a reinterpretation of the bits is free come to me, and I will be
> 
> s/free come/free to come/

Good catch, fixed!

Thanx, Paul

> > happy to act as a witness for its being committed to an insane asylum.
> 
> With the typo above fixed:
> Reviewed-by: Josh Triplett 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 2/3] rcu: Update RTFP documentation

2013-08-18 Thread Josh Triplett

On Sun, Aug 18, 2013 at 05:20:02PM -0700, Paul E. McKenney wrote:
> On Sat, Aug 17, 2013 at 07:46:30PM -0700, Josh Triplett wrote:
> > On Sat, Aug 17, 2013 at 06:25:52PM -0700, Paul E. McKenney wrote:
> > > +In 2012, Josh Triplett received his Ph.D. with his dissertation
> > > +covering RCU-protected resizable hash tables and the relationship
> > > +between memory barriers and read-side traversal order:  If the updater
> > > +is making changes in the opposite direction from the read-side traveral
> > > +order, the updater need only execute a memory-barrier instruction,
> > > +but if in the same direction, the updater needs to wait for a grace
> > > +period between the individual updates [JoshTriplettPhD].  Also in 2012,
> > 
> > :)
> > 
> > > +after seventeen years of attempts, an RCU paper made it into a top-flight
> > > +academic journal, IEEE Transactions on Parallel and Distributed Systems
> > > +[MathieuDesnoyers2012URCU].  A group of researchers in Spain applied
> > 
> > What about the 2010 paper in Operating Systems Review?
> 
> It is already there, but not visible in this patch:
> 
>   2010 produced a simpler preemptible-RCU implementation
>   based on TREE_RCU [PaulEMcKenney2010SimpleOptRCU], lockdep-RCU
>   [PaulEMcKenney2010LockdepRCU], another resizeable RCU-protected hash
>   table [HerbertXu2010RCUResizeHash] (this one consuming more memory,
>   but allowing arbitrary changes in hash function, as required for DoS
>   avoidance in the networking code), realization of the 2009 RCU-protected
>   hash table with atomic node move [JoshTriplett2010RPHash], an update on
>   the RCU API [PaulEMcKenney2010RCUAPI].
> 
> And:
> 
>   @article{JoshTriplett2010RPHash
>   ,author="Josh Triplett and Paul E. McKenney and Jonathan Walpole"
>   ,title="Scalable Concurrent Hash Tables via Relativistic Programming"
>   ,journal="ACM Operating Systems Review"
>   ,year=2010
>   ,volume=44
>   ,number=3
>   ,month="July"
>   ,annotation={
>   RP fun with hash tables.
>   http://portal.acm.org/citation.cfm?id=1842733.1842750
>   }

Right, I saw it in the file when I checked; I meant, that journal paper
seems to contradict "after seventeen years of attempts, an RCU paper
made it into a top-flight academic journal". :)

> > > +,day = {25}
> > > +,doi = {10.1007/s11227-012-0766-x}
> > > +,issn = {0920-8542}
> > > +,journal = {The Journal of Supercomputing}
> > > +,keywords = {linux, simulation}
> > > +,month = apr
> > > +,posted-at = {2012-05-03 09:12:04}
> > > +,priority = {2}
> > > +,title = {{A Read-Copy Update based parallel server for distributed 
> > > crowd simulations}}
> > > +,url = {http://dx.doi.org/10.1007/s11227-012-0766-x}
> > > +,year = {2012}
> > > +}
> > > +
> > > +
> > > +@unpublished{JonCorbet2012ACCESS:ONCE
> > 
> > LWN is not "unpublished"; it's at least "misc", and I'd suggest
> > "article".  Ditto for every other LWN cite in this bibliography.
> 
> There does seem to be a diverse set of advice out there, with some
> agreeing with you on "misc", others advocating for "electronic", and
> still others suggesting use of LaBibTex with its "online" tag, and with
> the Tex Frequently Asked Questions page saying:
> 
>   There is no citation type for URLs, per se, in the standard
>   BibTeX styles, though Oren Patashnik (the author of BibTeX)
>   is believed to be considering developing one such for use with
>   the long-awaited BibTeX version 1.0.
> 
> I couldn't find any online .bib files with entries for Linux Weekly News
> articles.  Other than my own, of course!  (I know people have cited
> them in papers, but Google doesn't see the corresponding .bib files.)
> 
> Given all that, I am going to stick with "unpublished" for the moment,
> and wait at least one year to see if BibTex version 1.0 comes out.

Several different tags make sense, but "unpublished" isn't one of them.
"unpublished" exists for entirely un-reviewed works such as self-hosted
PDFs.  LWN has editorial standards.  Thus, of the standard tags that
work with all BibTeX styles, I think either "article" or "misc" would
make more sense than "unpublished".

An example from one of my own .bib files:

@article{tiny-rcu-lwn,
author = "Paul E. McKenney",
title = {{RCU: The Bloatwatch Edition}},
journal = "Linux Weekly News",
month = "March",
year = "2009",
day = "17",
url = {https://lwn.net/Articles/323929/}
}

(With the obvious change that since you don't use "url" in your .bib
files, that should go in "howpublished" or "note" instead.)

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 06/11] rcu: Switch to exedited grace periods for suspend as well as hibernation

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 08:20:09PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:37:51PM -0700, Paul E. McKenney wrote:
> > From: Bjørn Mork 
> > 
> > Commit 587ff2cf ("rcu: Expedite grace periods during suspend/resume")
> > enabled expedited grace periods for hibernation, but not for suspend.
> > The same issue applies to both cases, so this commit simply applies the
> > same logic by adding additional cases to the switch statement.
> > 
> > Note that this commit also switches from PM_POST_RESTORE to the
> > combination of PM_POST_HIBERNATION and PM_POST_SUSPEND.  A separate
> > patch from Borislav Petkov corrects the documentation to indicate that
> > this is necessary.
> > 
> > Signed-off-by: Bjørn Mork 
> > Signed-off-by: Paul E. McKenney 
> 
> Please squash this together with the other two relevant patches in this
> series.

Done!  (Of course, retaining both Borislav's and Bjørn's Signed-off-bys.)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 02/11] rcu: Expedite during suspend and resume only on smallish systems

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 08:18:32PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:37:47PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > Expedited grace periods are of dubious benefit on very large systems,
> > so this commit restricts their automated use during suspend and resume
> > to systems of 256 or fewer CPUs.
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> This seems odd.  If expedited grace periods don't help on large systems,
> shouldn't you just compile them out entirely and ignore rcu_expedited,
> rather than just in this one special case?

Longer term, a bunch of stuff will need optimization for larger systems.
Expedited RCU grace periods, rcu_barrier(), possibly even grace-period
initialization and cleanup.  Also smp_call_function(), along with other
primitives that have broadcast semantics.  These changes would allow
this hack to be removed.

The most straightforward approach would be to provision helper kthreads,
increasing the number of helper kthreads with the number of CPUs.  But
I would like to see actual problems due to the lack of scalability before
adding more complexity to RCU.  ;-)

> In any case, if this patch still makes sense, please squash it into the
> previous one.

Fair enough, done!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 01/11] rcu: Expedite grace periods during suspend/resume

2013-08-18 Thread Paul E. McKenney

On Sun, Aug 18, 2013 at 11:34:44AM +0200, Borislav Petkov wrote:
> On Sat, Aug 17, 2013 at 08:17:17PM -0700, Josh Triplett wrote:
> > On Sat, Aug 17, 2013 at 06:37:46PM -0700, Paul E. McKenney wrote:
> > > From: Borislav Petkov 
> > > 
> > > CONFIG_RCU_FAST_NO_HZ can increase grace-period durations by up to
> > > a factor of four, which can result in long suspend and resume times.
> > > Thus, this commit temporarily switches to expedited grace periods when
> > > suspending the box and return to normal settings when resuming.
> > > 
> > > [ paulmck: This also papers over an audio/irq bug, but hopefully that will
> > >   be fixed soon. ]
> > > 
> > > Signed-off-by: Borislav Petkov 
> > > Signed-off-by: Paul E. McKenney 
> > 
> > This patch still seems like a hack, and there *ought* to be a better
> > general solution to avoid excessive grace-period latency.  Nonetheless,
> > in the absence of such a solution,
> > Reviewed-by: Josh Triplett 
> 
> Yeah, I'm not happy about it either but from quickly skimming over what
> context we're using rcu_expedited in, the basic requirement for a fix is
> for the pm core to be able to tell rcu not to stretch grace periods.
> 
> So simply setting a variable is much simpler than switching to calling
> all those *_expedited() rcu flavors from the pm notifier when going
> down.
> 
> Unless Paul has a better idea, of course.
> 
> The basic problem here is, however, that you need to temporarily
> reconfigure the inner workings of a subsystem because hardware is
> performing a power state transition. Unless someone teaches rcu about
> power state transitions... :-)

I would guess that once we have a few more subsystems that want RCU to
temporarily expedite its grace periods, we will have enough information
and experience to do something a bit more general.  In the meantime,
I am opting for something simple.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 2/3] rcu: Update RTFP documentation

2013-08-18 Thread Paul E. McKenney

On Sat, Aug 17, 2013 at 07:46:30PM -0700, Josh Triplett wrote:
> On Sat, Aug 17, 2013 at 06:25:52PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> 
> Could you mention the BibTeX formatting changes (and the rationale for
> them) in the commit message, please?

Good point, done.  Short version: For compatibility with my .bib files,
some of which predate bibtex's ability to handle trailing commas.

> > Signed-off-by: Paul E. McKenney 
> 
> A couple of comments below (and above); with those fixed,
> Reviewed-by: Josh Triplett 

I took all but the conversion of the LWN article entry tags, please
see below.

Thanx, Paul

> > ---
> >  Documentation/RCU/RTFP.txt | 855 
> > +
> >  1 file changed, 550 insertions(+), 305 deletions(-)
> > 
> > diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt
> > index 7f40c72..350be9a 100644
> > --- a/Documentation/RCU/RTFP.txt
> > +++ b/Documentation/RCU/RTFP.txt
> > @@ -39,7 +39,7 @@ in read-mostly situations.  This algorithm does take 
> > pains to avoid
> >  write-side contention and parallelize the other write-side overheads by
> >  providing a fine-grained locking design, however, it would be interesting
> >  to see how much of the performance advantage reported in 1990 remains
> > -in 2004.
> > +in 2004 (to say nothing of 2013).
> 
> In lieu of updating this again in 9 years, s/2004.*/today./ please?

Fair point, fixed.

> > +In 2012, Josh Triplett received his Ph.D. with his dissertation
> > +covering RCU-protected resizable hash tables and the relationship
> > +between memory barriers and read-side traversal order:  If the updater
> > +is making changes in the opposite direction from the read-side traveral
> > +order, the updater need only execute a memory-barrier instruction,
> > +but if in the same direction, the updater needs to wait for a grace
> > +period between the individual updates [JoshTriplettPhD].  Also in 2012,
> 
> :)
> 
> > +after seventeen years of attempts, an RCU paper made it into a top-flight
> > +academic journal, IEEE Transactions on Parallel and Distributed Systems
> > +[MathieuDesnoyers2012URCU].  A group of researchers in Spain applied
> 
> What about the 2010 paper in Operating Systems Review?

It is already there, but not visible in this patch:

2010 produced a simpler preemptible-RCU implementation
based on TREE_RCU [PaulEMcKenney2010SimpleOptRCU], lockdep-RCU
[PaulEMcKenney2010LockdepRCU], another resizeable RCU-protected hash
table [HerbertXu2010RCUResizeHash] (this one consuming more memory,
but allowing arbitrary changes in hash function, as required for DoS
avoidance in the networking code), realization of the 2009 RCU-protected
hash table with atomic node move [JoshTriplett2010RPHash], an update on
the RCU API [PaulEMcKenney2010RCUAPI].

And:

@article{JoshTriplett2010RPHash
,author="Josh Triplett and Paul E. McKenney and Jonathan Walpole"
,title="Scalable Concurrent Hash Tables via Relativistic Programming"
,journal="ACM Operating Systems Review"
,year=2010
,volume=44
,number=3
,month="July"
,annotation={
RP fun with hash tables.
http://portal.acm.org/citation.cfm?id=1842733.1842750
}

> > +user-level RCU to crowd simulation [GuillermoVigueras2012RCUCrowd], and
> > +another group of researchers in Europe produced a formal description of
> > +RCU based on separation logic [AlexeyGotsman2012VerifyGraceExtended],
> 
> Oh, interesting, I hadn't seen that one yet.

I am hoping that it will lead to better tools to analyze both RCU
implementations and RCU uses.  ;-)

> > +@phdthesis{JoshTriplettPhD
> > +,author="Josh Triplett"
> > +,title="Relativistic Causal Ordering A Memory Model for Scalable 
> > Concurrent Data Structures"
> 
> "A Memory Model ..." is a subtitle, so there should be either a : or a
> --- after "Ordering".

Got it, fixed both in RTFP.txt and in my bibtex database.

> > +,school="Portland State University"
> > +,year="2012"
> > +,annotation={
> > +   RCU-protected hash tables, barriers vs. read-side traversal order.
> > +}
> 
> Consider duplicating the summary above about traversal order versus
> update order, and memory barriers versus grace periods, into the
> annotation.

Done!

> > +@article{GuillermoVigueras2012RCUCrowd
> > +,author = {Vigueras, Guillermo and Ordu\~{n}a, Juan M. and Lozano, Miguel}
> > +,citeulike-article-id = {10632151}
> > +,citeulike-linkout-0 = {http://dx.doi.org/10.1007/s11227-012-0766-x}
> > +,citeulike-linkout-1 = 
> > {http://www.springerlink.com/content/25762r0874163570}
> 
> Consider dropping these non-standard BibTeX tags; nothing will ever
> format them, and you already have the doi below.

Done.

> > +,day = {25}
> > +,doi = {10.1007/s11227-012-0766-x}
> > +,issn = {0920-8542}
> >

Re: please fetch new git-based DM 'for-next' branch [was: Re: [git pull] device-mapper changes for 3.11]

2013-08-18 Thread Stephen Rothwell

Hi Mike, Alasdair,

On Fri, 16 Aug 2013 17:18:10 -0400 Mike Snitzer  wrote:
>
> To access all future DM changes for linux-next, please switch from
> Alasdair's quilt tree to using the 'for-next' branch of the DM git repo:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git

Done.  You are both listed as contacts for problems with that tree.

Thanks.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpljgyXCSmNQ.pgp
Description: PGP signature

[PATCH-v2 3/4] iscsi-target: Add login negotiation multi-plexing support

2013-08-18 Thread Nicholas A. Bellinger

From: Nicholas Bellinger 

This patch adds support for login negotiation multi-plexing in
iscsi-target code.

This involves handling the first login request PDU + payload and
login response PDU + payload within __iscsi_target_login_thread()
process context, and then changing struct sock->sk_data_ready()
so that all subsequent exchanges are handled by workqueue process
context, to allow other incoming login requests to be received
in parallel by __iscsi_target_login_thread().

Upon login negotiation completion (or failure), ->sk_data_ready()
is replaced with the original kernel sockets handler saved in
iscsi_conn->orig_data_ready.

v2 changes:
  - Add login_timer in iscsi_target_do_login_rx() to avoid
possible endless sleep with MSG_WAITALL for traditional
iscsi-target in certain network configurations.
  - Convert lprintk() -> pr_debug()
  - Remove forward declarations of iscsi_target_set_sock_callbacks(),
iscsi_target_restore_sock_callbacks() and iscsi_target_sk_data_ready()
  - Make iscsi_target_set_sock_callbacks + iscsi_target_restore_sock_callbacks()
static (Fengguang)
  - Make iscsi_target_do_login_rx() safe for iser-target w/o conn->sock

Signed-off-by: Nicholas Bellinger 
---
 drivers/target/iscsi/iscsi_target_core.h |1 +
 drivers/target/iscsi/iscsi_target_nego.c |  209 -
 drivers/target/iscsi/iscsi_target_tpg.c  |7 +-
 drivers/target/iscsi/iscsi_target_tpg.h  |2 +-
 4 files changed, 209 insertions(+), 10 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_core.h 
b/drivers/target/iscsi/iscsi_target_core.h
index 711a028..8a4c32d 100644
--- a/drivers/target/iscsi/iscsi_target_core.h
+++ b/drivers/target/iscsi/iscsi_target_core.h
@@ -562,6 +562,7 @@ struct iscsi_conn {
struct timer_list   nopin_timer;
struct timer_list   nopin_response_timer;
struct timer_list   transport_timer;
+   struct task_struct  *login_kworker;
/* Spinlock used for add/deleting cmd's from conn_cmd_list */
spinlock_t  cmd_lock;
spinlock_t  conn_usage_lock;
diff --git a/drivers/target/iscsi/iscsi_target_nego.c 
b/drivers/target/iscsi/iscsi_target_nego.c
index c4675b4..daebe32 100644
--- a/drivers/target/iscsi/iscsi_target_nego.c
+++ b/drivers/target/iscsi/iscsi_target_nego.c
@@ -377,14 +377,191 @@ static int iscsi_target_do_tx_login_io(struct iscsi_conn 
*conn, struct iscsi_log
return 0;
 }
 
+static void iscsi_target_sk_data_ready(struct sock *sk, int count)
+{
+   struct iscsi_conn *conn = sk->sk_user_data;
+   bool rc;
+
+   pr_debug("Entering iscsi_target_sk_data_ready: conn: %p\n", conn);
+
+   read_lock_bh(>sk_callback_lock);
+   if (!sk->sk_user_data) {
+   read_unlock_bh(>sk_callback_lock);
+   return;
+   }
+
+   if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, >login_flags)) {
+   read_unlock_bh(>sk_callback_lock);
+   pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p \n", 
conn);
+   return;
+   }
+
+   rc = schedule_delayed_work(>login_work, 0);
+   if (rc == false) {
+   pr_debug("iscsi_target_sk_data_ready, schedule_delayed_work"
+" got false\n");
+   }
+   read_unlock_bh(>sk_callback_lock);
+}
+
+static void iscsi_target_set_sock_callbacks(struct iscsi_conn *conn)
+{
+   struct sock *sk;
+
+   if (!conn->sock)
+   return;
+
+   sk = conn->sock->sk;
+   pr_debug("Entering iscsi_target_set_sock_callbacks: conn: %p\n", conn);
+
+   write_lock_bh(>sk_callback_lock);
+   sk->sk_user_data = conn;
+   conn->orig_data_ready = sk->sk_data_ready;
+   sk->sk_data_ready = iscsi_target_sk_data_ready;
+   write_unlock_bh(>sk_callback_lock);
+}
+
+static void iscsi_target_restore_sock_callbacks(struct iscsi_conn *conn)
+{
+   struct sock *sk;
+
+   if (!conn->sock)
+   return;
+
+   sk = conn->sock->sk;
+   pr_debug("Entering iscsi_target_restore_sock_callbacks: conn: %p\n", 
conn);
+
+   write_lock_bh(>sk_callback_lock);
+   if (!sk->sk_user_data) {
+   write_unlock_bh(>sk_callback_lock);
+   return;
+   }
+   sk->sk_user_data = NULL;
+   sk->sk_data_ready = conn->orig_data_ready;
+   write_unlock_bh(>sk_callback_lock);
+}
+
+static int iscsi_target_do_login(struct iscsi_conn *, struct iscsi_login *);
+
+static bool iscsi_target_sk_state_check(struct sock *sk)
+{
+   if (sk->sk_state == TCP_CLOSE_WAIT || sk->sk_state == TCP_CLOSE) {
+   pr_debug("iscsi_target_sk_state_check: 
TCP_CLOSE_WAIT|TCP_CLOSE,"
+   "returning FALSE\n");
+   return false;
+   }
+   return true;
+}
+
+static void iscsi_target_login_drop(struct iscsi_conn *conn, struct 
iscsi_login *login)
+{
+   struct iscsi_np *np = login->np;
+   bool zero_tsih = login->zero_tsih;

[PATCH-v2 0/4] iscsi-target: Add support for login multi-plexing support

2013-08-18 Thread Nicholas A. Bellinger

From: Nicholas Bellinger 

Hi folks,

This -v2 series for v3.12-rc1 adds support for login multi-plexing,
that allows subsequent login request/request PDUs beyond the initial 
exchange to be pushed off to workqueue process context, so that other
incoming login requests can be serviced in parallel.

This addresses a long-standing issue with login latency with many (100's)
of parallel login requests to the same network portal being shared
across many (100's) of TargetName+TargetPortalGroup endpoints.

Note that login negotiation to the same TargetName+TargetPortalGroup
endpoint is still sychronized in order to enforce session reinstatement
state machines.

The changes for v2 include:

  - Fix iscsit_transport reference leak during NP thread reset
  - Remove duplicate call to iscsi_post_login_handler() in
__iscsi_target_login_thread()
  - Drop unused iscsi_np->np_login_tpg
  - Add login_timer in iscsi_target_do_login_rx() to avoid
possible endless sleep with MSG_WAITALL for traditional
iscsi-target in certain network configurations.
  - Convert lprintk() -> pr_debug()
  - Remove forward declarations of iscsi_target_set_sock_callbacks(),
iscsi_target_restore_sock_callbacks() and iscsi_target_sk_data_ready()
  - Make iscsi_target_set_sock_callbacks + iscsi_target_restore_sock_callbacks()
static (Fengguang)
  - Make iscsi_target_do_login_rx() safe for iser-target w/o conn->sock
  - Updates to iser-target for login negotiation multi-plexing support

The main remaining FIXME is to keep track of connections that are pushed
out to workqueue process context for explicit network portal shutdown
purposes.

Thanks!

--nab

Nicholas Bellinger (4):
  iscsi-target: Fix iscsit_transport reference leak during NP thread
reset
  iscsi-target: Prepare login code for multi-plexing support
  iscsi-target: Add login negotiation multi-plexing support
  iser-target: Updates for login negotiation multi-plexing support

 drivers/infiniband/ulp/isert/ib_isert.c   |   17 +++-
 drivers/target/iscsi/iscsi_target.c   |   53 
 drivers/target/iscsi/iscsi_target.h   |6 +-
 drivers/target/iscsi/iscsi_target_core.h  |   12 ++-
 drivers/target/iscsi/iscsi_target_login.c |  171 +--
 drivers/target/iscsi/iscsi_target_login.h |3 +
 drivers/target/iscsi/iscsi_target_nego.c  |  209 +++-
 drivers/target/iscsi/iscsi_target_tpg.c   |   23 +++-
 drivers/target/iscsi/iscsi_target_tpg.h   |4 +-
 9 files changed, 378 insertions(+), 120 deletions(-)

-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] iser-target: Updates for login negotiation multi-plexing support

2013-08-18 Thread Nicholas A. Bellinger

From: Nicholas Bellinger 

This patch updates iser-target code to support login negotiation
multi-plexing.  This includes only using isert_conn->conn_login_comp
for the first login request PDU, pushing the subsequent processing
to iscsi_conn->login_work -> iscsi_target_do_login_rx(), and turning
isert_get_login_rx() into a NOP.

Signed-off-by: Nicholas Bellinger 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   17 -
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 3f62041..d17ff13 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -869,7 +869,14 @@ isert_rx_login_req(struct iser_rx_desc *rx_desc, int 
rx_buflen,
 size, rx_buflen, MAX_KEY_VALUE_PAIRS);
memcpy(login->req_buf, _desc->data[0], size);
 
-   complete(_conn->conn_login_comp);
+   if (login->first_request) {
+   complete(_conn->conn_login_comp);
+   return;
+   }
+   if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, >login_flags))
+   return;
+
+   schedule_delayed_work(>login_work, 0);
 }
 
 static void
@@ -2224,6 +2231,14 @@ isert_get_login_rx(struct iscsi_conn *conn, struct 
iscsi_login *login)
int ret;
 
pr_debug("isert_get_login_rx before conn_login_comp conn: %p\n", conn);
+   /*
+* For login requests after the first PDU, isert_rx_login_req() will
+* kick schedule_delayed_work(>login_work) as the packet is
+* received, which turns this callback from iscsi_target_do_login_rx()
+* into a NOP.
+*/
+   if (!login->first_request)
+   return 0;
 
ret = wait_for_completion_interruptible(_conn->conn_login_comp);
if (ret)
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH-v2 1/4] iscsi-target: Fix iscsit_transport reference leak during NP thread reset

2013-08-18 Thread Nicholas A. Bellinger

From: Nicholas Bellinger 

This patch fixes a bug in __iscsi_target_login_thread() where an explicit
network portal thread reset ends up leaking the iscsit_transport module
reference, along with the associated iscsi_conn allocation.

This manifests itself with iser-target where a NP reset causes the extra
iscsit_transport reference to be taken in iscsit_conn_set_transport()
during the reset, which prevents the ib_isert module from being unloaded
after the NP thread shutdown has finished.

Signed-off-by: Nicholas Bellinger 
---
 drivers/target/iscsi/iscsi_target_login.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_login.c 
b/drivers/target/iscsi/iscsi_target_login.c
index 0e85238..4c17f83 100644
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -1171,12 +1171,11 @@ static int __iscsi_target_login_thread(struct iscsi_np 
*np)
if (np->np_thread_state == ISCSI_NP_THREAD_RESET) {
spin_unlock_bh(>np_thread_lock);
complete(>np_restart_comp);
-   if (ret == -ENODEV) {
-   iscsit_put_transport(conn->conn_transport);
-   kfree(conn);
-   conn = NULL;
+   iscsit_put_transport(conn->conn_transport);
+   kfree(conn);
+   conn = NULL;
+   if (ret == -ENODEV)
goto out;
-   }
/* Get another socket */
return 1;
}
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH-v2 2/4] iscsi-target: Prepare login code for multi-plexing support

2013-08-18 Thread Nicholas A. Bellinger

From: Nicholas Bellinger 

This patch prepares the iscsi-target login code for multi-plexing
support.  This includes:

 - Adding iscsi_tpg_np->tpg_np_kref + iscsit_login_kref_put() for
   handling callback of iscsi_tpg_np->tpg_np_comp
 - Adding kref_put() in iscsit_deaccess_np()
 - Adding kref_put() and wait_for_completion() in
   iscsit_reset_np_thread()
 - Refactor login failure path release logic into
   iscsi_target_login_sess_out()
 - Update __iscsi_target_login_thread() to handle
   iscsi_post_login_handler() asynchronous completion
 - Add shutdown parameter for iscsit_clear_tpg_np_login_thread*()

v2 changes:
 - Remove duplicate call to iscsi_post_login_handler() in
   __iscsi_target_login_thread()
 - Drop unused iscsi_np->np_login_tpg

Signed-off-by: Nicholas Bellinger 
---
 drivers/target/iscsi/iscsi_target.c   |   53 +-
 drivers/target/iscsi/iscsi_target.h   |6 +-
 drivers/target/iscsi/iscsi_target_core.h  |   11 ++-
 drivers/target/iscsi/iscsi_target_login.c |  162 +
 drivers/target/iscsi/iscsi_target_login.h |3 +
 drivers/target/iscsi/iscsi_target_tpg.c   |   16 ++-
 drivers/target/iscsi/iscsi_target_tpg.h   |2 +-
 7 files changed, 149 insertions(+), 104 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index c4aeac3..7228cc7 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -220,11 +220,6 @@ int iscsit_access_np(struct iscsi_np *np, struct 
iscsi_portal_group *tpg)
spin_unlock_bh(>np_thread_lock);
return -1;
}
-   if (np->np_login_tpg) {
-   pr_err("np->np_login_tpg() is not NULL!\n");
-   spin_unlock_bh(>np_thread_lock);
-   return -1;
-   }
spin_unlock_bh(>np_thread_lock);
/*
 * Determine if the portal group is accepting storage traffic.
@@ -243,23 +238,35 @@ int iscsit_access_np(struct iscsi_np *np, struct 
iscsi_portal_group *tpg)
if ((ret != 0) || signal_pending(current))
return -1;
 
-   spin_lock_bh(>np_thread_lock);
-   np->np_login_tpg = tpg;
-   spin_unlock_bh(>np_thread_lock);
+   spin_lock_bh(>tpg_state_lock);
+   if (tpg->tpg_state != TPG_STATE_ACTIVE) {
+   spin_unlock_bh(>tpg_state_lock);
+   mutex_unlock(>np_login_lock);
+   return -1;
+   }
+   spin_unlock_bh(>tpg_state_lock);
 
return 0;
 }
 
-int iscsit_deaccess_np(struct iscsi_np *np, struct iscsi_portal_group *tpg)
+void iscsit_login_kref_put(struct kref *kref)
 {
-   struct iscsi_tiqn *tiqn = tpg->tpg_tiqn;
+   struct iscsi_tpg_np *tpg_np = container_of(kref,
+   struct iscsi_tpg_np, tpg_np_kref);
 
-   spin_lock_bh(>np_thread_lock);
-   np->np_login_tpg = NULL;
-   spin_unlock_bh(>np_thread_lock);
+   complete(_np->tpg_np_comp);
+}
+
+int iscsit_deaccess_np(struct iscsi_np *np, struct iscsi_portal_group *tpg,
+  struct iscsi_tpg_np *tpg_np)
+{
+   struct iscsi_tiqn *tiqn = tpg->tpg_tiqn;
 
mutex_unlock(>np_login_lock);
 
+   if (tpg_np)
+   kref_put(_np->tpg_np_kref, iscsit_login_kref_put);
+
if (tiqn)
iscsit_put_tiqn_for_login(tiqn);
 
@@ -410,20 +417,10 @@ struct iscsi_np *iscsit_add_np(
 int iscsit_reset_np_thread(
struct iscsi_np *np,
struct iscsi_tpg_np *tpg_np,
-   struct iscsi_portal_group *tpg)
+   struct iscsi_portal_group *tpg,
+   bool shutdown)
 {
spin_lock_bh(>np_thread_lock);
-   if (tpg && tpg_np) {
-   /*
-* The reset operation need only be performed when the
-* passed struct iscsi_portal_group has a login in progress
-* to one of the network portals.
-*/
-   if (tpg_np->tpg_np->np_login_tpg != tpg) {
-   spin_unlock_bh(>np_thread_lock);
-   return 0;
-   }
-   }
if (np->np_thread_state == ISCSI_NP_THREAD_INACTIVE) {
spin_unlock_bh(>np_thread_lock);
return 0;
@@ -438,6 +435,12 @@ int iscsit_reset_np_thread(
}
spin_unlock_bh(>np_thread_lock);
 
+   if (tpg_np && shutdown) {
+   kref_put(_np->tpg_np_kref, iscsit_login_kref_put);
+
+   wait_for_completion(_np->tpg_np_comp);
+   }
+
return 0;
 }
 
diff --git a/drivers/target/iscsi/iscsi_target.h 
b/drivers/target/iscsi/iscsi_target.h
index 2c437cb..f82f627 100644
--- a/drivers/target/iscsi/iscsi_target.h
+++ b/drivers/target/iscsi/iscsi_target.h
@@ -7,13 +7,15 @@ extern void iscsit_put_tiqn_for_login(struct iscsi_tiqn *);
 extern struct iscsi_tiqn *iscsit_add_tiqn(unsigned char *);
 extern void iscsit_del_tiqn(struct iscsi_tiqn *);
 extern int iscsit_access_np(struct iscsi_np *, struct iscsi_portal_group *);

Re: [PATCH v2 5/8] drm/i2c: tda998x: add video and audio input configuration

2013-08-18 Thread Russell King - ARM Linux

On Mon, Aug 19, 2013 at 09:23:17AM +1000, Dave Airlie wrote:
> On Thu, Aug 15, 2013 at 5:43 AM, Sebastian Hesselbarth
>  wrote:
> > From: Russell King 
> >
> > This patch adds tda998x specific parameters to allow it to be configured
> > for different boards using it. Also, this implements rudimentary audio
> > support for S/PDIF attached controllers.
> >
> > Signed-off-by: Russell King 
> > Signed-off-by: Sebastian Hesselbarth 
> > Tested-by: Darren Etheridge 
> > ---
> 
> I've merged the series,

Thanks.

> this one generates a warning though:
>   CC [M]  drivers/gpu/drm/i2c/tda998x_drv.o
> /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:
> In function ‘tda998x_encoder_mode_set’:
> /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:637:11:
> warning: ‘clksel_fs’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
> /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:573:30:
> note: ‘clksel_fs’ was declared here
> /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:637:11:
> warning: ‘clksel_aip’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
> /home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:573:18:
> note: ‘clksel_aip’ was declared here
> 
> It doesn't seem like a real problem, since the function is unlikely to
> be called any way to make that case happen.

Ok, I'll squash those warnings by a slight rearrangement of the code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 5/8] drm/i2c: tda998x: add video and audio input configuration

2013-08-18 Thread Dave Airlie

On Thu, Aug 15, 2013 at 5:43 AM, Sebastian Hesselbarth
 wrote:
> From: Russell King 
>
> This patch adds tda998x specific parameters to allow it to be configured
> for different boards using it. Also, this implements rudimentary audio
> support for S/PDIF attached controllers.
>
> Signed-off-by: Russell King 
> Signed-off-by: Sebastian Hesselbarth 
> Tested-by: Darren Etheridge 
> ---

I've merged the series,

this one generates a warning though:
  CC [M]  drivers/gpu/drm/i2c/tda998x_drv.o
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:
In function ‘tda998x_encoder_mode_set’:
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:637:11:
warning: ‘clksel_fs’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:573:30:
note: ‘clksel_fs’ was declared here
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:637:11:
warning: ‘clksel_aip’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i2c/tda998x_drv.c:573:18:
note: ‘clksel_aip’ was declared here

It doesn't seem like a real problem, since the function is unlikely to
be called any way to make that case happen.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

cf

2013-08-18 Thread eagdrympgx

*-**---

ÓÐ¹ã¸æ·Ñ£¬×ÉÑ¯·Ñ£¬×¡ËÞ£¬½¨£¬Îå½ðµÈµÈ£¬
N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±êçzX§¶¡Ü¨}©²Æ 
zÚ:+v¨¾«êçzZ+Ê+zf£¢·h§~Ûiÿûàz¹®w¥¢¸?¨èÚ&¢)ß¢fù^jÇ«y§má@A«a¶Úÿ
0¶ìh®åi

RE: [PATCH 01/10] leds: lp55xx: add common data structure for program

2013-08-18 Thread Kim, Milo

Hi Bryan,

> -Original Message-
> From: Bryan Wu [mailto:coolo...@gmail.com]
> Sent: Wednesday, August 14, 2013 3:56 AM
> To: Milo Kim
> Cc: Pali Rohár; Linux LED Subsystem; lkml; Kim, Milo
> Subject: Re: [PATCH 01/10] leds: lp55xx: add common data structure for
> program
> 
> On Thu, Aug 8, 2013 at 12:59 AM, Milo Kim  wrote:
> > LP55xx family devices have internal three program engines which are
> > used for loading LED patterns.
> > To maintain legacy device attributes, specific data structure is used,
> 'mode'
> > and 'led_mux'.
> > The mode is used for showing/storing current engine mode such like
> > disabled, load and run.
> > Then led_mux is used for showing/storing current output LED selection.
> > This is only for LP5523/55231.
> >
> 
> This patch looks good to me, but the commit message format is little bit odd
> to me. I will fix that and merge into my tree.

Thanks for your help.
Can I get more detailed information about this format problem?
I need to check my configurations.

Thanks,
Milo-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[no subject]

2013-08-18 Thread Subhash Deshpande




--
Please i need your help in executing an urgent business.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[no subject]

2013-08-18 Thread Subhash Deshpande




--
Please i need your help in executing an urgent business.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Linux 3.11-rc6

2013-08-18 Thread Linus Torvalds

It's been a fairly quiet week, and the rc's are definitely shrinking.
Which makes me happy.

Sure, we had an interesting rang handling bug in the TLB invalidation
code,  but that was an older bug and apparently really hard to hit in
practice. That said, it could explain a couple of random SIGSEGV's
etc, so if you've seen odd behavior maybe you've been hit by it, and
3.11-rc6 will fix it. Knock wood.

Other than that, it's more random changes: network drivers, usb,
sound, and a few filesystem fixes. With x86, ARM and a few small m68k
updates. But it's really been pretty quiet. The appended shortlog
gives the details for those who care.

.. and since the statistics on this rc were pretty boring, I started
looking at bigger numbers. We've now used git for over eight years,
and we have almost 400k commits in that time. That's interesting (to
me), because back in the BK days we were approaching the (back then)
limit of 65k commits in BK in the three years we used it. So we've
long since blown through that limit.

And those almost 400 thousand commits? They all fit in a 575MB
pack-file (plus a 85MB index file). Now, that's with more aggressive
packing than most people probably do, but I think it's interesting how
the last eight years of very active history ends up having almost
exactly the same size as the whole unpacked source tree. In fact, I
think that you need more free space for the object files of doing a
build than you need for remembering all that history.

I'll try to remember to do some more interesting/relevant statistics
for the rc7 release, because that should coincide with the 22nd
anniversary of the original Linux announcement on comp.os.minix.

How time flies when you're having fun..

  Linus

---

Alan Stern (1):
  USB: EHCI: accept very late isochronous URBs

Alexey Brodkin (1):
  ethernet/arc/arc_emac - fix NAPI "work > weight" warning

Alexey Kardashevskiy (1):
  Revert "cxgb3: Check and handle the dma mapping errors"

Andi Kleen (1):
  perf/x86: Add Haswell ULT model number used in Macbook Air and
other systems

Andi Shyti (1):
  cifs: file: initialize oparms.reconnect before using it

Andreas Schwab (1):
  m68k: Truncate base in do_div()

Andrey Vagin (1):
  memcg: don't initialize kmem-cache destroying work for root caches

Ariel Elior (1):
  bnx2x: fix memory leak in VF

Asbjoern Sloth Toennesen (1):
  rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with
ifinfomsg header

Barak Witkowsky (1):
  bnx2x: fix PTE write access error

Bartlomiej Zolnierkiewicz (1):
  stmmac: fix init_dma_desc_rings() to handle errors

Brian Austin (2):
  ASoC: cs42l52: Reorder Min/Max and update to SX_TLV for Beep Volume
  ASoC: cs42l52: Add new TLV for Beep Volume

Byungho An (1):
  net: stmmac: Fixed the condition of extend_desc for jumbo frame

Chen Gang (2):
  cifs: extend the buffer length enought for sprintf() using
  arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"

Chris Wright (1):
  mac80211: fix infinite loop in ieee80211_determine_chantype

Clemens Ladisch (1):
  ALSA: usb-audio: fix automatic Roland/Yamaha MIDI detection

Cong Wang (2):
  vxlan: fix a regression of igmp join
  vxlan: fix a soft lockup in vxlan module removal

Cyrill Gorcunov (2):
  mm: save soft-dirty bits on swapped pages
  mm: save soft-dirty bits on file pages

Dan Carpenter (2):
  netfilter: nfnetlink_{log,queue}: fix information leaks in netlink message
  tun: signedness bug in tun_get_user()

Daniel Borkmann (4):
  net: esp{4,6}: fix potential MTU calculation overflows
  net: sctp: sctp_assoc_control_transport: fix MTU size in SCTP_PF state
  net: sctp: sctp_transport_destroy{, _rcu}: fix potential pointer
corruption
  net: tg3: fix NULL pointer dereference in tg3_io_error_detected
and tg3_io_slot_reset

Dave Jones (1):
  8139cp: Fix skb leak in rx_status_loop failure path.

Dmitry Kravkov (2):
  bnx2x: protect different statistics flows
  bnx2x: update fairness parameters following DCB negotiation

Ed L. Cashin (1):
  aoe: adjust ref of head for compound page tails

Eli Cohen (1):
  mlx5: remove health handler plugin

Eliezer Tamir (2):
  busy_poll: cleanup do-nothing placeholders
  net: rename busy poll MIB counter

Eric Dumazet (5):
  fib_trie: remove potential out of bound access
  tcp: cubic: fix overflow error in bictcp_update()
  tcp: cubic: fix bug in bictcp_acked()
  net: flow_dissector: add 802.1ad support
  macvtap: fix two races

Geert Uytterhoeven (1):
  m68k/atari: ARAnyM - Fix NatFeat module support

Guenter Roeck (1):
  s390: Fix broken build

Hannes Frederic Sowa (1):
  ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not match

Himanshu Madhani (1):
  qlcnic: Fix set driver version command

Hyong-Youb Kim (1):
  myri10ge: Update MAINTAINERS

Jan Kara (2):
  jbd2: Fix use

Re: bcache: Fix a writeback performance regression

2013-08-18 Thread Stefan Priebe



Vanilla 3.10.7 + bcache: Fix a writeback performance regression

http://pastebin.com/raw.php?i=LXZk4cMH

Stefan

Am 16.08.2013 12:11, schrieb Stefan Priebe - Profihost AG:

Hi,

bcache: Fix a writeback performance regression

this one results in 3.10 into hung tasks in bcache_writeback read_dirty.

Stefan
Am 15.08.2013 08:43, schrieb Stefan Priebe - Profihost AG:

Am 15.08.2013 00:59, schrieb Kent Overstreet:

Jens, here's the latest bcache fixes. Some urgent stuff in here:


The following changes since commit 79826c35eb99cd3c0873b8396f45fa26c87fb0b0:

   bcache: Allocation kthread fixes (2013-07-12 00:22:49 -0700)

are available in the git repository at:

   git://evilpiepirate.org/~kent/linux-bcache.git bcache-for-3.11

for you to fetch changes up to 0434a516e99ee51ac2d0dfa71b2f56c89ac5db05:

   bcache: Fix a flush/fua performance bug (2013-08-14 15:44:51 -0700)


Gabriel de Perthuis (1):
   bcache: Strip endline when writing the label through sysfs

Geert Uytterhoeven (1):
   bcache: Correct printf()-style format length modifier

Kent Overstreet (4):
   bcache: Fix a dumb journal discard bug
   bcache: Fix for when no journal entries are found
   bcache: Fix a writeback performance regression
   bcache: Fix a flush/fua performance bug

  drivers/md/bcache/bcache.h|  7 +++
  drivers/md/bcache/btree.c |  2 +-
  drivers/md/bcache/journal.c   | 33 -
  drivers/md/bcache/sysfs.c |  9 +++--
  drivers/md/bcache/util.c  | 11 ++-
  drivers/md/bcache/util.h  | 12 +---
  drivers/md/bcache/writeback.c | 37 ++---
  7 files changed, 68 insertions(+), 43 deletions(-)
--


As 3.10 is long term stable release. You might need CC the stable list
for patches which have to go to 3.10 as well.

At least this one should go to 3.10:

Gabriel de Perthuis (1):
   bcache: Strip endline when writing the label through sysfs


Thanks!

Greets,
Stefan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: musb: Avoid null pointer dereference in debug logging

2013-08-18 Thread Sergei Shtylyov


Hello.

On 18-08-2013 20:21, Maarten ter Huurne wrote:


Since commit 511f3c53 usb_gadget_remove_driver will pass NULL for the


   Please also specify that commit's summary line in parens.


driver argument.



Signed-off-by: Maarten ter Huurne 
---
  drivers/usb/musb/musb_gadget.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


WBR, Sergei


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 >

1 - 100 of 576 matches

Mail list logo