date:20130110

Re: [RFC] Reproducible OOM with partial workaround

2013-01-10 Thread Andrew Morton

On Fri, 11 Jan 2013 12:46:15 +1100 paul.sz...@sydney.edu.au wrote:

> > ... I don't believe 64GB of RAM has _ever_ been booted on a 32-bit
> > kernel without either violating the ABI (3GB/1GB split) or doing
> > something that never got merged upstream ...
> 
> Sorry to be so contradictory:
> 
> psz@como:~$ uname -a
> Linux como.maths.usyd.edu.au 3.2.32-pk06.10-t01-i386 #1 SMP Sat Jan 5 
> 18:34:25 EST 2013 i686 GNU/Linux
> psz@como:~$ free -l
>  total   used   free sharedbuffers cached
> Mem:  644469004729292   59717608  0  15972 480520
> Low:375836 304400  71436
> High: 640710644424892   59646172
> -/+ buffers/cache:4232800   60214100
> Swap:134217724  0  134217724
> psz@como:~$ 
> 
> (though I would not know about violations).
> 
> But OK, I take your point that I should move with the times.

Check /proc/slabinfo, see if all your lowmem got eaten up by buffer_heads.

If so, you *may* be able to work around this by setting
/proc/sys/vm/dirty_ratio really low, so the system keeps a minimum
amount of dirty pagecache around.  Then, with luck, if we haven't
broken the buffer_heads_over_limit logic it in the past decade (we
probably have), the VM should be able to reclaim those buffer_heads.

Alternatively, use a filesystem which doesn't attach buffer_heads to
dirty pages.  xfs or btrfs, perhaps.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v4 05/14] dmaengine: edma: Add TI EDMA device tree binding

2013-01-10 Thread Hebbar, Gururaja

On Fri, Jan 11, 2013 at 11:18:41, Porter, Matt wrote:
> The binding definition is based on the generic DMA controller
> binding.
> 
> Signed-off-by: Matt Porter 
> ---
>  Documentation/devicetree/bindings/dma/ti-edma.txt |   51 
> +
>  1 file changed, 51 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/dma/ti-edma.txt
> 
> diff --git a/Documentation/devicetree/bindings/dma/ti-edma.txt 
> b/Documentation/devicetree/bindings/dma/ti-edma.txt
> new file mode 100644
> index 000..3344345
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/ti-edma.txt
> @@ -0,0 +1,51 @@
> +TI EDMA
> +
> +Required properties:
> +- compatible : "ti,edma3"
> +- ti,hwmods: Name of the hwmods associated to the EDMA
> +- ti,edma-regions: Number of regions
> +- ti,edma-slots: Number of slots
> +- ti,edma-queue-tc-map: List of transfer control to queue mappings
> +- ti,edma-queue-priority-map: List of queue priority mappings
> +- ti,edma-default-queue: Default queue value
> +
> +Optional properties:
> +- ti,edma-reserved-channels: List of reserved channel regions
> +- ti,edma-reserved-slots: List of reserved slot regions
> +- ti,edma-xbar-event-map: Crossbar event to channel map
> +
> +Example:
> +
> +edma: edma@4900 {
> + #address-cells = <1>;
> + #size-cells = <0>;

address-cells & size-cells are only required when current node is a parent 
node & it has sibling/child nodes (that too if the child node uses "reg" 
property).

> + reg = <0x4900 0x1>;
> + interrupt-parent = <>;
> + interrupts = <12 13 14>;
> + compatible = "ti,edma3";
> + ti,hwmods = "tpcc", "tptc0", "tptc1", "tptc2";
> + #dma-cells = <1>;
> + dma-channels = <64>;
> + ti,edma-regions = <4>;
> + ti,edma-slots = <256>;
> + ti,edma-reserved-channels = <0  2
> +  14 2
> +  26 6
> +  48 4
> +  56 8>;
> + ti,edma-reserved-slots = <0  2
> +   14 2
> +   26 6
> +   48 4
> +   56 8
> +   64 127>;
> + ti,edma-queue-tc-map = <0 0
> + 1 1
> + 2 2>;
> + ti,edma-queue-priority-map = <0 0
> +   1 1
> +   2 2>;
> + ti,edma-default-queue = <0>;
> + ti,edma-xbar-event-map = <1 12
> +   2 13>;
> +};
> -- 
> 1.7.9.5
> 
> ___
> Davinci-linux-open-source mailing list
> davinci-linux-open-sou...@linux.davincidsp.com
> http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
> 


Regards, 
Gururaja
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] slub: assign refcount for kmalloc_caches

2013-01-10 Thread Joonsoo Kim

On Thu, Jan 10, 2013 at 08:47:39PM -0800, Paul Hargrove wrote:
> I just had a look at patch-3.7.2-rc1, and this change doesn't appear to
> have made it in yet.
> Am I missing something?
> 
> -Paul

I try to check it.
Ccing to Greg.

Hello, Pekka and Greg.

v3.8-rcX has already fixed by another stuff, but it is not simple change.
So I made a new patch and sent it.

How this kind of patch (only for stable v3.7) go into stable tree?
through Pekka's slab tree? or send it to Greg, directly?

I don't know how to submit this kind of patch to stable tree exactly.
Could anyone help me?

Thanks.

> On Tue, Dec 25, 2012 at 7:30 AM, JoonSoo Kim  wrote:
> 
> > 2012/12/26 Joonsoo Kim :
> > > commit cce89f4f6911286500cf7be0363f46c9b0a12ce0('Move kmem_cache
> > > refcounting to common code') moves some refcount manipulation code to
> > > common code. Unfortunately, it also removed refcount assignment for
> > > kmalloc_caches. So, kmalloc_caches's refcount is initially 0.
> > > This makes errornous situation.
> > >
> > > Paul Hargrove report that when he create a 8-byte kmem_cache and
> > > destory it, he encounter below message.
> > > 'Objects remaining in kmalloc-8 on kmem_cache_close()'
> > >
> > > 8-byte kmem_cache merge with 8-byte kmalloc cache and refcount is
> > > increased by one. So, resulting refcount is 1. When destory it, it hit
> > > refcount = 0, then kmem_cache_close() is executed and error message is
> > > printed.
> > >
> > > This patch assign initial refcount 1 to kmalloc_caches, so fix this
> > > errornous situtation.
> > >
> > > Cc:  # v3.7
> > > Cc: Christoph Lameter 
> > > Reported-by: Paul Hargrove 
> > > Signed-off-by: Joonsoo Kim 
> > >
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index a0d6984..321afab 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -3279,6 +3279,7 @@ static struct kmem_cache *__init
> > create_kmalloc_cache(const char *name,
> > > if (kmem_cache_open(s, flags))
> > > goto panic;
> > >
> > > +   s->refcount = 1;
> > > list_add(>list, _caches);
> > > return s;
> > >
> > > --
> > > 1.7.9.5
> > >
> >
> > I missed some explanation.
> > In v3.8-rc1, this problem is already solved.
> > See create_kmalloc_cache() in mm/slab_common.c.
> > So this patch is just for v3.7 stable.
> >
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops in copy_page_rep()

2013-01-10 Thread Simon Jeons

On Tue, 2013-01-08 at 18:49 +0100, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote:
> > Merged patch is obviously broken: huge_pmd_set_accessed() can be called
> > only if the pmd is under splitting.
> 
> Of course I assume you meant "only if the pmd is not under splitting".
> 
> But no, setting a bitflag like the young bit or clearing or setting
> the numa bit won't screw with split_huge_page and it's safe even if
> the pmd is under splitting.
> 
> Those bits are only checked here at the last stage of
> split_huge_page_map after taking the PT lock:
> 
>   spin_lock(>page_table_lock);
>   pmd = page_check_address_pmd(page, mm, address,
>PAGE_CHECK_ADDRESS_PMD_SPLITTING_FLAG);
>   if (pmd) {
>   pgtable = pgtable_trans_huge_withdraw(mm);
>   pmd_populate(mm, &_pmd, pgtable);
> 
>   haddr = address;
>   for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
>   pte_t *pte, entry;
>   BUG_ON(PageCompound(page+i));
>   entry = mk_pte(page + i, vma->vm_page_prot);
>   entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>   if (!pmd_write(*pmd))
>   entry = pte_wrprotect(entry);
>   else
>   BUG_ON(page_mapcount(page) != 1);
>   if (!pmd_young(*pmd))
>   entry = pte_mkold(entry);
>   if (pmd_numa(*pmd))
>   entry = pte_mknuma(entry);
>   pte = pte_offset_map(&_pmd, haddr);
>   BUG_ON(!pte_none(*pte));
>   set_pte_at(mm, haddr, pte, entry);
>   pte_unmap(pte);
>   }
> 
> If "young" or "numa" bitflags changed on the original *pmd for the
> previous part of split_huge_page, nothing will go wrong by the time we
> get to split_huge_page_map (the same is not true if the pfn changes!).
> 

But this time BUG_ON(mapcount != mapcount2) in function
__split_huge_page will be trigged.

> If you think this is too tricky, we could also decide to forbid
> huge_pmd_set_accessed if the pmd is in splitting state, but I don't
> think that flipping young/numa bits while in splitting state, can
> cause any problem (if done correctly with PT lock + pmd_same).
> 
> Thanks!
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v1 05/16] vfs: add hooks to enable hot tracking

2013-01-10 Thread Zhi Yong Wu

On Thu, Jan 10, 2013 at 8:52 AM, David Sterba  wrote:
> On Thu, Dec 20, 2012 at 10:43:24PM +0800, zwu.ker...@gmail.com wrote:
>> --- a/fs/direct-io.c
>> +++ b/fs/direct-io.c
>> @@ -37,6 +37,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include "hot_tracking.h"
>>
>>  /*
>>   * How many user pages to map in one call to get_user_pages().  This 
>> determines
>> @@ -1299,6 +1300,11 @@ __blockdev_direct_IO(int rw, struct kiocb *iocb, 
>> struct inode *inode,
>>   prefetch(bdev->bd_queue);
>>   prefetch((char *)bdev->bd_queue + SMP_CACHE_BYTES);
>>
>> + /* Hot data tracking */
>> + hot_update_freqs(inode, offset,
>> + iov_length(iov, nr_segs),
>> + rw & WRITE);
>
> hot_update_freqs takes an 'int rw' directly, so you should pass plain
> 'rw' here and do the 'rw & WRITE' check in hot_freq_data_update itself.
OK, done.
>
>> +
>>   return do_blockdev_direct_IO(rw, iocb, inode, bdev, iov, offset,
>>nr_segs, get_block, end_io,
>>submit_io, flags);
>> --- a/mm/page-writeback.c
>> +++ b/mm/page-writeback.c
>> @@ -35,6 +35,7 @@
>>  #include  /* __set_page_dirty_buffers */
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>
>>  /*
>> @@ -1902,13 +1903,24 @@ EXPORT_SYMBOL(generic_writepages);
>>  int do_writepages(struct address_space *mapping, struct writeback_control 
>> *wbc)
>>  {
>>   int ret;
>> + loff_t start = 0;
>> + size_t count = 0;
>>
>>   if (wbc->nr_to_write <= 0)
>>   return 0;
>> +
>> + start = mapping->writeback_index << PAGE_CACHE_SHIFT;
>> + count = wbc->nr_to_write;
>> +
>>   if (mapping->a_ops->writepages)
>>   ret = mapping->a_ops->writepages(mapping, wbc);
>>   else
>>   ret = generic_writepages(mapping, wbc);
>> +
>> + /* Hot data tracking */
>> + hot_update_freqs(mapping->host, start,
>> + (count - wbc->nr_to_write) * PAGE_CACHE_SIZE, 1);
>
> I think the frequencies should not be updated in case of error returned
> from writepages.
OK, Done.
>
>> +
>>   return ret;
>>  }
>>
>> --- a/mm/readahead.c
>> +++ b/mm/readahead.c
>> @@ -138,6 +139,12 @@ static int read_pages(struct address_space *mapping, 
>> struct file *filp,
>>  out:
>>   blk_finish_plug();
>>
>> + /* Hot data tracking */
>> + hot_update_freqs(mapping->host,
>> + (loff_t)(list_entry(pages->prev, struct page, lru)->index)
>> + << PAGE_CACHE_SHIFT,
>> + (size_t)nr_pages * PAGE_CACHE_SIZE, 0);
>
> same comment here
Ditto. thanks.
>
>> +
>>   return ret;
>>  }
>
>
> david



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Oops in sound/usb/pcm.c:match_endpoint_audioformats() in current -git

2013-01-10 Thread Jens Axboe

On 2013-01-10 20:45, Eldad Zack wrote:
> Jens, could you please send me the device's descriptors (lsusb -v)?
> I'd like to take a closer look at this.

Below.

Bus 006 Device 010: ID 22e8:dac1  
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   1.00
  bDeviceClass0 (Defined at Interface level)
  bDeviceSubClass 0 
  bDeviceProtocol 0 
  bMaxPacketSize064
  idVendor   0x22e8 
  idProduct  0xdac1 
  bcdDevice3.21
  iManufacturer   1 Cambridge Audio 
  iProduct2 Cambridge Audio USB Audio 1.0
  iSerial 3 
  bNumConfigurations  1
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength  128
bNumInterfaces  2
bConfigurationValue 1
iConfiguration  0 
bmAttributes 0x80
  (Bus Powered)
MaxPower  500mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   0
  bInterfaceClass 1 Audio
  bInterfaceSubClass  1 Control Device
  bInterfaceProtocol  0 
  iInterface  2 Cambridge Audio USB Audio 1.0
  AudioControl Interface Descriptor:
bLength 9
bDescriptorType36
bDescriptorSubtype  1 (HEADER)
bcdADC   1.00
wTotalLength   40
bInCollection   1
baInterfaceNr( 0)   1
  AudioControl Interface Descriptor:
bLength12
bDescriptorType36
bDescriptorSubtype  2 (INPUT_TERMINAL)
bTerminalID 1
wTerminalType  0x0101 USB Streaming
bAssocTerminal  0
bNrChannels 2
wChannelConfig 0x0003
  Left Front (L)
  Right Front (R)
iChannelNames   0 
iTerminal   6 Cambridge Audio Audio 1.0 Output
  AudioControl Interface Descriptor:
bLength10
bDescriptorType36
bDescriptorSubtype  6 (FEATURE_UNIT)
bUnitID10
bSourceID   1
bControlSize1
bmaControls( 0)  0x01
  Mute Control
bmaControls( 1)  0x01
  Mute Control
bmaControls( 2)  0x01
  Mute Control
iFeature5 Cambridge Audio USB 1.0 Audio In
  AudioControl Interface Descriptor:
bLength 9
bDescriptorType36
bDescriptorSubtype  3 (OUTPUT_TERMINAL)
bTerminalID 6
wTerminalType  0x0301 Speaker
bAssocTerminal  0
bSourceID  10
iTerminal   0 
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber1
  bAlternateSetting   0
  bNumEndpoints   0
  bInterfaceClass 1 Audio
  bInterfaceSubClass  2 Streaming
  bInterfaceProtocol  0 
  iInterface  4 Cambridge Audio USB 1.0 Audio Out
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber1
  bAlternateSetting   1
  bNumEndpoints   2
  bInterfaceClass 1 Audio
  bInterfaceSubClass  2 Streaming
  bInterfaceProtocol  0 
  iInterface  4 Cambridge Audio USB 1.0 Audio Out
  AudioStreaming Interface Descriptor:
bLength 7
bDescriptorType36
bDescriptorSubtype  1 (AS_GENERAL)
bTerminalLink   1
bDelay  1 frames
wFormatTag  1 PCM
  AudioStreaming Interface Descriptor:
bLength20
bDescriptorType36
bDescriptorSubtype  2 (FORMAT_TYPE)
bFormatType 1 (FORMAT_TYPE_I)
bNrChannels 2
bSubframeSize   3
bBitResolution 24
bSamFreqType4 Discrete
tSamFreq[ 0]44100
tSamFreq[ 1]48000
tSamFreq[ 2]88200
tSamFreq[ 3]96000
  Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x01  EP 1 OUT
bmAttributes5
  Transfer TypeIsochronous
  Synch Type   Asynchronous
  Usage Type   Data
wMaxPacketSize 0x0246  1x 582 bytes
bInterval   1
bRefresh0
bSynchAddress 129
AudioControl Endpoint Descriptor:
  bLength 7

[PATCH v4 4/9] ARM: tegra: Define Tegra20 CAR binding

2013-01-10 Thread Prashant Gaikwad

From: Stephen Warren 

The Tegra20 CAR (Clock And Reset) Controller controls most aspects of
most clocks within Tegra20. The device tree binding models this as a
single monolithic clock provider, which exports many clocks. This reduces
the number of nodes needed in device tree to represent these clocks.

This binding is only useful for Tegra20; the set of clocks that exists on
Tegra30 is sufficiently different to merit its own binding.

Signed-off-by: Stephen Warren 
Acked-by: Simon Glass 
[pgaikwad: Added mux clk ids and sorted CAR node]
Signed-off-by: Prashant Gaikwad 
---
 .../bindings/clock/nvidia,tegra20-car.txt  |  205 
 arch/arm/boot/dts/tegra20.dtsi |6 +
 2 files changed, 211 insertions(+), 0 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/clock/nvidia,tegra20-car.txt

diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.txt 
b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.txt
new file mode 100644
index 000..0921fac
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.txt
@@ -0,0 +1,205 @@
+NVIDIA Tegra20 Clock And Reset Controller
+
+This binding uses the common clock binding:
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+
+The CAR (Clock And Reset) Controller on Tegra is the HW module responsible
+for muxing and gating Tegra's clocks, and setting their rates.
+
+Required properties :
+- compatible : Should be "nvidia,tegra20-car"
+- reg : Should contain CAR registers location and length
+- clocks : Should contain phandle and clock specifiers for two clocks:
+  the 32 KHz "32k_in", and the board-specific oscillator "osc".
+- #clock-cells : Should be 1.
+  In clock consumers, this cell represents the clock ID exposed by the CAR.
+
+  The first 96 clocks are numbered to match the bits in the CAR's CLK_OUT_ENB
+  registers. These IDs often match those in the CAR's RST_DEVICES registers,
+  but not in all cases. Some bits in CLK_OUT_ENB affect multiple clocks. In
+  this case, those clocks are assigned IDs above 95 in order to highlight
+  this issue. Implementations that interpret these clock IDs as bit values
+  within the CLK_OUT_ENB or RST_DEVICES registers should be careful to
+  explicitly handle these special cases.
+
+  The balance of the clocks controlled by the CAR are assigned IDs of 96 and
+  above.
+
+  0cpu
+  1unassigned
+  2unassigned
+  3ac97
+  4rtc
+  5tmr
+  6uart1
+  7unassigned  (register bit affects uart2 and vfir)
+  8gpio
+  9sdmmc2
+  10   unassigned  (register bit affects spdif_in and spdif_out)
+  11   i2s1
+  12   i2c1
+  13   ndflash
+  14   sdmmc1
+  15   sdmmc4
+  16   twc
+  17   pwm
+  18   i2s2
+  19   epp
+  20   unassigned  (register bit affects vi and vi_sensor)
+  21   2d
+  22   usbd
+  23   isp
+  24   3d
+  25   ide
+  26   disp2
+  27   disp1
+  28   host1x
+  29   vcp
+  30   unassigned
+  31   cache2
+
+  32   mem
+  33   ahbdma
+  34   apbdma
+  35   unassigned
+  36   kbc
+  37   stat_mon
+  38   pmc
+  39   fuse
+  40   kfuse
+  41   sbc1
+  42   snor
+  43   spi1
+  44   sbc2
+  45   xio
+  46   sbc3
+  47   dvc
+  48   dsi
+  49   unassigned  (register bit affects tvo and cve)
+  50   mipi
+  51   hdmi
+  52   csi
+  53   tvdac
+  54   i2c2
+  55   uart3
+  56   unassigned
+  57   emc
+  58   usb2
+  59   usb3
+  60   mpe
+  61   vde
+  62   bsea
+  63   bsev
+
+  64   speedo
+  65   uart4
+  66   uart5
+  67   i2c3
+  68   sbc4
+  69   sdmmc3
+  70   pcie
+  71   owr
+  72   afi
+  73   csite
+  74   unassigned
+  75   avpucq
+  76   la
+  77   unassigned
+  78   unassigned
+  79   unassigned
+  80   unassigned
+  81   unassigned
+  82   unassigned
+  83   unassigned
+  84   irama
+  85   iramb
+  86   iramc
+  87   iramd
+  88   cram2
+  89   audio_2xa/k/a audio_2x_sync_clk
+  90   clk_d
+  91   unassigned
+  92   sus
+  93   cdev1
+  94   cdev2
+  95   unassigned
+
+  96   uart2
+  97   vfir
+  98   spdif_in
+  99   spdif_out
+  100  vi
+  101  vi_sensor
+  102  tvo
+  103  cve
+  104  osc
+  105  clk_32k a/k/a clk_s
+  106  clk_m
+  107  sclk
+  108  cclk
+  109  hclk
+  110  pclk
+  111  blink
+  112  pll_a
+  113  pll_a_out0
+  114  pll_c
+  115  pll_c_out1
+  116  pll_d
+  117  pll_d_out0
+  118  pll_e
+  119  pll_m
+  120  pll_m_out1
+  121  pll_p
+  122  pll_p_out1
+  123  pll_p_out2
+  124  pll_p_out3
+  125  pll_p_out4
+  126  pll_s
+  127  pll_u
+  128  pll_x
+  129  cop a/k/a avp
+  130  audio   a/k/a audio_sync_clk
+  131  pll_ref
+  132  twd
+
+Example SoC include file:
+
+/ {
+   tegra_car: clock {
+   compatible = "nvidia,tegra20-car";
+   reg = <0x60006000 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   usb@c5004000 {
+   clocks = <_car 58>; /* usb2 */
+   };
+};
+
+Example board file:
+
+/ {
+   clocks {
+   compatible

[PATCH v4 6/9] clk: tegra: add clock support for tegra20

2013-01-10 Thread Prashant Gaikwad

Add tegra20 clock support based on common clock framework.

Signed-off-by: Prashant Gaikwad 
---
 drivers/clk/tegra/Makefile  |2 +
 drivers/clk/tegra/clk-tegra20.c | 1255 +++
 drivers/clk/tegra/clk.h |6 +
 3 files changed, 1263 insertions(+), 0 deletions(-)
 create mode 100644 drivers/clk/tegra/clk-tegra20.c

diff --git a/drivers/clk/tegra/Makefile b/drivers/clk/tegra/Makefile
index 68bd353..00484fd 100644
--- a/drivers/clk/tegra/Makefile
+++ b/drivers/clk/tegra/Makefile
@@ -6,3 +6,5 @@ obj-y   += clk-periph-gate.o
 obj-y  += clk-pll.o
 obj-y  += clk-pll-out.o
 obj-y  += clk-super.o
+
+obj-$(CONFIG_ARCH_TEGRA_2x_SOC) += clk-tegra20.o
diff --git a/drivers/clk/tegra/clk-tegra20.c b/drivers/clk/tegra/clk-tegra20.c
new file mode 100644
index 000..4875261
--- /dev/null
+++ b/drivers/clk/tegra/clk-tegra20.c
@@ -0,0 +1,1255 @@
+/*
+ * Copyright (c) 2012, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "clk.h"
+
+#define RST_DEVICES_L 0x004
+#define RST_DEVICES_H 0x008
+#define RST_DEVICES_U 0x00c
+#define RST_DEVICES_SET_L 0x300
+#define RST_DEVICES_CLR_L 0x304
+#define RST_DEVICES_SET_H 0x308
+#define RST_DEVICES_CLR_H 0x30c
+#define RST_DEVICES_SET_U 0x310
+#define RST_DEVICES_CLR_U 0x314
+#define RST_DEVICES_NUM 3
+
+#define CLK_OUT_ENB_L 0x010
+#define CLK_OUT_ENB_H 0x014
+#define CLK_OUT_ENB_U 0x018
+#define CLK_OUT_ENB_SET_L 0x320
+#define CLK_OUT_ENB_CLR_L 0x324
+#define CLK_OUT_ENB_SET_H 0x328
+#define CLK_OUT_ENB_CLR_H 0x32c
+#define CLK_OUT_ENB_SET_U 0x330
+#define CLK_OUT_ENB_CLR_U 0x334
+#define CLK_OUT_ENB_NUM 3
+
+#define OSC_CTRL 0x50
+#define OSC_CTRL_OSC_FREQ_MASK (3<<30)
+#define OSC_CTRL_OSC_FREQ_13MHZ (0<<30)
+#define OSC_CTRL_OSC_FREQ_19_2MHZ (1<<30)
+#define OSC_CTRL_OSC_FREQ_12MHZ (2<<30)
+#define OSC_CTRL_OSC_FREQ_26MHZ (3<<30)
+#define OSC_CTRL_MASK (0x3f2 | OSC_CTRL_OSC_FREQ_MASK)
+
+#define OSC_CTRL_PLL_REF_DIV_MASK (3<<28)
+#define OSC_CTRL_PLL_REF_DIV_1 (0<<28)
+#define OSC_CTRL_PLL_REF_DIV_2 (1<<28)
+#define OSC_CTRL_PLL_REF_DIV_4 (2<<28)
+
+#define OSC_FREQ_DET 0x58
+#define OSC_FREQ_DET_TRIG (1<<31)
+
+#define OSC_FREQ_DET_STATUS 0x5c
+#define OSC_FREQ_DET_BUSY (1<<31)
+#define OSC_FREQ_DET_CNT_MASK 0x
+
+#define PLLS_BASE 0xf0
+#define PLLS_MISC 0xf4
+#define PLLC_BASE 0x80
+#define PLLC_MISC 0x8c
+#define PLLM_BASE 0x90
+#define PLLM_MISC 0x9c
+#define PLLP_BASE 0xa0
+#define PLLP_MISC 0xac
+#define PLLA_BASE 0xb0
+#define PLLA_MISC 0xbc
+#define PLLU_BASE 0xc0
+#define PLLU_MISC 0xcc
+#define PLLD_BASE 0xd0
+#define PLLD_MISC 0xdc
+#define PLLX_BASE 0xe0
+#define PLLX_MISC 0xe4
+#define PLLE_BASE 0xe8
+#define PLLE_MISC 0xec
+
+#define PLL_BASE_LOCK 27
+#define PLLE_MISC_LOCK 11
+
+#define PLL_MISC_LOCK_ENABLE 18
+#define PLLDU_MISC_LOCK_ENABLE 22
+#define PLLE_MISC_LOCK_ENABLE 9
+
+#define PLLC_OUT 0x84
+#define PLLM_OUT 0x94
+#define PLLP_OUTA 0xa4
+#define PLLP_OUTB 0xa8
+#define PLLA_OUT 0xb4
+
+#define CCLK_BURST_POLICY 0x20
+#define SUPER_CCLK_DIVIDER 0x24
+#define SCLK_BURST_POLICY 0x28
+#define SUPER_SCLK_DIVIDER 0x2c
+#define CLK_SYSTEM_RATE 0x30
+
+#define CLK_SOURCE_I2S1 0x100
+#define CLK_SOURCE_I2S2 0x104
+#define CLK_SOURCE_SPDIF_OUT 0x108
+#define CLK_SOURCE_SPDIF_IN 0x10c
+#define CLK_SOURCE_PWM 0x110
+#define CLK_SOURCE_SPI 0x114
+#define CLK_SOURCE_SBC1 0x134
+#define CLK_SOURCE_SBC2 0x118
+#define CLK_SOURCE_SBC3 0x11c
+#define CLK_SOURCE_SBC4 0x1b4
+#define CLK_SOURCE_XIO 0x120
+#define CLK_SOURCE_TWC 0x12c
+#define CLK_SOURCE_IDE 0x144
+#define CLK_SOURCE_NDFLASH 0x160
+#define CLK_SOURCE_VFIR 0x168
+#define CLK_SOURCE_SDMMC1 0x150
+#define CLK_SOURCE_SDMMC2 0x154
+#define CLK_SOURCE_SDMMC3 0x1bc
+#define CLK_SOURCE_SDMMC4 0x164
+#define CLK_SOURCE_CVE 0x140
+#define CLK_SOURCE_TVO 0x188
+#define CLK_SOURCE_TVDAC 0x194
+#define CLK_SOURCE_HDMI 0x18c
+#define CLK_SOURCE_DISP1 0x138
+#define CLK_SOURCE_DISP2 0x13c
+#define CLK_SOURCE_CSITE 0x1d4
+#define CLK_SOURCE_LA 0x1f8
+#define CLK_SOURCE_OWR 0x1cc
+#define CLK_SOURCE_NOR 0x1d0
+#define CLK_SOURCE_MIPI 0x174
+#define CLK_SOURCE_I2C1 0x124
+#define CLK_SOURCE_I2C2 0x198
+#define CLK_SOURCE_I2C3 0x1b8

[PATCH v4 3/9] arm: tegra: Move tegra_cpu_car.h to linux/clk/tegra.h

2013-01-10 Thread Prashant Gaikwad

tegra_cpu_car_ops struct is going to be accessed from drivers/clk/tegra.
Move the tegra_cpu_car_ops to include/linux/clk/tegra.h.

Signed-off-by: Prashant Gaikwad 
---
 arch/arm/mach-tegra/clock.c|2 +-
 arch/arm/mach-tegra/cpuidle-tegra30.c  |2 +-
 arch/arm/mach-tegra/hotplug.c  |2 +-
 arch/arm/mach-tegra/platsmp.c  |2 +-
 arch/arm/mach-tegra/pm.c   |2 +-
 arch/arm/mach-tegra/tegra20_clocks.c   |2 +-
 arch/arm/mach-tegra/tegra20_clocks_data.c  |2 +-
 arch/arm/mach-tegra/tegra30_clocks.c   |2 +-
 arch/arm/mach-tegra/tegra30_clocks_data.c  |2 +-
 .../tegra_cpu_car.h => include/linux/clk/tegra.h   |6 +++---
 10 files changed, 12 insertions(+), 12 deletions(-)
 rename arch/arm/mach-tegra/tegra_cpu_car.h => include/linux/clk/tegra.h (96%)

diff --git a/arch/arm/mach-tegra/clock.c b/arch/arm/mach-tegra/clock.c
index 867bf8b..8c0ff06 100644
--- a/arch/arm/mach-tegra/clock.c
+++ b/arch/arm/mach-tegra/clock.c
@@ -26,10 +26,10 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "board.h"
 #include "clock.h"
-#include "tegra_cpu_car.h"
 
 /* Global data of Tegra CPU CAR ops */
 struct tegra_cpu_car_ops *tegra_cpu_car_ops;
diff --git a/arch/arm/mach-tegra/cpuidle-tegra30.c 
b/arch/arm/mach-tegra/cpuidle-tegra30.c
index 82530bd..8b50cf4 100644
--- a/arch/arm/mach-tegra/cpuidle-tegra30.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra30.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -32,7 +33,6 @@
 
 #include "pm.h"
 #include "sleep.h"
-#include "tegra_cpu_car.h"
 
 #ifdef CONFIG_PM_SLEEP
 static int tegra30_idle_lp2(struct cpuidle_device *dev,
diff --git a/arch/arm/mach-tegra/hotplug.c b/arch/arm/mach-tegra/hotplug.c
index 6a27de4..a599f6e 100644
--- a/arch/arm/mach-tegra/hotplug.c
+++ b/arch/arm/mach-tegra/hotplug.c
@@ -10,12 +10,12 @@
  */
 #include 
 #include 
+#include 
 
 #include 
 #include 
 
 #include "sleep.h"
-#include "tegra_cpu_car.h"
 
 static void (*tegra_hotplug_shutdown)(void);
 
diff --git a/arch/arm/mach-tegra/platsmp.c b/arch/arm/mach-tegra/platsmp.c
index 6867030..3ec7fc4 100644
--- a/arch/arm/mach-tegra/platsmp.c
+++ b/arch/arm/mach-tegra/platsmp.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -30,7 +31,6 @@
 #include "fuse.h"
 #include "flowctrl.h"
 #include "reset.h"
-#include "tegra_cpu_car.h"
 
 #include "common.h"
 #include "iomap.h"
diff --git a/arch/arm/mach-tegra/pm.c b/arch/arm/mach-tegra/pm.c
index 498d70b..abfe9b9 100644
--- a/arch/arm/mach-tegra/pm.c
+++ b/arch/arm/mach-tegra/pm.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -36,7 +37,6 @@
 #include "reset.h"
 #include "flowctrl.h"
 #include "sleep.h"
-#include "tegra_cpu_car.h"
 
 #define TEGRA_POWER_CPU_PWRREQ_OE  (1 << 16)  /* CPU pwr req enable */
 
diff --git a/arch/arm/mach-tegra/tegra20_clocks.c 
b/arch/arm/mach-tegra/tegra20_clocks.c
index 4eb6bc8..1a80ff6 100644
--- a/arch/arm/mach-tegra/tegra20_clocks.c
+++ b/arch/arm/mach-tegra/tegra20_clocks.c
@@ -26,12 +26,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "clock.h"
 #include "fuse.h"
 #include "iomap.h"
 #include "tegra2_emc.h"
-#include "tegra_cpu_car.h"
 
 #define RST_DEVICES0x004
 #define RST_DEVICES_SET0x300
diff --git a/arch/arm/mach-tegra/tegra20_clocks_data.c 
b/arch/arm/mach-tegra/tegra20_clocks_data.c
index a23a073..022cdae 100644
--- a/arch/arm/mach-tegra/tegra20_clocks_data.c
+++ b/arch/arm/mach-tegra/tegra20_clocks_data.c
@@ -26,12 +26,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "clock.h"
 #include "fuse.h"
 #include "tegra2_emc.h"
 #include "tegra20_clocks.h"
-#include "tegra_cpu_car.h"
 
 /* Clock definitions */
 
diff --git a/arch/arm/mach-tegra/tegra30_clocks.c 
b/arch/arm/mach-tegra/tegra30_clocks.c
index d714777..4330787 100644
--- a/arch/arm/mach-tegra/tegra30_clocks.c
+++ b/arch/arm/mach-tegra/tegra30_clocks.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -36,7 +37,6 @@
 #include "clock.h"
 #include "fuse.h"
 #include "iomap.h"
-#include "tegra_cpu_car.h"
 
 #define USE_PLL_LOCK_BITS 0
 
diff --git a/arch/arm/mach-tegra/tegra30_clocks_data.c 
b/arch/arm/mach-tegra/tegra30_clocks_data.c
index 741d264..9bfaa49 100644
--- a/arch/arm/mach-tegra/tegra30_clocks_data.c
+++ b/arch/arm/mach-tegra/tegra30_clocks_data.c
@@ -28,11 +28,11 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "clock.h"
 #include "fuse.h"
 #include "tegra30_clocks.h"
-#include "tegra_cpu_car.h"
 
 #define DEFINE_CLK_TEGRA(_name, _rate, _ops, _flags,   \
   _parent_names, _parents, _parent)\
diff --git a/arch/arm/mach-tegra/tegra_cpu_car.h b/include/linux/clk/tegra.h
similarity index 96%
rename from

Re: [PATCH 1/1] fs/xfs remove obsolete simple_strto

2013-01-10 Thread Abhijit Pawar

On 01/11/2013 12:06 PM, Jeff Liu wrote:
> On 01/09/2013 10:04 PM, Abhijit Pawar wrote:
>> This patch replaces usages of obsolete simple_strtoul with kstrtoint in 
>> xfs_args and suffix_strtoul.
>>
>> Signed-off-by: Abhijit Pawar 
>> ---
>>  fs/xfs/xfs_super.c |   29 +++--
>>  1 files changed, 19 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
>> index ab8839b..c407121 100644
>> --- a/fs/xfs/xfs_super.c
>> +++ b/fs/xfs/xfs_super.c
>> @@ -139,9 +139,9 @@ static const match_table_t tokens = {
>>  
>>  
>>  STATIC unsigned long
>> -suffix_strtoul(char *s, char **endp, unsigned int base)
>> +suffix_kstrtoint(char *s, unsigned int base, int *res)
>>  {
>> -int last, shift_left_factor = 0;
>> +int last, shift_left_factor = 0, _res;
>>  char*value = s;
>>  
>>  last = strlen(value) - 1;
>> @@ -158,7 +158,10 @@ suffix_strtoul(char *s, char **endp, unsigned int base)
>>  value[last] = '\0';
>>  }
>>  
>> -return simple_strtoul((const char *)s, endp, base) << shift_left_factor;
>> +if (kstrtoint(s, base, &_res))
>> +return -EINVAL;
>> +*res = _res << shift_left_factor;
>> +return 0;
>>  }
>>  
>>  /*
>> @@ -174,7 +177,7 @@ xfs_parseargs(
>>  char*options)
>>  {
>>  struct super_block  *sb = mp->m_super;
>> -char*this_char, *value, *eov;
>> +char*this_char, *value;
>>  int dsunit = 0;
>>  int dswidth = 0;
>>  int iosize = 0;
>> @@ -230,14 +233,16 @@ xfs_parseargs(
>>  this_char);
>>  return EINVAL;
>>  }
>> -mp->m_logbufs = simple_strtoul(value, , 10);
>> +if (kstrtoint(value, 10, >m_logbufs))
>> +return EINVAL;
>>  } else if (!strcmp(this_char, MNTOPT_LOGBSIZE)) {
>>  if (!value || !*value) {
>>  xfs_warn(mp, "%s option requires an argument",
>>  this_char);
>>  return EINVAL;
>>  }
>> -mp->m_logbsize = suffix_strtoul(value, , 10);
>> +if (suffix_kstrtoint(value, 10, >m_logbsize))
>> +return EINVAL;
>>  } else if (!strcmp(this_char, MNTOPT_LOGDEV)) {
>>  if (!value || !*value) {
>>  xfs_warn(mp, "%s option requires an argument",
>> @@ -266,7 +271,8 @@ xfs_parseargs(
>>  this_char);
>>  return EINVAL;
>>  }
>> -iosize = simple_strtoul(value, , 10);
>> +if (kstrtoint(value, 10, ))
>> +return EINVAL;
>>  iosizelog = ffs(iosize) - 1;
>>  } else if (!strcmp(this_char, MNTOPT_ALLOCSIZE)) {
>>  if (!value || !*value) {
>> @@ -274,7 +280,8 @@ xfs_parseargs(
>>  this_char);
>>  return EINVAL;
>>  }
>> -iosize = suffix_strtoul(value, , 10);
>> +if (suffix_kstrtoint(value, 10, ))
>> +return EINVAL;
>>  iosizelog = ffs(iosize) - 1;
>>  } else if (!strcmp(this_char, MNTOPT_GRPID) ||
>> !strcmp(this_char, MNTOPT_BSDGROUPS)) {
>> @@ -296,14 +303,16 @@ xfs_parseargs(
>>  this_char);
>>  return EINVAL;
>>  }
>> -dsunit = simple_strtoul(value, , 10);
>> +if (kstrtoint(value, 10, ))
>> +return EINVAL;
>>  } else if (!strcmp(this_char, MNTOPT_SWIDTH)) {
>>  if (!value || !*value) {
>>  xfs_warn(mp, "%s option requires an argument",
>>  this_char);
>>  return EINVAL;
>>  }
>> -dswidth = simple_strtoul(value, , 10);
>> +if (kstrtoint(value, 10, ))
>> +return EINVAL;
>>  } else if (!strcmp(this_char, MNTOPT_32BITINODE)) {
>>  mp->m_flags |= XFS_MOUNT_SMALL_INUMS;
>>  } else if (!strcmp(this_char, MNTOPT_64BITINODE)) {
>>
> checkpatch.pl show warning if we return EINVAL as below:
> WARNING: return of an errno should typically be -ve (return -EINVAL)
> 
> Can we just ignore such code style issue?
I think we can.
Ben?
> 
> Thanks,
> -Jeff
> 


-- 
-
Abhijit
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

[PATCH v4 8/9] arm: tegra: Migrate to new clock code

2013-01-10 Thread Prashant Gaikwad

Migrate tegra clock support to drivers/clk/tegra, this involves
moving
1. definition of tegra_cpu_car_ops to clk.c
2. definition of reset functions to clk-peripheral.c
3. change parent of cpu clock.
4. Remove legacy clock initialization.
5. Initialize clocks using DT.
6. Remove all instance of mach/clk.h

Signed-off-by: Prashant Gaikwad 
---
 arch/arm/mach-tegra/board-dt-tegra20.c |   30 -
 arch/arm/mach-tegra/board-dt-tegra30.c |   31 --
 arch/arm/mach-tegra/clock.c|   19 -
 arch/arm/mach-tegra/common.c   |   44 +--
 arch/arm/mach-tegra/cpu-tegra.c|2 +-
 arch/arm/mach-tegra/include/mach/clk.h |3 --
 arch/arm/mach-tegra/pcie.c |2 +-
 arch/arm/mach-tegra/powergate.c|2 +-
 drivers/clk/tegra/clk-periph.c |   38 +++
 drivers/clk/tegra/clk.c|   16 +++
 drivers/dma/tegra20-apb-dma.c  |2 +-
 drivers/gpu/drm/tegra/dc.c |3 +-
 drivers/gpu/drm/tegra/drm.c|1 -
 drivers/gpu/drm/tegra/hdmi.c   |3 +-
 drivers/i2c/busses/i2c-tegra.c |3 +-
 drivers/input/keyboard/tegra-kbc.c |2 +-
 drivers/spi/spi-tegra20-sflash.c   |2 +-
 drivers/spi/spi-tegra20-slink.c|2 +-
 drivers/staging/nvec/nvec.c|3 +-
 include/linux/clk/tegra.h  |5 +++
 sound/soc/tegra/tegra30_ahub.c |2 +-
 21 files changed, 73 insertions(+), 142 deletions(-)

diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
b/arch/arm/mach-tegra/board-dt-tegra20.c
index e1f87dd..0c11b8a 100644
--- a/arch/arm/mach-tegra/board-dt-tegra20.c
+++ b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -42,7 +42,6 @@
 #include 
 
 #include "board.h"
-#include "clock.h"
 #include "common.h"
 #include "iomap.h"
 
@@ -104,37 +103,8 @@ static struct of_dev_auxdata tegra20_auxdata_lookup[] 
__initdata = {
{}
 };
 
-static __initdata struct tegra_clk_init_table tegra_dt_clk_init_table[] = {
-   /* name parent  rateenabled */
-   { "uarta",  "pll_p",21600,  true },
-   { "uartd",  "pll_p",21600,  true },
-   { "usbd",   "clk_m",1200,   false },
-   { "usb2",   "clk_m",1200,   false },
-   { "usb3",   "clk_m",1200,   false },
-   { "pll_a",  "pll_p_out1",   56448000,   true },
-   { "pll_a_out0", "pll_a",11289600,   true },
-   { "cdev1",  NULL,   0,  true },
-   { "blink",  "clk_32k",  32768,  true },
-   { "i2s1",   "pll_a_out0",   11289600,   false},
-   { "i2s2",   "pll_a_out0",   11289600,   false},
-   { "sdmmc1", "pll_p",4800,   false},
-   { "sdmmc3", "pll_p",4800,   false},
-   { "sdmmc4", "pll_p",4800,   false},
-   { "spi","pll_p",2000,   false },
-   { "sbc1",   "pll_p",1,  false },
-   { "sbc2",   "pll_p",1,  false },
-   { "sbc3",   "pll_p",1,  false },
-   { "sbc4",   "pll_p",1,  false },
-   { "host1x", "pll_c",15000,  false },
-   { "disp1",  "pll_p",6,  false },
-   { "disp2",  "pll_p",6,  false },
-   { NULL, NULL,   0,  0},
-};
-
 static void __init tegra_dt_init(void)
 {
-   tegra_clk_init_from_table(tegra_dt_clk_init_table);
-
/*
 * Finished with the static registrations now; fill in the missing
 * devices
diff --git a/arch/arm/mach-tegra/board-dt-tegra30.c 
b/arch/arm/mach-tegra/board-dt-tegra30.c
index cfe5fc0..92f6014 100644
--- a/arch/arm/mach-tegra/board-dt-tegra30.c
+++ b/arch/arm/mach-tegra/board-dt-tegra30.c
@@ -35,7 +35,6 @@
 #include 
 
 #include "board.h"
-#include "clock.h"
 #include "common.h"
 #include "iomap.h"
 
@@ -67,38 +66,8 @@ static struct of_dev_auxdata tegra30_auxdata_lookup[] 
__initdata = {
{}
 };
 
-static __initdata struct tegra_clk_init_table tegra_dt_clk_init_table[] = {
-   /* name parent  rateenabled */
-   { "uarta",  "pll_p",40800,  true },
-   { "pll_a",  "pll_p_out1",   56448,  true },
-   { "pll_a_out0", "pll_a",11289600,   true },
-   { "extern1","pll_a_out0",   0,  true },
-   { "clk_out_1",  "extern1",  0,  true },
-   { "blink",  "clk_32k",  32768,  true },
-   { "i2s0",   "pll_a_out0",   11289600,   false},
-   { "i2s1",   "pll_a_out0",   11289600,   false},
-   { "i2s2",   "pll_a_out0",   11289600,   false},
-

[PATCH v4 5/9] ARM: Tegra: Define Tegra30 CAR binding

2013-01-10 Thread Prashant Gaikwad

The device tree binding models Tegra30 CAR (Clock And Reset)
as a single monolithic clock provider.

Signed-off-by: Prashant Gaikwad 
---
 .../bindings/clock/nvidia,tegra30-car.txt  |  262 
 arch/arm/boot/dts/tegra30.dtsi |6 +
 2 files changed, 268 insertions(+), 0 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/clock/nvidia,tegra30-car.txt

diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra30-car.txt 
b/Documentation/devicetree/bindings/clock/nvidia,tegra30-car.txt
new file mode 100644
index 000..121d203
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/nvidia,tegra30-car.txt
@@ -0,0 +1,262 @@
+NVIDIA Tegra30 Clock And Reset Controller
+
+This binding uses the common clock binding:
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+
+The CAR (Clock And Reset) Controller on Tegra is the HW module responsible
+for muxing and gating Tegra's clocks, and setting their rates.
+
+Required properties :
+- compatible : Should be "nvidia,tegra30-car"
+- reg : Should contain CAR registers location and length
+- clocks : Should contain phandle and clock specifiers for two clocks:
+  the 32 KHz "32k_in", and the board-specific oscillator "osc".
+- #clock-cells : Should be 1.
+  In clock consumers, this cell represents the clock ID exposed by the CAR.
+
+  The first 130 clocks are numbered to match the bits in the CAR's CLK_OUT_ENB
+  registers. These IDs often match those in the CAR's RST_DEVICES registers,
+  but not in all cases. Some bits in CLK_OUT_ENB affect multiple clocks. In
+  this case, those clocks are assigned IDs above 160 in order to highlight
+  this issue. Implementations that interpret these clock IDs as bit values
+  within the CLK_OUT_ENB or RST_DEVICES registers should be careful to
+  explicitly handle these special cases.
+
+  The balance of the clocks controlled by the CAR are assigned IDs of 160 and
+  above.
+
+  0cpu
+  1unassigned
+  2unassigned
+  3unassigned
+  4rtc
+  5timer
+  6uarta
+  7unassigned  (register bit affects uartb and vfir)
+  8gpio
+  9sdmmc2
+  10   unassigned  (register bit affects spdif_in and spdif_out)
+  11   i2s1
+  12   i2c1
+  13   ndflash
+  14   sdmmc1
+  15   sdmmc4
+  16   unassigned
+  17   pwm
+  18   i2s2
+  19   epp
+  20   unassigned  (register bit affects vi and vi_sensor)
+  21   2d
+  22   usbd
+  23   isp
+  24   3d
+  25   unassigned
+  26   disp2
+  27   disp1
+  28   host1x
+  29   vcp
+  30   i2s0
+  31   cop_cache
+
+  32   mc
+  33   ahbdma
+  34   apbdma
+  35   unassigned
+  36   kbc
+  37   statmon
+  38   pmc
+  39   unassigned  (register bit affects fuse and fuse_burn)
+  40   kfuse
+  41   sbc1
+  42   nor
+  43   unassigned
+  44   sbc2
+  45   unassigned
+  46   sbc3
+  47   i2c5
+  48   dsia
+  49   unassigned  (register bit affects cve and tvo)
+  50   mipi
+  51   hdmi
+  52   csi
+  53   tvdac
+  54   i2c2
+  55   uartc
+  56   unassigned
+  57   emc
+  58   usb2
+  59   usb3
+  60   mpe
+  61   vde
+  62   bsea
+  63   bsev
+
+  64   speedo
+  65   uartd
+  66   uarte
+  67   i2c3
+  68   sbc4
+  69   sdmmc3
+  70   pcie
+  71   owr
+  72   afi
+  73   csite
+  74   pciex
+  75   avpucq
+  76   la
+  77   unassigned
+  78   unassigned
+  79   dtv
+  80   ndspeed
+  81   i2cslow
+  82   dsib
+  83   unassigned
+  84   irama
+  85   iramb
+  86   iramc
+  87   iramd
+  88   cram2
+  89   unassigned
+  90   audio_2xa/k/a audio_2x_sync_clk
+  91   unassigned
+  92   csus
+  93   cdev2
+  94   cdev1
+  95   unassigned
+
+  96   cpu_g
+  97   cpu_lp
+  98   3d2
+  99   mselect
+  100  tsensor
+  101  i2s3
+  102  i2s4
+  103  i2c4
+  104  sbc5
+  105  sbc6
+  106  d_audio
+  107  apbif
+  108  dam0
+  109  dam1
+  110  dam2
+  111  hda2codec_2x
+  112  atomics
+  113  audio0_2x
+  114  audio1_2x
+  115  audio2_2x
+  116  audio3_2x
+  117  audio4_2x
+  118  audio5_2x
+  119  actmon
+  120  extern1
+  121  extern2
+  122  extern3
+  123  sata_oob
+  124  sata
+  125  hda
+  127  se
+  128  hda2hdmi
+  129  sata_cold
+
+  160  uartb
+  161  vfir
+  162  spdif_in
+  163  spdif_out
+  164  vi
+  165  vi_sensor
+  166  fuse
+  167  fuse_burn
+  168  cve
+  169  tvo
+
+  170  clk_32k
+  171  clk_m
+  172  clk_m_div2
+  173  clk_m_div4
+  174  pll_ref
+  175  pll_c
+  176  pll_c_out1
+  177  pll_m
+  178  pll_m_out1
+  179  pll_p
+  180  pll_p_out1
+  181  pll_p_out2
+  182  pll_p_out3
+  183  pll_p_out4
+  184  pll_a
+  185  pll_a_out0
+  186  pll_d
+  187  pll_d_out0
+  188  pll_d2
+  189  pll_d2_out0
+  190  pll_u
+  191  pll_x
+  192  pll_x_out0
+  193  pll_e
+  194  spdif_in_sync
+  195  i2s0_sync
+  196  i2s1_sync
+  197  i2s2_sync
+  198  i2s3_sync
+  199  i2s4_sync
+  200  vimclk
+  201  audio0
+  202  audio1
+  203  audio2
+  204  audio3
+  205  audio4
+  206  audio5
+  207  clk_out_1 (extern1)
+  208  clk_out_2 (extern2)
+  209  clk_out_3 (extern3)
+  210  sclk

Re: Oops in sound/usb/pcm.c:match_endpoint_audioformats() in current -git

2013-01-10 Thread Jens Axboe

On 2013-01-10 21:19, Takashi Iwai wrote:
> From: Takashi Iwai 
> Subject: [PATCH v2] ALSA: usb-audio: Fix NULL dereference by access to
>  non-existing substream
> 
> The commit [0d9741c0: ALSA: usb-audio: sync ep init fix for
> audioformat mismatch] introduced the correction of parameters to be
> set for sync EP.  But since the new code assumes that the sync EP is
> always paired with the data EP of another direction, it triggers Oops
> when a device only with a single direction is used.
> 
> This patch adds a proper check of sync EP type and the presence of the
> paired substream for avoiding the crash.
> 
> Reported-by: Jens Axboe 
> Signed-off-by: Takashi Iwai 

Confirmed, it works. You can add my tested-by too. Thanks Takashi!

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/9] Migrate Tegra to common clock framework

2013-01-10 Thread Prashant Gaikwad

This patchset does following:
1. Decompose single tegra clock structure into multiple clocks.
2. Try to use standard clock types supported by common clock framework.
3. Use dynamic initialization.
4. Move all clock code to drivers/clk/tegra from mach-tegra.
5. Add device tree support for Tegra20 and Tegra30 clocks.
6. Remove all legacy clock code from mach-tegra.

Tested on Tegra30 (Cardhu) and Tegra20 (Ventana).

This patch series is rebased on Tegra's for-3.9/soc and for-3.9/cleanup branch.

Changes from V3:
Fixed issued reported by Stephen Warren.

Changes from v2:
Removed APB MISC node.
Fixed some issues reported by Joseph Lo.
Added function to read chip id revision register.

Changes from v1:
Rebased on linux-next for 20121224.

Prashant Gaikwad (8):
  ARM: tegra: Add function to read chipid
  clk: tegra: Add tegra specific clocks
  arm: tegra: Move tegra_cpu_car.h to linux/clk/tegra.h
  ARM: Tegra: Define Tegra30 CAR binding
  clk: tegra: add clock support for tegra20
  clk: tegra: add clock support for tegra30
  arm: tegra: Migrate to new clock code
  arm: tegra: Remove legacy clock code

Stephen Warren (1):
  ARM: tegra: Define Tegra20 CAR binding

 .../bindings/clock/nvidia,tegra20-car.txt  |  205 ++
 .../bindings/clock/nvidia,tegra30-car.txt  |  262 ++
 arch/arm/boot/dts/tegra20.dtsi |6 +
 arch/arm/boot/dts/tegra30.dtsi |6 +
 arch/arm/mach-tegra/Makefile   |5 -
 arch/arm/mach-tegra/board-dt-tegra20.c |   30 -
 arch/arm/mach-tegra/board-dt-tegra30.c |   31 -
 arch/arm/mach-tegra/clock.c|  166 --
 arch/arm/mach-tegra/clock.h|  153 --
 arch/arm/mach-tegra/common.c   |   44 +-
 arch/arm/mach-tegra/cpu-tegra.c|2 +-
 arch/arm/mach-tegra/cpuidle-tegra30.c  |2 +-
 arch/arm/mach-tegra/fuse.c |8 +-
 arch/arm/mach-tegra/hotplug.c  |2 +-
 arch/arm/mach-tegra/include/mach/clk.h |   44 -
 arch/arm/mach-tegra/pcie.c |2 +-
 arch/arm/mach-tegra/platsmp.c  |2 +-
 arch/arm/mach-tegra/pm.c   |2 +-
 arch/arm/mach-tegra/powergate.c|2 +-
 arch/arm/mach-tegra/tegra20_clocks.c   | 1623 -
 arch/arm/mach-tegra/tegra20_clocks.h   |   42 -
 arch/arm/mach-tegra/tegra20_clocks_data.c  | 1143 -
 arch/arm/mach-tegra/tegra30_clocks.c   | 2506 
 arch/arm/mach-tegra/tegra30_clocks.h   |   54 -
 arch/arm/mach-tegra/tegra30_clocks_data.c  | 1425 ---
 drivers/clk/Makefile   |1 +
 drivers/clk/tegra/Makefile |   11 +
 drivers/clk/tegra/clk-audio-sync.c |   89 +
 drivers/clk/tegra/clk-divider.c|  188 ++
 drivers/clk/tegra/clk-periph-gate.c|  182 ++
 drivers/clk/tegra/clk-periph.c |  228 ++
 drivers/clk/tegra/clk-pll-out.c|  124 +
 drivers/clk/tegra/clk-pll.c|  676 ++
 drivers/clk/tegra/clk-super.c  |  154 ++
 drivers/clk/tegra/clk-tegra20.c| 1255 ++
 drivers/clk/tegra/clk-tegra30.c| 2041 
 drivers/clk/tegra/clk.c|   85 +
 drivers/clk/tegra/clk.h|  488 
 drivers/dma/tegra20-apb-dma.c  |2 +-
 drivers/gpu/drm/tegra/dc.c |3 +-
 drivers/gpu/drm/tegra/drm.c|1 -
 drivers/gpu/drm/tegra/hdmi.c   |3 +-
 drivers/i2c/busses/i2c-tegra.c |3 +-
 drivers/input/keyboard/tegra-kbc.c |2 +-
 drivers/spi/spi-tegra20-sflash.c   |2 +-
 drivers/spi/spi-tegra20-slink.c|2 +-
 drivers/staging/nvec/nvec.c|3 +-
 .../tegra_cpu_car.h => include/linux/clk/tegra.h   |   13 +-
 include/linux/tegra-soc.h  |   22 +
 sound/soc/tegra/tegra30_ahub.c |2 +-
 50 files changed, 6056 insertions(+), 7291 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/clock/nvidia,tegra20-car.txt
 create mode 100644 
Documentation/devicetree/bindings/clock/nvidia,tegra30-car.txt
 delete mode 100644 arch/arm/mach-tegra/clock.c
 delete mode 100644 arch/arm/mach-tegra/clock.h
 delete mode 100644 arch/arm/mach-tegra/include/mach/clk.h
 delete mode 100644 arch/arm/mach-tegra/tegra20_clocks.c
 delete mode 100644 arch/arm/mach-tegra/tegra20_clocks.h
 delete mode 100644 arch/arm/mach-tegra/tegra20_clocks_data.c
 delete mode 100644 arch/arm/mach-tegra/tegra30_clocks.c
 delete mode 100644

[PATCH v4 1/9] ARM: tegra: Add function to read chipid

2013-01-10 Thread Prashant Gaikwad

Add function to read chip id from APB MISC registers. This function
will also get called from clock driver to flush write operations on
apb bus.

Signed-off-by: Prashant Gaikwad 
---
 arch/arm/mach-tegra/fuse.c |8 +++-
 include/linux/tegra-soc.h  |   22 ++
 2 files changed, 29 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/tegra-soc.h

diff --git a/arch/arm/mach-tegra/fuse.c b/arch/arm/mach-tegra/fuse.c
index 8121742..f7db078 100644
--- a/arch/arm/mach-tegra/fuse.c
+++ b/arch/arm/mach-tegra/fuse.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "fuse.h"
 #include "iomap.h"
@@ -105,6 +106,11 @@ static void tegra_get_process_id(void)
tegra_core_process_id = (reg >> 12) & 3;
 }
 
+u32 tegra_read_chipid(void)
+{
+   return readl_relaxed(IO_ADDRESS(TEGRA_APB_MISC_BASE) + 0x804);
+}
+
 void tegra_init_fuse(void)
 {
u32 id;
@@ -119,7 +125,7 @@ void tegra_init_fuse(void)
reg = tegra_apb_readl(TEGRA_APB_MISC_BASE + STRAP_OPT);
tegra_bct_strapping = (reg & RAM_ID_MASK) >> RAM_CODE_SHIFT;
 
-   id = readl_relaxed(IO_ADDRESS(TEGRA_APB_MISC_BASE) + 0x804);
+   id = tegra_read_chipid();
tegra_chip_id = (id >> 8) & 0xff;
 
switch (tegra_chip_id) {
diff --git a/include/linux/tegra-soc.h b/include/linux/tegra-soc.h
new file mode 100644
index 000..95f611d
--- /dev/null
+++ b/include/linux/tegra-soc.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (c) 2012, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __LINUX_TEGRA_SOC_H_
+#define __LINUX_TEGRA_SOC_H_
+
+u32 tegra_read_chipid(void);
+
+#endif /* __LINUX_TEGRA_SOC_H_ */
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v1 03/16] vfs: add I/O frequency update function

2013-01-10 Thread Zhi Yong Wu

On Thu, Jan 10, 2013 at 8:51 AM, David Sterba  wrote:
> On Thu, Dec 20, 2012 at 10:43:22PM +0800, zwu.ker...@gmail.com wrote:
>> --- a/fs/hot_tracking.c
>> +++ b/fs/hot_tracking.c
>> @@ -164,6 +164,135 @@ static void hot_inode_tree_exit(struct hot_info *root)
>>   spin_unlock(>lock);
>>  }
>>
>> +struct hot_inode_item
>> +*hot_inode_item_lookup(struct hot_info *root, u64 ino)
>> +{
>> + struct rb_node **p = >hot_inode_tree.map.rb_node;
>> + struct rb_node *parent = NULL;
>> + struct hot_comm_item *ci;
>> + struct hot_inode_item *entry;
>> +
>> + /* walk tree to find insertion point */
>> + spin_lock(>lock);
>> + while (*p) {
>> + parent = *p;
>> + ci = rb_entry(parent, struct hot_comm_item, rb_node);
>> + entry = container_of(ci, struct hot_inode_item, hot_inode);
>> + if (ino < entry->i_ino)
>> + p = &(*p)->rb_left;
>> + else if (ino > entry->i_ino)
>> + p = &(*p)->rb_right;
>
> style comment: put { } around the all if/else blocks,
no, it will violate checkpatch.pl. If the if/else block only contains
one line of code, we should not put {} around them.
>
>> + else {
>> + spin_unlock(>lock);
>> + kref_get(>hot_inode.refs);
>
> jumping forwards in the series, the spin_unlock and kref_get get swapped
> later, and I think that's the right order. Otherwise there's a small
> window where the entry does not get the reference and could be
> potentially freed by racing kref_put, no?
yes, good catch, thanks, done
>
> 
> spin_unlock(tree)
>  spin_lock(tree)
>  
>  kref_put(E) or via hot_inode_item_put(E) (1)
> kref_get(E)   (2)
>
>
> if the reference count at (1) was 1, it's freed and (2) hits a free
> memory. hot_inode_item_put can be called from filesystem or via seq
> print of the respective /proc files, so I think there are chances to hit
> the problem.
Great.
>
>> + return entry;
>> + }
>> + }
>> + spin_unlock(>lock);
>> +
>> + entry = kmem_cache_zalloc(hot_inode_item_cachep, GFP_NOFS);
>> + if (!entry)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + spin_lock(>lock);
>> + hot_inode_item_init(entry, ino, >hot_inode_tree);
>> + rb_link_node(>hot_inode.rb_node, parent, p);
>> + rb_insert_color(>hot_inode.rb_node,
>> + >hot_inode_tree.map);
>> + spin_unlock(>lock);
>> +
>> + kref_get(>hot_inode.refs);
>
> Similar here, the entry is inserted into the tree but there's no
> refcount yet. And the order of spin_unlock/kref_get remains unchanged.
ditto
>
>> + return entry;
>> +}
>> +EXPORT_SYMBOL_GPL(hot_inode_item_lookup);
>> +
>> +static struct hot_range_item
>> +*hot_range_item_lookup(struct hot_inode_item *he,
>> + loff_t start)
>> +{
>> + struct rb_node **p = >hot_range_tree.map.rb_node;
>> + struct rb_node *parent = NULL;
>> + struct hot_comm_item *ci;
>> + struct hot_range_item *entry;
>> +
>> + /* walk tree to find insertion point */
>> + spin_lock(>lock);
>> + while (*p) {
>> + parent = *p;
>> + ci = rb_entry(parent, struct hot_comm_item, rb_node);
>> + entry = container_of(ci, struct hot_range_item, hot_range);
>> + if (start < entry->start)
>> + p = &(*p)->rb_left;
>> + else if (start > hot_range_end(entry))
>> + p = &(*p)->rb_right;
>
> if { ...}
> else if { ... }
We should not put {} around them as what i explained above.
>
>> + else {
>> + spin_unlock(>lock);
>> + kref_get(>hot_range.refs);
>
> same here
Done
>
>> + return entry;
>> + }
>> + }
>> + spin_unlock(>lock);
>> +
>> + entry = kmem_cache_zalloc(hot_range_item_cachep, GFP_NOFS);
>> + if (!entry)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + spin_lock(>lock);
>> + hot_range_item_init(entry, start, he);
>> + rb_link_node(>hot_range.rb_node, parent, p);
>> + rb_insert_color(>hot_range.rb_node,
>> + >hot_range_tree.map);
>> + spin_unlock(>lock);
>> +
>> + kref_get(>hot_range.refs);
>
> and here
Done
>
>> + return entry;
>> +}
>> +
>> +/*
>> + * This function does the actual work of updating
>> + * the frequency numbers, whatever they turn out to be.
>
> Can this function be described a bit better? This comment did not help.
OK, i will
>
>> + */
>> +static void hot_rw_freq_calc(struct timespec old_atime,
>> + struct timespec cur_time, u64 *avg)
>> +{
>> + struct timespec delta_ts;
>> + u64 new_delta;
>> +
>> + delta_ts = timespec_sub(cur_time, old_atime);
>> + new_delta = timespec_to_ns(_ts) >> FREQ_POWER;
>> +
>> + *avg = (*avg <<

Re: linux-next: build failure after merge of the scsi tree

2013-01-10 Thread James Bottomley

On Fri, 2013-01-11 at 12:03 +1100, Stephen Rothwell wrote:
> Hi James,
> 
> After merging the scsi tree, today's linux-next build (powerpc
> ppc64_defconfig) failed like this:
> 
> drivers/scsi/ipr.c:9138:22: error: expected '=', ',', ';', 'asm' or 
> '__attribute__' before 'ipr_enable_msix'
> drivers/scsi/ipr.c:9165:22: error: expected '=', ',', ';', 'asm' or 
> '__attribute__' before 'ipr_enable_msi'
> drivers/scsi/ipr.c:9188:23: error: expected '=', ',', ';', 'asm' or 
> '__attribute__' before 'name_msi_vectors'
> drivers/scsi/ipr.c:9200:22: error: expected '=', ',', ';', 'asm' or 
> '__attribute__' before 'ipr_request_other_msi_irqs'
> drivers/scsi/ipr.c: In function 'ipr_probe_ioa':
> drivers/scsi/ipr.c:9422:4: error: implicit declaration of function 
> 'ipr_enable_msix' [-Werror=implicit-function-declaration]
> drivers/scsi/ipr.c:9425:4: error: implicit declaration of function 
> 'ipr_enable_msi' [-Werror=implicit-function-declaration]
> drivers/scsi/ipr.c:9517:3: error: implicit declaration of function 
> 'name_msi_vectors' [-Werror=implicit-function-declaration]
> drivers/scsi/ipr.c:9523:4: error: implicit declaration of function 
> 'ipr_request_other_msi_irqs' [-Werror=implicit-function-declaration]

OK, fine, I'll drop all the ipr patches.  I've been waiting for a month
for them to fix the smatch and sparse warnings.  Please resend the
series with all the fixes.

Thanks,

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v1 02/16] vfs: add init and cleanup functions

2013-01-10 Thread Zhi Yong Wu

On Thu, Jan 10, 2013 at 8:48 AM, David Sterba  wrote:
> On Thu, Dec 20, 2012 at 10:43:21PM +0800, zwu.ker...@gmail.com wrote:
>> From: Zhi Yong Wu 
>> --- a/fs/hot_tracking.c
>> +++ b/fs/hot_tracking.c
>> @@ -107,3 +189,38 @@ err:
>>   kmem_cache_destroy(hot_inode_item_cachep);
>>  }
>>  EXPORT_SYMBOL_GPL(hot_cache_init);
>> +
>> +/*
>> + * Initialize the data structures for hot data tracking.
>> + */
>> +int hot_track_init(struct super_block *sb)
>> +{
>> + struct hot_info *root;
>> + int ret = -ENOMEM;
>> +
>> + root = kzalloc(sizeof(struct hot_info), GFP_NOFS);
>> + if (!root) {
>> + printk(KERN_ERR "%s: Failed to malloc memory for "
>> + "hot_info\n", __func__);
>> + return ret;
>> + }
>> +
>> + hot_inode_tree_init(root);
>
> This function is supposed to be called from the filesystem init, please
> add a sanity check that would catch multiple initialization attempts.
Good catch, thanks. Done.

>
>> +
>> + sb->s_hot_root = root;
>> +
>> + printk(KERN_INFO "VFS: Turning on hot data tracking\n");
>> +
>> + return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(hot_track_init);
>> +
>> +void hot_track_exit(struct super_block *sb)
>> +{
>> + struct hot_info *root = sb->s_hot_root;
>
> another sanity check to catch the opposite.
ditto.
>
> Why? The option is parsed and enabled from the filesystems, due to
> unexpected bugs eg with remounting or incorrectly handled error paths,
> vfs layer should IMHO rather warn than crash.
thanks for your expalaination.
>
>> +
>> + hot_inode_tree_exit(root);
>> + sb->s_hot_root = NULL;
>> + kfree(root);
>> +}
>> +EXPORT_SYMBOL_GPL(hot_track_exit);
>
>
> david



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] arm: fix returning wrong CALLER_ADDRx

2013-01-10 Thread Keun-O Park

From: sahara 

This makes return_address return correct value for ftrace feature.
unwind_frame does not update frame->lr but frame->pc for backtrace.
And, the initialization for data.addr was missing so that wrong value
returned when unwind_frame failed.

Signed-off-by: sahara 
---
 arch/arm/kernel/return_address.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/return_address.c b/arch/arm/kernel/return_address.c
index 8085417..fafedd8 100644
--- a/arch/arm/kernel/return_address.c
+++ b/arch/arm/kernel/return_address.c
@@ -26,7 +26,7 @@ static int save_return_addr(struct stackframe *frame, void *d)
struct return_address_data *data = d;
 
if (!data->level) {
-   data->addr = (void *)frame->lr;
+   data->addr = (void *)frame->pc;
 
return 1;
} else {
@@ -41,7 +41,8 @@ void *return_address(unsigned int level)
struct stackframe frame;
register unsigned long current_sp asm ("sp");
 
-   data.level = level + 1;
+   data.level = level + 2;
+   data.addr = NULL;
 
frame.fp = (unsigned long)__builtin_frame_address(0);
frame.sp = current_sp;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.8-rc3: yet another MIPS build failure

2013-01-10 Thread Aaro Koskinen

Hi,

Commit d3ce88431892b703b04769566338a89eda6b0477 (MIPS: Fix modpost
error in modules attepting to use virt_addr_valid()) broke the 64-bit
MIPS build:

  LD  init/built-in.o
kernel/built-in.o: In function `memory_bm_free':
snapshot.c:(.text+0x3c76c): undefined reference to `__virt_addr_valid'
snapshot.c:(.text+0x3c800): undefined reference to `__virt_addr_valid'
kernel/built-in.o: In function `snapshot_write_next':
(.text+0x3e094): undefined reference to `__virt_addr_valid'
kernel/built-in.o: In function `snapshot_write_next':
(.text+0x3e468): undefined reference to `__virt_addr_valid'
make[4]: *** [vmlinux] Error 1

A quick workaround is to compile ioremap.c always, but it adds ~2KB
unused code for 64-bit-only kernels...

A.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 16/22] sched: add power aware scheduling in fork/exec/wake

2013-01-10 Thread Alex Shi

On 01/10/2013 11:01 PM, Morten Rasmussen wrote:
> On Sat, Jan 05, 2013 at 08:37:45AM +, Alex Shi wrote:
>> This patch add power aware scheduling in fork/exec/wake. It try to
>> select cpu from the busiest while still has utilization group. That's
>> will save power for other groups.
>>
>> The trade off is adding a power aware statistics collection in group
>> seeking. But since the collection just happened in power scheduling
>> eligible condition, the worst case of hackbench testing just drops
>> about 2% with powersaving/balance policy. No clear change for
>> performance policy.
>>
>> I had tried to use rq load avg utilisation in this balancing, but since
>> the utilisation need much time to accumulate itself. It's unfit for any
>> burst balancing. So I use nr_running as instant rq utilisation.
> 
> So you effective use a mix of nr_running (counting tasks) and PJT's
> tracked load for balancing?

no, just task number here.
> 
> The problem of slow reaction time of the tracked load a cpu/rq is an
> interesting one. Would it be possible to use it if you maintained a
> sched group runnable_load_avg similar to cfs_rq->runnable_load_avg where
> load contribution of a tasks is added when a task is enqueued and
> removed again if it migrates to another cpu?
> This way you would know the new load of the sched group/domain instantly
> when you migrate a task there. It might not be precise as the load
> contribution of the task to some extend depends on the load of the cpu
> where it is running. But it would probably be a fair estimate, which is
> quite likely to be better than just counting tasks (nr_running).

For power consideration scenario, it ask task number less than Lcpu
number, don't care the load weight, since whatever the load weight, the
task only can burn one LCPU.

>> +
>> +if (sched_policy == SCHED_POLICY_POWERSAVING)
>> +threshold = sgs.group_weight;
>> +else
>> +threshold = sgs.group_capacity;
> 
> Is group_capacity larger or smaller than group_weight on your platform?

Guess most of your confusing come from the capacity != weight here.

In most of Intel CPU, a cpu core's power(with 2 HT) is usually 1178, it
just bigger than a normal cpu power - 1024. but the capacity is still 1,
while the group weight is 2.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG: kernel panic on Dell Vostro 3560 when plugging in AC adapter

2013-01-10 Thread Sujith Manoharan

Adrian Byszuk wrote:
> Aaaand failure again!
> I haven't finished bisecting yet, but as of now I have ~20 commits left:
> all related to 'mtd' - propably not a source of troubles too.
> Any other way to diagnose this bug?

I think this has been fixed by:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=3935e89505a1c3ab3f3b0c7ef0eae54124f48905

For Arch Linux, it was tracked here:
https://bugs.archlinux.org/task/33095

Sujith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] drm intel fixes

2013-01-10 Thread Heinz Diehl

On 11.01.2013, Dave Airlie wrote: 

> Just intel fixes, including getting the Ironlake systems back to the state 
> they were in for 3.6.

>   drm/i915: Revert shrinker changes from "Track unbound pages"

I guess it's this one which fixes the ILK hang. Would it be enough for
3.7 to just appy this patch to get the problem fixed?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] dac960: return success instead of -ENOTTY

2013-01-10 Thread Dan Carpenter

There is a missing break statement here.  This used to return directly
but we re-worked it in 2008 to add locking as part of the BKL push down.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c
index 9a13e88..0d3ffc5 100644
--- a/drivers/block/DAC960.c
+++ b/drivers/block/DAC960.c
@@ -7054,6 +7054,7 @@ static long DAC960_gam_ioctl(struct file *file, unsigned 
int Request,
else
ErrorCode =  0;
   }
+  break;
   default:
ErrorCode = -ENOTTY;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.8-rc2/rc3 write() blocked on CLOSE_WAIT TCP socket

2013-01-10 Thread David Miller

From: Eric Dumazet 
Date: Thu, 10 Jan 2013 18:18:47 -0800

> [PATCH] tcp: accept RST without ACK flag
> 
> commit c3ae62af8e755 (tcp: should drop incoming frames without ACK flag
> set) added a regression on the handling of RST messages.
> 
> RST should be allowed to come even without ACK bit set. We validate
> the RST by checking the exact sequence, as requested by RFC 793 and 
> 5961 3.2, in tcp_validate_incoming()
> 
> Reported-by: Eric Wong 
> Signed-off-by: Eric Dumazet 

Applied, thanks Eric.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] ibft: add a missing break statement

2013-01-10 Thread Dan Carpenter

The code works the same with or without the break.  It just looks a bit
cleaner to not fall through.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/firmware/iscsi_ibft.c b/drivers/firmware/iscsi_ibft.c
index 3ee852c..c4b187c 100644
--- a/drivers/firmware/iscsi_ibft.c
+++ b/drivers/firmware/iscsi_ibft.c
@@ -503,6 +503,7 @@ static umode_t __init ibft_check_tgt_for(void *data, int 
type)
case ISCSI_BOOT_TGT_NIC_ASSOC:
case ISCSI_BOOT_TGT_CHAP_TYPE:
rc = S_IRUGO;
+   break;
case ISCSI_BOOT_TGT_NAME:
if (tgt->tgt_name_len)
rc = S_IRUGO;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.

2013-01-10 Thread Rusty Russell

Untested, but I wanted to post before the weekend.

I think the implementation is a bit nicer, and though we have a callback
to get the guest-to-userspace offset, it might be faster since I think
most cases will re-use the same mapping.

Feedback on API welcome!
Rusty.

virtio_host: host-side implementation of virtio rings (untested!)

Getting use of virtio rings correct is tricky, and a recent patch saw
an implementation of in-kernel rings (as separate from userspace).

This patch attempts to abstract the business of dealing with the
virtio ring layout from the access (userspace or direct); to do this,
we use function pointers, which gcc inlines correctly.

FIXME: strong barriers a-la virtio weak_barrier flag.
FIXME: separate notify call with flag if we wrapped.
FIXME: move to vhost/vringh.c.
FIXME: test :)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 202bba6..38ec470 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -1,6 +1,7 @@
 config VHOST_NET
tristate "Host kernel accelerator for virtio net (EXPERIMENTAL)"
depends on NET && EVENTFD && (TUN || !TUN) && (MACVTAP || !MACVTAP) && 
EXPERIMENTAL
+   select VHOST
---help---
  This kernel module can be loaded in host kernel to accelerate
  guest networking with virtio_net. Not to be confused with virtio_net
diff --git a/drivers/vhost/Kconfig.tcm b/drivers/vhost/Kconfig.tcm
index a9c6f76..f4c3704 100644
--- a/drivers/vhost/Kconfig.tcm
+++ b/drivers/vhost/Kconfig.tcm
@@ -1,6 +1,7 @@
 config TCM_VHOST
tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
+   select VHOST
default n
---help---
Say M here to enable the TCM_VHOST fabric module for use with 
virtio-scsi guests
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 8d5bddb..fd95d3e 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -5,6 +5,12 @@ config VIRTIO
  bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_LGUEST,
  CONFIG_RPMSG or CONFIG_S390_GUEST.
 
+config VHOST
+   tristate
+   ---help---
+ This option is selected by any driver which needs to access
+ the host side of a virtio ring.
+
 menu "Virtio drivers"
 
 config VIRTIO_PCI
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 9076635..9833cd5 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -2,3 +2,4 @@ obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
 obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
 obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
+obj-$(CONFIG_VHOST) += virtio_host.o
diff --git a/drivers/virtio/virtio_host.c b/drivers/virtio/virtio_host.c
new file mode 100644
index 000..7416741
--- /dev/null
+++ b/drivers/virtio/virtio_host.c
@@ -0,0 +1,618 @@
+/*
+ * Helpers for the host side of a virtio ring.
+ *
+ * Since these may be in userspace, we use (inline) accessors.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
+{
+   static DEFINE_RATELIMIT_STATE(vringh_rs,
+ DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+   if (__ratelimit(_rs)) {
+   va_list ap;
+   va_start(ap, fmt);
+   printk(KERN_NOTICE "vringh:");
+   vprintk(fmt, ap);
+   va_end(ap);
+   }
+}
+
+/* Returns vring->num if empty, -ve on error. */
+static inline int __vringh_get_head(const struct vringh *vrh,
+   int (*getu16)(u16 *val, const u16 *p),
+   u16 *last_avail_idx)
+{
+   u16 avail_idx, i, head;
+   int err;
+
+   err = getu16(_idx, >vring.avail->idx);
+   if (err) {
+   vringh_bad("Failed to access avail idx at %p",
+  >vring.avail->idx);
+   return err;
+   }
+
+   err = getu16(last_avail_idx, _avail_event(>vring));
+   if (err) {
+   vringh_bad("Failed to access last avail idx at %p",
+  _avail_event(>vring));
+   return err;
+   }
+
+   if (*last_avail_idx == avail_idx)
+   return vrh->vring.num;
+
+   /* Only get avail ring entries after they have been exposed by guest. */
+   smp_rmb();
+
+   i = *last_avail_idx & (vrh->vring.num - 1);
+
+   err = getu16(, >vring.avail->ring[i]);
+   if (err) {
+   vringh_bad("Failed to read head: idx %d address %p",
+  *last_avail_idx, >vring.avail->ring[i]);
+   return err;
+   }
+
+   if (head >= vrh->vring.num) {
+   vringh_bad("Guest says index %u > %u is available",
+  head, vrh->vring.num);
+   return -EINVAL;
+   }
+

Re: [PATCH 1/1] fs/xfs remove obsolete simple_strto

2013-01-10 Thread Jeff Liu

On 01/09/2013 10:04 PM, Abhijit Pawar wrote:
> This patch replaces usages of obsolete simple_strtoul with kstrtoint in 
> xfs_args and suffix_strtoul.
> 
> Signed-off-by: Abhijit Pawar 
> ---
>  fs/xfs/xfs_super.c |   29 +++--
>  1 files changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index ab8839b..c407121 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -139,9 +139,9 @@ static const match_table_t tokens = {
>  
>  
>  STATIC unsigned long
> -suffix_strtoul(char *s, char **endp, unsigned int base)
> +suffix_kstrtoint(char *s, unsigned int base, int *res)
>  {
> - int last, shift_left_factor = 0;
> + int last, shift_left_factor = 0, _res;
>   char*value = s;
>  
>   last = strlen(value) - 1;
> @@ -158,7 +158,10 @@ suffix_strtoul(char *s, char **endp, unsigned int base)
>   value[last] = '\0';
>   }
>  
> - return simple_strtoul((const char *)s, endp, base) << shift_left_factor;
> + if (kstrtoint(s, base, &_res))
> + return -EINVAL;
> + *res = _res << shift_left_factor;
> + return 0;
>  }
>  
>  /*
> @@ -174,7 +177,7 @@ xfs_parseargs(
>   char*options)
>  {
>   struct super_block  *sb = mp->m_super;
> - char*this_char, *value, *eov;
> + char*this_char, *value;
>   int dsunit = 0;
>   int dswidth = 0;
>   int iosize = 0;
> @@ -230,14 +233,16 @@ xfs_parseargs(
>   this_char);
>   return EINVAL;
>   }
> - mp->m_logbufs = simple_strtoul(value, , 10);
> + if (kstrtoint(value, 10, >m_logbufs))
> + return EINVAL;
>   } else if (!strcmp(this_char, MNTOPT_LOGBSIZE)) {
>   if (!value || !*value) {
>   xfs_warn(mp, "%s option requires an argument",
>   this_char);
>   return EINVAL;
>   }
> - mp->m_logbsize = suffix_strtoul(value, , 10);
> + if (suffix_kstrtoint(value, 10, >m_logbsize))
> + return EINVAL;
>   } else if (!strcmp(this_char, MNTOPT_LOGDEV)) {
>   if (!value || !*value) {
>   xfs_warn(mp, "%s option requires an argument",
> @@ -266,7 +271,8 @@ xfs_parseargs(
>   this_char);
>   return EINVAL;
>   }
> - iosize = simple_strtoul(value, , 10);
> + if (kstrtoint(value, 10, ))
> + return EINVAL;
>   iosizelog = ffs(iosize) - 1;
>   } else if (!strcmp(this_char, MNTOPT_ALLOCSIZE)) {
>   if (!value || !*value) {
> @@ -274,7 +280,8 @@ xfs_parseargs(
>   this_char);
>   return EINVAL;
>   }
> - iosize = suffix_strtoul(value, , 10);
> + if (suffix_kstrtoint(value, 10, ))
> + return EINVAL;
>   iosizelog = ffs(iosize) - 1;
>   } else if (!strcmp(this_char, MNTOPT_GRPID) ||
>  !strcmp(this_char, MNTOPT_BSDGROUPS)) {
> @@ -296,14 +303,16 @@ xfs_parseargs(
>   this_char);
>   return EINVAL;
>   }
> - dsunit = simple_strtoul(value, , 10);
> + if (kstrtoint(value, 10, ))
> + return EINVAL;
>   } else if (!strcmp(this_char, MNTOPT_SWIDTH)) {
>   if (!value || !*value) {
>   xfs_warn(mp, "%s option requires an argument",
>   this_char);
>   return EINVAL;
>   }
> - dswidth = simple_strtoul(value, , 10);
> + if (kstrtoint(value, 10, ))
> + return EINVAL;
>   } else if (!strcmp(this_char, MNTOPT_32BITINODE)) {
>   mp->m_flags |= XFS_MOUNT_SMALL_INUMS;
>   } else if (!strcmp(this_char, MNTOPT_64BITINODE)) {
> 
checkpatch.pl show warning if we return EINVAL as below:
WARNING: return of an errno should typically be -ve (return -EINVAL)

Can we just ignore such code style issue?

Thanks,
-Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please

Re: [PATCH v3 09/22] sched: compute runnable load avg in cpu_load and cpu_avg_load_per_task

2013-01-10 Thread Alex Shi

On 01/07/2013 02:31 AM, Linus Torvalds wrote:
> On Sat, Jan 5, 2013 at 11:54 PM, Alex Shi  wrote:
>>
>> I just looked into the aim9 benchmark, in this case it forks 2000 tasks,
>> after all tasks ready, aim9 give a signal than all tasks burst waking up
>> and run until all finished.
>> Since each of tasks are finished very quickly, a imbalanced empty cpu
>> may goes to sleep till a regular balancing give it some new tasks. That
>> causes the performance dropping. cause more idle entering.
> 
> Sounds like for AIM (and possibly for other really bursty loads), we
> might want to do some load-balancing at wakeup time by *just* looking
> at the number of running tasks, rather than at the load average. Hmm?
> 
> The load average is fundamentally always going to run behind a bit,
> and while you want to use it for long-term balancing, a short-term you
> might want to do just a "if we have a huge amount of runnable
> processes, do a load balancing *now*". Where "huge amount" should
> probably be relative to the long-term load balancing (ie comparing the
> number of runnable processes on this CPU right *now* with the load
> average over the last second or so would show a clear spike, and a
> reason for quick action).
> 

Sorry for response late!

Just written a patch following your suggestion, but no clear improvement for 
this case.
I also tried change the burst checking interval, also no clear help.

If I totally give up runnable load in periodic balancing, the performance can 
recover 60%
of lose.

I will try to optimize wake up balancing in weekend.

Nice weekend!
Alex

---
>From 8f6f7317568a7bd8497e7a6e8d9afcc2b4e93a7e Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Wed, 9 Jan 2013 23:16:57 +0800
Subject: [PATCH] sched: use instant load weight in burst regular load balance

Runnable load tracking needs much time to accumulate the runnable
load, so when system burst wake up many sleep tasks, it needs more time
balance them well. This patch try to catch such scenario and use instant
load instead of runnable load to do balance.

Signed-off-by: Alex Shi 
---
 include/linux/sched.h |  1 +
 kernel/sched/debug.c  |  1 +
 kernel/sched/fair.c   | 55 +++
 kernel/sysctl.c   |  7 +++
 4 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index b0354a5..f6cf1b5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2032,6 +2032,7 @@ extern unsigned int sysctl_sched_latency;
 extern unsigned int sysctl_sched_min_granularity;
 extern unsigned int sysctl_sched_wakeup_granularity;
 extern unsigned int sysctl_sched_child_runs_first;
+extern unsigned int sysctl_sched_burst_check_ms;
 
 enum sched_tunable_scaling {
SCHED_TUNABLESCALING_NONE,
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index e4035f7..d06fc3c 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -380,6 +380,7 @@ static int sched_debug_show(struct seq_file *m, void *v)
PN(sysctl_sched_latency);
PN(sysctl_sched_min_granularity);
PN(sysctl_sched_wakeup_granularity);
+   PN(sysctl_sched_burst_check_ms);
P(sysctl_sched_child_runs_first);
P(sysctl_sched_features);
 #undef PN
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 604d0ee..875e7af 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4032,6 +4032,7 @@ struct lb_env {
unsigned intloop_max;
int power_lb;  /* if power balance needed */
int perf_lb;   /* if performance balance needed */
+   int has_burst;
 };
 
 /*
@@ -4729,6 +4730,37 @@ fix_small_capacity(struct sched_domain *sd, struct 
sched_group *group)
return 0;
 }
 
+DEFINE_PER_CPU(unsigned long, next_check);
+DEFINE_PER_CPU(unsigned int, last_running);
+
+/* do burst check no less than this interval */
+unsigned int sysctl_sched_burst_check_ms = 1000UL;
+
+/**
+ * check_burst - check if tasks bursts up on this cpu.
+ * @env: The load balancing environment.
+ */
+static void check_burst(struct lb_env *env)
+{
+   int cpu;
+   unsigned int curr_running, prev_running, interval;
+
+   cpu = env->dst_cpu;
+   curr_running = cpu_rq(cpu)->nr_running;
+   prev_running = per_cpu(last_running, cpu);
+   interval = sysctl_sched_burst_check_ms;
+
+   per_cpu(last_running, cpu) = curr_running;
+
+   if (time_after_eq(jiffies, per_cpu(next_check, cpu))) {
+   per_cpu(next_check, cpu) = jiffies + msecs_to_jiffies(interval);
+   /* find a pike from last balance on the cpu  */
+   if (curr_running  >  2 + (prev_running << 2))
+   env->has_burst = 1;
+   }
+   env->has_burst = 0;
+}
+
 /**
  * update_sg_lb_stats - Update sched_group's statistics for load balancing.
  * @env: The load balancing environment.
@@ -4770,9 +4802,15 @@ static inline void

[PATCH] powerpc: added DSCR support to ptrace

2013-01-10 Thread Alexey Kardashevskiy

The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

The kernel already supports DSCR value per thread but there is also
a need in a ability to change it from an external process for
the specific pid.

The patch adds new register index PT_DSCR (index=44) which can be
set/get by:
  ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR << 3, dscr);
  dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR << 3, NULL);

The patch does not increase PT_REGS_COUNT as the pt_regs struct has not
been changed.

Signed-off-by: Alexey Kardashevskiy 
---
 arch/powerpc/include/uapi/asm/ptrace.h |1 +
 arch/powerpc/kernel/ptrace.c   |   29 +
 2 files changed, 30 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index ee67a2b..5a4863c 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -108,6 +108,7 @@ struct pt_regs {
 #define PT_DAR 41
 #define PT_DSISR 42
 #define PT_RESULT 43
+#define PT_DSCR 44
 #define PT_REGS_COUNT 44
 
 #define PT_FPR048  /* each FP reg occupies 2 slots in this space */
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index d4afccc..245c1b6 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -179,6 +179,30 @@ static int set_user_msr(struct task_struct *task, unsigned 
long msr)
return 0;
 }
 
+#ifdef CONFIG_PPC64
+static unsigned long get_user_dscr(struct task_struct *task)
+{
+   return task->thread.dscr;
+}
+
+static int set_user_dscr(struct task_struct *task, unsigned long dscr)
+{
+   task->thread.dscr = dscr;
+   task->thread.dscr_inherit = 1;
+   return 0;
+}
+#else
+static unsigned long get_user_dscr(struct task_struct *task)
+{
+   return -EIO;
+}
+
+static int set_user_dscr(struct task_struct *task, unsigned long dscr)
+{
+   return -EIO;
+}
+#endif
+
 /*
  * We prevent mucking around with the reserved area of trap
  * which are used internally by the kernel.
@@ -200,6 +224,9 @@ unsigned long ptrace_get_reg(struct task_struct *task, int 
regno)
if (regno == PT_MSR)
return get_user_msr(task);
 
+   if (regno == PT_DSCR)
+   return get_user_dscr(task);
+
if (regno < (sizeof(struct pt_regs) / sizeof(unsigned long)))
return ((unsigned long *)task->thread.regs)[regno];
 
@@ -218,6 +245,8 @@ int ptrace_put_reg(struct task_struct *task, int regno, 
unsigned long data)
return set_user_msr(task, data);
if (regno == PT_TRAP)
return set_user_trap(task, data);
+   if (regno == PT_DSCR)
+   return set_user_dscr(task, data);
 
if (regno <= PT_MAX_PUT_REG) {
((unsigned long *)task->thread.regs)[regno] = data;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus

2013-01-10 Thread Liu, Chuansheng



> -Original Message-
> From: ccr...@google.com [mailto:ccr...@google.com] On Behalf Of Colin
> Cross
> Sent: Friday, January 11, 2013 2:18 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Andrew Morton; Don Zickus; Ingo Molnar;
> Thomas Gleixner; linux-arm-ker...@lists.infradead.org
> Subject: Re: [PATCH] hardlockup: detect hard lockups without NMIs using
> secondary cpus
> 
> On Thu, Jan 10, 2013 at 9:57 PM, Liu, Chuansheng
>  wrote:
> >
> >
> >> -Original Message-
> >> From: ccr...@google.com [mailto:ccr...@google.com] On Behalf Of Colin
> >> Cross
> >> Sent: Friday, January 11, 2013 1:34 PM
> >> To: Liu, Chuansheng
> >> Cc: linux-kernel@vger.kernel.org; Andrew Morton; Don Zickus; Ingo Molnar;
> >> Thomas Gleixner; linux-arm-ker...@lists.infradead.org
> >> Subject: Re: [PATCH] hardlockup: detect hard lockups without NMIs using
> >> secondary cpus
> >>
> >> On Thu, Jan 10, 2013 at 5:39 PM, Liu, Chuansheng
> >>  wrote:
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: Colin Cross [mailto:ccr...@android.com]
> >> >> Sent: Thursday, January 10, 2013 9:58 AM
> >> >> To: linux-kernel@vger.kernel.org
> >> >> Cc: Andrew Morton; Don Zickus; Ingo Molnar; Thomas Gleixner; Liu,
> >> >> Chuansheng; linux-arm-ker...@lists.infradead.org; Colin Cross
> >> >> Subject: [PATCH] hardlockup: detect hard lockups without NMIs using
> >> >> secondary cpus
> >> >>
> >> >> Emulate NMIs on systems where they are not available by using timer
> >> >> interrupts on other cpus.  Each cpu will use its softlockup hrtimer
> >> >> to check that the next cpu is processing hrtimer interrupts by
> >> >> verifying that a counter is increasing.
> >> >>
> >> >> This patch is useful on systems where the hardlockup detector is not
> >> >> available due to a lack of NMIs, for example most ARM SoCs.
> >> >> Without this patch any cpu stuck with interrupts disabled can
> >> >> cause a hardware watchdog reset with no debugging information,
> >> >> but with this patch the kernel can detect the lockup and panic,
> >> >> which can result in useful debugging info.
> >> >>
> >> >> Signed-off-by: Colin Cross 
> >> >> +static void watchdog_check_hardlockup_other_cpu(void)
> >> >> +{
> >> >> + int cpu;
> >> >> + cpumask_t cpus = watchdog_cpus;
> >> >> +
> >> >> + /*
> >> >> +  * Test for hardlockups every 3 samples.  The sample period is
> >> >> +  *  watchdog_thresh * 2 / 5, so 3 samples gets us back to
> slightly
> >> over
> >> >> +  *  watchdog_thresh (over by 20%).
> >> >> +  */
> >> >> + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0)
> >> >> + return;
> >> >> +
> > Another feeling is about __this_cpu_read(hrtimer_interrupts) % 3 != 0,
> > It will cause the actual timeout value for hard lockup detection is not 
> > very fix,
> or even
> > very short.
> > Sometimes using 3 samples can detect the lockup case, but sometimes 1
> sample.
> > Is it the case?
> 
> I'm not sure what you mean.  The mod 3 will cause every 3rd timer (12
> seconds, assuming watchdog_thresh = 10) to check hrtimer_interrupts
> vs. hrtimer_interrupts_saved, and then update it.  The sampling should
> be fixed and very accurate.  It will cause a panic/warning between 12
> and 24 seconds after a cpu stops processing timer interrupts,
> depending on the alignment of the hrtimers between the two cpus.
> 
You are right, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] serial/arc-uart: Don't index with -ve platform_device->id

2013-01-10 Thread Vineet Gupta

probe routine could index into port[] with -ve index. The check in
arc_uart_init_one() was too late.

This came to light when trying to port driver to CONFIG_OF, where
bydefault of-core code sets -ve platform dev id and in absence of
DT serial aliases, driver would use the -ve index.

Signed-off-by: Vineet Gupta 
Cc: Alan Cox 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: linux-ser...@vger.kernel.org
---
 drivers/tty/serial/arc_uart.c |   29 +++--
 1 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/tty/serial/arc_uart.c b/drivers/tty/serial/arc_uart.c
index 3e0b3fa..8089dc3 100644
--- a/drivers/tty/serial/arc_uart.c
+++ b/drivers/tty/serial/arc_uart.c
@@ -526,15 +526,11 @@ static struct uart_ops arc_serial_pops = {
 };
 
 static int
-arc_uart_init_one(struct platform_device *pdev, struct arc_uart_port *uart)
+arc_uart_init_one(struct platform_device *pdev, int dev_id)
 {
struct resource *res, *res2;
unsigned long *plat_data;
-
-   if (pdev->id < 0 || pdev->id >= CONFIG_SERIAL_ARC_NR_PORTS) {
-   dev_err(>dev, "Wrong uart platform device id.\n");
-   return -ENOENT;
-   }
+   struct arc_uart_port *uart = _uart_ports[dev_id];
 
plat_data = ((unsigned long *)(pdev->dev.platform_data));
uart->baud = plat_data[0];
@@ -557,7 +553,7 @@ arc_uart_init_one(struct platform_device *pdev, struct 
arc_uart_port *uart)
uart->port.dev = >dev;
uart->port.iotype = UPIO_MEM;
uart->port.flags = UPF_BOOT_AUTOCONF;
-   uart->port.line = pdev->id;
+   uart->port.line = dev_id;
uart->port.ops = _serial_pops;
 
uart->port.uartclk = plat_data[1];
@@ -657,9 +653,14 @@ static struct __initdata console arc_early_serial_console 
= {
 
 static int arc_serial_probe_earlyprintk(struct platform_device *pdev)
 {
-   arc_early_serial_console.index = pdev->id;
+   int dev_id = pdev->id < 0 ? 0 : pdev->id;
+   int rc;
 
-   arc_uart_init_one(pdev, _uart_ports[pdev->id]);
+   arc_early_serial_console.index = dev_id;
+
+   rc = arc_uart_init_one(pdev, dev_id);
+   if (rc)
+   panic("early console init failed\n");
 
arc_serial_console_setup(_early_serial_console, NULL);
 
@@ -675,18 +676,18 @@ static int arc_serial_probe_earlyprintk(struct 
platform_device *pdev)
 
 static int arc_serial_probe(struct platform_device *pdev)
 {
-   struct arc_uart_port *uart;
-   int rc;
+   int rc, dev_id;
 
if (is_early_platform_device(pdev))
return arc_serial_probe_earlyprintk(pdev);
 
-   uart = _uart_ports[pdev->id];
-   rc = arc_uart_init_one(pdev, uart);
+   dev_id = pdev->id < 0 ? 0 : pdev->id;
+   rc = arc_uart_init_one(pdev, dev_id);
if (rc)
return rc;
 
-   return uart_add_one_port(_uart_driver, >port);
+   rc = uart_add_one_port(_uart_driver, _uart_ports[dev_id].port);
+   return rc;
 }
 
 static int arc_serial_remove(struct platform_device *pdev)
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] serial/arc-uart: switch to devicetree based probing

2013-01-10 Thread Vineet Gupta

* DT binding for arc-uart
* With alll the bits in place we can now use DT probing.

Note that there's a bit of kludge right now because earlyprintk portion
of driver can't use the DT infrastrcuture to get resoures/plat_data.
This requires some infrastructre changes to of_flat_ framework

Signed-off-by: Vineet Gupta 
Cc: Grant Likely 
Cc: Arnd Bergmann 
Cc: linux-ser...@vger.kernel.org
Cc: Alan Cox 
Cc: Greg Kroah-Hartman 
Cc: devicetree-disc...@lists.ozlabs.org
Cc: Rob Herring 
Cc: Rob Landley 
Cc: linux-ser...@vger.kernel.org
---
 .../devicetree/bindings/tty/serial/arc-uart.txt|   26 
 drivers/tty/serial/arc_uart.c  |   43 ++-
 2 files changed, 66 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/tty/serial/arc-uart.txt

diff --git a/Documentation/devicetree/bindings/tty/serial/arc-uart.txt 
b/Documentation/devicetree/bindings/tty/serial/arc-uart.txt
new file mode 100644
index 000..c3bd8f9
--- /dev/null
+++ b/Documentation/devicetree/bindings/tty/serial/arc-uart.txt
@@ -0,0 +1,26 @@
+* Synopsys ARC UART : Non standard UART used in some of the ARC FPGA boards
+
+Required properties:
+- compatible   : "snps,arc-uart"
+- reg  : offset and length of the register set for the device.
+- interrupts   : device interrupt
+- clock-frequency  : the input clock frequency for the UART
+- baud : baud rate for UART
+
+e.g.
+
+arcuart0: serial@c0fc1000 {
+   compatible = "snps,arc-uart";
+   reg = <0xc0fc1000 0x100>;
+   interrupts = <5>;
+   clock-frequency = <8000>;
+   baud = <115200>;
+   status = "okay";
+};
+
+Note: Each port should have an alias correctly numbered in "aliases" node.
+
+e.g.
+aliases {
+   serial0 = 
+};
diff --git a/drivers/tty/serial/arc_uart.c b/drivers/tty/serial/arc_uart.c
index 2db6410..b468601 100644
--- a/drivers/tty/serial/arc_uart.c
+++ b/drivers/tty/serial/arc_uart.c
@@ -37,6 +37,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 /*
  * ARC UART Hardware Specs
@@ -537,8 +539,26 @@ arc_uart_init_one(struct platform_device *pdev, int dev_id)
return -ENODEV;
 
uart->is_emulated = !!plat_data[0]; /* workaround ISS bug */
-   uart->port.uartclk = plat_data[1];
-   uart->baud = plat_data[2];
+
+   if (is_early_platform_device(pdev)) {
+   uart->port.uartclk = plat_data[1];
+   uart->baud = plat_data[2];
+   } else {
+   struct device_node *np = pdev->dev.of_node;
+   u32 val;
+
+   if (of_property_read_u32(np, "clock-frequency", )) {
+   dev_err(>dev, "clock-frequency property 
NOTset\n");
+   return -EINVAL;
+   }
+   uart->port.uartclk = val;
+
+   if (of_property_read_u32(np, "baud", )) {
+   dev_err(>dev, "baud property NOT set\n");
+   return -EINVAL;
+   }
+   uart->baud = val;
+   }
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res)
@@ -673,8 +693,18 @@ static int __init arc_serial_probe_earlyprintk(struct 
platform_device *pdev)
 static int arc_serial_probe(struct platform_device *pdev)
 {
int rc, dev_id;
+   struct device_node *np = pdev->dev.of_node;
+
+   /* no device tree device */
+   if (!np)
+   return -ENODEV;
+
+   dev_id = of_alias_get_id(np, "serial");
+   if (dev_id < 0) {
+   dev_err(>dev, "failed to get alias id: %d\n", dev_id);
+   return dev_id;
+   }
 
-   dev_id = pdev->id < 0 ? 0 : pdev->id;
rc = arc_uart_init_one(pdev, dev_id);
if (rc)
return rc;
@@ -689,12 +719,19 @@ static int arc_serial_remove(struct platform_device *pdev)
return 0;
 }
 
+static const struct of_device_id arc_uart_dt_ids[] = {
+   { .compatible = "snps,arc-uart" },
+   { /* Sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, arc_uart_dt_ids);
+
 static struct platform_driver arc_platform_driver = {
.probe = arc_serial_probe,
.remove = arc_serial_remove,
.driver = {
.name = DRIVER_NAME,
.owner = THIS_MODULE,
+   .of_match_table  = arc_uart_dt_ids,
 },
 };
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/4] serial/arc-uart: platform_data order changed

2013-01-10 Thread Vineet Gupta

* is_emulated is now 1st element, rather than last
* also tucked all platform data refs together

Signed-off-by: Vineet Gupta 
Cc: Alan Cox 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: linux-ser...@vger.kernel.org
---
 drivers/tty/serial/arc_uart.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/serial/arc_uart.c b/drivers/tty/serial/arc_uart.c
index 9de26ba..2db6410 100644
--- a/drivers/tty/serial/arc_uart.c
+++ b/drivers/tty/serial/arc_uart.c
@@ -533,7 +533,12 @@ arc_uart_init_one(struct platform_device *pdev, int dev_id)
struct arc_uart_port *uart = _uart_ports[dev_id];
 
plat_data = ((unsigned long *)(pdev->dev.platform_data));
-   uart->baud = plat_data[0];
+   if (!plat_data)
+   return -ENODEV;
+
+   uart->is_emulated = !!plat_data[0]; /* workaround ISS bug */
+   uart->port.uartclk = plat_data[1];
+   uart->baud = plat_data[2];
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res)
@@ -556,7 +561,6 @@ arc_uart_init_one(struct platform_device *pdev, int dev_id)
uart->port.line = dev_id;
uart->port.ops = _serial_pops;
 
-   uart->port.uartclk = plat_data[1];
uart->port.fifosize = ARC_UART_TX_FIFO_SIZE;
 
/*
@@ -565,9 +569,6 @@ arc_uart_init_one(struct platform_device *pdev, int dev_id)
 */
uart->port.ignore_status_mask = 0;
 
-   /* Real Hardware vs. emulated to work around a bug */
-   uart->is_emulated = !!plat_data[2];
-
return 0;
 }
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/4] switch arc-uart to devicetree based probing

2013-01-10 Thread Vineet Gupta

Hi,

As part of converting ARC Port to devicetree infrastructure, the following
series converts the arc-uart driver to DT.

* The first patch is a bug-fix which showed up in the process as DT based
  platform devices by default have -ve id
* Next two prepare the driver for forthcoming DT changes.
* Last one contains the DT bindings and driver using those.

Couple of points worth mentioning:
* The earlyprintk portion of driver still relies on static platform data
  we would need some earlyprintk handling in of_fdt_* to clean it up properly
* Two of the three platform data instances are now retrieved from DT.
  However one still needs to be dynamically passed by platform (using
  of_dev_auxdata) as we want to run same image in simulator and hardware

Tested on in-works ARC 3.8 port.

P.S. Greg, can this be treated as a bug-fix for 3.8

Thx,
-Vineet

Vineet Gupta (4):
  serial/arc-uart: Don't index with -ve platform_device->id
  serial/arc-uart: split probe from probe_earlyprintk
  serial/arc-uart: platform_data order changed
  serial/arc-uart: switch to devicetree based probing

 .../devicetree/bindings/tty/serial/arc-uart.txt|   26 ++
 drivers/tty/serial/arc_uart.c  |   95 ++--
 2 files changed, 94 insertions(+), 27 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/tty/serial/arc-uart.txt

-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] serial/arc-uart: split probe from probe_earlyprintk

2013-01-10 Thread Vineet Gupta

This is in preparation for devicetree based probing, where earlyprintk
won't have access to DT serial aliases which the normal probe would
absolutely rely on.

Signed-off-by: Vineet Gupta 
Cc: Alan Cox 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: linux-ser...@vger.kernel.org
---
 drivers/tty/serial/arc_uart.c |   21 +++--
 1 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/serial/arc_uart.c b/drivers/tty/serial/arc_uart.c
index 8089dc3..9de26ba 100644
--- a/drivers/tty/serial/arc_uart.c
+++ b/drivers/tty/serial/arc_uart.c
@@ -651,7 +651,7 @@ static struct __initdata console arc_early_serial_console = 
{
.index = -1
 };
 
-static int arc_serial_probe_earlyprintk(struct platform_device *pdev)
+static int __init arc_serial_probe_earlyprintk(struct platform_device *pdev)
 {
int dev_id = pdev->id < 0 ? 0 : pdev->id;
int rc;
@@ -667,20 +667,12 @@ static int arc_serial_probe_earlyprintk(struct 
platform_device *pdev)
register_console(_early_serial_console);
return 0;
 }
-#else
-static int arc_serial_probe_earlyprintk(struct platform_device *pdev)
-{
-   return -ENODEV;
-}
 #endif /* CONFIG_SERIAL_ARC_CONSOLE */
 
 static int arc_serial_probe(struct platform_device *pdev)
 {
int rc, dev_id;
 
-   if (is_early_platform_device(pdev))
-   return arc_serial_probe_earlyprintk(pdev);
-
dev_id = pdev->id < 0 ? 0 : pdev->id;
rc = arc_uart_init_one(pdev, dev_id);
if (rc)
@@ -706,6 +698,15 @@ static struct platform_driver arc_platform_driver = {
 };
 
 #ifdef CONFIG_SERIAL_ARC_CONSOLE
+
+static struct platform_driver early_arc_platform_driver = {
+   .probe = arc_serial_probe_earlyprintk,
+   .remove = arc_serial_remove,
+   .driver = {
+   .name = DRIVER_NAME,
+   .owner = THIS_MODULE,
+},
+};
 /*
  * Register an early platform driver of "earlyprintk" class.
  * ARCH platform code installs the driver and probes the early devices
@@ -713,7 +714,7 @@ static struct platform_driver arc_platform_driver = {
  * or it could be done independently, for all "earlyprintk" class drivers.
  * [see arch/arc/plat-arcfpga/platform.c]
  */
-early_platform_init("earlyprintk", _platform_driver);
+early_platform_init("earlyprintk", _arc_platform_driver);
 
 #endif  /* CONFIG_SERIAL_ARC_CONSOLE */
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: Tree for Jan 11

2013-01-10 Thread Stephen Rothwell

Hi all,

Changes since 20130110:

Dropped tree: samung (many conflicts)

The ia64 tree gained a build failure so I used the version from
next-20130110.

The powerpc tree gained a build failure for which I applied a fix patch
and another which I just left for today.

The v4l-dvb tree gained a build failure for which I applied a merge fix
patch.

The scsi tree gained a build failure for which I applied a merge fix
patch.

The net-next tree gained a conflict against the net tree and a build
failure for which I applied a merge fix patch.

The arm-soc tree gained a conflict against the xilinx tree.

The samsung tree gained so many conflicts against the arm-soc tree that I
dropped it for today.

The signal tree gained conflicts against the arm64 tree.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 214 trees (counting Linus' and 28 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (254adaa seq_file: fix new kernel-doc warnings)
Merging fixes/master (d287b87 Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs)
Merging kbuild-current/rc-fixes (bad9955 menuconfig: Replace CIRCLEQ by 
list_head-style lists.)
Merging arm-current/fixes (d106de3 ARM: 7614/1: mm: fix wrong branch from 
Cortex-A9 to PJ4b)
Merging m68k-current/for-linus (e7e29b4 m68k: Wire up finit_module)
Merging powerpc-merge/merge (e6449c9 powerpc: Add missing NULL terminator to 
avoid boot panic on PPC40x)
Merging sparc/master (4e4d78f sparc: Hook up finit_module syscall.)
Merging net/master (cb59c87 net: ethernet: xilinx: Do not use NO_IRQ in axienet)
Merging sound-current/for-linus (c18ab0b Merge tag 'asoc-fix-3.8-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (56d0da4 PCI/AER: pci_get_domain_bus_and_slot() 
call missing required pci_dev_put())
Merging wireless/master (5e20a4b b43: Fix firmware loading when driver is built 
into the kernel)
Merging driver-core.current/driver-core-linus (54b956b Remove __dev* markings 
from init.h)
Merging tty.current/tty-linus (d1c3ed6 Linux 3.8-rc2)
Merging usb.current/usb-linus (75e1a2a USB: ehci: make debug port in-use 
detection functional again)
Merging staging.current/staging-linus (e16a922 staging: tidspbridge: use 
prepare/unprepare on dsp clocks)
Merging char-misc.current/char-misc-linus (e6028db mei: fix mismatch in mutex 
unlock-lock in mei_amthif_read())
Merging input-current/for-linus (bec7a4b Input: lm8323 - fix checking PWM 
interrupt status)
Merging md-current/for-linus (a9add5d md/raid5: add blktrace calls)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (a2c0911 crypto: caam - Updated SEC-4.0 device 
tree binding for ERA information.)
Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops)
Merging dwmw2/master (084a0ec x86: add CONFIG_X86_MOVBE option)
CONFLICT (content): Merge conflict in arch/x86/Kconfig
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (a0d271c Linux 3.6)
Merging devicetree-curr

Re: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus

2013-01-10 Thread Colin Cross

On Thu, Jan 10, 2013 at 9:57 PM, Liu, Chuansheng
 wrote:
>
>
>> -Original Message-
>> From: ccr...@google.com [mailto:ccr...@google.com] On Behalf Of Colin
>> Cross
>> Sent: Friday, January 11, 2013 1:34 PM
>> To: Liu, Chuansheng
>> Cc: linux-kernel@vger.kernel.org; Andrew Morton; Don Zickus; Ingo Molnar;
>> Thomas Gleixner; linux-arm-ker...@lists.infradead.org
>> Subject: Re: [PATCH] hardlockup: detect hard lockups without NMIs using
>> secondary cpus
>>
>> On Thu, Jan 10, 2013 at 5:39 PM, Liu, Chuansheng
>>  wrote:
>> >
>> >
>> >> -Original Message-
>> >> From: Colin Cross [mailto:ccr...@android.com]
>> >> Sent: Thursday, January 10, 2013 9:58 AM
>> >> To: linux-kernel@vger.kernel.org
>> >> Cc: Andrew Morton; Don Zickus; Ingo Molnar; Thomas Gleixner; Liu,
>> >> Chuansheng; linux-arm-ker...@lists.infradead.org; Colin Cross
>> >> Subject: [PATCH] hardlockup: detect hard lockups without NMIs using
>> >> secondary cpus
>> >>
>> >> Emulate NMIs on systems where they are not available by using timer
>> >> interrupts on other cpus.  Each cpu will use its softlockup hrtimer
>> >> to check that the next cpu is processing hrtimer interrupts by
>> >> verifying that a counter is increasing.
>> >>
>> >> This patch is useful on systems where the hardlockup detector is not
>> >> available due to a lack of NMIs, for example most ARM SoCs.
>> >> Without this patch any cpu stuck with interrupts disabled can
>> >> cause a hardware watchdog reset with no debugging information,
>> >> but with this patch the kernel can detect the lockup and panic,
>> >> which can result in useful debugging info.
>> >>
>> >> Signed-off-by: Colin Cross 
>> >> +static void watchdog_check_hardlockup_other_cpu(void)
>> >> +{
>> >> + int cpu;
>> >> + cpumask_t cpus = watchdog_cpus;
>> >> +
>> >> + /*
>> >> +  * Test for hardlockups every 3 samples.  The sample period is
>> >> +  *  watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly
>> over
>> >> +  *  watchdog_thresh (over by 20%).
>> >> +  */
>> >> + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0)
>> >> + return;
>> >> +
> Another feeling is about __this_cpu_read(hrtimer_interrupts) % 3 != 0,
> It will cause the actual timeout value for hard lockup detection is not very 
> fix, or even
> very short.
> Sometimes using 3 samples can detect the lockup case, but sometimes 1 sample.
> Is it the case?

I'm not sure what you mean.  The mod 3 will cause every 3rd timer (12
seconds, assuming watchdog_thresh = 10) to check hrtimer_interrupts
vs. hrtimer_interrupts_saved, and then update it.  The sampling should
be fixed and very accurate.  It will cause a panic/warning between 12
and 24 seconds after a cpu stops processing timer interrupts,
depending on the alignment of the hrtimers between the two cpus.

> And in NMI case, the NMI interrupt is coming at least every watchdog_thresh.

NMI interrupt will happen every 10 seconds instead of 12, meaning the
panic/warning will occur between 10 and 20 seconds after a cpu stops
processing timer interrupts, depending on the alignment of the NMI
with the hrtimer, but otherwise my patch should be very similar.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v4 01/14] ARM: davinci: move private EDMA API to arm/common

2013-01-10 Thread Hebbar, Gururaja

On Fri, Jan 11, 2013 at 11:18:37, Porter, Matt wrote:
> Move mach-davinci/dma.c to common/edma.c so it can be used
> by OMAP (specifically AM33xx) as well. This just moves the
> private EDMA API and enables it to build on OMAP.
> 
> Signed-off-by: Matt Porter 
> ---
>  arch/arm/Kconfig   |1 +
>  arch/arm/common/Kconfig|3 +
>  arch/arm/common/Makefile   |1 +
>  arch/arm/{mach-davinci/dma.c => common/edma.c} |2 +-
>  arch/arm/mach-davinci/Makefile |2 +-
>  arch/arm/mach-davinci/board-tnetv107x-evm.c|2 +-
>  arch/arm/mach-davinci/davinci.h|2 +-
>  arch/arm/mach-davinci/devices-tnetv107x.c  |2 +-
>  arch/arm/mach-davinci/devices.c|7 +-
>  arch/arm/mach-davinci/dm355.c  |2 +-
>  arch/arm/mach-davinci/dm365.c  |2 +-
>  arch/arm/mach-davinci/dm644x.c |2 +-
>  arch/arm/mach-davinci/dm646x.c |2 +-
>  arch/arm/mach-davinci/include/mach/da8xx.h |2 +-
>  arch/arm/mach-davinci/include/mach/edma.h  |  267 
> 
>  arch/arm/plat-omap/Kconfig |1 +
>  drivers/dma/edma.c |2 +-
>  drivers/mmc/host/davinci_mmc.c |1 +
>  include/linux/mfd/davinci_voicecodec.h |3 +-
>  include/linux/platform_data/edma.h |  182 

Headers file are just moved here. So "git mv file1 flie2; and the git 
format-patch -C" on commit should just generate few lines of patch.

>  include/linux/platform_data/spi-davinci.h  |2 +-
>  sound/soc/davinci/davinci-evm.c|1 +
>  sound/soc/davinci/davinci-pcm.c|1 +
>  sound/soc/davinci/davinci-pcm.h|2 +-
>  sound/soc/davinci/davinci-sffsdr.c |6 +-
>  25 files changed, 212 insertions(+), 288 deletions(-)
>  rename arch/arm/{mach-davinci/dma.c => common/edma.c} (99%)
>  delete mode 100644 arch/arm/mach-davinci/include/mach/edma.h
>  create mode 100644 include/linux/platform_data/edma.h
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 67874b8..7637d31 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -932,6 +932,7 @@ config ARCH_DAVINCI
>   select GENERIC_IRQ_CHIP
>   select HAVE_IDE
>   select NEED_MACH_GPIO_H
> + select TI_PRIV_EDMA
>   select USE_OF
>   select ZONE_DMA
>   help
> diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
> index 45ceeb0..9e32d0d 100644
> --- a/arch/arm/common/Kconfig
> +++ b/arch/arm/common/Kconfig
> @@ -40,3 +40,6 @@ config SHARP_PARAM
>  
>  config SHARP_SCOOP
>   bool
> +
> +config TI_PRIV_EDMA
> + bool
> diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
> index e8a4e58..d09a39b 100644
> --- a/arch/arm/common/Makefile
> +++ b/arch/arm/common/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)   += sharpsl_param.o
>  obj-$(CONFIG_SHARP_SCOOP)+= scoop.o
>  obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
>  obj-$(CONFIG_ARM_TIMER_SP804)+= timer-sp.o
> +obj-$(CONFIG_TI_PRIV_EDMA)   += edma.o
> diff --git a/arch/arm/mach-davinci/dma.c b/arch/arm/common/edma.c
> similarity index 99%
> rename from arch/arm/mach-davinci/dma.c
> rename to arch/arm/common/edma.c
> index a685e97..4411087 100644
> --- a/arch/arm/mach-davinci/dma.c
> +++ b/arch/arm/common/edma.c
> @@ -25,7 +25,7 @@
>  #include 
>  #include 
>  
> -#include 
> +#include 
>  
>  /* Offsets matching "struct edmacc_param" */
>  #define PARM_OPT 0x00
> diff --git a/arch/arm/mach-davinci/Makefile b/arch/arm/mach-davinci/Makefile
> index fb5c1aa..493a36b 100644
> --- a/arch/arm/mach-davinci/Makefile
> +++ b/arch/arm/mach-davinci/Makefile
> @@ -5,7 +5,7 @@
>  
>  # Common objects
>  obj-y:= time.o clock.o serial.o psc.o \
> -dma.o usb.o common.o sram.o aemif.o
> +usb.o common.o sram.o aemif.o
>  
>  obj-$(CONFIG_DAVINCI_MUX)+= mux.o
>  
> diff --git a/arch/arm/mach-davinci/board-tnetv107x-evm.c 
> b/arch/arm/mach-davinci/board-tnetv107x-evm.c
> index be30997..86f55ba 100644
> --- a/arch/arm/mach-davinci/board-tnetv107x-evm.c
> +++ b/arch/arm/mach-davinci/board-tnetv107x-evm.c
> @@ -26,12 +26,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
>  
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/arch/arm/mach-davinci/davinci.h b/arch/arm/mach-davinci/davinci.h
> index 12d544b..d26a6bc 100644
> --- a/arch/arm/mach-davinci/davinci.h
> +++ b/arch/arm/mach-davinci/davinci.h
> @@ -23,9 +23,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> -#include 
>  
>  #include 
>  #include 
> diff --git a/arch/arm/mach-davinci/devices-tnetv107x.c 
> b/arch/arm/mach-davinci/devices-tnetv107x.c
>

Re: [PATCH 1/2] Add mempressure cgroup

2013-01-10 Thread Anton Vorontsov

On Fri, Jan 11, 2013 at 02:56:15PM +0900, Minchan Kim wrote:
[...]
> > Ahh. You're talking about the shrinker interface. Yes, there is no way to
> > tell if the freed memory will be actually "released" (and if not, then
> > yes, we released it unnecessary).
> 
> I don't tell about actually "released" or not.
> I assume application actually release pages but the pages would be another
> zones, NOT targetted zone from kernel. In case of that, kernel could ask
> continuously until target zone has enough free memory.
[...]
> > isolate task to only some nodes/zones, if we really care about precise
> > accounting?). But I'm surely open for ideas. :)
> 
> My dumb idea is only notify to user when reclaim is triggered by
> __GFP_HIGHMEM|__GFP_MOVABLE which is most gfp_t for application memory. :)

Ah, I see. Sure, that will help a lot. I'll try to incorporate this into
the next iteration. But there are still unresolved accounting issues that
I outlined, and I don't think that they are this easy to solve. :)

Thanks!

Anton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] MAINTAINER: sync Omar Ramirez Luna's mail to latest.

2013-01-10 Thread Chen Gang


  original mail is invalid, need use the new one.

Signed-off-by: Chen Gang 
---
 MAINTAINERS |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ae9f8b8..dcede8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7553,7 +7553,7 @@ S:Odd Fixes
 F: drivers/staging/speakup/
 
 STAGING - TI DSP BRIDGE DRIVERS
-M: Omar Ramirez Luna 
+M: Omar Ramirez Luna 
 S: Odd Fixes
 F: drivers/staging/tidspbridge/
 
-- 
1.7.10.4

-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the final tree (powerpc tree related)

2013-01-10 Thread Michael Neuling

Stephen Rothwell  wrote:

> Hi all,
> 
> After merging the final tree, today's linux-next build (powerpc
> allyesconfig) failed like this:
> 
> arch/powerpc/kernel/kgdb.c: In function 'kgdb_arch_exit':
> arch/powerpc/kernel/kgdb.c:492:2: error: '__debugger_breakx_match' undeclared 
> (first use in this function)
> 
> Caused by commit 9422de3e953d ("powerpc: Hardware breakpoints rewrite to
> handle non DABR breakpoint registers").  I applied "powerpc: Fix typo in
> breakpoint kgdb code" from Mikey for this.
> 
> arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
> arch/powerpc/kernel/exceptions-64s.S:1204: Error: attempt to move .org 
> backwards
> 
> Not sure what caused that - probably a combination of patches adding code
> low down.  I have just left this broken for today.

FWIW I posted this earlier

http://patchwork.ozlabs.org/patch/211184/

Mikey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at kernel/sched_rt.c:493!

2013-01-10 Thread Mike Galbraith

On Fri, 2013-01-11 at 06:22 +0100, Mike Galbraith wrote: 
> On Thu, 2013-01-10 at 13:58 -0600, Shawn Bohrer wrote:
> 
> > Here is the output:
> > 
> > [   81.278842] SysRq : Changing Loglevel
> > [   81.279027] Loglevel set to 9
> > [   83.285456] Initial want: 5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 9
> > [   85.286452] Initial want: 5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 9
> > [   85.289625] Initial want: 5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 9
> > [   87.287435] Initial want: 1 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 85000
> > [   87.290718] Initial want: 5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 9
> > [   89.288469] Initial want: -5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 10
> > [   89.291550] Initial want: 15000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 8
> > [   89.292940] Initial want: 1 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 85000
> > [   89.294082] Initial want: 1 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 85000
> > [   89.295194] Initial want: 5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 9
> > [   89.296274] Initial want: 5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 9
> > [   90.959004] [sched_delayed] sched: RT throttling activated
> > [   91.289470] Initial want: 2 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 75000
> > [   91.292767] Initial want: 2 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 75000
> > [   91.294037] Initial want: 2 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 75000
> > [   91.295364] Initial want: 2 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 75000
> > [   91.296355] BUG triggered, want: 2
> > [   91.296355] 
> > [   91.296355] rt_rq[7]:
> > [   91.296355]   .rt_nr_running : 0
> > [   91.296355]   .rt_throttled  : 0
> > [   91.296355]   .rt_time   : 0.00
> > [   91.296355]   .rt_runtime: 750.00
> > [   91.307332] Initial want: -5000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 10
> > [   91.308440] Initial want: -1 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 105000
> > [   91.309586] Initial want: -15000 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 11
> > [   91.310716] Initial want: -2 rt_b->rt_runtime: 95000 
> > rt_rq->rt_runtime: 115000
> > [   91.311707] BUG triggered, want: -2
> > [   91.311707] 
> > [   91.311707] rt_rq[6]:
> > [   91.311707]   .rt_nr_running : 1
> > [   91.311707]   .rt_throttled  : 0
> > [   91.311707]   .rt_time   : 307.209987
> > [   91.311707]   .rt_runtime: 1150.00
> 
> That makes about as much sense as my crash did.  There is no leak, but
> cpu found nada.  So rd/span is changing on us?

So I looked at the locking (yet again), and (for the umpteenth time) see
no way in the world that can happen.  Hrmph.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus

2013-01-10 Thread Liu, Chuansheng



> -Original Message-
> From: ccr...@google.com [mailto:ccr...@google.com] On Behalf Of Colin
> Cross
> Sent: Friday, January 11, 2013 1:34 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Andrew Morton; Don Zickus; Ingo Molnar;
> Thomas Gleixner; linux-arm-ker...@lists.infradead.org
> Subject: Re: [PATCH] hardlockup: detect hard lockups without NMIs using
> secondary cpus
> 
> On Thu, Jan 10, 2013 at 5:39 PM, Liu, Chuansheng
>  wrote:
> >
> >
> >> -Original Message-
> >> From: Colin Cross [mailto:ccr...@android.com]
> >> Sent: Thursday, January 10, 2013 9:58 AM
> >> To: linux-kernel@vger.kernel.org
> >> Cc: Andrew Morton; Don Zickus; Ingo Molnar; Thomas Gleixner; Liu,
> >> Chuansheng; linux-arm-ker...@lists.infradead.org; Colin Cross
> >> Subject: [PATCH] hardlockup: detect hard lockups without NMIs using
> >> secondary cpus
> >>
> >> Emulate NMIs on systems where they are not available by using timer
> >> interrupts on other cpus.  Each cpu will use its softlockup hrtimer
> >> to check that the next cpu is processing hrtimer interrupts by
> >> verifying that a counter is increasing.
> >>
> >> This patch is useful on systems where the hardlockup detector is not
> >> available due to a lack of NMIs, for example most ARM SoCs.
> >> Without this patch any cpu stuck with interrupts disabled can
> >> cause a hardware watchdog reset with no debugging information,
> >> but with this patch the kernel can detect the lockup and panic,
> >> which can result in useful debugging info.
> >>
> >> Signed-off-by: Colin Cross 
> >> +static void watchdog_check_hardlockup_other_cpu(void)
> >> +{
> >> + int cpu;
> >> + cpumask_t cpus = watchdog_cpus;
> >> +
> >> + /*
> >> +  * Test for hardlockups every 3 samples.  The sample period is
> >> +  *  watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly
> over
> >> +  *  watchdog_thresh (over by 20%).
> >> +  */
> >> + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0)
> >> + return;
> >> +
Another feeling is about __this_cpu_read(hrtimer_interrupts) % 3 != 0,
It will cause the actual timeout value for hard lockup detection is not very 
fix, or even
very short.
Sometimes using 3 samples can detect the lockup case, but sometimes 1 sample.
Is it the case?

And in NMI case, the NMI interrupt is coming at least every watchdog_thresh.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V4 1/2] virtio-net: fix the set affinity bug when CPU IDs are not consecutive

2013-01-10 Thread Wanlong Gao

As Michael mentioned, set affinity and select queue will not work very
well when CPU IDs are not consecutive, this can happen with hot unplug.
Fix this bug by traversal the online CPUs, and create a per cpu variable
to find the mapping from CPU to the preferable virtual-queue.

Cc: Rusty Russell 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Eric Dumazet 
Cc: virtualizat...@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Signed-off-by: Wanlong Gao 
---
V3->V4:
move vq_index into virtnet_info (Jason)
change the mapping value when not setting affinity (Jason)
address the comments about select_queue (Rusty)
Not Addressed yet:
race between select_queue and set_channels need discuss more
but not the same problem with this (Rusty)

 drivers/net/virtio_net.c | 53 ++--
 1 file changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a6fcf15..ca17a58 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -123,6 +123,9 @@ struct virtnet_info {
 
/* Does the affinity hint is set for virtqueues? */
bool affinity_hint_set;
+
+   /* Per-cpu variable to show the mapping from CPU to virtqueue */
+   int __percpu *vq_index;
 };
 
 struct skb_vnet_hdr {
@@ -1016,6 +1019,7 @@ static int virtnet_vlan_rx_kill_vid(struct net_device 
*dev, u16 vid)
 static void virtnet_set_affinity(struct virtnet_info *vi, bool set)
 {
int i;
+   int cpu;
 
/* In multiqueue mode, when the number of cpu is equal to the number of
 * queue pairs, we let the queue pairs to be private to one cpu by
@@ -1029,16 +1033,29 @@ static void virtnet_set_affinity(struct virtnet_info 
*vi, bool set)
return;
}
 
-   for (i = 0; i < vi->max_queue_pairs; i++) {
-   int cpu = set ? i : -1;
-   virtqueue_set_affinity(vi->rq[i].vq, cpu);
-   virtqueue_set_affinity(vi->sq[i].vq, cpu);
-   }
+   if (set) {
+   i = 0;
+   for_each_online_cpu(cpu) {
+   virtqueue_set_affinity(vi->rq[i].vq, cpu);
+   virtqueue_set_affinity(vi->sq[i].vq, cpu);
+   *per_cpu_ptr(vi->vq_index, cpu) = i;
+   i++;
+   }
 
-   if (set)
vi->affinity_hint_set = true;
-   else
+   } else {
+   for(i = 0; i < vi->max_queue_pairs; i++) {
+   virtqueue_set_affinity(vi->rq[i].vq, -1);
+   virtqueue_set_affinity(vi->sq[i].vq, -1);
+   }
+
+   i = 0;
+   for_each_online_cpu(cpu)
+   *per_cpu_ptr(vi->vq_index, cpu) =
+   ++i % vi->curr_queue_pairs;
+
vi->affinity_hint_set = false;
+   }
 }
 
 static void virtnet_get_ringparam(struct net_device *dev,
@@ -1127,12 +1144,19 @@ static int virtnet_change_mtu(struct net_device *dev, 
int new_mtu)
 
 /* To avoid contending a lock hold by a vcpu who would exit to host, select the
  * txq based on the processor id.
- * TODO: handle cpu hotplug.
  */
 static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
 {
-   int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
- smp_processor_id();
+   int txq;
+   struct virtnet_info *vi = netdev_priv(dev);
+
+   if (skb_rx_queue_recorded(skb)) {
+   txq = skb_get_rx_queue(skb);
+   } else {
+   txq = *__this_cpu_ptr(vi->vq_index);
+   if (txq == -1)
+   txq = 0;
+   }
 
while (unlikely(txq >= dev->real_num_tx_queues))
txq -= dev->real_num_tx_queues;
@@ -1453,6 +1477,10 @@ static int virtnet_probe(struct virtio_device *vdev)
if (vi->stats == NULL)
goto free;
 
+   vi->vq_index = alloc_percpu(int);
+   if (vi->vq_index == NULL)
+   goto free_stats;
+
mutex_init(>config_lock);
vi->config_enable = true;
INIT_WORK(>config_work, virtnet_config_changed_work);
@@ -1476,7 +1504,7 @@ static int virtnet_probe(struct virtio_device *vdev)
/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
err = init_vqs(vi);
if (err)
-   goto free_stats;
+   goto free_index;
 
netif_set_real_num_tx_queues(dev, 1);
netif_set_real_num_rx_queues(dev, 1);
@@ -1520,6 +1548,8 @@ free_recv_bufs:
 free_vqs:
cancel_delayed_work_sync(>refill);
virtnet_del_vqs(vi);
+free_index:
+   free_percpu(vi->vq_index);
 free_stats:
free_percpu(vi->stats);
 free:
@@ -1554,6 +1584,7 @@ static void virtnet_remove(struct virtio_device *vdev)
 
flush_work(>config_work);
 
+   free_percpu(vi->vq_index);
free_percpu(vi->stats);
free_netdev(vi->dev);

[PATCH V4 2/2] virtio-net: reset virtqueue affinity when doing cpu hotplug

2013-01-10 Thread Wanlong Gao

Add a cpu notifier to virtio-net, so that we can reset the
virtqueue affinity if the cpu hotplug happens. It improve
the performance through enabling or disabling the virtqueue
affinity after doing cpu hotplug.

Cc: Rusty Russell 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Eric Dumazet 
Cc: virtualizat...@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Signed-off-by: Wanlong Gao 
---
 drivers/net/virtio_net.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ca17a58..e0b1f25 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int napi_weight = 128;
 module_param(napi_weight, int, 0444);
@@ -126,6 +127,9 @@ struct virtnet_info {
 
/* Per-cpu variable to show the mapping from CPU to virtqueue */
int __percpu *vq_index;
+
+   /* CPU hot plug notifier */
+   struct notifier_block nb;
 };
 
 struct skb_vnet_hdr {
@@ -1058,6 +1062,23 @@ static void virtnet_set_affinity(struct virtnet_info 
*vi, bool set)
}
 }
 
+static int virtnet_cpu_callback(struct notifier_block *nfb,
+   unsigned long action, void *hcpu)
+{
+   struct virtnet_info *vi = container_of(nfb, struct virtnet_info, nb);
+   switch(action) {
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
+   case CPU_DEAD:
+   case CPU_DEAD_FROZEN:
+   virtnet_set_affinity(vi, true);
+   break;
+   default:
+   break;
+   }
+   return NOTIFY_OK;
+}
+
 static void virtnet_get_ringparam(struct net_device *dev,
struct ethtool_ringparam *ring)
 {
@@ -1527,6 +1548,13 @@ static int virtnet_probe(struct virtio_device *vdev)
}
}
 
+   vi->nb.notifier_call = _cpu_callback;
+   err = register_hotcpu_notifier(>nb);
+   if (err) {
+   pr_debug("virtio_net: registering cpu notifier failed\n");
+   goto free_recv_bufs;
+   }
+
/* Assume link up if device can't report link status,
   otherwise get link status from config. */
if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
@@ -1573,6 +1601,8 @@ static void virtnet_remove(struct virtio_device *vdev)
 {
struct virtnet_info *vi = vdev->priv;
 
+   unregister_hotcpu_notifier(>nb);
+
/* Prevent config work handler from accessing the device. */
mutex_lock(>config_lock);
vi->config_enable = false;
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Add mempressure cgroup

2013-01-10 Thread Minchan Kim

On Thu, Jan 10, 2013 at 09:38:31PM -0800, Anton Vorontsov wrote:
> On Fri, Jan 11, 2013 at 02:12:10PM +0900, Minchan Kim wrote:
> > On Wed, Jan 09, 2013 at 02:14:49PM -0800, Anton Vorontsov wrote:
> > > On Tue, Jan 08, 2013 at 05:49:49PM +0900, Minchan Kim wrote:
> > > [...]
> > > > Sorry still I didn't look at your implementation about cgroup part.
> > > > but I had a question since long time ago.
> > > > 
> > > > How can we can make sure false positive about zone and NUMA?
> > > > I mean DMA zone is short in system so VM notify to user and user
> > > > free all memory of NORMAL zone because he can't know what pages live
> > > > in any zones. NUMA is ditto.
> > > 
> > > Um, we count scans irrespective of zones or nodes, i.e. we sum all 'number
> > > of scanned' and 'number of reclaimed' stats. So, it should not be a
> > > problem, as I see it.
> > 
> > Why is it no problem? For example, let's think of normal zone reclaim.
> > Page allocator try to allocate pages from NORMAL zone to DMA zone fallback
> > and your logic could trigger mpc_shrinker. So process A, B, C start to
> > release thier freeable memory but unfortunately, freed pages are all
> > HIGHMEM pages. Why should processes release memory unnecessary?
> > Is there any method for proecess to detect such situation in user level
> > before releasing the freeable memory?
> 
> Ahh. You're talking about the shrinker interface. Yes, there is no way to
> tell if the freed memory will be actually "released" (and if not, then
> yes, we released it unnecessary).

I don't tell about actually "released" or not.
I assume application actually release pages but the pages would be another
zones, NOT targetted zone from kernel. In case of that, kernel could ask
continuously until target zone has enough free memory.

> 
> But that's not only problem with NUMA or zones. Shared pages are in the
> same boat, right? An app might free some memory, but as another process
> might be still using it, we don't know whether our action helps or not.

It's not what I meant.

> 
> The situation is a little bit easier for the in-kernel shrinkers, since we
> have more control over pages, but still, even for the kernel shrinkers, we
> don't provide all the information (only gfpmask, which, I just looked into
> the random user, drivers/gpu/drm/ttm, sometimes is not used).
> 
> So, answering your question: no, I don't know how to solve it for the
> userland. But I also don't think it's a big concern (especially if we make
> it cgroup-aware -- this would be cgroup's worry then, i.e. we might
> isolate task to only some nodes/zones, if we really care about precise
> accounting?). But I'm surely open for ideas. :)

My dumb idea is only notify to user when reclaim is triggered by
__GFP_HIGHMEM|__GFP_MOVABLE which is most gfp_t for application memory. :)


> 
> Thanks!
> 
> Anton
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the final tree (powerpc tree related)

2013-01-10 Thread Stephen Rothwell

Hi all,

After merging the final tree, today's linux-next build (powerpc
allyesconfig) failed like this:

arch/powerpc/kernel/kgdb.c: In function 'kgdb_arch_exit':
arch/powerpc/kernel/kgdb.c:492:2: error: '__debugger_breakx_match' undeclared 
(first use in this function)

Caused by commit 9422de3e953d ("powerpc: Hardware breakpoints rewrite to
handle non DABR breakpoint registers").  I applied "powerpc: Fix typo in
breakpoint kgdb code" from Mikey for this.

arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:1204: Error: attempt to move .org backwards

Not sure what caused that - probably a combination of patches adding code
low down.  I have just left this broken for today.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgppiV4QXf0Am.pgp
Description: PGP signature

[PATCH v4 11/14] ARM: dts: add AM33XX MMC support

2013-01-10 Thread Matt Porter

Adds AM33XX MMC support for am335x-bone, am335x-evm, and
am335x-evmsk..

Signed-off-by: Matt Porter 
---
 arch/arm/boot/dts/am335x-bone.dts  |7 +++
 arch/arm/boot/dts/am335x-evm.dts   |7 +++
 arch/arm/boot/dts/am335x-evmsk.dts |7 +++
 arch/arm/boot/dts/am33xx.dtsi  |   28 
 4 files changed, 49 insertions(+)

diff --git a/arch/arm/boot/dts/am335x-bone.dts 
b/arch/arm/boot/dts/am335x-bone.dts
index 11b240c..a154ce0 100644
--- a/arch/arm/boot/dts/am335x-bone.dts
+++ b/arch/arm/boot/dts/am335x-bone.dts
@@ -120,6 +120,8 @@
};
 
ldo3_reg: regulator@5 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <330>;
regulator-always-on;
};
 
@@ -136,3 +138,8 @@
 _emac1 {
phy_id = <_mdio>, <1>;
 };
+
+ {
+   status = "okay";
+   vmmc-supply = <_reg>;
+};
diff --git a/arch/arm/boot/dts/am335x-evm.dts b/arch/arm/boot/dts/am335x-evm.dts
index d649644..2907da6 100644
--- a/arch/arm/boot/dts/am335x-evm.dts
+++ b/arch/arm/boot/dts/am335x-evm.dts
@@ -232,6 +232,8 @@
};
 
vmmc_reg: regulator@12 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <330>;
regulator-always-on;
};
};
@@ -244,3 +246,8 @@
 _emac1 {
phy_id = <_mdio>, <1>;
 };
+
+ {
+   status = "okay";
+   vmmc-supply = <_reg>;
+};
diff --git a/arch/arm/boot/dts/am335x-evmsk.dts 
b/arch/arm/boot/dts/am335x-evmsk.dts
index f5a6162..f050c46 100644
--- a/arch/arm/boot/dts/am335x-evmsk.dts
+++ b/arch/arm/boot/dts/am335x-evmsk.dts
@@ -244,7 +244,14 @@
};
 
vmmc_reg: regulator@12 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <330>;
regulator-always-on;
};
};
 };
+
+ {
+   status = "okay";
+   vmmc-supply = <_reg>;
+};
diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index e711ffb..278b75d 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -235,6 +235,34 @@
status = "disabled";
};
 
+   mmc1: mmc@4806 {
+   compatible = "ti,omap3-hsmmc";
+   ti,hwmods = "mmc1";
+   ti,dual-volt;
+   ti,needs-special-reset;
+   dmas = < 24
+25>;
+   dma-names = "tx", "rx";
+   status = "disabled";
+   };
+
+   mmc2: mmc@481d8000 {
+   compatible = "ti,omap3-hsmmc";
+   ti,hwmods = "mmc2";
+   ti,needs-special-reset;
+   dmas = < 2
+3>;
+   dma-names = "tx", "rx";
+   status = "disabled";
+   };
+
+   mmc3: mmc@4781 {
+   compatible = "ti,omap3-hsmmc";
+   ti,hwmods = "mmc3";
+   ti,needs-special-reset;
+   status = "disabled";
+   };
+
wdt2: wdt@44e35000 {
compatible = "ti,omap3-wdt";
ti,hwmods = "wd_timer2";
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 03/14] ARM: edma: add AM33XX support to the private EDMA API

2013-01-10 Thread Matt Porter

Adds support for parsing the TI EDMA DT data into the required
EDMA private API platform data. Enables runtime PM support to
initialize the EDMA hwmod. Adds AM33XX EMDA crossbar event mux
support.

Signed-off-by: Matt Porter 
---
 arch/arm/common/edma.c |  314 ++--
 include/linux/platform_data/edma.h |1 +
 2 files changed, 306 insertions(+), 9 deletions(-)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index a3d189d..1951d63 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -24,6 +24,13 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 
 #include 
 
@@ -723,6 +730,9 @@ EXPORT_SYMBOL(edma_free_channel);
  */
 int edma_alloc_slot(unsigned ctlr, int slot)
 {
+   if (!edma_cc[ctlr])
+   return -EINVAL;
+
if (slot >= 0)
slot = EDMA_CHAN_SLOT(slot);
 
@@ -1366,31 +1376,291 @@ void edma_clear_event(unsigned channel)
 EXPORT_SYMBOL(edma_clear_event);
 
 /*---*/
+static int edma_of_read_u32_to_s8_array(const struct device_node *np,
+const char *propname, s8 *out_values,
+size_t sz)
+{
+   struct property *prop = of_find_property(np, propname, NULL);
+   const __be32 *val;
+
+   if (!prop)
+   return -EINVAL;
+   if (!prop->value)
+   return -ENODATA;
+   if ((sz * sizeof(u32)) > prop->length)
+   return -EOVERFLOW;
+
+   val = prop->value;
+
+   while (sz--)
+   *out_values++ = (s8)(be32_to_cpup(val++) & 0xff);
+
+   /* Terminate it */
+   *out_values++ = -1;
+   *out_values++ = -1;
+
+   return 0;
+}
+
+static int edma_of_read_u32_to_s16_array(const struct device_node *np,
+const char *propname, s16 *out_values,
+size_t sz)
+{
+   struct property *prop = of_find_property(np, propname, NULL);
+   const __be32 *val;
+
+   if (!prop)
+   return -EINVAL;
+   if (!prop->value)
+   return -ENODATA;
+   if ((sz * sizeof(u32)) > prop->length)
+   return -EOVERFLOW;
+
+   val = prop->value;
+
+   while (sz--)
+   *out_values++ = (s16)(be32_to_cpup(val++) & 0x);
+
+   /* Terminate it */
+   *out_values++ = -1;
+   *out_values++ = -1;
+
+   return 0;
+}
+
+static int edma_xbar_event_map(struct device *dev,
+  struct device_node *node,
+  struct edma_soc_info *pdata, int len)
+{
+   int ret = 0;
+   int i;
+   struct resource res;
+   void *xbar;
+   const s16 (*xbar_chans)[2];
+   u32 shift, offset, mux;
+
+   xbar_chans = devm_kzalloc(dev,
+ len/sizeof(s16) + 2*sizeof(s16),
+ GFP_KERNEL);
+   if (!xbar_chans)
+   return -ENOMEM;
+
+   ret = of_address_to_resource(node, 1, );
+   if (IS_ERR_VALUE(ret))
+   return -EIO;
+
+   xbar = devm_ioremap(dev, res.start, resource_size());
+   if (!xbar)
+   return -ENOMEM;
+
+   ret = edma_of_read_u32_to_s16_array(node,
+   "ti,edma-xbar-event-map",
+   (s16 *)xbar_chans,
+   len/sizeof(u32));
+   if (IS_ERR_VALUE(ret))
+   return -EIO;
+
+   for (i = 0; xbar_chans[i][0] != -1; i++) {
+   shift = (xbar_chans[i][1] % 4) * 8;
+   offset = xbar_chans[i][1] >> 2;
+   offset <<= 2;
+   mux = readl((void *)((u32)xbar + offset));
+   mux &= ~(0xff << shift);
+   mux |= xbar_chans[i][0] << shift;
+   writel(mux, (void *)((u32)xbar + offset));
+   }
+
+   pdata->xbar_chans = xbar_chans;
+
+   return 0;
+}
+
+static int edma_of_parse_dt(struct device *dev,
+   struct device_node *node,
+   struct edma_soc_info *pdata)
+{
+   int ret = 0;
+   u32 value;
+   struct property *prop;
+   size_t sz;
+   struct edma_rsv_info *rsv_info;
+   const s16 (*rsv_chans)[2], (*rsv_slots)[2];
+   const s8 (*queue_tc_map)[2], (*queue_priority_map)[2];
+
+   memset(pdata, 0, sizeof(struct edma_soc_info));
+
+   ret = of_property_read_u32(node, "dma-channels", );
+   if (ret < 0)
+   return ret;
+   pdata->n_channel = value;
+
+   ret = of_property_read_u32(node, "ti,edma-regions", );
+   if (ret < 0)
+   return ret;
+   pdata->n_region = value;
+
+   ret = of_property_read_u32(node, "ti,edma-slots", );
+   if (ret < 0)
+   return ret;
+

[PATCH v4 05/14] dmaengine: edma: Add TI EDMA device tree binding

2013-01-10 Thread Matt Porter

The binding definition is based on the generic DMA controller
binding.

Signed-off-by: Matt Porter 
---
 Documentation/devicetree/bindings/dma/ti-edma.txt |   51 +
 1 file changed, 51 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/ti-edma.txt

diff --git a/Documentation/devicetree/bindings/dma/ti-edma.txt 
b/Documentation/devicetree/bindings/dma/ti-edma.txt
new file mode 100644
index 000..3344345
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/ti-edma.txt
@@ -0,0 +1,51 @@
+TI EDMA
+
+Required properties:
+- compatible : "ti,edma3"
+- ti,hwmods: Name of the hwmods associated to the EDMA
+- ti,edma-regions: Number of regions
+- ti,edma-slots: Number of slots
+- ti,edma-queue-tc-map: List of transfer control to queue mappings
+- ti,edma-queue-priority-map: List of queue priority mappings
+- ti,edma-default-queue: Default queue value
+
+Optional properties:
+- ti,edma-reserved-channels: List of reserved channel regions
+- ti,edma-reserved-slots: List of reserved slot regions
+- ti,edma-xbar-event-map: Crossbar event to channel map
+
+Example:
+
+edma: edma@4900 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0x4900 0x1>;
+   interrupt-parent = <>;
+   interrupts = <12 13 14>;
+   compatible = "ti,edma3";
+   ti,hwmods = "tpcc", "tptc0", "tptc1", "tptc2";
+   #dma-cells = <1>;
+   dma-channels = <64>;
+   ti,edma-regions = <4>;
+   ti,edma-slots = <256>;
+   ti,edma-reserved-channels = <0  2
+14 2
+26 6
+48 4
+56 8>;
+   ti,edma-reserved-slots = <0  2
+ 14 2
+ 26 6
+ 48 4
+ 56 8
+ 64 127>;
+   ti,edma-queue-tc-map = <0 0
+   1 1
+   2 2>;
+   ti,edma-queue-priority-map = <0 0
+ 1 1
+ 2 2>;
+   ti,edma-default-queue = <0>;
+   ti,edma-xbar-event-map = <1 12
+ 2 13>;
+};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 02/14] ARM: edma: remove unused transfer controller handlers

2013-01-10 Thread Matt Porter

Fix build on OMAP, the irqs are undefined on AM33xx.
These error interrupt handlers were hardcoded as disabled
so since they are unused code, simply remove them.

Signed-off-by: Matt Porter 
---
 arch/arm/common/edma.c |   37 -
 1 file changed, 37 deletions(-)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 4411087..a3d189d 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -494,26 +494,6 @@ static irqreturn_t dma_ccerr_handler(int irq, void *data)
return IRQ_HANDLED;
 }
 
-/**
- *
- * Transfer controller error interrupt handlers
- *
- */
-
-#define tc_errs_handledfalse   /* disabled as long as they're NOPs */
-
-static irqreturn_t dma_tc0err_handler(int irq, void *data)
-{
-   dev_dbg(data, "dma_tc0err_handler\n");
-   return IRQ_HANDLED;
-}
-
-static irqreturn_t dma_tc1err_handler(int irq, void *data)
-{
-   dev_dbg(data, "dma_tc1err_handler\n");
-   return IRQ_HANDLED;
-}
-
 static int reserve_contiguous_slots(int ctlr, unsigned int id,
 unsigned int num_slots,
 unsigned int start_slot)
@@ -1538,23 +1518,6 @@ static int __init edma_probe(struct platform_device 
*pdev)
arch_num_cc++;
}
 
-   if (tc_errs_handled) {
-   status = request_irq(IRQ_TCERRINT0, dma_tc0err_handler, 0,
-   "edma_tc0", >dev);
-   if (status < 0) {
-   dev_dbg(>dev, "request_irq %d failed --> %d\n",
-   IRQ_TCERRINT0, status);
-   return status;
-   }
-   status = request_irq(IRQ_TCERRINT, dma_tc1err_handler, 0,
-   "edma_tc1", >dev);
-   if (status < 0) {
-   dev_dbg(>dev, "request_irq %d --> %d\n",
-   IRQ_TCERRINT, status);
-   return status;
-   }
-   }
-
return 0;
 
 fail:
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 06/14] ARM: dts: add AM33XX EDMA support

2013-01-10 Thread Matt Porter

Adds AM33XX EDMA support to the am33xx.dtsi as documented in
Documentation/devicetree/bindings/dma/ti-edma.txt

Signed-off-by: Matt Porter 
---
 arch/arm/boot/dts/am33xx.dtsi |   20 
 1 file changed, 20 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index c2f14e8..e711ffb 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -87,6 +87,26 @@
reg = <0x4820 0x1000>;
};
 
+   edma: edma@4900 {
+   compatible = "ti,edma3";
+   ti,hwmods = "tpcc", "tptc0", "tptc1", "tptc2";
+   reg =   <0x4900 0x1>,
+   <0x44e10f90 0x10>;
+   interrupt-parent = <>;
+   interrupts = <12 13 14>;
+   #dma-cells = <1>;
+   dma-channels = <64>;
+   ti,edma-regions = <4>;
+   ti,edma-slots = <256>;
+   ti,edma-queue-tc-map = <0 0
+   1 1
+   2 2>;
+   ti,edma-queue-priority-map = <0 0
+ 1 1
+ 2 2>;
+   ti,edma-default-queue = <0>;
+   };
+
gpio1: gpio@44e07000 {
compatible = "ti,omap4-gpio";
ti,hwmods = "gpio1";
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 10/14] mmc: omap_hsmmc: add generic DMA request support to the DT binding

2013-01-10 Thread Matt Porter

The binding definition is based on the generic DMA request binding.

Signed-off-by: Matt Porter 
---
 .../devicetree/bindings/mmc/ti-omap-hsmmc.txt  |   25 +++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt 
b/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt
index ed271fc..826cc51 100644
--- a/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt
+++ b/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt
@@ -20,8 +20,28 @@ ti,dual-volt: boolean, supports dual voltage cards
 ti,non-removable: non-removable slot (like eMMC)
 ti,needs-special-reset: Requires a special softreset sequence
 ti,needs-special-hs-handling: HSMMC IP needs special setting for handling High 
Speed
+dmas: DMA controller phandle and DMA request value ordered pair
+One tx and one rx pair is required.
+dma-names: DMA request names. These strings correspond 1:1 with
+the ordered pairs in dmas. The RX request must be "rx" and the
+TX request must be "tx".
+
+Examples:
+
+[hwmod populated DMA resources]
+
+   mmc1: mmc@0x4809c000 {
+   compatible = "ti,omap4-hsmmc";
+   reg = <0x4809c000 0x400>;
+   ti,hwmods = "mmc1";
+   ti,dual-volt;
+   bus-width = <4>;
+   vmmc-supply = <>; /* phandle to regulator node */
+   ti,non-removable;
+   };
+
+[generic DMA request binding]
 
-Example:
mmc1: mmc@0x4809c000 {
compatible = "ti,omap4-hsmmc";
reg = <0x4809c000 0x400>;
@@ -30,4 +50,7 @@ Example:
bus-width = <4>;
vmmc-supply = <>; /* phandle to regulator node */
ti,non-removable;
+   dmas = < 24
+25>;
+   dma-names = "tx", "rx";
};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 04/14] dmaengine: edma: enable build for AM33XX

2013-01-10 Thread Matt Porter

Enable TI EDMA option on OMAP.

Signed-off-by: Matt Porter 
---
 drivers/dma/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index d4c1218..20ef955 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -221,7 +221,7 @@ config SIRF_DMA
 
 config TI_EDMA
tristate "TI EDMA support"
-   depends on ARCH_DAVINCI
+   depends on ARCH_DAVINCI || ARCH_OMAP
select DMA_ENGINE
select DMA_VIRTUAL_CHANNELS
default n
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 13/14] spi: omap2-mcspi: add generic DMA request support to the DT binding

2013-01-10 Thread Matt Porter

The binding definition is based on the generic DMA request binding.

Signed-off-by: Matt Porter 
---
 Documentation/devicetree/bindings/spi/omap-spi.txt |   28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/spi/omap-spi.txt 
b/Documentation/devicetree/bindings/spi/omap-spi.txt
index 938809c..3bd8eed 100644
--- a/Documentation/devicetree/bindings/spi/omap-spi.txt
+++ b/Documentation/devicetree/bindings/spi/omap-spi.txt
@@ -10,7 +10,18 @@ Required properties:
  input. The default is D0 as input and
  D1 as output.
 
-Example:
+Optional properties:
+- dmas: List of DMA controller phandle and DMA request ordered
+   pairs. One tx and one rx pair is required for each chip
+   select.
+- dma-names: List of DMA request names. These strings correspond
+   1:1 with the ordered pairs in dmas. The string naming is
+   to be "rxN" and "txN" for RX and TX requests,
+   respectively, where N equals the chip select number.
+
+Examples:
+
+[hwmod populated DMA resources]
 
 mcspi1: mcspi@1 {
 #address-cells = <1>;
@@ -20,3 +31,18 @@ mcspi1: mcspi@1 {
 ti,spi-num-cs = <4>;
 };
 
+[generic DMA request binding]
+
+mcspi1: mcspi@1 {
+#address-cells = <1>;
+#size-cells = <0>;
+compatible = "ti,omap4-mcspi";
+ti,hwmods = "mcspi1";
+ti,spi-num-cs = <2>;
+dmas = < 42
+43
+44
+45>;
+dma-names = "tx0", "rx0", "tx1", "rx1";
+};
+
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 14/14] ARM: dts: add AM33XX SPI DMA support

2013-01-10 Thread Matt Porter

Adds DMA resources to the AM33XX SPI nodes.

Signed-off-by: Matt Porter 
---
 arch/arm/boot/dts/am33xx.dtsi |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index 278b75d..8fd3648 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -356,6 +356,11 @@
interrupt = <65>;
ti,spi-num-cs = <2>;
ti,hwmods = "spi0";
+   dmas = < 16
+17
+18
+19>;
+   dma-names = "tx0", "rx0", "tx1", "rx1";
status = "disabled";
};
 
@@ -367,6 +372,11 @@
interrupt = <125>;
ti,spi-num-cs = <2>;
ti,hwmods = "spi1";
+   dmas = < 42
+43
+44
+45>;
+   dma-names = "tx0", "rx0", "tx1", "rx1";
status = "disabled";
};
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 12/14] spi: omap2-mcspi: convert to dma_request_slave_channel_compat()

2013-01-10 Thread Matt Porter

Convert dmaengine channel requests to use
dma_request_slave_channel_compat(). This supports the DT case of
platforms requiring channel selection from either the OMAP DMA or
the EDMA engine. AM33xx only boots from DT and is the only user
implementing EDMA so in the !DT case we can default to the OMAP DMA
filter.

Signed-off-by: Matt Porter 
---
 drivers/spi/spi-omap2-mcspi.c |   65 -
 1 file changed, 45 insertions(+), 20 deletions(-)

diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c
index b610f52..2c02c02 100644
--- a/drivers/spi/spi-omap2-mcspi.c
+++ b/drivers/spi/spi-omap2-mcspi.c
@@ -102,6 +102,9 @@ struct omap2_mcspi_dma {
 
struct completion dma_tx_completion;
struct completion dma_rx_completion;
+
+   char dma_rx_ch_name[14];
+   char dma_tx_ch_name[14];
 };
 
 /* use PIO for small transfers, avoiding DMA setup/teardown overhead and
@@ -822,14 +825,23 @@ static int omap2_mcspi_request_dma(struct spi_device *spi)
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);
sig = mcspi_dma->dma_rx_sync_dev;
-   mcspi_dma->dma_rx = dma_request_channel(mask, omap_dma_filter_fn, );
+
+   mcspi_dma->dma_rx =
+   dma_request_slave_channel_compat(mask, omap_dma_filter_fn,
+, >dev,
+mcspi_dma->dma_rx_ch_name);
+
if (!mcspi_dma->dma_rx) {
dev_err(>dev, "no RX DMA engine channel for McSPI\n");
return -EAGAIN;
}
 
sig = mcspi_dma->dma_tx_sync_dev;
-   mcspi_dma->dma_tx = dma_request_channel(mask, omap_dma_filter_fn, );
+   mcspi_dma->dma_tx =
+   dma_request_slave_channel_compat(mask, omap_dma_filter_fn,
+, >dev,
+mcspi_dma->dma_tx_ch_name);
+
if (!mcspi_dma->dma_tx) {
dev_err(>dev, "no TX DMA engine channel for McSPI\n");
dma_release_channel(mcspi_dma->dma_rx);
@@ -1223,29 +1235,42 @@ static int omap2_mcspi_probe(struct platform_device 
*pdev)
goto free_master;
 
for (i = 0; i < master->num_chipselect; i++) {
-   char dma_ch_name[14];
+   char *dma_rx_ch_name = mcspi->dma_channels[i].dma_rx_ch_name;
+   char *dma_tx_ch_name = mcspi->dma_channels[i].dma_tx_ch_name;
struct resource *dma_res;
 
-   sprintf(dma_ch_name, "rx%d", i);
-   dma_res = platform_get_resource_byname(pdev, IORESOURCE_DMA,
-   dma_ch_name);
-   if (!dma_res) {
-   dev_dbg(>dev, "cannot get DMA RX channel\n");
-   status = -ENODEV;
-   break;
-   }
+   sprintf(dma_rx_ch_name, "rx%d", i);
+   if (!pdev->dev.of_node) {
+   dma_res =
+   platform_get_resource_byname(pdev,
+IORESOURCE_DMA,
+dma_rx_ch_name);
+   if (!dma_res) {
+   dev_dbg(>dev,
+   "cannot get DMA RX channel\n");
+   status = -ENODEV;
+   break;
+   }
 
-   mcspi->dma_channels[i].dma_rx_sync_dev = dma_res->start;
-   sprintf(dma_ch_name, "tx%d", i);
-   dma_res = platform_get_resource_byname(pdev, IORESOURCE_DMA,
-   dma_ch_name);
-   if (!dma_res) {
-   dev_dbg(>dev, "cannot get DMA TX channel\n");
-   status = -ENODEV;
-   break;
+   mcspi->dma_channels[i].dma_rx_sync_dev =
+   dma_res->start;
}
+   sprintf(dma_tx_ch_name, "tx%d", i);
+   if (!pdev->dev.of_node) {
+   dma_res =
+   platform_get_resource_byname(pdev,
+IORESOURCE_DMA,
+dma_tx_ch_name);
+   if (!dma_res) {
+   dev_dbg(>dev,
+   "cannot get DMA TX channel\n");
+   status = -ENODEV;
+   break;
+   }
 
-   mcspi->dma_channels[i].dma_tx_sync_dev = dma_res->start;
+   mcspi->dma_channels[i].dma_tx_sync_dev =
+   dma_res->start;
+   }
}
 
if (status < 0)
-- 
1.7.9.5

--

[PATCH v4 09/14] mmc: omap_hsmmc: set max_segs based on dma engine limitations

2013-01-10 Thread Matt Porter

The EDMA DMAC has a hardware limitation that prevents supporting
scatter gather lists with any number of segments. The DMA Engine
API reports the maximum number of segments a channel can support
via the optional dma_get_channel_caps() API. If the nr_segs
capability is present, the value is used to configure mmc->max_segs
appropriately.

Signed-off-by: Matt Porter 
---
 drivers/mmc/host/omap_hsmmc.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index e79b12d..f74bd69 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -1769,6 +1769,7 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
const struct of_device_id *match;
dma_cap_mask_t mask;
unsigned tx_req, rx_req;
+   struct dmaengine_chan_caps *dma_chan_caps;
struct pinctrl *pinctrl;
 
match = of_match_device(of_match_ptr(omap_mmc_of_match), >dev);
@@ -1935,6 +1936,11 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
goto err_irq;
}
 
+   /* Some DMA Engines only handle a limited number of SG segments */
+   dma_chan_caps = dma_get_channel_caps(host->rx_chan, DMA_DEV_TO_MEM);
+   if (dma_chan_caps && dma_chan_caps->seg_nr)
+   mmc->max_segs = dma_chan_caps->seg_nr;
+
/* Request IRQ for MMC operations */
ret = request_irq(host->irq, omap_hsmmc_irq, 0,
mmc_hostname(mmc), host);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 07/14] dmaengine: add dma_request_slave_channel_compat()

2013-01-10 Thread Matt Porter

Adds a dma_request_slave_channel_compat() wrapper which accepts
both the arguments from dma_request_channel() and
dma_request_slave_channel(). Based on whether the driver is
instantiated via DT, the appropriate channel request call will be
made.

This allows for a much cleaner migration of drivers to the
dmaengine DT API as platforms continue to be mixed between those
that boot using DT and those that do not.

Suggested-by: Tony Lindgren 
Signed-off-by: Matt Porter 
---
 include/linux/dmaengine.h |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 9fd0c5b..64f9f69 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -1047,6 +1047,16 @@ void dma_run_dependencies(struct dma_async_tx_descriptor 
*tx);
 struct dma_chan *dma_find_channel(enum dma_transaction_type tx_type);
 struct dma_chan *net_dma_find_channel(void);
 #define dma_request_channel(mask, x, y) __dma_request_channel(&(mask), x, y)
+static inline struct dma_chan
+*dma_request_slave_channel_compat(dma_cap_mask_t mask, dma_filter_fn fn,
+ void *fn_param, struct device *dev,
+ char *name)
+{
+   if (dev->of_node)
+   return dma_request_slave_channel(dev, name);
+   else
+   return dma_request_channel(mask, fn, fn_param);
+}
 
 /* --- Helper iov-locking functions --- */
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 01/14] ARM: davinci: move private EDMA API to arm/common

2013-01-10 Thread Matt Porter

Move mach-davinci/dma.c to common/edma.c so it can be used
by OMAP (specifically AM33xx) as well. This just moves the
private EDMA API and enables it to build on OMAP.

Signed-off-by: Matt Porter 
---
 arch/arm/Kconfig   |1 +
 arch/arm/common/Kconfig|3 +
 arch/arm/common/Makefile   |1 +
 arch/arm/{mach-davinci/dma.c => common/edma.c} |2 +-
 arch/arm/mach-davinci/Makefile |2 +-
 arch/arm/mach-davinci/board-tnetv107x-evm.c|2 +-
 arch/arm/mach-davinci/davinci.h|2 +-
 arch/arm/mach-davinci/devices-tnetv107x.c  |2 +-
 arch/arm/mach-davinci/devices.c|7 +-
 arch/arm/mach-davinci/dm355.c  |2 +-
 arch/arm/mach-davinci/dm365.c  |2 +-
 arch/arm/mach-davinci/dm644x.c |2 +-
 arch/arm/mach-davinci/dm646x.c |2 +-
 arch/arm/mach-davinci/include/mach/da8xx.h |2 +-
 arch/arm/mach-davinci/include/mach/edma.h  |  267 
 arch/arm/plat-omap/Kconfig |1 +
 drivers/dma/edma.c |2 +-
 drivers/mmc/host/davinci_mmc.c |1 +
 include/linux/mfd/davinci_voicecodec.h |3 +-
 include/linux/platform_data/edma.h |  182 
 include/linux/platform_data/spi-davinci.h  |2 +-
 sound/soc/davinci/davinci-evm.c|1 +
 sound/soc/davinci/davinci-pcm.c|1 +
 sound/soc/davinci/davinci-pcm.h|2 +-
 sound/soc/davinci/davinci-sffsdr.c |6 +-
 25 files changed, 212 insertions(+), 288 deletions(-)
 rename arch/arm/{mach-davinci/dma.c => common/edma.c} (99%)
 delete mode 100644 arch/arm/mach-davinci/include/mach/edma.h
 create mode 100644 include/linux/platform_data/edma.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 67874b8..7637d31 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -932,6 +932,7 @@ config ARCH_DAVINCI
select GENERIC_IRQ_CHIP
select HAVE_IDE
select NEED_MACH_GPIO_H
+   select TI_PRIV_EDMA
select USE_OF
select ZONE_DMA
help
diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
index 45ceeb0..9e32d0d 100644
--- a/arch/arm/common/Kconfig
+++ b/arch/arm/common/Kconfig
@@ -40,3 +40,6 @@ config SHARP_PARAM
 
 config SHARP_SCOOP
bool
+
+config TI_PRIV_EDMA
+   bool
diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index e8a4e58..d09a39b 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_SHARP_PARAM) += sharpsl_param.o
 obj-$(CONFIG_SHARP_SCOOP)  += scoop.o
 obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
 obj-$(CONFIG_ARM_TIMER_SP804)  += timer-sp.o
+obj-$(CONFIG_TI_PRIV_EDMA) += edma.o
diff --git a/arch/arm/mach-davinci/dma.c b/arch/arm/common/edma.c
similarity index 99%
rename from arch/arm/mach-davinci/dma.c
rename to arch/arm/common/edma.c
index a685e97..4411087 100644
--- a/arch/arm/mach-davinci/dma.c
+++ b/arch/arm/common/edma.c
@@ -25,7 +25,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 /* Offsets matching "struct edmacc_param" */
 #define PARM_OPT   0x00
diff --git a/arch/arm/mach-davinci/Makefile b/arch/arm/mach-davinci/Makefile
index fb5c1aa..493a36b 100644
--- a/arch/arm/mach-davinci/Makefile
+++ b/arch/arm/mach-davinci/Makefile
@@ -5,7 +5,7 @@
 
 # Common objects
 obj-y  := time.o clock.o serial.o psc.o \
-  dma.o usb.o common.o sram.o aemif.o
+  usb.o common.o sram.o aemif.o
 
 obj-$(CONFIG_DAVINCI_MUX)  += mux.o
 
diff --git a/arch/arm/mach-davinci/board-tnetv107x-evm.c 
b/arch/arm/mach-davinci/board-tnetv107x-evm.c
index be30997..86f55ba 100644
--- a/arch/arm/mach-davinci/board-tnetv107x-evm.c
+++ b/arch/arm/mach-davinci/board-tnetv107x-evm.c
@@ -26,12 +26,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/mach-davinci/davinci.h b/arch/arm/mach-davinci/davinci.h
index 12d544b..d26a6bc 100644
--- a/arch/arm/mach-davinci/davinci.h
+++ b/arch/arm/mach-davinci/davinci.h
@@ -23,9 +23,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/arm/mach-davinci/devices-tnetv107x.c 
b/arch/arm/mach-davinci/devices-tnetv107x.c
index 773ab07..ba37760 100644
--- a/arch/arm/mach-davinci/devices-tnetv107x.c
+++ b/arch/arm/mach-davinci/devices-tnetv107x.c
@@ -18,10 +18,10 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
-#include 
 #include 
 
 #include "clock.h"
diff --git a/arch/arm/mach-davinci/devices.c b/arch/arm/mach-davinci/devices.c
index 4c48a36..3bdf9f7 100644
--- a/arch/arm/mach-davinci/devices.c
+++ b/arch/arm/mach-davinci/devices.c
@@

[PATCH v4 00/14] DMA Engine support for AM33XX

2013-01-10 Thread Matt Porter

Changes since v3:
- Rebased on 3.8-rc3
- No longer an RFC
- Fixed bugs in DT/pdata parsing reported by Vaibhav Bedia
- Restored all the Davinci pdata to const
- Removed max_segs hack in favor of using dma_get_channel_caps()
- Fixed extra parens, __raw_* accessors and, ioremap error checks
  in xbar handling
- Removed excess license info in platform_data/edma.h
- Removed unneeded reserved channels data for AM33xx
- Removed test-specific pinmuxing from dts files
- Adjusted mmc1 node to be disabled by default in the dtsi

Changes since v2:
- Rebased on 3.7-rc1
- Fixed bug in DT/pdata parsing first found by Gururaja
  that turned out to be masked by some toolchains
- Dropped unused mach-omap2/devices.c hsmmc patch
- Added AM33XX crossbar DMA event mux support
- Added am335x-evm support

Changes since v1:
- Rebased on top of mainline from 12250d8
- Dropped the feature removal schedule patch
- Implemented dma_request_slave_channel_compat() and
  converted the mmc and spi drivers to use it
- Dropped unneeded #address-cells and #size-cells from
  EDMA DT support
- Moved private EDMA header to linux/platform_data/ and
  removed some unneeded definitions
- Fixed parsing of optional properties

This series adds DMA Engine support for AM33xx, which uses
an EDMA DMAC. The EDMA DMAC has been previously supported by only
a private API implementation (much like the situation with OMAP
DMA) found on the DaVinci family of SoCs.

The series applies on top of 3.8-rc3 and the following patches:

- TPS65910 REGMAP_IRQ build fix:
  https://patchwork.kernel.org/patch/1857701/
- dmaengine DT support from Vinod's dmaengine_dt branch in
  git://git.infradead.org/users/vkoul/slave-dma.git since
  027478851791df751176398be02a3b1c5f6aa824
- edma dmaengine driver fix:
  https://patchwork.kernel.org/patch/1961521/
- dmaengine dma_get_channel_caps v2:
  https://patchwork.kernel.org/patch/1961601/
- dmaengine edma driver channel caps support v2:
  https://patchwork.kernel.org/patch/1961591/

The approach taken is similar to how OMAP DMA is being converted to
DMA Engine support. With the functional EDMA private API already
existing in mach-davinci/dma.c, we first move that to an ARM common
area so it can be shared. Adding DT and runtime PM support to the
private EDMA API implementation allows it to run on AM33xx. AM33xx
*only* boots using DT so we leverage Jon's generic DT DMA helpers to
register EDMA DMAC with the of_dma framework and then add support
for calling the dma_request_slave_channel() API to both the mmc
and spi drivers.

With this series both BeagleBone and the AM335x EVM have working
MMC and SPI support.

This is tested on BeagleBone with a SPI framebuffer driver and MMC
rootfs. A trivial gpio DMA event misc driver was used to test the
crossbar DMA event support. It is also tested on the AM335x EVM
with the onboard SPI flash and MMC rootfs. The branch at
https://github.com/ohporter/linux/tree/edma-dmaengine-am33xx-v4
has the complete series, dependencies, and some test
drivers/defconfigs.

Regression testing was done on AM180x-EVM (which also makes use
of the EDMA dmaengine driver and the EDMA private API) using SD,
SPI flash, and the onboard audio supported by the ASoC Davinci
driver. Regression testing was also done on a BeagleBoard xM
booting from the legacy board file using MMC rootfs.

Matt Porter (14):
  ARM: davinci: move private EDMA API to arm/common
  ARM: edma: remove unused transfer controller handlers
  ARM: edma: add AM33XX support to the private EDMA API
  dmaengine: edma: enable build for AM33XX
  dmaengine: edma: Add TI EDMA device tree binding
  ARM: dts: add AM33XX EDMA support
  dmaengine: add dma_request_slave_channel_compat()
  mmc: omap_hsmmc: convert to dma_request_slave_channel_compat()
  mmc: omap_hsmmc: set max_segs based on dma engine limitations
  mmc: omap_hsmmc: add generic DMA request support to the DT binding
  ARM: dts: add AM33XX MMC support
  spi: omap2-mcspi: convert to dma_request_slave_channel_compat()
  spi: omap2-mcspi: add generic DMA request support to the DT binding
  ARM: dts: add AM33XX SPI DMA support

 Documentation/devicetree/bindings/dma/ti-edma.txt  |   51 +
 .../devicetree/bindings/mmc/ti-omap-hsmmc.txt  |   25 +-
 Documentation/devicetree/bindings/spi/omap-spi.txt |   28 +-
 arch/arm/Kconfig   |1 +
 arch/arm/boot/dts/am335x-bone.dts  |7 +
 arch/arm/boot/dts/am335x-evm.dts   |7 +
 arch/arm/boot/dts/am335x-evmsk.dts |7 +
 arch/arm/boot/dts/am33xx.dtsi  |   58 +
 arch/arm/common/Kconfig|3 +
 arch/arm/common/Makefile

Re: [PATCH v3 07/22] sched: set initial load avg of new forked task

2013-01-10 Thread Alex Shi

On 01/11/2013 01:10 PM, Preeti U Murthy wrote:
>> >update_curr(cfs_rq);
>> > -  enqueue_entity_load_avg(cfs_rq, se, flags & ENQUEUE_WAKEUP);
>> > +  enqueue_entity_load_avg(cfs_rq, se, flags);
>> >account_entity_enqueue(cfs_rq, se);
>> >update_cfs_shares(cfs_rq);
>> > 
> I had seen in my experiments, that the forked tasks with initial load to
> be 0,would adversely affect the runqueue lengths.Since the load for
> these tasks to pick up takes some time,the cpus on which the forked
> tasks are scheduled, could be candidates for "dst_cpu" many times and
> the runqueue lengths increase considerably.
> 
> This patch solves this issue by making the forked tasks contribute
> actively to the runqueue load.
> 
> Reviewed-by:Preeti U Murthy
> 

Thanks for review, Preeti! :)


-- 
Thanks Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Add mempressure cgroup

2013-01-10 Thread Anton Vorontsov

On Fri, Jan 11, 2013 at 02:12:10PM +0900, Minchan Kim wrote:
> On Wed, Jan 09, 2013 at 02:14:49PM -0800, Anton Vorontsov wrote:
> > On Tue, Jan 08, 2013 at 05:49:49PM +0900, Minchan Kim wrote:
> > [...]
> > > Sorry still I didn't look at your implementation about cgroup part.
> > > but I had a question since long time ago.
> > > 
> > > How can we can make sure false positive about zone and NUMA?
> > > I mean DMA zone is short in system so VM notify to user and user
> > > free all memory of NORMAL zone because he can't know what pages live
> > > in any zones. NUMA is ditto.
> > 
> > Um, we count scans irrespective of zones or nodes, i.e. we sum all 'number
> > of scanned' and 'number of reclaimed' stats. So, it should not be a
> > problem, as I see it.
> 
> Why is it no problem? For example, let's think of normal zone reclaim.
> Page allocator try to allocate pages from NORMAL zone to DMA zone fallback
> and your logic could trigger mpc_shrinker. So process A, B, C start to
> release thier freeable memory but unfortunately, freed pages are all
> HIGHMEM pages. Why should processes release memory unnecessary?
> Is there any method for proecess to detect such situation in user level
> before releasing the freeable memory?

Ahh. You're talking about the shrinker interface. Yes, there is no way to
tell if the freed memory will be actually "released" (and if not, then
yes, we released it unnecessary).

But that's not only problem with NUMA or zones. Shared pages are in the
same boat, right? An app might free some memory, but as another process
might be still using it, we don't know whether our action helps or not.

The situation is a little bit easier for the in-kernel shrinkers, since we
have more control over pages, but still, even for the kernel shrinkers, we
don't provide all the information (only gfpmask, which, I just looked into
the random user, drivers/gpu/drm/ttm, sometimes is not used).

So, answering your question: no, I don't know how to solve it for the
userland. But I also don't think it's a big concern (especially if we make
it cgroup-aware -- this would be cgroup's worry then, i.e. we might
isolate task to only some nodes/zones, if we really care about precise
accounting?). But I'm surely open for ideas. :)

Thanks!

Anton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus

2013-01-10 Thread Colin Cross

On Thu, Jan 10, 2013 at 5:39 PM, Liu, Chuansheng
 wrote:
>
>
>> -Original Message-
>> From: Colin Cross [mailto:ccr...@android.com]
>> Sent: Thursday, January 10, 2013 9:58 AM
>> To: linux-kernel@vger.kernel.org
>> Cc: Andrew Morton; Don Zickus; Ingo Molnar; Thomas Gleixner; Liu,
>> Chuansheng; linux-arm-ker...@lists.infradead.org; Colin Cross
>> Subject: [PATCH] hardlockup: detect hard lockups without NMIs using
>> secondary cpus
>>
>> Emulate NMIs on systems where they are not available by using timer
>> interrupts on other cpus.  Each cpu will use its softlockup hrtimer
>> to check that the next cpu is processing hrtimer interrupts by
>> verifying that a counter is increasing.
>>
>> This patch is useful on systems where the hardlockup detector is not
>> available due to a lack of NMIs, for example most ARM SoCs.
>> Without this patch any cpu stuck with interrupts disabled can
>> cause a hardware watchdog reset with no debugging information,
>> but with this patch the kernel can detect the lockup and panic,
>> which can result in useful debugging info.
>>
>> Signed-off-by: Colin Cross 
>> +static void watchdog_check_hardlockup_other_cpu(void)
>> +{
>> + int cpu;
>> + cpumask_t cpus = watchdog_cpus;
>> +
>> + /*
>> +  * Test for hardlockups every 3 samples.  The sample period is
>> +  *  watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly over
>> +  *  watchdog_thresh (over by 20%).
>> +  */
>> + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0)
>> + return;
>> +
>> + /* check for a hardlockup on the next cpu */
>> + cpu = cpumask_next(smp_processor_id(), );
>> + if (cpu >= nr_cpu_ids)
>> + cpu = cpumask_first();
>> + if (cpu == smp_processor_id())
>> + return;
>> +
>> + smp_rmb();
>> +
>> + if (per_cpu(watchdog_nmi_touch, cpu) == true) {
>> + per_cpu(watchdog_nmi_touch, cpu) = false;
>> + return;
>> + }
>> +
>> + if (is_hardlockup_other_cpu(cpu)) {
>> + /* only warn once */
> One possible case for new hotplug CPU that false hardlockup case.
> 1/ Assume CPU1, CPU2 are online, CPU3 is being hotplug:
> CPU3:CPU2:
> watchdog_nmi_enable()
>  per_cpu(watchdog_nmi_touch, cpu) = true;
>  cpumask_set_cpu(cpu, _cpus);
>  
> watchdog_check_hardlockup_other_cpu()
>
> per_cpu(watchdog_nmi_touch, cpu) = false; == > Here cpu is CPU3
>
> 2/ Before CPU3's first hrtimer interrupt coming, CPU2 is been offlined.
>   Then CPU1's next CPU is CPU3. But we can not use CPU3's watchdog_nmi_touch 
> to defense
>   false CPU3 hardlock more. When CPU1's hrtimer interrupt coming, it is 
> possible report CPU3
>   false hard lockup.
>
> Is it the case?

Yes, this is the same race condition I pointed out in reply to Don
Zickus earlier in the thread.  I think the easiest solution is to set
per_cpu(watchdog_nmi_touch, next_cpu) = true during cpu offlining.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] rtlwifi/rtl8188ee: Add firmware for new driver

2013-01-10 Thread Larry Finger

Signed-off-by: Larry Finger 
---
 WHENCE |8 
 rtlwifi/rtl8188efw.bin |  Bin 0 -> 11216 bytes
 2 files changed, 8 insertions(+)
 create mode 100644 rtlwifi/rtl8188efw.bin

diff --git a/WHENCE b/WHENCE
index c445856..f84bec0 100644
--- a/WHENCE
+++ b/WHENCE
@@ -1810,6 +1810,14 @@ File: rtlwifi/rtl8723fw_B.bin
 Licence: Redistributable. See LICENCE.rtlwifi_firmware.txt for details.
 
 --
+Driver: rtl8188ee - Realtek 802.11n WLAN driver for RTL8188EE
+
+Info: Taken from Realtek version 
rtl_92ce_92se_92de_8723ae_88ee_linux_mac80211_0010.0109.2013
+File: rtlwifi/rtl8188efw.bin
+
+Licence: Redistributable. See LICENCE.rtlwifi_firmware.txt for details.
+
+--
 
 Driver: rtk_btusb - Realtek Bluetooth driver
 
diff --git a/rtlwifi/rtl8188efw.bin b/rtlwifi/rtl8188efw.bin
new file mode 100644
index 
..ac9a430a8350ac533153068a8c2eb24e2e76dc65
GIT binary patch
literal 11216
zcmeHNdw5jUwV!k5%p=K=#|a=Iw_
zZol-Q)qimCALIRh@Bcvwyw|Fyr>T|7!yN^U9qw1zd+Xzm`)*A>9$!D>{^a_7>yP*z
z>+k%7T@Gh<{mf{quvL6?zp`2%+vyG!yw}?4KWGn(JG!sc{Ym#%>pHw~Pv%^o(O=|t
z?^KkCZMD(qk9}>6dvXcjud)cZuYO{a@S_WbV#(rXP)a_HU^;
zFYclFvfZh1ogencjk|)*4t8{$oGG~#yUhsOgU*cmUor;A_;oDGaZ@sK4{LyK|ga
zN2>ewU|wF=`0>Z`^7L=Uj@3Wy!D1AL;V`J;C2V)+@t(Il@+-zTr?XQG7(c!%FYnlR
z{hPD(PZGYh{hdmGVyXhRHXU^*rfxbKv3Z9WNzi{elr~Js635-g4!K*MhsGAy+P|
zxhVV?>N`oQ5ZiI;8R#7R@`pk9c1PFI3FGy??e=Ysi8SnU5Ej^}agRtvgt<*}KcY~V
z{JPPNA*1rPJ35c)eFXaKG2=8wNbGTkpUL}T`wl_akp3a5NUpqc1CM3Wv^jeJd$0J9
zjf%KCg9R5J4f<;?(p~;uXZO+Iq&2eJ=8#xa-_P{~-^CE4nJ$rP`6@SR2+8ak~?N
zW{7>I*vsmScMtE6vdJ7pylG5RNFLM7W^!-hnoQ64)7@96j$Ae9Oe9la2V#1GBrERXvI^a-dONm8+7$dlG^5S)$Mdv_X$%ZX
zh5xR#Ph!FC6>pO)^6)N;ix0h@{IA
zDG{7!>W5jNn2Thf<)X;s((?5|iER=Tv^OEb4g
zrGkx7L`+L_%@vY+x1sAxmU&2$m+sE8H%VYk1Pv@hnjOX}^cd3yP+(kYcpQ-
zMOsmV9JI@KELE4!`RBVd$is_hm#X7SAn1}76kzqHmr{4Kr|)>my9k1~ex?f8}c
zgN}$@4V;HsOfK1sB94xaP^kwP?pza-C|*#Bdq*(i@(5VX`SUV=%coDe=|2!P!k(H+pacSNU_IH|LO
zW7ibgwesUnY8zI(vEp^DwZ6Qhb;X;Ssp!~=)a$Nr$pyK|0umq5!cL-h>~QZ19v(B?
znn=CQ;dT)>m3nA`Qirb=>ob*_aEgaV520x7%TnRdQ?s*mbMOA*R%uszhiUwx8zOr?
z!DFdG_0Fpww7ua%lbOmSfPcY)h2f>PQ;JI2F#xjNH0A1RDu{7t3R(b4f#
zt2aCco@rV7XxJ~1rDSstWFuV?Ud^LW+=o_YYHz~s+OfONd=_v)hrf<8({U9lq`sp~
zJ#MKs9hFfJN$?jQf&~=rrPk0=Ucypvo=oGHC}S+WyR=lZ!>kx%dr>%D_mbDPrJVTtQ=
zNTR?f)h7Cx0LomSoPdeUHWjXqCp)#jMX3<|Zmv@qqf=!{zr$7|kf3=?r&7i|EKota
z6~?%HrbR^E17qxs(aDLkU8QB12M8XnmtY9tlLlI2vUo-`!PQ$lrV(X9hk#k#pdidR
zECqBYMP$SGfRU2Uof6g>%MmIhP5d3!2S%Yyghs}}lr9CkDbh*x7BdE##1q_uuso1D
z>~9ZOf@f!KzOe9#y2htmHxatzq16$2N0js%f8USD>aBN=u-Z^xPc4i(Z1*t^
z3qmJA>*wD2tI6JBgcOm*`AK*8U4QBkB8IB68z;6!HKog$KUT2g*&(xO8qdZWxugbO
z*<~8fu@)Tw)}XiO6$PgB47nHv%hUT?sTGT4a1Q6Icic=gvpYS8C}pDTo=|~$>rx_2
zZqSrEVQwM18!Y$Z=0XA|uxVN(d3J0d6}fhew095)uCnI4COaQ?ly=!o`vz*X
z85WPnnZ=*BMtXAmhBM=5Yp=7-O0;_(SqGZQ~s2%tL359c5gEC|qm9F6in
zN*>goL^T?=b8ikFg*JIfUrDD)o528d*lQ|hvc905+V8{
zq+yvI5CisgJV65~(kcp~b`o#DVq;JPOYGS*R$=LIwC-`0O~P`%PmrYxx2g3>Tn$Q)*`c
zI7OQ`GHs1bo%Sj^vdX0IG;PONSyvv5NM)U#i0q{{g0hI+TXLDHO9@_bvne$|7V?EJ
zm+UXMmpoOnBbK~4c}eop5JOV?p`*
zw81!-q*YOl5GIOUOD@CC>8Le+VQ<|2Q)e$ulRl|lEdyMn!^?E_y)O?QpXP7B<
zS#2W^LZt(X8-{>b;-{U-ia$TsFVogZ6vCL!iEYfs^O5uXo`=fg|z_qi~`-)G72
zJp*f-4RbIt?hv2zosw!1gOfhP@^5<45X5qZ!jD_~obj{o3ZDCOJnF}JLQBK8Yf?2}
zcHn3j3lJ?bbL%
zO1C2t1o4YNhO5=hL$Lw?IdON>H~cKuRcfN%W_N^5uG_;DX?@G43fm&=OD=?#KA=?@cj9Q>$xh2{#{GAx
z-|HSi+iRiS4eHr}`z**k!;t+JBs2`+7UbSxNUsI?(J+-JY9}o83U@cOtjaDY0rt|^O?_wFj`rWlo8U`P$+4-Nw2|Fcoain4cCsMHV
zf#s%gM}i0WUYC9ZimhU_itGXREfwjNRkZD2VRk6ZJ;*+NRz(uv3zIA6BPr-YMDD
z>ldn(%k;nIqz{dx@cTLI*pfjiRBzXIO77lPQ;LI=AscY3(%UYu)LRBko11BgRK!g#
zEwiBpjFgHK2Ip+^chL#Exkj9@(c*-aW=|NLyEtK`r=2k1;e-J%cf#0jyd#$U5G}c6
zj(#M|H>FNC(}8+{rQUKXYmg9GW9z}Go@Pp|EL7Uv$cwi6T*t%u#r(trQpwLKU4o><
z3nbf!tu0^VPaHS_FIeKu$-`Jf9>I8qWyq*V33cLNq;-u{(w_*V?;u#uaMW
zN2XE5210IX1U}Kn9$R$T!}cGUC1)A!Bl$n8EV-(+m%-wuR33Y
z^Qy+kZY`2JqUnQb;DaLpeUN}en}{CKB7^3`@1QLvzUY@U<>}Lil%SwxAyYm^C1gQ^
zXr4*~cgHe4kEkJ~6qzg*wc4$gZBJ-YhE_^mxgKVqDp+Vuc#BwwFYDUcXg
zUnE#xBnLjp?tA(2tS~20Tns;JOvzev)0m7Jxf;{B95vyyjMJN$oQ7A(=BPT6A|5Ovw1mi#eWw>wnl
z;cUHlylKp)2;_2>>V}OuU+5A7R0xAr&+J%O*8ZyHw$gE5&6^lsAaxRubQ#!7UxyV?;s
zE)ugY18>NAyLKp9MY|%wUqf$g_Sm-C#jzWGl7N?CsHL)W=(HkeF7WBlS?V1esovJ<
zs-@Y)F{0BY<8{HMl!Gn${bPV(v6AT?=!BJspyi#HJiBxLpFtNe#86qw=lFdhW4(sU*S#T8f3(8aZOPe5I+z-_Y
zUydblrP`bb{^+JUDPHGWC|JE7tlHMHWKFUbsq_Q!8b2#2L#lYnp8DDzeJ`-0jsQ*y?5eQioV7hQ)6%8)d69i(_RgBn5T;Lx?Oy~sdN02WyrApKu
z24n%(N8xWwbXefEkepP>Zh-tAK}gUO-eV=iNl=6dabh)DCxUp;STf|~au)4IVY9A0
z4StJ=qRWtpRDVUD2mEYqgIzSPkwiV~=_IUEQ%5f~~Q0d);)0YBB`?q%sI2(T6t
z*U=u=k|I^e^ZUX#UPKBBy@5j3Lv?7>R55CXZ9|g)YZ$ed1t-xkpD_Iv`gKTB^nojl
zU5~%@ES0zpfw)nPBOMhq0HT{ThC%Hf9?6iaX7_J4BLry|0{m#*}A63o57f1
z;|a`LkH+=60NYqGRKp@mXsRJ^eKqXBxA8gg6tM~U{Fr7KI9WiZaWzbZ2nLS#!=swU
zbu8USS_0YhT!7zxC4PkPjySz{wYzW%a>jTs-P@wr>KLfA`~~dfQFu
z)uHw{(S9w#y(Hxzm!GV(Mmo!GUxW?W(ttO6d`V~F$KMw$%IdNyy7{u$0A(nD;;
zoaB7Ge9lc);_tlV0{pE=2GA3qkDOB_z6JOKbf=%qR#YXI0AXQrDgG`>UV*=hlUJfA

Re: kernel BUG at kernel/sched_rt.c:493!

2013-01-10 Thread Mike Galbraith

On Thu, 2013-01-10 at 13:58 -0600, Shawn Bohrer wrote:

> Here is the output:
> 
> [   81.278842] SysRq : Changing Loglevel
> [   81.279027] Loglevel set to 9
> [   83.285456] Initial want: 5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 9
> [   85.286452] Initial want: 5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 9
> [   85.289625] Initial want: 5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 9
> [   87.287435] Initial want: 1 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 85000
> [   87.290718] Initial want: 5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 9
> [   89.288469] Initial want: -5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 10
> [   89.291550] Initial want: 15000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 8
> [   89.292940] Initial want: 1 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 85000
> [   89.294082] Initial want: 1 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 85000
> [   89.295194] Initial want: 5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 9
> [   89.296274] Initial want: 5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 9
> [   90.959004] [sched_delayed] sched: RT throttling activated
> [   91.289470] Initial want: 2 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 75000
> [   91.292767] Initial want: 2 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 75000
> [   91.294037] Initial want: 2 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 75000
> [   91.295364] Initial want: 2 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 75000
> [   91.296355] BUG triggered, want: 2
> [   91.296355] 
> [   91.296355] rt_rq[7]:
> [   91.296355]   .rt_nr_running : 0
> [   91.296355]   .rt_throttled  : 0
> [   91.296355]   .rt_time   : 0.00
> [   91.296355]   .rt_runtime: 750.00
> [   91.307332] Initial want: -5000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 10
> [   91.308440] Initial want: -1 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 105000
> [   91.309586] Initial want: -15000 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 11
> [   91.310716] Initial want: -2 rt_b->rt_runtime: 95000 
> rt_rq->rt_runtime: 115000
> [   91.311707] BUG triggered, want: -2
> [   91.311707] 
> [   91.311707] rt_rq[6]:
> [   91.311707]   .rt_nr_running : 1
> [   91.311707]   .rt_throttled  : 0
> [   91.311707]   .rt_time   : 307.209987
> [   91.311707]   .rt_runtime: 1150.00

That makes about as much sense as my crash did.  There is no leak, but
cpu found nada.  So rd/span is changing on us?  We saw nada as we
traversed, release locks, poof, it's there.  In my dump, at crash time,
rd/span was fine, but at the top of the stack, I found a mighty
suspicious reference to def_root_domain.span.  At _crash_ time, both it
and rq->rd.span had pilfer-able goodies, but whatever it was that we
traversed had nada just a wee bit earlier.

If the darn thing would trigger for me again, I'd use trace_printk(),
check rd and cpumask_weight(rd->span) before/after traverse, print cpu
and rt_runtime as we traverse, set /proc/sys/kernel/ftrace_dump_on_oops
to 1.. and hope bug doesn't like to play heisenbug games.

-Mike 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver

2013-01-10 Thread David Miller

From: Ming Lei 
Date: Fri, 11 Jan 2013 10:45:38 +0800

> Cc netdev and usb lists.

That doesn't work for patches, sorry.  It will have to be submitted
freshly and cleanly to the appropriate lists, not as a quoted reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.8-rc] tuntap: refuse to re-attach to different tun_struct

2013-01-10 Thread David Miller

From: Jason Wang 
Date: Fri, 11 Jan 2013 09:29:20 +0800

> On 01/11/2013 06:39 AM, David Miller wrote:
>> From: Stefan Hajnoczi 
>> Date: Thu, 10 Jan 2013 08:59:48 +0100
>>
>>> Multiqueue tun devices support detaching a tun_file from its tun_struct
>>> and re-attaching at a later point in time.  This allows users to disable
>>> a specific queue temporarily.
>>>
>>> ioctl(TUNSETIFF) allows the user to specify the network interface to
>>> attach by name.  This means the user can attempt to attach to interface
>>> "B" after detaching from interface "A".
>>>
>>> The driver is not designed to support this so check we are re-attaching
>>> to the right tun_struct.  Failure to do so may lead to oops.
>>>
>>> Signed-off-by: Stefan Hajnoczi 
>> Applied.
> Hi David:
> 
> Any chance that I can have a respin on this patch, there's still a bug
> after this patch. Or I just can send a patch on top?

If I've applied it, there is no reverting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Add mempressure cgroup

2013-01-10 Thread Minchan Kim

On Wed, Jan 09, 2013 at 02:14:49PM -0800, Anton Vorontsov wrote:
> On Tue, Jan 08, 2013 at 05:49:49PM +0900, Minchan Kim wrote:
> [...]
> > Sorry still I didn't look at your implementation about cgroup part.
> > but I had a question since long time ago.
> > 
> > How can we can make sure false positive about zone and NUMA?
> > I mean DMA zone is short in system so VM notify to user and user
> > free all memory of NORMAL zone because he can't know what pages live
> > in any zones. NUMA is ditto.
> 
> Um, we count scans irrespective of zones or nodes, i.e. we sum all 'number
> of scanned' and 'number of reclaimed' stats. So, it should not be a
> problem, as I see it.

Why is it no problem? For example, let's think of normal zone reclaim.
Page allocator try to allocate pages from NORMAL zone to DMA zone fallback
and your logic could trigger mpc_shrinker. So process A, B, C start to
release thier freeable memory but unfortunately, freed pages are all
HIGHMEM pages. Why should processes release memory unnecessary?
Is there any method for proecess to detect such situation in user level
before releasing the freeable memory?

In android smart phone, until now, there was a zone - DMA so low memory
killer didn't have a problem but these days smart phone use 2G DRAM so
we started seeing the above problem. Your generic approach should solve
the problem, too.

> 
> Thanks,
> Anton
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 07/22] sched: set initial load avg of new forked task

2013-01-10 Thread Preeti U Murthy

On 01/05/2013 02:07 PM, Alex Shi wrote:
> New task has no runnable sum at its first runnable time, that make
> burst forking just select few idle cpus to put tasks.
> Set initial load avg of new forked task as its load weight to resolve
> this issue.
> 
> Signed-off-by: Alex Shi 
> ---
>  include/linux/sched.h |  1 +
>  kernel/sched/core.c   |  2 +-
>  kernel/sched/fair.c   | 11 +--
>  3 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 206bb08..fb7aab5 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1069,6 +1069,7 @@ struct sched_domain;
>  #else
>  #define ENQUEUE_WAKING   0
>  #endif
> +#define ENQUEUE_NEWTASK  8
> 
>  #define DEQUEUE_SLEEP1
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 66c1718..66ce1f1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1705,7 +1705,7 @@ void wake_up_new_task(struct task_struct *p)
>  #endif
> 
>   rq = __task_rq_lock(p);
> - activate_task(rq, p, 0);
> + activate_task(rq, p, ENQUEUE_NEWTASK);
>   p->on_rq = 1;
>   trace_sched_wakeup_new(p, true);
>   check_preempt_curr(rq, p, WF_FORK);
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 895a3f4..5c545e4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1503,8 +1503,9 @@ static inline void update_rq_runnable_avg(struct rq 
> *rq, int runnable)
>  /* Add the load generated by se into cfs_rq's child load-average */
>  static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq,
> struct sched_entity *se,
> -   int wakeup)
> +   int flags)
>  {
> + int wakeup = flags & ENQUEUE_WAKEUP;
>   /*
>* We track migrations using entity decay_count <= 0, on a wake-up
>* migration we use a negative decay count to track the remote decays
> @@ -1538,6 +1539,12 @@ static inline void enqueue_entity_load_avg(struct 
> cfs_rq *cfs_rq,
>   update_entity_load_avg(se, 0);
>   }
> 
> + /*
> +  * set the initial load avg of new task same as its load
> +  * in order to avoid brust fork make few cpu too heavier
> +  */
> + if (flags & ENQUEUE_NEWTASK)
> + se->avg.load_avg_contrib = se->load.weight;
>   cfs_rq->runnable_load_avg += se->avg.load_avg_contrib;
>   /* we force update consideration on load-balancer moves */
>   update_cfs_rq_blocked_load(cfs_rq, !wakeup);
> @@ -1701,7 +1708,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct 
> sched_entity *se, int flags)
>* Update run-time statistics of the 'current'.
>*/
>   update_curr(cfs_rq);
> - enqueue_entity_load_avg(cfs_rq, se, flags & ENQUEUE_WAKEUP);
> + enqueue_entity_load_avg(cfs_rq, se, flags);
>   account_entity_enqueue(cfs_rq, se);
>   update_cfs_shares(cfs_rq);
> 
I had seen in my experiments, that the forked tasks with initial load to
be 0,would adversely affect the runqueue lengths.Since the load for
these tasks to pick up takes some time,the cpus on which the forked
tasks are scheduled, could be candidates for "dst_cpu" many times and
the runqueue lengths increase considerably.

This patch solves this issue by making the forked tasks contribute
actively to the runqueue load.

Reviewed-by:Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 04/22] sched: don't need go to smaller sched domain

2013-01-10 Thread Preeti U Murthy

On 01/05/2013 02:07 PM, Alex Shi wrote:
> If parent sched domain has no task allowed cpu find. neither find in
> it's child. So, go out to save useless checking.
> 
> Signed-off-by: Alex Shi 
> ---
>  kernel/sched/fair.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3c7b09a..ecfbf8e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3378,10 +3378,8 @@ select_task_rq_fair(struct task_struct *p, int 
> sd_flag, int wake_flags)
>   load_idx = sd->wake_idx;
> 
>   group = find_idlest_group(sd, p, cpu, load_idx);
> - if (!group) {
> - sd = sd->child;
> - continue;
> - }
> + if (!group)
> + goto unlock;
> 
>   new_cpu = find_idlest_cpu(group, p, cpu);
> 
Reviewed-by:Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] clk: add common of_clk_init() function

2013-01-10 Thread Prashant Gaikwad


On Friday 11 January 2013 01:23 AM, Josh Cartwright wrote:

* PGP Signed by an unknown key

On Fri, Jan 04, 2013 at 12:30:52PM +0530, Prashant Gaikwad wrote:

Modify of_clk_init function so that it will determine which
driver to initialize based on device tree instead of each driver
registering to it.

Based on a similar patch for drivers/irqchip by Thomas Petazzoni and
drivers/clocksource by Stephen Warren.

Signed-off-by: Prashant Gaikwad 
---

Prashant-

Sorry for the late response, but I finally got a chance to give this
patchset a spin on Zynq.  For patches 1 and 6:

Reviewed-by: Josh Cartwright 
Tested-by: Josh Cartwright 

   Josh


Thanks Josh!!


* Unknown Key
* 0x846F0FA5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 03/22] sched: fix find_idlest_group mess logical

2013-01-10 Thread Preeti U Murthy

On 01/05/2013 02:07 PM, Alex Shi wrote:
> There is 4 situations in the function:
> 1, no task allowed group;
>   so min_load = ULONG_MAX, this_load = 0, idlest = NULL
> 2, only local group task allowed;
>   so min_load = ULONG_MAX, this_load assigned, idlest = NULL
> 3, only non-local task group allowed;
>   so min_load assigned, this_load = 0, idlest != NULL
> 4, local group + another group are task allowed.
>   so min_load assigned, this_load assigned, idlest != NULL
> 
> Current logical will return NULL in first 3 kinds of scenarios.
> And still return NULL, if idlest group is heavier then the
> local group in the 4th situation.
> 
> Actually, I thought groups in situation 2,3 are also eligible to host
> the task. And in 4th situation, agree to bias toward local group.
> So, has this patch.
> 
> Signed-off-by: Alex Shi 
> ---
>  kernel/sched/fair.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6d3a95d..3c7b09a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3181,6 +3181,7 @@ find_idlest_group(struct sched_domain *sd, struct 
> task_struct *p,
> int this_cpu, int load_idx)
>  {
>   struct sched_group *idlest = NULL, *group = sd->groups;
> + struct sched_group *this_group = NULL;
>   unsigned long min_load = ULONG_MAX, this_load = 0;
>   int imbalance = 100 + (sd->imbalance_pct-100)/2;
> 
> @@ -3215,14 +3216,19 @@ find_idlest_group(struct sched_domain *sd, struct 
> task_struct *p,
> 
>   if (local_group) {
>   this_load = avg_load;
> - } else if (avg_load < min_load) {
> + this_group = group;
> + }
> + if (avg_load < min_load) {
>   min_load = avg_load;
>   idlest = group;
>   }
>   } while (group = group->next, group != sd->groups);
> 
> - if (!idlest || 100*this_load < imbalance*min_load)
> - return NULL;
> + if (this_group && idlest != this_group)
> + /* Bias toward our group again */
> + if (100*this_load < imbalance*min_load)
> + idlest = this_group;
> +
>   return idlest;
>  }
> 
Reviewed-by:Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 02/22] sched: select_task_rq_fair clean up

2013-01-10 Thread Preeti U Murthy

On 01/05/2013 02:07 PM, Alex Shi wrote:
> It is impossible to miss a task allowed cpu in a eligible group.
> 
> And since find_idlest_group only return a different group which
> excludes old cpu, it's also imporissible to find a new cpu same as old
> cpu.
> 
> Signed-off-by: Alex Shi 
> ---
>  kernel/sched/fair.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5eea870..6d3a95d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3378,11 +3378,6 @@ select_task_rq_fair(struct task_struct *p, int 
> sd_flag, int wake_flags)
>   }
> 
>   new_cpu = find_idlest_cpu(group, p, cpu);
> - if (new_cpu == -1 || new_cpu == cpu) {
> - /* Now try balancing at a lower domain level of cpu */
> - sd = sd->child;
> - continue;
> - }
> 
>   /* Now try balancing at a lower domain level of new_cpu */
>   cpu = new_cpu;
> 
Reviewed-by:Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 05/22] sched: remove domain iterations in fork/exec/wake

2013-01-10 Thread Preeti U Murthy

Hi Morten,Alex

On 01/09/2013 11:51 PM, Morten Rasmussen wrote:
> On Sat, Jan 05, 2013 at 08:37:34AM +, Alex Shi wrote:
>> Guess the search cpu from bottom to up in domain tree come from
>> commit 3dbd5342074a1e sched: multilevel sbe sbf, the purpose is
>> balancing over tasks on all level domains.
>>
>> This balancing cost much if there has many domain/groups in a large
>> system. And force spreading task among different domains may cause
>> performance issue due to bad locality.
>>
>> If we remove this code, we will get quick fork/exec/wake, plus better
>> balancing among whole system, that also reduce migrations in future
>> load balancing.
>>
>> This patch increases 10+% performance of hackbench on my 4 sockets
>> NHM and SNB machines.
>>
>> Signed-off-by: Alex Shi 
>> ---
>>  kernel/sched/fair.c | 20 +---
>>  1 file changed, 1 insertion(+), 19 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index ecfbf8e..895a3f4 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -3364,15 +3364,9 @@ select_task_rq_fair(struct task_struct *p, int 
>> sd_flag, int wake_flags)
>>  goto unlock;
>>  }
>>  
>> -while (sd) {
>> +if (sd) {
>>  int load_idx = sd->forkexec_idx;
>>  struct sched_group *group;
>> -int weight;
>> -
>> -if (!(sd->flags & sd_flag)) {
>> -sd = sd->child;
>> -continue;
>> -}
>>  
>>  if (sd_flag & SD_BALANCE_WAKE)
>>  load_idx = sd->wake_idx;
>> @@ -3382,18 +3376,6 @@ select_task_rq_fair(struct task_struct *p, int 
>> sd_flag, int wake_flags)
>>  goto unlock;
>>  
>>  new_cpu = find_idlest_cpu(group, p, cpu);
>> -
>> -/* Now try balancing at a lower domain level of new_cpu */
>> -cpu = new_cpu;
>> -weight = sd->span_weight;
>> -sd = NULL;
>> -for_each_domain(cpu, tmp) {
>> -if (weight <= tmp->span_weight)
>> -break;
>> -if (tmp->flags & sd_flag)
>> -sd = tmp;
>> -}
>> -/* while loop will break here if sd == NULL */
> 
> I agree that this should be a major optimization. I just can't figure
> out why the existing recursive search for an idle cpu switches to the
> new cpu near the end and then starts a search for an idle cpu in the new
> cpu's domain. Is this to handle some exotic sched domain configurations?
> If so, they probably wouldn't work with your optimizations.

Let me explain my understanding of why the recursive search is the way
it is.

 _  sd0
| |
|  ___sd1__   ___sd2__|
| || ||   |
| | sgx| |  sga   |   |
| | sgy| |  sgb   |   |
| || ||   |
|_|

What the current recursive search is doing is (assuming we start with
sd0-the top level sched domain whose flags are rightly set). we find
that sd1 is the idlest group,and a cpux1 in sgx is the idlest cpu.

We could have ideally stopped the search here.But the problem with this
is that there is a possibility that sgx is more loaded than sgy; meaning
the cpus in sgx are heavily imbalanced;say there are two cpus cpux1 and
cpux2 in sgx,where cpux2 is heavily loaded and cpux1 has recently gotten
idle and load balancing has not come to its rescue yet.According to the
search above, cpux1 is idle,but is *not the right candidate for
scheduling forked task,it is the right candidate for relieving the load
from cpux2* due to cache locality etc.

Therefore in the next recursive search we go one step inside sd1-the
chosen idlest group candidate,which also happens to be the *next level
sched domain for cpux1-the chosen idle cpu*. It then returns sgy as the
idlest perhaps,if the situation happens to be better than what i have
described for sgx and an appropriate cpu there is chosen.

So in short a bird's eye view of a large sched domain to choose the cpu
would be very short sighted,we could end up creating imbalances within
lower level sched domains.To avoid this the recursive search plays safe
and chooses the best idle group after viewing the large sched domain in
detail.

Therefore even i feel that this patch should be implemented after
thorough tests.

> Morten

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 0/4] input: keyboard: tegra: cleanups and DT supports

2013-01-10 Thread Laxman Dewangan


On Thursday 10 January 2013 11:14 PM, Dmitry Torokhov wrote:

Hi Laxman,

On Thu, Jan 10, 2013 at 12:01:23PM +0530, Laxman Dewangan wrote:

Hi Dmitry,

On Monday 07 January 2013 10:22 PM, Stephen Warren wrote:

On 01/06/2013 04:14 AM, Laxman Dewangan wrote:

This patch series:
  - fix build warning,
  - use devm_* for allocation,
  - make column/rows configuration through DT and
  - remove the rarely used  key mapping table.

The series,
Reviewed-by: Stephen Warren 


If you are fine with this series then can it be apply please? I can
handle if there is any comment on this series to close this.


The patches are applied with minor edits to the 3rd and 4th (there was
no point in having a separate keymap setup function anymore).



Thanks for taking care. I looked the final change from your git and the 
changes are  fine.

Thanks,
Laxman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the signal tree with the arm64 tree

2013-01-10 Thread Stephen Rothwell

Hi Al,

Today's linux-next merge of the signal tree got a conflict in
arch/arm64/kernel/signal32.c between commits 068f1bb36cf1 ("arm64:
compat: include sa_restorer in old action from rt_sigaction") and
efed4d52e39f ("arm64: compat: ensure access_ok checks are performed on
user structures") from the arm64 tree and various commits from the signal
tree.

The latter removes the code modified by the former, so I did that.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpBiO42H4sK_.pgp
Description: PGP signature

Re: [PATCH 2/2] mm: forcely swapout when we are out of page cache

2013-01-10 Thread Minchan Kim

Hi Andrew,

On Thu, Jan 10, 2013 at 01:58:28PM -0800, Andrew Morton wrote:
> On Thu, 10 Jan 2013 11:23:06 +0900
> Minchan Kim  wrote:
> 
> > > I have a feeling that laptop mode has bitrotted and these patches are
> > > kinda hacking around as-yet-not-understood failures...
> > 
> > Absolutely, this patch is last guard for unexpectable behavior.
> > As I mentioned in cover-letter, Luigi's problem could be solved either [1/2]
> > or [2/2] but I wanted to add this as last resort in case of unexpected
> > emergency. But you're right. It's not good to hide the problem like this 
> > path
> > so let's drop [2/2].
> > 
> > Also, I absolutely agree it has bitrotted so for correcting it, we need a
> > volunteer who have to inverstigate power saveing experiment with long time.
> > So [1/2] would be band-aid until that.
> 
> I'm inclined to hold off on 1/2 as well, really.

Then, what's your plan?

It's real bug since f80c067[mm: zone_reclaim: make isolate_lru_page() 
filter-aware]
was introduced. Some portable device could use laptop_mode to save batter power.
AFAIK, the usecase was trial of ChromeOS and Luigi reported this problem 
although
they decided to disable laptop_mode due to other reason which laptop_mode burns 
out
power for a very long time in their some workload.

Another problem of laptop_mode isn't aware of in-memory swap, like zram.
So unconditionally, prevent to swap out. :( Yeb. it's another story to be fixed.

If you hate this version, how about this?
This version does following as.

1. get_scan_count forces only file-backed pages reclaiming if may_writepage is 
false.
   It prevents unnecessary CPU consumption and LRU churing with anon pages.
2. If memory reclaim suffers(ie, below DEF_PRIORITY - 2), may_writepage would 
be true
   in only direct reclaim path.

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 73b64a3..695b907 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -550,6 +550,8 @@ static inline int zone_is_oom_locked(const struct zone 
*zone)
  */
 #define DEF_PRIORITY 12
 
+#define HARD_TO_RECLAIM_PRIO (DEF_PRIORITY - 2)
+
 /* Maximum number of zones on a zonelist */
 #define MAX_ZONES_PER_ZONELIST (MAX_NUMNODES * MAX_NR_ZONES)
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ff869d2..4c63bda 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -814,7 +814,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
 */
if (page_is_file_cache(page) &&
(!current_is_kswapd() ||
-sc->priority >= DEF_PRIORITY - 2)) {
+sc->priority >= HARD_TO_RECLAIM_PRIO)) 
{
/*
 * Immediately reclaim when written back.
 * Similar in principal to deactivate_page()
@@ -1683,8 +1683,11 @@ static void get_scan_count(struct lruvec *lruvec, struct 
scan_control *sc,
if (!global_reclaim(sc))
force_scan = true;
 
-   /* If we have no swap space, do not bother scanning anon pages. */
-   if (!sc->may_swap || (nr_swap_pages <= 0)) {
+   /*
+* If we have no swap space or may_writepage is false,
+* do not bother scanning anon pages.
+*/
+   if (!sc->may_swap || !sc->may_writepage || (nr_swap_pages <= 0)) {
scan_balance = SCAN_FILE;
goto out;
}
@@ -1879,7 +1882,7 @@ static bool in_reclaim_compaction(struct scan_control *sc)
 {
if (IS_ENABLED(CONFIG_COMPACTION) && sc->order &&
(sc->order > PAGE_ALLOC_COSTLY_ORDER ||
-sc->priority < DEF_PRIORITY - 2))
+sc->priority < HARD_TO_RECLAIM_PRIO))
return true;
 
return false;
@@ -2215,9 +2218,16 @@ static unsigned long do_try_to_free_pages(struct 
zonelist *zonelist,
sc->may_writepage = 1;
}
 
+   /*
+* This is a safety belt to prevent OOM kill through reclaiming
+* pages with sacrificing the power.
+*/
+   if (sc->priority < HARD_TO_RECLAIM_PRIO)
+   sc->may_writepage = 1;
+
/* Take a nap, wait for some writeback to complete */
if (!sc->hibernation_mode && sc->nr_scanned &&
-   sc->priority < DEF_PRIORITY - 2) {
+   sc->priority < HARD_TO_RECLAIM_PRIO) {
struct zone *preferred_zone;
 
first_zones_zonelist(zonelist, gfp_zone(sc->gfp_mask),
@@ -2824,7 +2834,7 @@ loop_again:
 * OK, kswapd is getting into trouble.  Take a nap, then take
 * another pass across the zones.
 */
-   if (total_scanned && (sc.priority < DEF_PRIORITY - 2)) {
+   if (total_scanned &&

Re: [PATCH 12/14] ARM: tegra: tec: Add PCIe support

2013-01-10 Thread Thierry Reding

On Thu, Jan 10, 2013 at 05:22:30PM -0700, Stephen Warren wrote:
> On 01/09/2013 01:43 PM, Thierry Reding wrote:
> > Enable the first PCIe root port which is connected to an FPGA on the
> > Tamonten Evaluation Carrier and add device nodes for each of the PCI
> > endpoints available in the standard configuration.
> 
> > diff --git a/arch/arm/boot/dts/tegra20-tec.dts 
> > b/arch/arm/boot/dts/tegra20-tec.dts
> 
> > +   pcie-controller {
> > +   vdd-supply = <_vdd_reg>;
> > +   pex-clk-supply = <_clk_reg>;
> > +   status = "okay";
> 
> Sorry this is also really picky. I'd prefer properties that exist in
> /include/d files and are overidden here to appear first, followed by new
> properties. In other words, move the status property to be first. I
> believe/hope all the other (Tegra) .dts files follow this convention.

Okay, I'll fix that.

> > +   pci@1,0 {
> > +   bus-range = <0x01 0x0a>;
> > +   status = "okay";
> > +
> > +   pci@0,0 {
> > +   reg = <0x01 0 0 0 0>;
> 
> Hmm. The unit address in that node name doesn't match the address in the
> reg property, although I suppose there's nothing we can do about it
> since those formats are both defined by the standard PCI binding?

That's the standard encoding for unit addresses for PCI devices. The
first cell in the reg property encodes bus/device/function (amongst
other things) and the node name is supposed to be pci@,.

> What do the numbers "0,0" represent here; device/function? Is the same
> true for the "0,0" in the child nodes?

Yes, exactly.

> > +   bus-range = <0x02 0x0a>;
> > +
> > +   compatible = "plda,pcie";
> 
> Are there DT binding documents for all these devices; plda,pcie,
> ad,pcie, ad,pcie-test, etc.?

No. To be honest I don't quite know how to handle this. For the PLDA
things aren't so bad since it has a proper PCI ID, but the other cores
don't since they are custom IP or taken from opencores.org and made
available via PCIe. We're still in the process of obtaining our own PCI
vendor ID so that these can be properly assigned.

Also, as you have guessed, most of these are not required. I just wanted
to include them here for completeness (and maybe reference in case
somebody else, myself included, needs a working example to base other
work on).

> > +   pci@4,0 {
> > +   reg = <0x022000 0 0 0 0>;
> > +   bus-range = <0x07 0x07>;
> > +
> > +   compatible = "ad,pcie";
> > +   device_type = "pci";
> > +
> > +   #address-cells = <3>;
> > +   #size-cells = <2>;
> > +
> > +   pci@0,0 {
> > +   compatible = "opencores,uart";
> > +   reg = <0x07 0 0 0 0>;
> > +   };
> > +   };
> 
> Do you need to include a node for the UART; I can see you need to for
> the SPI/I2C controllers so you can instantiate the appropriate devices
> on non-probe-able buses, but I think you can just let regular PCI device
> probing find the UART, Ethernet MAC, etc., can't you?

Yes, that's correct. Only SPI and I2C actually require these nodes. I'm
not sure if the PCI binding requires all nodes to be present or not.
Other examples I've seen (e.g. arch/x86/platform/ce4100/falconfalls.dts)
contain nodes for everything, most of which don't seem to be necessary
for things to work.

One other thing that I've seen is the usage of the more standard pci*
values for the compatible property. None of them are very descriptive
which is why I used a vendor,device type of value instead.

Going over the PCI binding again, however, it looks like there's no
requirement to make the node name pci@dev,fn and pci can be anything.
Making it uart@0,0 and then adjusting the compatible value to be as the
binding requires could be an option. In that case I suppose even the
bindings documentation shouldn't be necessary.

That doesn't cover the nodes where non-standard properties are needed
(I2C and SPI), which do need binding documents. I wouldn't know how to
name them, though. I'm not sure going with the current convention would
be any good since it'll be hard to find the right document if you have
to look it up by matching vendor and device IDs or PCI class.

Thierry

pgpIP1UwsSiFP.pgp
Description: PGP signature

linux-next: manual merge of the samsung tree with the arm-soc tree

2013-01-10 Thread Stephen Rothwell

Hi Kukjin,

Today's linux-next merge of the samsung tree got conflicts in
many files with the arm-soc tree.

I just dropped the samsung tree for today.  Please have a look and try to
fix this mess up, thanks.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpHt77sd_Rgs.pgp
Description: PGP signature

linux-next: manual merge of the samsung tree with the arm-soc tree

2013-01-10 Thread Stephen Rothwell

Hi Kukjin,

Today's linux-next merge of the samsung tree got a conflict in
arch/arm/plat-samsung/time.c between various commits from the arm-soc
tree and commit 0e4a0a6e970e ("ARM: SAMSUNG: Remove unused
plat-samsung/time.c") from the samsung tree.

The latter removes the file, so I did that.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpkU1CI7oVKm.pgp
Description: PGP signature

[PATCH] powerpc: added DSCR support to ptrace

2013-01-10 Thread Alexey Kardashevskiy

The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

The kernel already supports DSCR value per thread but there is also
a need in a ability to change it from an external process for
the specific pid.

The patch adds new register index PT_DSCR (index=44) which can be
set/get by:
  ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR << 3, dscr);
  dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR << 3, NULL);

Signed-off-by: Alexey Kardashevskiy 
---
 arch/powerpc/include/asm/ptrace.h |3 +++
 arch/powerpc/kernel/ptrace.c  |   23 ++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index 48223f9..85eefa8 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -261,6 +261,9 @@ static inline unsigned long 
regs_get_kernel_stack_nth(struct pt_regs *regs,
 #define PT_DAR 41
 #define PT_DSISR 42
 #define PT_RESULT 43
+#ifdef __powerpc64__
+#define PT_DSCR 44
+#endif
 #define PT_REGS_COUNT 44
 
 #define PT_FPR048  /* each FP reg occupies 2 slots in this space */
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 05b7dd2..444f22a 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -182,6 +182,20 @@ static int set_user_msr(struct task_struct *task, unsigned 
long msr)
return 0;
 }
 
+#ifdef CONFIG_PPC64
+static unsigned long get_user_dscr(struct task_struct *task)
+{
+   return task->thread.dscr;
+}
+
+static int set_user_dscr(struct task_struct *task, unsigned long dscr)
+{
+   task->thread.dscr = dscr;
+   task->thread.dscr_inherit = 1;
+   return 0;
+}
+#endif
+
 /*
  * We prevent mucking around with the reserved area of trap
  * which are used internally by the kernel.
@@ -203,6 +217,10 @@ unsigned long ptrace_get_reg(struct task_struct *task, int 
regno)
if (regno == PT_MSR)
return get_user_msr(task);
 
+#ifdef CONFIG_PPC64
+   if (regno == PT_DSCR)
+   return get_user_dscr(task);
+#endif
if (regno < (sizeof(struct pt_regs) / sizeof(unsigned long)))
return ((unsigned long *)task->thread.regs)[regno];
 
@@ -221,7 +239,10 @@ int ptrace_put_reg(struct task_struct *task, int regno, 
unsigned long data)
return set_user_msr(task, data);
if (regno == PT_TRAP)
return set_user_trap(task, data);
-
+#ifdef CONFIG_PPC64
+   if (regno == PT_DSCR)
+   return set_user_dscr(task, data);
+#endif
if (regno <= PT_MAX_PUT_REG) {
((unsigned long *)task->thread.regs)[regno] = data;
return 0;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-10 Thread Eric W. Biederman

Konrad Rzeszutek Wilk  writes:

> On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
>> I think that new kexec hypercall function should mimics kexec syscall.
>> It means that all arguments passed to hypercall should have same types
>> if it is possible or if it is not possible then conversion should be done
>> in very easy way. Additionally, I think that one call of new hypercall
>> load function should load all needed thinks in right place and
>> return relevant status. Last but not least, new functionality should
>
> We are not restricted to just _one_ hypercall. And this loading
> thing could be similar to the micrcode hypercall - which just points
> to a virtual address along with the length - and says 'load me'.
>
>> be available through /dev/xen/privcmd or directly from kernel without
>> bigger effort.
>
> Perhaps we should have a email thread on xen-devel where we hash out
> some ideas. Eric, would you be OK included on this - it would make
> sense for this mechanism to be as future-proof as possible - and I am not
> sure what your plans for kexec are in the future?

The basic kexec interface is.

load ranges of virtual addresses physical addresses.
jump to the physical address  with identity mapped page tables.

There are a few flags to allow for different usage scenarios like
kexec on panic vs normal kexec.

It is very very simple and very extensible.  All of the weird glue
happens in userspace.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/14] ARM: tegra: Move pmc.h to include/mach

2013-01-10 Thread Thierry Reding

On Thu, Jan 10, 2013 at 05:15:07PM -0700, Stephen Warren wrote:
> On 01/09/2013 01:43 PM, Thierry Reding wrote:
> > In preparation for moving the PCIe driver into the drivers/pci/host
> > directory, this header, which contains prototypes that are required by
> > the PCIe driver, needs to be moved to a globally visible location.
> > 
> > Signed-off-by: Thierry Reding 
> > ---
> > Note that eventually the code in pmc.c and powergate.c should probably
> > be split out into a separate driver. The PMC registers are also directly
> > accessed from tegra20_clocks.c and tegra30_clocks.c, so that it might be
> > required to provide that functionality through the new driver as well.
> > ---
> >  arch/arm/mach-tegra/common.c   |  2 +-
> >  arch/arm/mach-tegra/include/mach/pmc.h | 24 
> >  arch/arm/mach-tegra/pmc.h  | 24 
> 
> On IRC, we'd talked about putting the public functionality in
> include/linux/tegra-pmc.h so that we wouldn't add to include/mach, which
> would make it harder to make Tegra support ARM multi-platform. Perhaps
> that IRC discussion happened after you posted this series?

I'm not sure, it's equally possible that I just forgot. Will fix it up.

Thierry


pgp6MW8HEK5GJ.pgp
Description: PGP signature

Re: [PATCH 02/14] of/pci: Add of_pci_get_devfn() function

2013-01-10 Thread Thierry Reding

On Thu, Jan 10, 2013 at 05:09:43PM -0700, Stephen Warren wrote:
> On 01/09/2013 01:43 PM, Thierry Reding wrote:
> > This function can be used to parse the device and function number from a
> > standard 5-cell PCI resource. PCI_SLOT() and PCI_FUNC() can be used on
> > the returned value obtain the device and function numbers respectively.
> 
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> 
> >  static inline int __of_pci_pci_compare(struct device_node *node,
> >unsigned int devfn)
> >  {
> > +   int err;
> >  
> > +   err = of_pci_get_devfn(node);
> > +   if (err < 0)
> > return 0;
> > +
> > +   return devfn == err;
> 
> I know this is really picky, but calling that "err" when it's hopefully
> not an error but rather a PCI device/function ID seems a little obscure.
> Perhaps node_devfn? I assume that fact that devfn is unsigned and err is
> signed won't be an issue.

Maybe renaming the devfn parameter to data and using devfn for the local
variable would be more obvious.

The signedness shouldn't be problematic. devfn is an 8-bit unsigned
integer and sign mismatch is excluded by the error return already.

Thierry


pgpSd96sMkIJ5.pgp
Description: PGP signature

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-10 Thread Minchan Kim

Hi Luigi,

On Thu, Jan 10, 2013 at 03:24:21PM -0800, Luigi Semenzato wrote:
> For what it's worth, I tested this patch on my 3.4 kernel, and it works as
> advertised.  Here's my setup.
> 
> - 2 GB RAM
> - a 3 GB zram disk for swapping
> - start one "hog" process per second (each hog process mallocs and touches
> 200 MB of memory).
> - watch /proc/meminfo
> 
> 1. I verified that the problem still exists on my current 3.4 kernel.  With
> laptop_mode = 2, hog processes are oom-killed when about 1.8-1.9 (out of 3)
> GB of swap space are still left
> 
> 2. I double-checked that the problem does not exist with laptop_mode = 0:
> hog processes are oom-killed when swap space is exhausted (with good
> approximation).
> 
> 3. I added the two-line patch, put back laptop_mode = 2, and verified that
> hog processes are oom-killed when swap space is exhausted, same as case 2.
> 
> Let me know if I can run any more tests for you, and thanks for all the
> support so far!

Thanks very much! But it seems Andrew doesn't like this version.
I will discuss more with him and ask again with confimred version to you.

Thanks, again.!

FYI)
After I resolves this issue, will dive into min_filelist_kbytes patch. :)
> 
> 
> 
> On Wed, Jan 9, 2013 at 6:03 PM, Minchan Kim  wrote:
> 
> > Hi Andrew,
> >
> > On Wed, Jan 09, 2013 at 04:18:54PM -0800, Andrew Morton wrote:
> > > On Wed,  9 Jan 2013 15:21:13 +0900
> > > Minchan Kim  wrote:
> > >
> > > > Recently, Luigi reported there are lots of free swap space when
> > > > OOM happens. It's easily reproduced on zram-over-swap, where
> > > > many instance of memory hogs are running and laptop_mode is enabled.
> > > >
> > > > Luigi reported there was no problem when he disabled laptop_mode.
> > > > The problem when I investigate problem is following as.
> > > >
> > > > try_to_free_pages disable may_writepage if laptop_mode is enabled.
> > > > shrink_page_list adds lots of anon pages in swap cache by
> > > > add_to_swap, which makes pages Dirty and rotate them to head of
> > > > inactive LRU without pageout. If it is repeated, inactive anon LRU
> > > > is full of Dirty and SwapCache pages.
> > > >
> > > > In case of that, isolate_lru_pages fails because it try to isolate
> > > > clean page due to may_writepage == 0.
> > > >
> > > > The may_writepage could be 1 only if total_scanned is higher than
> > > > writeback_threshold in do_try_to_free_pages but unfortunately,
> > > > VM can't isolate anon pages from inactive anon lru list by
> > > > above reason and we already reclaimed all file-backed pages.
> > > > So it ends up OOM killing.
> > > >
> > > > This patch prevents to add a page to swap cache unnecessary when
> > > > may_writepage is unset so anoymous lru list isn't full of
> > > > Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> > > > which ends up setting may_writepage to 1 and could swap out
> > > > anon lru pages. When OOM triggers, I confirmed swap space was full.
> > > >
> > > > ...
> > > >
> > > > --- a/mm/vmscan.c
> > > > +++ b/mm/vmscan.c
> > > > @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct
> > list_head *page_list,
> > > > if (PageAnon(page) && !PageSwapCache(page)) {
> > > > if (!(sc->gfp_mask & __GFP_IO))
> > > > goto keep_locked;
> > > > +   if (!sc->may_writepage)
> > > > +   goto keep_locked;
> > > > if (!add_to_swap(page))
> > > > goto activate_locked;
> > > > may_enter_fs = 1;
> > >
> > > I'm not really getting it, and the description is rather hard to follow
> > :(
> >
> > It seems I don't have a talent about description. :(
> > I hope it would be better this year. :)
> >
> > >
> > > We should be adding anon pages to swapcache even when laptop_mode is
> > > set.  And we should be writing them to swap as well, then reclaiming
> > > them.  The only thing laptop_mode shouild do is make the disk spin up
> > > less frequently - that doesn't mean "not at all"!
> >
> > So it seems your rationale is that let's save power in only system has
> > enough memory so let's remove may_writepage in reclaim path?
> >
> > If it is, I love it because I didn't see any number about power saving
> > through reclaiming throttling(But surely there was reason to add it)
> > and not sure it works well during long time because we have tweaked
> > reclaim part too many.
> >
> > >
> > > So something seems screwed up here and the patch looks like a
> > > heavy-handed workaround.  Why aren't these anon pages getting written
> > > out in laptop_mode?
> >
> > Don't know. It was there long time and I don't want to screw it up.
> > If we decide paging out in reclaim path regardless of laptop_mode,
> > it makes the problem easy without ugly workaround.
> >
> > Remove may_writepage? If it's too agressive, we can remove it in only
> > direct reclaim path.
> >
> > >
> > >
> > > --
> > > To unsubscribe, send a

Re: [PATCH 01/14] of/pci: Provide support for parsing PCI DT ranges property

2013-01-10 Thread Thierry Reding

On Thu, Jan 10, 2013 at 05:06:48PM -0700, Stephen Warren wrote:
> On 01/09/2013 01:43 PM, Thierry Reding wrote:
> > From: Andrew Murray 
> > 
> > DT bindings for PCI host bridges often use the ranges property to describe
> > memory and IO ranges - this binding tends to be the same across 
> > architectures
> > yet several parsing implementations exist, e.g. arch/mips/pci/pci.c,
> > arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and
> > arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate
> > functionality provided by drivers/of/address.c.
> > 
> > This patch provides a common iterator-based parser for the ranges property, 
> > it
> > is hoped this will reduce DT representation differences between 
> > architectures
> > and that architectures will migrate in part to this new parser.
> ...
> 
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> 
> > +const __be32 *of_pci_process_ranges(struct device_node *node,
> 
> > +   while (from + np <= end) {
> > +   u64 cpu_addr, size;
> > +
> > +   cpu_addr = of_translate_address(node, from + na);
> > +   size = of_read_number(from + na + pna, ns);
> > +   res->flags = bus->get_flags(from);
> > +   from += np;
> > +
> > +   if (cpu_addr == OF_BAD_ADDR || size == 0)
> > +   continue;
> 
> Hmmm. That seems to just ignore bad entries in the ranges property. Is
> that really the right thing to do? At least printing a diagnostic might
> be a good idea, even if the code does just soldier on in the hope
> everything still works.

That's true. However, erroring out here wouln't be useful either since a
NULL return value is used to mark the end of the iteration and the
caller would have to assume that no more ranges are present. Maybe
that's better than continuing anyway, even if a message is printed.

Alternatively, the of_pci_process_ranges() could be changed to return an
ERR_PTR() encoded errno. This is one case where it makes a lot of sense.
I have a feeling that Grant won't like that very much, though.

Another possibility would be to change away from an iterator-based
approach and return an integer for the number of ranges and negative
error code on failure while returning an allocated array of resources
through an output parameter.

Thierry

pgp8xVxoYUvnT.pgp
Description: PGP signature

Re: [PATCH] module, fix percpu reserved memory exhaustion

2013-01-10 Thread Rusty Russell

Prarit Bhargava  writes:
> [   15.478160] kvm: Could not allocate 304 bytes percpu data
> [   15.478174] PERCPU: allocation failed, size=304 align=32, alloc
> from reserved chunk failed
...
> What is happening is systemd is loading an instance of the kvm module for
> each cpu found (see commit e9bda3b).  When the module load occurs the kernel
> currently allocates the modules percpu data area prior to checking to see
> if the module is already loaded or is in the process of being loaded.  If
> the module is already loaded, or finishes load, the module loading code
> releases the current instance's module's percpu data.

Wow, what a cool bug!  Classic unforseen side-effect.

I'd prefer not to do relocations with the module_lock held: it can be
relatively slow.  Yet we can't do relocations before the per-cpu
allocation, obviously.  Did you do boot timings before and after?

An alternative would be to put the module into the list even earlier
(say, just after layout_and_allocate) so we could block on concurrent
loads at that point.  But then we have to make sure noone looks in the
module too early before it's completely set up, and that's complicated
and error-prone too.  A separate list is kind of icky.

We currently have PERCPU_MODULE_RESERVE set at 8k: in my 32-bit
allmodconfig build, there are only three modules with per-cpu data,
totalling 328 bytes.  So it's not reasonable to increase that number to
paper over this.

This is what a new boot state looks like (pains not to break ksplice).
It's two patches, but I'll just post them back to back:

module: add new state MODULE_STATE_UNFORMED.

You should never look at such a module, so it's excised from all paths
which traverse the modules list.

We add the state at the end, to avoid gratuitous ABI break (ksplice).

Signed-off-by: Rusty Russell 

diff --git a/include/linux/module.h b/include/linux/module.h
index 7760c6d..4432373 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -199,11 +199,11 @@ struct module_use {
struct module *source, *target;
 };
 
-enum module_state
-{
-   MODULE_STATE_LIVE,
-   MODULE_STATE_COMING,
-   MODULE_STATE_GOING,
+enum module_state {
+   MODULE_STATE_LIVE,  /* Normal state. */
+   MODULE_STATE_COMING,/* Full formed, running module_init. */
+   MODULE_STATE_GOING, /* Going away. */
+   MODULE_STATE_UNFORMED,  /* Still setting it up. */
 };
 
 /**
diff --git a/kernel/module.c b/kernel/module.c
index 41bc118..c3a2ee8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -188,6 +188,7 @@ struct load_info {
ongoing or failed initialization etc. */
 static inline int strong_try_module_get(struct module *mod)
 {
+   BUG_ON(mod && mod->state == MODULE_STATE_UNFORMED);
if (mod && mod->state == MODULE_STATE_COMING)
return -EBUSY;
if (try_module_get(mod))
@@ -343,6 +344,9 @@ bool each_symbol_section(bool (*fn)(const struct symsearch 
*arr,
 #endif
};
 
+   if (mod->state == MODULE_STATE_UNFORMED)
+   continue;
+
if (each_symbol_in_section(arr, ARRAY_SIZE(arr), mod, fn, data))
return true;
}
@@ -450,16 +454,24 @@ const struct kernel_symbol *find_symbol(const char *name,
 EXPORT_SYMBOL_GPL(find_symbol);
 
 /* Search for module by name: must hold module_mutex. */
-struct module *find_module(const char *name)
+static struct module *find_module_all(const char *name,
+ bool even_unformed)
 {
struct module *mod;
 
list_for_each_entry(mod, , list) {
+   if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
+   continue;
if (strcmp(mod->name, name) == 0)
return mod;
}
return NULL;
 }
+
+struct module *find_module(const char *name)
+{
+   return find_module_all(name, false);
+}
 EXPORT_SYMBOL_GPL(find_module);
 
 #ifdef CONFIG_SMP
@@ -525,6 +537,8 @@ bool is_module_percpu_address(unsigned long addr)
preempt_disable();
 
list_for_each_entry_rcu(mod, , list) {
+   if (mod->state == MODULE_STATE_UNFORMED)
+   continue;
if (!mod->percpu_size)
continue;
for_each_possible_cpu(cpu) {
@@ -1048,6 +1062,8 @@ static ssize_t show_initstate(struct module_attribute 
*mattr,
case MODULE_STATE_GOING:
state = "going";
break;
+   default:
+   BUG();
}
return sprintf(buffer, "%s\n", state);
 }
@@ -1786,6 +1802,8 @@ void set_all_modules_text_rw(void)
 
mutex_lock(_mutex);
list_for_each_entry_rcu(mod, , list) {
+   if (mod->state == MODULE_STATE_UNFORMED)
+   continue;
if ((mod->module_core) && (mod->core_text_size)) {
set_page_attributes(mod->module_core,

Re: [PATCH 0/3] ixgbe: request_firmware for configuration parameters

2013-01-10 Thread Shannon Nelson

On Thu, Jan 10, 2013 at 6:02 PM, Shannon Nelson
 wrote:
[...]
>
> In these RFC patches for ixgbe,
>

Yeah, these should have the "RFC" in the Subject line.  Sorry about that.
sln

-- 
==
Mr. Shannon Nelson Parents can't afford to be squeamish.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/14] PCI: tegra: Move PCIe driver to drivers/pci/host

2013-01-10 Thread Thierry Reding

On Thu, Jan 10, 2013 at 05:48:46PM -0700, Stephen Warren wrote:
> On 01/09/2013 01:43 PM, Thierry Reding wrote:
> > Move the PCIe driver from arch/arm/mach-tegra into the drivers/pci/host
> > directory. The motivation is to collect various host controller drivers
> > in the same location in order to facilitate refactoring.
> > 
> > The Tegra PCIe driver has been largely rewritten, both in order to turn
> > it into a proper platform driver and to add MSI (based on code by
> > Krishna Kishore ) as well as device tree support.
> 
> > diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
> > b/arch/arm/mach-tegra/board-dt-tegra20.c
> 
> >  static void __init trimslice_init(void)
> >  {
> >  #ifdef CONFIG_TEGRA_PCI
> > -   int ret;
> > -
> > -   ret = tegra_pcie_init(true, true);
> > -   if (ret)
> > -   pr_err("tegra_pci_init() failed: %d\n", ret);
> > +   platform_device_register(_pcie_device);
> 
> That struct doesn't actually exist anywhere; only an extern definition
> is added (and that extern definition isn't removed by patch 14 either).

Right, this shouldn't be there. In fact TEGRA_PCI is removed by this
patch, so I should go over the code more carefully again.

> > diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
> 
> > +config PCI_TEGRA
> > +   bool "NVIDIA Tegra PCIe controller"
> > +   depends on ARCH_TEGRA_2x_SOC
> 
> Perhaps depend on ARCH_TEGRA; that will save churn once this is ported
> to Tegra30, and shouldn't cause any problems before then.

Okay, I can do that.

> > diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
> 
> > +#define AFI_INTR_CODE  0xb8
> > +#define  AFI_INTR_CODE_MASK0xf
> > +#define  AFI_INTR_MASTER_ABORT 4
> > +#define  AFI_INTR_LEGACY   6
> 
> Adding defines for at least some other codes here, would help further
> below ...
> 
> > +static irqreturn_t tegra_pcie_isr(int irq, void *arg)
> 
> > +   if (code == AFI_INTR_MASTER_ABORT) {
> > +   dev_dbg(pcie->dev, "%s, signature: %08x\n", err_msg[code],
> > +   signature);
> > +   } else
> > +   dev_err(pcie->dev, "%s, signature: %08x\n", err_msg[code],
> > +   signature);
> > +
> > +   if (code == 3 || code == 4 || code == 7) {
> 
> ... i.e. here.

Will do.

> 
> > +   u32 fpci = afi_readl(pcie, AFI_UPPER_FPCI_ADDRESS) & 0xff;
> > +   u64 address = (u64)fpci << 32 | (signature & 0xfffc);
> > +   dev_dbg(pcie->dev, "  FPCI address: %10llx\n", address);
> 
> I'd suggest making that dev_err(), or at least something higher than
> debug, since the message indicating the error happened is dev_err(), so
> the complete details may as well be available since they're small.

I can make it conditional on !AFI_INTR_MASTER_ABORT to match the
previous output. Or rather move it into the branches above.

> > +static int tegra_pcie_enable_controller(struct tegra_pcie *pcie)
> > +{
> > +   unsigned int timeout;
> > +   unsigned long value;
> > +
> > +   /* enable dual controller and both ports */
> > +   value = afi_readl(pcie, AFI_PCIE_CONFIG);
> > +   value &= ~(AFI_PCIE_CONFIG_PCIEC0_DISABLE_DEVICE |
> > +  AFI_PCIE_CONFIG_PCIEC1_DISABLE_DEVICE |
> > +  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK);
> > +   value |= AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL;
> > +   afi_writel(pcie, value, AFI_PCIE_CONFIG);
> 
> Eventually, we should probably derive the port enables from the state of
> the root port DT nodes, so that we can disable some and presumably save
> a little power. Also, I notice that the nvidia,num-lanes property isn't
> implemented yet. Still, we can probably take care of this later.

Yes, the plan was to eventually derive the disable bits from the port
status and setup the XBAR_CONFIG field based on the combination of
nvidia,num-lanes properties.

I assume we should simply fail if the configuration specified by
nvidia,num-lanes is invalid?

> > +static void tegra_pcie_power_off(struct tegra_pcie *pcie)
> 
> > +   if (!IS_ERR_OR_NULL(pcie->pex_clk_supply)) {
> 
> Hmm. I think we should make supplies mandatory; it doesn't make sense
> for regulator support to be disabled on Tegra, and where a specific
> board doesn't actually have a regulator, you're supposed to provide a
> dummy fixed regulator so the driver doesn't have to care.
> 
> The same comment obviously applies to tegra_pcie_power_on() and wherever
> regulator_get() happens.

Okay, I'll fix that.

> > +static int tegra_pcie_parse_dt(struct tegra_pcie *pcie)
> 
> > +   pcie->vdd_supply = devm_regulator_get(pcie->dev, "vdd");
> > +   if (IS_ERR(pcie->vdd_supply))
> > +   return PTR_ERR(pcie->vdd_supply);
> > +
> > +   pcie->pex_clk_supply = devm_regulator_get(pcie->dev, "pex-clk");
> > +   if (IS_ERR(pcie->pex_clk_supply))
> > +   return PTR_ERR(pcie->pex_clk_supply);
> 
> Oh, I guess the regulator_get() calls are already strict.

Yeah, I think they can't return NULL, right? In that case I can just

linux-next: manual merge of the arm-soc tree with the xilinx tree

2013-01-10 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the arm-soc tree got a conflict in
arch/arm/mach-zynq/common.c between commit 453708c6da9b ("arm: zynq:
timer: Replace PSS through PS") from the xilinx tree and commit
6bb27d7349db ("ARM: delete struct sys_timer") from the arm-soc tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/arm/mach-zynq/common.c
index 892f65e,2ae4bce..000
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@@ -90,16 -90,9 +90,9 @@@ static void __init xilinx_zynq_timer_in
  
xilinx_zynq_clocks_init(slcr);
  
 -  xttcpss_timer_init();
 +  xttcps_timer_init();
  }
  
- /*
-  * Instantiate and initialize the system timer structure
-  */
- static struct sys_timer xttcps_sys_timer = {
-   .init   = xilinx_zynq_timer_init,
- };
- 
  /**
   * xilinx_map_io() - Create memory mappings needed for early I/O.
   */


pgp8BlWwGGc6e.pgp
Description: PGP signature

Re: [ 004/123] b43legacy: Fix firmware loading when driver is built into the kernel

2013-01-10 Thread Ben Hutchings

On Wed, 2013-01-09 at 12:34 -0800, Greg Kroah-Hartman wrote:
> 3.7-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Larry Finger 
> 
> commit 576d28a7c73013717311cfcb514dbcae27c82eeb upstream.
> 
> Recent versions of udev cause synchronous firmware loading from the
> probe routine to fail because the request to user space times out.
[...]

This has since been fixed in udev, and the kernel can also load firmware
directly.  So is this still worth doing?

Ben.

-- 
Ben Hutchings
If you seem to know what you are doing, you'll be given more to do.


signature.asc
Description: This is a digitally signed message part

Re: [PATCH v3 17/22] sched: packing small tasks in wake/exec balancing

2013-01-10 Thread Alex Shi

On 01/11/2013 01:17 AM, Morten Rasmussen wrote:
> On Sat, Jan 05, 2013 at 08:37:46AM +, Alex Shi wrote:
>> If the wake/exec task is small enough, utils < 12.5%, it will
>> has the chance to be packed into a cpu which is busy but still has space to
>> handle it.
>>
>> Signed-off-by: Alex Shi 
>> ---
>>  kernel/sched/fair.c | 51 +--
>>  1 file changed, 45 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 8d0d3af..0596e81 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -3471,19 +3471,57 @@ static inline int get_sd_sched_policy(struct 
>> sched_domain *sd,
>>  }
>>  
>>  /*
>> + * find_leader_cpu - find the busiest but still has enough leisure time cpu
>> + * among the cpus in group.
>> + */
>> +static int
>> +find_leader_cpu(struct sched_group *group, struct task_struct *p, int 
>> this_cpu)
>> +{
>> +unsigned vacancy, min_vacancy = UINT_MAX;
> 
> unsigned int?

yes
> 
>> +int idlest = -1;
>> +int i;
>> +/* percentage the task's util */
>> +unsigned putil = p->se.avg.runnable_avg_sum * 100
>> +/ (p->se.avg.runnable_avg_period + 1);
> 
> Alternatively you could use se.avg.load_avg_contrib which is the same
> ratio scaled by the task priority (se->load.weight). In the above
> expression you don't take priority into account.

sure. but this seems more directly of meaningful.
> 
>> +
>> +/* Traverse only the allowed CPUs */
>> +for_each_cpu_and(i, sched_group_cpus(group), tsk_cpus_allowed(p)) {
>> +struct rq *rq = cpu_rq(i);
>> +int nr_running = rq->nr_running > 0 ? rq->nr_running : 1;
>> +
>> +/* only pack task which putil < 12.5% */
>> +vacancy = FULL_UTIL - (rq->util * nr_running + putil * 8);
> 
> I can't follow this expression.
> 
> The variables can have the following values:
> FULL_UTIL  = 99
> rq->util   = [0..99]
> nr_running = [1..inf]
> putil  = [0..99]
> 
> Why multiply rq->util by nr_running?
> 
> Let's take an example where rq->util = 50, nr_running = 2, and putil =
> 10. In this case the value of putil doesn't really matter as vacancy
> would be negative anyway since FULL_UTIL - rq->util * nr_running is -1.
> However, with rq->util = 50 there should be plenty of spare cpu time to
> take another task.

for this example, the util is not full maybe due to it was just wake up,
it still is possible like to run full time. So, I try to give it the
large guess load.
> 
> Also, why multiply putil by 8? rq->util must be very close to 0 for
> vacancy to be positive if putil is close to 12 (12.5%).

just want to pack small util tasks, since packing is possible to hurt
performance.
> 
> The vacancy variable is declared unsigned, so it will underflow instead
> of becoming negative. Is this intentional?

ops, my mistake.
> 
> I may be missing something, but could the expression be something like
> the below instead?
> 
> Create a putil < 12.5% check before the loop. There is no reason to
> recheck it every iteration. Then:
> 
> vacancy = FULL_UTIL - (rq->util + putil)
> 
> should be enough?
> 
>> +
>> +/* bias toward local cpu */
>> +if (vacancy > 0 && (i == this_cpu))
>> +return i;
>> +
>> +if (vacancy > 0 && vacancy < min_vacancy) {
>> +min_vacancy = vacancy;
>> +idlest = i;
> 
> "idlest" may be a bit misleading here as you actually select busiest cpu
> that have enough spare capacity to take the task.

Um, change to leader_cpu?
> 
> Morten
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] memcg: modify swap accounting function to support THP

2013-01-10 Thread Sha Zhengju

On Thu, Jan 10, 2013 at 9:39 PM, Michal Hocko  wrote:
> On Thu 10-01-13 19:43:58, Sha Zhengju wrote:
>> From: Sha Zhengju 
>
> THP are not swapped out because they are split before so this change
> doesn't make much sense to me.

Yes... I have been puzzled by 'nr_pages' in __mem_cgroup_uncharge_common..
Sorry for the noise. : )

>
>> Signed-off-by: Sha Zhengju 
>> ---
>>  mm/memcontrol.c |   13 ++---
>>  1 file changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 3817460..674cf21 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -914,10 +914,9 @@ static long mem_cgroup_read_stat(struct mem_cgroup 
>> *memcg,
>>  }
>>
>>  static void mem_cgroup_swap_statistics(struct mem_cgroup *memcg,
>> -  bool charge)
>> +  int nr_pages)
>>  {
>> - int val = (charge) ? 1 : -1;
>> - this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], val);
>> + this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], nr_pages);
>>  }
>>
>>  static unsigned long mem_cgroup_read_events(struct mem_cgroup *memcg,
>> @@ -4107,7 +4106,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum 
>> charge_type ctype,
>>*/
>>   memcg_check_events(memcg, page);
>>   if (do_swap_account && ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) {
>> - mem_cgroup_swap_statistics(memcg, true);
>> + mem_cgroup_swap_statistics(memcg, nr_pages);
>>   mem_cgroup_get(memcg);
>>   }
>>   /*
>> @@ -4238,7 +4237,7 @@ void mem_cgroup_uncharge_swap(swp_entry_t ent)
>>*/
>>   if (!mem_cgroup_is_root(memcg))
>>   res_counter_uncharge(>memsw, PAGE_SIZE);
>> - mem_cgroup_swap_statistics(memcg, false);
>> + mem_cgroup_swap_statistics(memcg, -1);
>>   mem_cgroup_put(memcg);
>>   }
>>   rcu_read_unlock();
>> @@ -4267,8 +4266,8 @@ static int mem_cgroup_move_swap_account(swp_entry_t 
>> entry,
>>   new_id = css_id(>css);
>>
>>   if (swap_cgroup_cmpxchg(entry, old_id, new_id) == old_id) {
>> - mem_cgroup_swap_statistics(from, false);
>> - mem_cgroup_swap_statistics(to, true);
>> + mem_cgroup_swap_statistics(from, -1);
>> + mem_cgroup_swap_statistics(to, 1);
>>   /*
>>* This function is only called from task migration context 
>> now.
>>* It postpones res_counter and refcount handling till the end
>> --
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
> --
> Michal Hocko
> SUSE Labs



-- 
Thanks,
Sha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/14] PCI: tegra: Move PCIe driver to drivers/pci/host

2013-01-10 Thread Thierry Reding

On Thu, Jan 10, 2013 at 04:54:30PM -0700, Stephen Warren wrote:
> On 01/09/2013 01:43 PM, Thierry Reding wrote:
> > Move the PCIe driver from arch/arm/mach-tegra into the drivers/pci/host
> > directory. The motivation is to collect various host controller drivers
> > in the same location in order to facilitate refactoring.
> > 
> > The Tegra PCIe driver has been largely rewritten, both in order to turn
> > it into a proper platform driver and to add MSI (based on code by
> > Krishna Kishore ) as well as device tree support.
> 
> This driver doesn't compile unless CONFIG_PCI_MSI is also enabled.
> Should it select that, or contain a few ifdefs?
> 
> drivers/pci/host/pci-tegra.c:900: undefined reference to `write_msi_msg'

Right, it'll need #ifdefs around the arch_{setup,teardown}_msi_irq(). Or
select PCI_MSI unconditionally. Once this is merged I was going to post
a patch that enables PCI_MSI in tegra_defconfig anyway. But it might be
better to keep it optional anyway since the remainder of the code copes
with it properly.

Thierry


pgpFosDbsMiQ6.pgp
Description: PGP signature

Re: [PATCH v3 15/22] sched: log the cpu utilization at rq

2013-01-10 Thread Alex Shi

On 01/10/2013 07:40 PM, Morten Rasmussen wrote:
>> >  #undef P64
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index ee015b8..7bfbd69 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -1495,8 +1495,12 @@ static void update_cfs_rq_blocked_load(struct 
>> > cfs_rq *cfs_rq, int force_update)
>> >  
>> >  static inline void update_rq_runnable_avg(struct rq *rq, int runnable)
>> >  {
>> > +  u32 period;
>> >__update_entity_runnable_avg(rq->clock_task, >avg, runnable);
>> >__update_tg_runnable_avg(>avg, >cfs);
>> > +
>> > +  period = rq->avg.runnable_avg_period ? rq->avg.runnable_avg_period : 1;
>> > +  rq->util = rq->avg.runnable_avg_sum * 100 / period;
> The existing tg->runnable_avg and cfs_rq->tg_runnable_contrib variables
> both holds
> rq->avg.runnable_avg_sum / rq->avg.runnable_avg_period scaled by
> NICE_0_LOAD (1024). Why not use one of the existing variables instead of
> introducing a new one?

we want to a rq variable that reflect the utilization of the cpu, not of
the tg
-- 
Thanks Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 11/22] sched: consider runnable load average in effective_load

2013-01-10 Thread Alex Shi

On 01/10/2013 07:28 PM, Morten Rasmussen wrote:
> On Sat, Jan 05, 2013 at 08:37:40AM +, Alex Shi wrote:
>> effective_load calculates the load change as seen from the
>> root_task_group. It needs to multiple cfs_rq's tg_runnable_contrib
>> when we turn to runnable load average balance.
>>
>> Signed-off-by: Alex Shi 
>> ---
>>  kernel/sched/fair.c | 11 ---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index cab62aa..247d6a8 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -2982,7 +2982,8 @@ static void task_waking_fair(struct task_struct *p)
>>  
>>  #ifdef CONFIG_FAIR_GROUP_SCHED
>>  /*
>> - * effective_load() calculates the load change as seen from the 
>> root_task_group
>> + * effective_load() calculates the runnable load average change as seen from
>> + * the root_task_group
>>   *
>>   * Adding load to a group doesn't make a group heavier, but can cause 
>> movement
>>   * of group shares between cpus. Assuming the shares were perfectly aligned 
>> one
>> @@ -3030,13 +3031,17 @@ static void task_waking_fair(struct task_struct *p)
>>   * Therefore the effective change in loads on CPU 0 would be 5/56 (3/8 - 
>> 2/7)
>>   * times the weight of the group. The effect on CPU 1 would be -4/56 (4/8 -
>>   * 4/7) times the weight of the group.
>> + *
>> + * After get effective_load of the load moving, will multiple the cpu own
>> + * cfs_rq's runnable contrib of root_task_group.
>>   */
>>  static long effective_load(struct task_group *tg, int cpu, long wl, long wg)
>>  {
>>  struct sched_entity *se = tg->se[cpu];
>>  
>>  if (!tg->parent)/* the trivial, non-cgroup case */
>> -return wl;
>> +return wl * tg->cfs_rq[cpu]->tg_runnable_contrib
>> +>> NICE_0_SHIFT;
> 
> Why do we need to scale the load of the task (wl) by runnable_contrib
> when the task is in the root task group? Wouldn't the load change still
> just be wl?
> 

Here, wl is the load weight, runnable_contrib engaged the runnable time.
>>  
>>  for_each_sched_entity(se) {
>>  long w, W;
>> @@ -3084,7 +3089,7 @@ static long effective_load(struct task_group *tg, int 
>> cpu, long wl, long wg)
>>  wg = 0;
>>  }
>>  
>> -return wl;
>> +return wl * tg->cfs_rq[cpu]->tg_runnable_contrib >> NICE_0_SHIFT;
> 
> I believe that effective_load() is only used in wake_affine() to compare
> load scenarios of the same task group. Since the task group is the same
> the effective load is scaled by the same factor and should not make any
> difference?
> 
> Also, in wake_affine() the result of effective_load() is added with
> target_load() which is load.weight of the cpu and not a tracked load
> based on runnable_avg_*/contrib?
> 
> Finally, you have not scaled the result of effective_load() in the
> function used when FAIR_GROUP_SCHED is disabled. Should that be scaled
> too?

it should be, thanks reminder.

the wake up is not good for burst wakeup benchmark. I am thinking to
rewrite this part.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fs: Disable preempt when acquire i_size_seqcount write lock

2013-01-10 Thread Fan Du




On 2013年01月11日 06:38, Andrew Morton wrote:

On Wed, 9 Jan 2013 11:34:19 +0800
Fan Du  wrote:


Two rt tasks bind to one CPU core.

The higher priority rt task A preempts a lower priority rt task B which
has already taken the write seq lock, and then the higher priority
rt task A try to acquire read seq lock, it's doomed to lockup.

rt task A with lower priority: call write
i_size_writert task B with higher 
priority: call sync, and preempt task A
   write_seqcount_begin(>i_size_seqcount);i_size_read
   inode->i_size = i_size; read_seqcount_begin<-- 
lockup here...



Ouch.

And even if the preemping task is preemptible, it will spend an entire
timeslice pointlessly spinning, which isn't very good.


So disable preempt when acquiring every i_size_seqcount *write* lock will
cure the problem.

...

--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -758,9 +758,11 @@ static inline loff_t i_size_read(const struct inode *inode)
  static inline void i_size_write(struct inode *inode, loff_t i_size)
  {
  #if BITS_PER_LONG==32&&  defined(CONFIG_SMP)
+   preempt_disable();
write_seqcount_begin(>i_size_seqcount);
inode->i_size = i_size;
write_seqcount_end(>i_size_seqcount);
+   preempt_enable();
  #elif BITS_PER_LONG==32&&  defined(CONFIG_PREEMPT)
preempt_disable();
inode->i_size = i_size;


afacit all write_seqcount_begin()/read_seqretry() sites are vulnerable
to this problem.  Would it not be better to do the preempt_disable() in
write_seqcount_begin()?


IMHO, write_seqcount_begin/write_seqcount_end are often wrapped by mutex,
this gives higher priority task a chance to sleep, and then lower priority task
get cpu to unlock, so avoid the problematic scenario this patch describing.

But in i_size_write case, I could only find disable preempt a good choice before
someone else has better idea :)



Possible problems:

- mm/filemap_xip.c does disk I/O under write_seqcount_begin().

- dev_change_name() does GFP_KERNEL allocations under write_seqcount_begin()

- I didn't review u64_stats_update_begin() callers.

But I think calling schedule() under preempt_disable() is OK anyway?



--
浮沉随浪只记今朝笑

--fan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ftrace test failure on MIPS - Looking for insight..

2013-01-10 Thread Steven Rostedt

On Thu, 2013-01-10 at 17:58 -0800, David Daney wrote:
> Hi Steven,
> 
> I am trying to track down the cause of:
> 
> .
> .
> .
> Brought up 32 CPUs
> Testing tracer function: PASSED
> Testing dynamic ftrace: .. filter failed count=0 ..FAILED!
> [ cut here ]
> WARNING: at kernel/trace/trace.c:878 register_tracer+0x23c/0x300()
> Modules linked in:
> Call Trace:
> [] dump_stack+0x14/0x40
> [] warn_slowpath_common+0x84/0xb0
> [] register_tracer+0x23c/0x300
> [] do_one_initcall+0x110/0x178
> [] kernel_init+0x174/0x318
> [] ret_from_kernel_thread+0x14/0x1c
> 
> ---[ end trace 204112383c2d190e ]---
> .
> .
> .
> 
> 
> This is a MIPS64 kernel build from Linus' tree of today (commit 
> 254adaa465c40151df11fc1f88f93e6e86eb61d4)
> 
> I think the failure is long standing (since 3.4.x at least).
> 
> If you have any ideas off the top of your head as to what the cause 
> might be, I would love to hear them.
> 
> In any event, I will try to track down the root cause and fix it.  But 
> if something jumped out at you, that could speed up my search for the cause.

The failure is that it set the tracing filter to be DYN_FTRACE_TEST_NAME
(which is defined as trace_selftest_dynamic_test_func) and then it
called the function and then checked how many events were in the trace.
But there wasn't any (count=0). For some reason dynamic function tracing
didn't trace the function when it was called.

Some reasons fro this to happen:

1) tracing was some how disabled (tracing_on set to zero). But as the
function tracer passed, I don't think this would be the case.

2) the function wasn't properly set in the filter. That is, could mips
have another name for that function? Where it wouldn't add it?

3) well, just about anything :-)

I could suggest adding printks in the code, and that might help you.
Look into ftrace_set_global_filter (kernel/trace/ftrace.c) and follow
that code. Follow it all the way to __ftrace_hash_rec_update(), and make
sure the rec get's updated. You may add a printk right after the inc
(although, you may also want to set a variable to not do that printk
until the dynamic test runs).

Something like this:

rec->flags++;
if (ok_to_printk)
printk("setting rec %p %pS\n", (void*)rec->ip, (void*)rec->ip);

and at the start of the dynamic test have:

ok_to_printk = 1;
pr_info("Testing dynamic ftarce: ");

You should see the record being set. If not, you know why it broke.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1352 matches

Mail list logo