Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

2015-11-24 Thread Yunlong Song
On 2015/11/24 23:06, David Ahern wrote:
> 
> So you are basically ignoring all samples until SIGUSR2 is received. That 
> means the resulting data file will have limited history of task events for 
> example. And for other events the quantity is random as to when the mmaps 
> were last scanned.
> 
> Your cover letter mentioned my code "just makes some count when the signal 
> triggers perf sched, with no sample recording and has nothing to do with 
> perf.data". That is not correct. If you look at the perf-daemon code I 
> pointed you to it processes task events as they are received and saves the 
> last N-events after time sorting (limited by memory or time). When a signal 
> is received it processes the saved events and dumps them to stdout versus 
> writing a perf.data file.
> 
> David
> 

Hi, David,

Yes, I know that your sched daemon can store and print info when the signal 
triggers,
however, what I mean 'makes some count' is: sched daemon parses and processes 
the events
to extract the tracing info related with sched, rather than a general use of 
perf.data
like "perf script", "perf report", "perf data convert --to-ctf", etc. And what 
I mean
'no sample recording and has nothing to do with perf.data' is: when perf 
receives a signal,
sched daemon uses timehist_print_summary and timehist_pstree to record those 
tracing info
related with sched to a new file rather than the raw perf event records in the 
perf.data.
We can not use those files generated by sched daemon to enjoy the strong 
functions like how
perf.data can be used in "perf script", "perf report", "perf data convert 
--to-ctf", etc.

Sched daemon is good, but it is carefully designed for specific use of perf 
sched. In general
case of perf record, with snapshot mode, we still want a perf.data as before. 
Your sched daemon
concurrently does the work of storing and sched-parsing action for each signal 
trigger. To get
a general style of perf.data, the sched-parsing semantic action may have to be 
removed.

-- 
Thanks,
Yunlong Song

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

2015-11-24 Thread Wangnan (F)



On 2015/11/25 15:22, Adrian Hunter wrote:

On 25/11/15 05:50, Wangnan (F) wrote:


On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:

Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:

On 11/24/15 7:00 AM, Yunlong Song wrote:

+static int record__write(struct record *rec, void *bf, size_t size)
+{
+if (rec->memory.size && memory_enabled) {
+if (perf_memory__write(&rec->memory, bf, size) < 0) {
+pr_err("failed to write memory data, error: %m\n");
+return -1;
+}
+} else {
+if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+pr_err("failed to write perf data, error: %m\n");
+return -1;
+}
+rec->bytes_written += size;
   }

-rec->bytes_written += size;
   return 0;
   }

@@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int
idx)
   if (old == head)
   return 0;

+memory_enabled = 1;
+
   rec->samples++;

   size = head - old;
@@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec, int
idx)
   md->prev = old;
   perf_evlist__mmap_consume(rec->evlist, idx);
   out:
+memory_enabled = 0;
   return rc;
   }


So you are basically ignoring all samples until SIGUSR2 is received. That

No, he is not, its just that his code is difficult to follow, has to be
rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
will..


means the resulting data file will have limited history of task events for

... have a complete history of task events, since PERF_RECORD_FORK, etc
are not being ignored.

No?

Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...

Furthermore, there's another problem being discussed: if userspace ringbuffer
is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting.

Have you considered trying to find the head by trial-and-error at the time
you make the snapshot i.e. look at the first 8 bytes (event records are 8
byte aligned) and see if it is a valid record header, if not try the next 8
bytes.  When you find a real event record it should parse without error and
the subsequent events should all parse without error too, all the way to the
tail.  Then you can use timestamps and compare the events byte-by-byte to
avoid overlaps between 2 snapshots.


It seems not work. Now we have BPF output event, it is possible that a
BPF program output anything through that event. Even if we have a magic
in head of each event, we can't prevent BPF output event output that
magic, except we introduce some 'escape' method to prevent BPF output
event output some data pattern. So although might work in reallife,
this solution is logically incorrect. Or am I miss someting?

Thank you.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9] PCI: Xilinx-NWL-PCIe: Added support for Xilinx NWL PCIe Host Controller

2015-11-24 Thread Marc Zyngier
On Wed, 25 Nov 2015 05:40:49 +
Bharat Kumar Gogada  wrote:

> > On Thu, 19 Nov 2015 11:05:23 +0530
> > Bharat Kumar Gogada  wrote:
> > 
> > > Adding PCIe Root Port driver for Xilinx PCIe NWL bridge IP.
> > >
> > > Signed-off-by: Bharat Kumar Gogada 
> > > Signed-off-by: Ravi Kiran Gummaluri 
> > > Acked-by: Rob Herring 
> > > ---
> > > +
> > > +#define MSI_ADDRESS  0xDEED
> > 
> > How did you pick this value? What if it intersect with some actual RAM?
> > What if a device actually does DMA to that location?
> > 
> > Wouldn't it make sense to actually pick a real *device* address (hint:
> > your MSI controller itself) for this purpose, as the device will never DMA
> > there?
> >
> > 
> We have already mentioned in previous patch discussion, we don't have
> any device address on our SOC for MSI, that's the reason we are
> allocating a page for MSI in RAM. Since our memory write is consumed
> by bridge and doesn't write to memory, you suggested to use some
> random address,  so using some random address.

This is becoming painful.

- "write is consumed by bridge and doesn't write to memory": So why are
  you using something that has a chance of actually being memory??? Are
  you in the business of corrupting unsuspecting data?

- "we don't have any device address on our SOC for MSI": You have
  plenty, and that's the whole of your device space. *All of it*. So
  just take the base address of your PCIe controller, and be done with
  it. Or your UART. Anything that cannot be DMA'ed to from a PCIe
  device, and that is downstream of your PCIe bridge.

M.
-- 
Jazz is not dead. It just smells funny.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scripts: fix the sys path for gdb scripts

2015-11-24 Thread Jan Kiszka
On 2015-11-19 11:54, yalin wang wrote:
> The sys.path should be scripts/gdb,
> so that we can import linux lib correctly.
> 
> Signed-off-by: yalin wang 
> ---
>  scripts/gdb/vmlinux-gdb.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py
> index ce82bf5..5a45d1a 100644
> --- a/scripts/gdb/vmlinux-gdb.py
> +++ b/scripts/gdb/vmlinux-gdb.py
> @@ -13,7 +13,7 @@
>  
>  import os
>  
> -sys.path.insert(0, os.path.dirname(__file__) + "/scripts/gdb")
> +sys.path.insert(0, os.path.dirname(__file__))
>  
>  try:
>  gdb.parse_and_eval("0")
> 

NACK. This patch is assuming that vmlinux-gdb.py is (only) started from
the scripts/gdb folder. But CONFIG_GDB_SCRIPTS places a link to
vmlinux-gdb.py aside the vmlinux binary in the top-level folder. That
way, the script is auto-loaded by gdb.

If you have a compelling use case for loading the script manually from
its original folder, we can discuss augmenting the path. But removing
the existing one is wrong.

Andrew, please drop the patch from your queue.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

2015-11-24 Thread Adrian Hunter
On 25/11/15 05:50, Wangnan (F) wrote:
> 
> 
> On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:
>> Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:
>>> On 11/24/15 7:00 AM, Yunlong Song wrote:
 +static int record__write(struct record *rec, void *bf, size_t size)
 +{
 +if (rec->memory.size && memory_enabled) {
 +if (perf_memory__write(&rec->memory, bf, size) < 0) {
 +pr_err("failed to write memory data, error: %m\n");
 +return -1;
 +}
 +} else {
 +if (perf_data_file__write(rec->session->file, bf, size) < 0) {
 +pr_err("failed to write perf data, error: %m\n");
 +return -1;
 +}
 +rec->bytes_written += size;
   }

 -rec->bytes_written += size;
   return 0;
   }

 @@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int
 idx)
   if (old == head)
   return 0;

 +memory_enabled = 1;
 +
   rec->samples++;

   size = head - old;
 @@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec, int
 idx)
   md->prev = old;
   perf_evlist__mmap_consume(rec->evlist, idx);
   out:
 +memory_enabled = 0;
   return rc;
   }

>>> So you are basically ignoring all samples until SIGUSR2 is received. That
>> No, he is not, its just that his code is difficult to follow, has to be
>> rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
>> will..
>>
>>> means the resulting data file will have limited history of task events for
>> ... have a complete history of task events, since PERF_RECORD_FORK, etc
>> are not being ignored.
>>
>> No?
> 
> Actually we are discussing about this problem.
> 
> For such tracking events (PERF_RECORD_FORK...), we have dummy event so
> it is possible for us to receive tracking events from a separated
> channel, therefore we don't have to parse every events to pick those
> events out. Instead, we can process tracking events differently, then
> more interesting things can be done. For example, squashing those tracking
> events if it takes too much memory...
> 
> Furthermore, there's another problem being discussed: if userspace ringbuffer
> is bytes based, parsing event is unavoidable. Without parsing event we are
> unable to find the new 'head' pointer when overwriting.

Have you considered trying to find the head by trial-and-error at the time
you make the snapshot i.e. look at the first 8 bytes (event records are 8
byte aligned) and see if it is a valid record header, if not try the next 8
bytes.  When you find a real event record it should parse without error and
the subsequent events should all parse without error too, all the way to the
tail.  Then you can use timestamps and compare the events byte-by-byte to
avoid overlaps between 2 snapshots.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions

2015-11-24 Thread Arnd Bergmann
On Tuesday 24 November 2015 17:51:37 Stephen Boyd wrote:
> On 11/24, Arnd Bergmann wrote:
> > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote:
> > IOW, anything with CPU implementer 0x56 part 0x581 should use those,
> > while part 0x584 can use the sdiv/udiv that it reports correctly.
> > 
> 
> It looks like we have some sort of function that mostly does
> this, except it doesn't differentiate on that lower bit for 1 vs
> 4. I guess I'll write another one for that.
> 
> static inline int cpu_is_pj4(void)
> {
> unsigned int id;
> 
> id = read_cpuid_id();
> if ((id & 0xff0fff00) == 0x560f5800)
> return 1;
> 
> return 0;
> }

Correct, thanks.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel oops on mmotm-2015-10-15-15-20

2015-11-24 Thread Minchan Kim
On Thu, Nov 19, 2015 at 08:58:27AM +0200, Kirill A. Shutemov wrote:
> On Thu, Nov 19, 2015 at 11:12:21AM +0900, Minchan Kim wrote:
> > On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote:
> > > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote:
> > > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote:
> > > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote:
> > > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote:
> > > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote:
> > > > > > > > During the test with MADV_FREE on kernel I applied your patches,
> > > > > > > > I couldn't see any problem.
> > > > > > > > 
> > > > > > > > However, in this round, I did another test which is same one
> > > > > > > > I attached but a liitle bit different because it doesn't do
> > > > > > > > (memcg things/kill/swapoff) for testing program long-live test.
> > > > > > > 
> > > > > > > Could you share updated test?
> > > > > > 
> > > > > > It's part of my testing suite so I should factor it out.
> > > > > > I will send it when I go to office tomorrow.
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53?
> > > > > > 
> > > > > > Befor leaving office, I queued it up and result is below.
> > > > > > It seems you fixed already but didn't apply it to mmotm yet. Right?
> > > > > > Anyway, please confirm and say to me what I should add more patches
> > > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug
> > > > > > fix patches.
> > > > > 
> > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 
> > > > > release:
> > > > > 
> > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com
> > > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com
> > > > 
> > > > 1. mm: fix __page_mapcount()
> > > > 2. thp: fix leak due split_huge_page() vs. exit race
> > > > 
> > > > If I missed some patches, let me know it.
> > > > 
> > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested 
> > > > again.
> > > > But unfortunately, the result was below.
> > > > 
> > > > Now, I am making test program I can send to you but it seems to be not 
> > > > easy
> > > > because small changes for factoring it out from testing suite seems to 
> > > > change
> > > > something(ex, timing) and makes hard to reproduce. I will try it again.
> > > 
> > > Your test suite seems generate quite a few bug reports. Don't mind make 
> > > whole
> > > suite public?
> > 
> > It's tough due to including company internal stuffs.
> > That's why I try to factor the part I can share out but unfortunatel,
> > I couldn't grab a time for retrying until now. :(
> > 
> > >  
> > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 
> > > > index:0x60e02
> > > > flags: 0x40040018(uptodate|dirty|swapbacked)
> > > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
> > > > page->mem_cgroup:880077cf0c00
> > > > [ cut here ]
> > > > kernel BUG at mm/huge_memory.c:3272!
> > > > invalid opcode:  [#1] SMP 
> > > > Dumping ftrace buffer:
> > > >(ftrace buffer empty)
> > > > Modules linked in:
> > > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
> > > > 01/01/2011
> > > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000
> > > > RIP: 0010:[]  [] 
> > > > split_huge_page_to_list+0x8fb/0x910
> > > > RSP: 0018:88007344f968  EFLAGS: 00010286
> > > > RAX: 0021 RBX: ea240080 RCX: 
> > > > RDX: 0001 RSI: 0246 RDI: 821df4d8
> > > > RBP: 88007344f9e8 R08:  R09: 880bc600
> > > > R10: 8163e2c0 R11: 4b47 R12: ea240080
> > > > R13: ea240088 R14: ea240080 R15: 
> > > > FS:  () GS:88007830() 
> > > > knlGS:
> > > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0
> > > > Stack:
> > > >  cccd ea240080 88007344fa00 ea240088
> > > >  88007344fa00  88007344f9e8 810f0200
> > > >  ea24   ea240080
> > > > Call Trace:
> > > >  [] ? __lock_page+0xa0/0xb0
> > > >  [] deferred_split_scan+0x115/0x240
> > > >  [] ? list_lru_count_one+0x1c/0x30
> > > >  [] shrink_slab.part.42+0x1e3/0x350
> > > >  [] shrink_zone+0x26a/0x280
> > > >  [] do_try_to_free_pages+0x12d/0x3b0
> > > >  [] try_to_free_pages+0xb4/0x140
> > > >  [] __alloc_pages_nodemask+0x459/0x920
> > > >  [] ? trace_event_raw_event_tick_stop+0xd0/0xd0
> > > >  [] khugepaged+0x155/0x1b10
> > > >  []

Re: [PATCH v2 3/3] arcmsr: changes driver version number

2015-11-24 Thread Hannes Reinecke
On 11/25/2015 04:40 AM, Ching Huang wrote:
> From: Ching Huang 
> 
> Changes driver version number.
> 
> Signed-of-by: Ching Huang 
> 
> ---
> 
> diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
> --- a/drivers/scsi/arcmsr/arcmsr.h2015-11-25 10:52:13.33447 +0800
> +++ b/drivers/scsi/arcmsr/arcmsr.h2015-10-19 15:57:08.0 +0800
> @@ -52,7 +52,7 @@ struct device_attribute;
>   #define ARCMSR_MAX_FREECCB_NUM  320
>  #define ARCMSR_MAX_OUTSTANDING_CMD   255
>  #endif
> -#define ARCMSR_DRIVER_VERSION"v1.30.00.04-20140919"
> +#define ARCMSR_DRIVER_VERSION"v1.30.00.21-20151019"
>  #define ARCMSR_SCSI_INITIATOR_ID 
> 255
>  #define ARCMSR_MAX_XFER_SECTORS  
> 512
>  #define ARCMSR_MAX_XFER_SECTORS_B
> 4096
> 
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/3] arcmsr: adds code for support areca new adapter ARC1203

2015-11-24 Thread Hannes Reinecke
On 11/25/2015 04:25 AM, Ching Huang wrote:
> From: Ching Huang 
> 
> Support areca new PCIe to SATA RAID adapter ARC1203
> 
> Signed-of-by: Ching Huang
> 
> ---
> 
> diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
> --- a/drivers/scsi/arcmsr/arcmsr.h2015-11-25 10:52:16.28647 +0800
> +++ b/drivers/scsi/arcmsr/arcmsr.h2015-11-25 10:52:13.33447 +0800
> @@ -74,6 +74,9 @@ struct device_attribute;
>  #ifndef PCI_DEVICE_ID_ARECA_1214
>   #define PCI_DEVICE_ID_ARECA_12140x1214
>  #endif
> +#ifndef PCI_DEVICE_ID_ARECA_1203
> + #define PCI_DEVICE_ID_ARECA_12030x1203
> +#endif
>  /*
>  
> **
>  **
> @@ -245,6 +248,12 @@ struct FIRMWARE_INFO
>  /* window of "instruction flags" from iop to driver */
>  #define ARCMSR_IOP2DRV_DOORBELL   0x00020408
>  #define ARCMSR_IOP2DRV_DOORBELL_MASK  0x0002040C
> +/* window of "instruction flags" from iop to driver */
> +#define ARCMSR_IOP2DRV_DOORBELL_1203  0x00021870
> +#define ARCMSR_IOP2DRV_DOORBELL_MASK_1203 0x00021874
> +/* window of "instruction flags" from driver to iop */
> +#define ARCMSR_DRV2IOP_DOORBELL_1203  0x00021878
> +#define ARCMSR_DRV2IOP_DOORBELL_MASK_1203 0x0002187C
>  /* ARECA FLAG LANGUAGE */
>  /* ioctl transfer */
>  #define ARCMSR_IOP2DRV_DATA_WRITE_OK  0x0001
> diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c 
> b/drivers/scsi/arcmsr/arcmsr_hba.c
> --- a/drivers/scsi/arcmsr/arcmsr_hba.c2015-11-24 11:35:26.0 
> +0800
> +++ b/drivers/scsi/arcmsr/arcmsr_hba.c2015-11-24 18:58:40.640226000 
> +0800
> @@ -114,6 +114,7 @@ static void arcmsr_hardware_reset(struct
>  static const char *arcmsr_info(struct Scsi_Host *);
>  static irqreturn_t arcmsr_interrupt(struct AdapterControlBlock *acb);
>  static void arcmsr_free_irq(struct pci_dev *, struct AdapterControlBlock *);
> +static void arcmsr_wait_firmware_ready(struct AdapterControlBlock *acb);
>  static int arcmsr_adjust_disk_queue_depth(struct scsi_device *sdev, int 
> queue_depth)
>  {
>   if (queue_depth > ARCMSR_MAX_CMD_PERLUN)
> @@ -157,6 +158,8 @@ static struct pci_device_id arcmsr_devic
>   .driver_data = ACB_ADAPTER_TYPE_B},
>   {PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1202),
>   .driver_data = ACB_ADAPTER_TYPE_B},
> + {PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1203),
> + .driver_data = ACB_ADAPTER_TYPE_B},
>   {PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1210),
>   .driver_data = ACB_ADAPTER_TYPE_A},
>   {PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1214),
> @@ -2621,7 +2624,7 @@ static bool arcmsr_hbaA_get_config(struc
>  }
>  static bool arcmsr_hbaB_get_config(struct AdapterControlBlock *acb)
>  {
> - struct MessageUnit_B *reg = acb->pmuB;
> + struct MessageUnit_B *reg;
>   struct pci_dev *pdev = acb->pdev;
>   void *dma_coherent;
>   dma_addr_t dma_coherent_handle;
> @@ -2649,10 +2652,17 @@ static bool arcmsr_hbaB_get_config(struc
>   acb->dma_coherent2 = dma_coherent;
>   reg = (struct MessageUnit_B *)dma_coherent;
>   acb->pmuB = reg;
> - reg->drv2iop_doorbell= (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL);
> - reg->drv2iop_doorbell_mask = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_MASK);
> - reg->iop2drv_doorbell = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL);
> - reg->iop2drv_doorbell_mask = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_MASK);
> + if (acb->pdev->device == PCI_DEVICE_ID_ARECA_1203) {
> + reg->drv2iop_doorbell = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_1203);
> + reg->drv2iop_doorbell_mask = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_MASK_1203);
> + reg->iop2drv_doorbell = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_1203);
> + reg->iop2drv_doorbell_mask = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_MASK_1203);
> + } else {
> + reg->drv2iop_doorbell= (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL);
> + reg->drv2iop_doorbell_mask = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_MASK);
> + reg->iop2drv_doorbell = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL);
> + reg->iop2drv_doorbell_mask = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_MASK);
> + }
>   reg->message_wbuffer = (uint32_t __iomem *)((unsigned 
> long)acb->mem_base1 + ARCMSR_M

Re: [PATCH 1/1] thermal: setup monitor only once after handling trips

2015-11-24 Thread Chen, Yu C
Hi,
On Tue, 2015-11-24 at 20:07 -0800, Eduardo Valentin wrote:
> Instead of changing the monitoring setup every time after
> handling each trip, this patch simplifies the monitoring
> setup by moving the setup call to a place where all
> trips have been treated already.
> 
> Cc: Zhang Rui 
> Cc: linux...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Eduardo Valentin 
> ---
>  drivers/thermal/thermal_core.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index d9e525c..6debb54 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -457,11 +457,6 @@ static void handle_thermal_trip(struct 
> thermal_zone_device *tz, int trip)
>   handle_critical_trips(tz, trip, type);
>   else
>   handle_non_critical_trips(tz, trip, type);
> - /*
> -  * Alright, we handled this trip successfully.
> -  * So, start monitoring again.
> -  */
> - monitor_thermal_zone(tz);
>  }
>  
>  /**
> @@ -547,6 +542,12 @@ void thermal_zone_device_update(struct 
> thermal_zone_device *tz)
>  
>   for (count = 0; count < tz->trips; count++)
>   handle_thermal_trip(tz, count);
> +
> + /*
> +  * Alright, we handled this trip successfully.
s/these trips ?
> +  * So, start monitoring again.
> +  */
> + monitor_thermal_zone(tz);
>  }
>  EXPORT_SYMBOL_GPL(thermal_zone_device_update);
>  
BTW, thermal_notify_framework might need be affected? 

thanks,
Yu
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH v2 1/3] arcmsr: fixed getting wrong configuration data

2015-11-24 Thread Hannes Reinecke
On 11/25/2015 04:21 AM, Ching Huang wrote:
> From: Ching Huang 
> 
> Fixed getting wrong configuration data of adapter type B and type D.
> 
> Signed-of-by: Ching Huang 
> 
> ---
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next 3/3] vhost_net: basic polling support

2015-11-24 Thread Jason Wang
This patch tries to poll for new added tx buffer or socket receive
queue for a while at the end of tx/rx processing. The maximum time
spent on polling were specified through a new kind of vring ioctl.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c| 72 ++
 drivers/vhost/vhost.c  | 15 ++
 drivers/vhost/vhost.h  |  1 +
 include/uapi/linux/vhost.h | 11 +++
 4 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 9eda69e..ce6da77 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -287,6 +287,41 @@ static void vhost_zerocopy_callback(struct ubuf_info 
*ubuf, bool success)
rcu_read_unlock_bh();
 }
 
+static inline unsigned long busy_clock(void)
+{
+   return local_clock() >> 10;
+}
+
+static bool vhost_can_busy_poll(struct vhost_dev *dev,
+   unsigned long endtime)
+{
+   return likely(!need_resched()) &&
+  likely(!time_after(busy_clock(), endtime)) &&
+  likely(!signal_pending(current)) &&
+  !vhost_has_work(dev) &&
+  single_task_running();
+}
+
+static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
+   struct vhost_virtqueue *vq,
+   struct iovec iov[], unsigned int iov_size,
+   unsigned int *out_num, unsigned int *in_num)
+{
+   unsigned long uninitialized_var(endtime);
+
+   if (vq->busyloop_timeout) {
+   preempt_disable();
+   endtime = busy_clock() + vq->busyloop_timeout;
+   while (vhost_can_busy_poll(vq->dev, endtime) &&
+  !vhost_vq_more_avail(vq->dev, vq))
+   cpu_relax();
+   preempt_enable();
+   }
+
+   return vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+out_num, in_num, NULL, NULL);
+}
+
 /* Expects to be always run from workqueue - which acts as
  * read-size critical section for our kind of RCU. */
 static void handle_tx(struct vhost_net *net)
@@ -331,10 +366,9 @@ static void handle_tx(struct vhost_net *net)
  % UIO_MAXIOV == nvq->done_idx))
break;
 
-   head = vhost_get_vq_desc(vq, vq->iov,
-ARRAY_SIZE(vq->iov),
-&out, &in,
-NULL, NULL);
+   head = vhost_net_tx_get_vq_desc(net, vq, vq->iov,
+   ARRAY_SIZE(vq->iov),
+   &out, &in);
/* On error, stop handling until the next kick. */
if (unlikely(head < 0))
break;
@@ -435,6 +469,34 @@ static int peek_head_len(struct sock *sk)
return len;
 }
 
+static int vhost_net_peek_head_len(struct vhost_net *net, struct sock *sk)
+{
+   struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
+   struct vhost_virtqueue *vq = &nvq->vq;
+   unsigned long uninitialized_var(endtime);
+
+   if (vq->busyloop_timeout) {
+   mutex_lock(&vq->mutex);
+   vhost_disable_notify(&net->dev, vq);
+
+   preempt_disable();
+   endtime = busy_clock() + vq->busyloop_timeout;
+
+   while (vhost_can_busy_poll(&net->dev, endtime) &&
+  skb_queue_empty(&sk->sk_receive_queue) &&
+  !vhost_vq_more_avail(&net->dev, vq))
+   cpu_relax();
+
+   preempt_enable();
+
+   if (vhost_enable_notify(&net->dev, vq))
+   vhost_poll_queue(&vq->poll);
+   mutex_unlock(&vq->mutex);
+   }
+
+   return peek_head_len(sk);
+}
+
 /* This is a multi-buffer version of vhost_get_desc, that works if
  * vq has read descriptors only.
  * @vq - the relevant virtqueue
@@ -553,7 +615,7 @@ static void handle_rx(struct vhost_net *net)
vq->log : NULL;
mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
 
-   while ((sock_len = peek_head_len(sock->sk))) {
+   while ((sock_len = vhost_net_peek_head_len(net, sock->sk))) {
sock_len += sock_hlen;
vhost_len = sock_len + vhost_hlen;
headcount = get_rx_bufs(vq, vq->heads, vhost_len,
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b86c5aa..857af6c 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -285,6 +285,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq->memory = NULL;
vq->is_le = virtio_legacy_is_little_endian();
vhost_vq_reset_user_be(vq);
+   vq->busyloop_timeout = 0;
 }
 
 static int vhost_worker(void *data)
@@ -747,6 +748,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void

[PATCH net-next 1/3] vhost: introduce vhost_has_work()

2015-11-24 Thread Jason Wang
This path introduces a helper which can give a hint for whether or not
there's a work queued in the work list.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 7 +++
 drivers/vhost/vhost.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index eec2f11..163b365 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -245,6 +245,13 @@ void vhost_work_queue(struct vhost_dev *dev, struct 
vhost_work *work)
 }
 EXPORT_SYMBOL_GPL(vhost_work_queue);
 
+/* A lockless hint for busy polling code to exit the loop */
+bool vhost_has_work(struct vhost_dev *dev)
+{
+   return !list_empty(&dev->work_list);
+}
+EXPORT_SYMBOL_GPL(vhost_has_work);
+
 void vhost_poll_queue(struct vhost_poll *poll)
 {
vhost_work_queue(poll->dev, &poll->work);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index d3f7674..43284ad 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -37,6 +37,7 @@ struct vhost_poll {
 
 void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn);
 void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work);
+bool vhost_has_work(struct vhost_dev *dev);
 
 void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
 unsigned long mask, struct vhost_dev *dev);
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next 2/3] vhost: introduce vhost_vq_more_avail()

2015-11-24 Thread Jason Wang
Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 26 +-
 drivers/vhost/vhost.h |  1 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 163b365..b86c5aa 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1633,10 +1633,25 @@ void vhost_add_used_and_signal_n(struct vhost_dev *dev,
 }
 EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n);
 
+bool vhost_vq_more_avail(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+{
+   __virtio16 avail_idx;
+   int r;
+
+   r = __get_user(avail_idx, &vq->avail->idx);
+   if (r) {
+   vq_err(vq, "Failed to check avail idx at %p: %d\n",
+  &vq->avail->idx, r);
+   return false;
+   }
+
+   return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
+}
+EXPORT_SYMBOL_GPL(vhost_vq_more_avail);
+
 /* OK, now we need to know about added descriptors. */
 bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
-   __virtio16 avail_idx;
int r;
 
if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
@@ -1660,14 +1675,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct 
vhost_virtqueue *vq)
/* They could have slipped one in as we were doing that: make
 * sure it's written, then check again. */
smp_mb();
-   r = __get_user(avail_idx, &vq->avail->idx);
-   if (r) {
-   vq_err(vq, "Failed to check avail idx at %p: %d\n",
-  &vq->avail->idx, r);
-   return false;
-   }
-
-   return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
+   return vhost_vq_more_avail(dev, vq);
 }
 EXPORT_SYMBOL_GPL(vhost_enable_notify);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 43284ad..2f3c57c 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -159,6 +159,7 @@ void vhost_add_used_and_signal_n(struct vhost_dev *, struct 
vhost_virtqueue *,
   struct vring_used_elem *heads, unsigned count);
 void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *);
 void vhost_disable_notify(struct vhost_dev *, struct vhost_virtqueue *);
+bool vhost_vq_more_avail(struct vhost_dev *, struct vhost_virtqueue *);
 bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
 
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next 0/3] basic busy polling support for vhost_net

2015-11-24 Thread Jason Wang
Hi all:

This series tries to add basic busy polling for vhost net. The idea is
simple: at the end of tx/rx processing, busy polling for new tx added
descriptor and rx receive socket for a while. The maximum number of
time (in us) could be spent on busy polling was specified ioctl.

Test A were done through:

- 50 us as busy loop timeout
- Netperf 2.6
- Two machines with back to back connected ixgbe
- Guest with 1 vcpu and 1 queue

Results:
- For stream workload, ioexits were reduced dramatically in medium
  size (1024-2048) of tx (at most -43%) and almost all rx (at most
  -84%) as a result of polling. This compensate for the possible
  wasted cpu cycles more or less. That porbably why we can still see
  some increasing in the normalized throughput in some cases.
- Throughput of tx were increased (at most 50%) expect for the huge
  write (16384). And we can send more packets in the case (+tpkts were
  increased).
- Very minor rx regression in some cases.
- Improvemnt on TCP_RR (at most 17%).

Guest TX:
size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
   64/ 1/  +18%/  -10%/   +7%/  +11%/0%
   64/ 2/  +14%/  -13%/   +7%/  +10%/0%
   64/ 4/   +8%/  -17%/   +7%/   +9%/0%
   64/ 8/  +11%/  -15%/   +7%/  +10%/0%
  256/ 1/  +35%/   +9%/  +21%/  +12%/  -11%
  256/ 2/  +26%/   +2%/  +20%/   +9%/  -10%
  256/ 4/  +23%/0%/  +21%/  +10%/   -9%
  256/ 8/  +23%/0%/  +21%/   +9%/   -9%
  512/ 1/  +31%/   +9%/  +23%/  +18%/  -12%
  512/ 2/  +30%/   +8%/  +24%/  +15%/  -10%
  512/ 4/  +26%/   +5%/  +24%/  +14%/  -11%
  512/ 8/  +32%/   +9%/  +23%/  +15%/  -11%
 1024/ 1/  +39%/  +16%/  +29%/  +22%/  -26%
 1024/ 2/  +35%/  +14%/  +30%/  +21%/  -22%
 1024/ 4/  +34%/  +13%/  +32%/  +21%/  -25%
 1024/ 8/  +36%/  +14%/  +32%/  +19%/  -26%
 2048/ 1/  +50%/  +27%/  +34%/  +26%/  -42%
 2048/ 2/  +43%/  +21%/  +36%/  +25%/  -43%
 2048/ 4/  +41%/  +20%/  +37%/  +27%/  -43%
 2048/ 8/  +40%/  +18%/  +35%/  +25%/  -42%
16384/ 1/0%/  -12%/   -1%/   +8%/  +15%
16384/ 2/0%/  -10%/   +1%/   +4%/   +5%
16384/ 4/0%/  -11%/   -3%/0%/   +3%
16384/ 8/0%/  -10%/   -4%/0%/   +1%

Guest RX:
size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
   64/ 1/   -2%/  -21%/   +1%/   +2%/  -75%
   64/ 2/   +1%/   -9%/  +12%/0%/  -55%
   64/ 4/0%/   -6%/   +5%/   -1%/  -44%
   64/ 8/   -5%/   -5%/   +7%/  -23%/  -50%
  256/ 1/   -8%/  -18%/  +16%/  +15%/  -63%
  256/ 2/0%/   -8%/   +9%/   -2%/  -26%
  256/ 4/0%/   -7%/   -8%/  +20%/  -41%
  256/ 8/   -8%/  -11%/   -9%/  -24%/  -78%
  512/ 1/   -6%/  -19%/  +20%/  +18%/  -29%
  512/ 2/0%/  -10%/  -14%/   -8%/  -31%
  512/ 4/   -1%/   -5%/  -11%/   -9%/  -38%
  512/ 8/   -7%/   -9%/  -17%/  -22%/  -81%
 1024/ 1/0%/  -16%/  +12%/   +9%/  -11%
 1024/ 2/0%/  -11%/0%/   +3%/  -30%
 1024/ 4/0%/   -4%/   +2%/   +6%/  -15%
 1024/ 8/   -3%/   -4%/   -8%/   -8%/  -70%
 2048/ 1/   -8%/  -23%/  +36%/  +22%/  -11%
 2048/ 2/0%/  -12%/   +1%/   +3%/  -29%
 2048/ 4/0%/   -3%/  -17%/  -15%/  -84%
 2048/ 8/0%/   -3%/   +1%/   -3%/  +10%
16384/ 1/0%/  -11%/   +4%/   +7%/  -22%
16384/ 2/0%/   -7%/   +4%/   +4%/  -33%
16384/ 4/0%/   -2%/   -2%/   -4%/  -23%
16384/ 8/   -1%/   -2%/   +1%/  -22%/  -40%

TCP_RR:
size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
1/ 1/  +11%/  -26%/  +11%/  +11%/  +10%
1/25/  +11%/  -15%/  +11%/  +11%/0%
1/50/   +9%/  -16%/  +10%/  +10%/0%
1/   100/   +9%/  -15%/   +9%/   +9%/0%
   64/ 1/  +11%/  -31%/  +11%/  +11%/  +11%
   64/25/  +12%/  -14%/  +12%/  +12%/0%
   64/50/  +11%/  -14%/  +12%/  +12%/0%
   64/   100/  +11%/  -15%/  +11%/  +11%/0%
  256/ 1/  +11%/  -27%/  +11%/  +11%/  +10%
  256/25/  +17%/  -11%/  +16%/  +16%/   -1%
  256/50/  +16%/  -11%/  +17%/  +17%/   +1%
  256/   100/  +17%/  -11%/  +18%/  +18%/   +1%

Test B were done through:

- 50us as busy loop timeout
- Netperf 2.6
- Two machines with back to back connected ixgbe
- Two guests each wich 1 vcpu and 1 queue
- pin two vhost threads to the same cpu on host to simulate the cpu
  contending

Results:
- In this radical case, we can still get at most 14% improvement on
  TCP_RR.
- For guest tx stream, minor improvemnt with at most 5% regression in
  one byte case. For guest rx stream, at most 5% regression were seen.

Guest TX:
size /-+%   /
1/-5.55%/
64   /+1.11%/
256  /+2.33%/
512  /-0.03%/
1024 /+1.14%/
4096 /+0.00%/
16384/+0.00%/

Guest RX:
size /-+%   /
1/-5.11%/
64   /-0.55%/
256  /-2.35%/
512  /-3.39%/
1024 /+6.8% /
4096 /-0.01%/
16384/+0.00%/

TCP_RR:
size /-+%/
1/+9.79% /
64   /+4.51% /
256  /+6.47% /
512  /-3.37% /
1024 /+6.15% /
4096 /+14.88%/
16384/-2.23% /

Changes from RFC V3:
- small tweak on the code to avoid mul

Re: LTO build errors (Re: linux-next: clean up the kbuild tree?)

2015-11-24 Thread Takashi Iwai
On Wed, 25 Nov 2015 05:33:44 +0100,
Andi Kleen wrote:
> 
> 
> Hi Takashi,
> 
> On Tue, Nov 24, 2015 at 05:33:36PM +0100, Takashi Iwai wrote:
> >   LD  vmlinux
> > arch/x86/kernel/cpu/perf_event_intel_rapl.c:66:20: error: rapl_domain_names 
> > causes a section type conflict with __setup_str_set_reset_devices
> >  static const char *rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
> > ^
> > init/main.c:159:19: note: ‘__setup_str_set_reset_devices’ was declared here
> >  __setup("reset_devices", set_reset_devices);
> > 
> > Hmm...  I see no direct relation, but OK, let's try to get rid of
> > __initconst.  Now it hits lots of other errors like:
> 
> I hit the same issue, will send a patch. The other symbol is typically some
> random correct symbol because gcc detects the conflict on a pair of symbols.
> 
> The problem is that placing const correctly is too difficult, the correct line
> would be 
> 
> static const char *const rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {

Ah, so I should have ignored the relation with __setup_* but
concentrate on the line.  LTO goes much deeper than a human being can
look through :)

Yes, such an error is often overseen.  A quick grep shows the
following:

arch/arc/plat-axs10x/axs10x.c:463:static const char *axs101_compat[] 
__initconst = {
arch/arc/plat-axs10x/axs10x.c:477:static const char *axs103_compat[] 
__initconst = {
arch/arc/plat-sim/platform.c:22:static const char *simulation_compat[] 
__initconst = {
arch/arm/mach-imx/mach-imx6ul.c:87:static const char *imx6ul_dt_compat[] 
__initconst = {
arch/arm/mach-shmobile/setup-r8a7793.c:22:static const char 
*r8a7793_boards_compat_dt[] __initconst = {
arch/x86/kernel/cpu/perf_event_intel_rapl.c:66:static const char 
*rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
drivers/clk/pistachio/clk.h:40:#define PNAME(x) static const char *x[] 
__initconst

> > `__sw_hweight32' referenced in section `.text' of 
> > /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> > lib/built-in.o (symbol from plugin)
> > `__sw_hweight32' referenced in section `.text' of 
> > /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> > lib/built-in.o (symbol from plugin)
> > `__sw_hweight32' referenced in section `.text' of 
> > /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> > lib/built-in.o (symbol from plugin)
> > `__sw_hweight32' referenced in section `.text' of 
> > /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> > lib/built-in.o (symbol from plugin)
> 
> This needs
> 
> https://git.kernel.org/cgit/linux/kernel/git/ak/linux-misc.git/commit/?h=lto-4.0&id=d826425f7a9d935d521989bd0a871b76fb4c59e2

OK, noted.


> > /tmp/ccUCMU7n.ltrans21.ltrans.o: In function `do_exit':
> > :(.text+0xfc0): undefined reference to `sys_futex'
> > /tmp/ccUCMU7n.ltrans22.ltrans.o: In function `_do_fork':
> > :(.text+0x39f7): undefined reference to `ret_from_fork'
> > :(.text+0x4428): undefined reference to `ret_from_kernel_thread'
> 
> 
> That's new, but can be fixed by adding __visible or asmlinkage to these 
> symbols
> I guess it's from the recent entry* restructuring.
> 
> I'll do an updated tree later.
> 
> Everything that's called from assembler in C needs to be marked like this. 
> It's
> fairly mechanic.

OK, thanks for the information!


Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] ARM: dts: ls1021a: Add a TFT LCD panel.

2015-11-24 Thread Meng Yi
Signed-off-by: Alison Wang 
Signed-off-by: Xiubo Li 
Signed-off-by: Jianwei Wang 
---
 arch/arm/boot/dts/ls1021a-twr.dts | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm/boot/dts/ls1021a-twr.dts 
b/arch/arm/boot/dts/ls1021a-twr.dts
index fbb89d1..fad2e3b 100644
--- a/arch/arm/boot/dts/ls1021a-twr.dts
+++ b/arch/arm/boot/dts/ls1021a-twr.dts
@@ -105,6 +105,17 @@
bitclock-master;
};
};
+
+   panel: panel {
+   compatible = "nec,nl4827hc19-05b";
+   };
+
+};
+
+&dcu {
+   fsl,panel = <&panel>;
+   status = "okay";
+
 };
 
 &dspi1 {
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] ARM: dts: ls1021a: Add DCU dts node.

2015-11-24 Thread Meng Yi
Signed-off-by: Alison Wang 
Signed-off-by: Xiubo Li 
Signed-off-by: Jianwei Wang 
---
 arch/arm/boot/dts/ls1021a.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 9430a99..f01c98b 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -428,6 +428,16 @@
 <&platform_clk 1>;
};
 
+   dcu: dcu@2ce {
+   compatible = "fsl,ls1021a-dcu";
+   reg = <0x0 0x2ce 0x0 0x1>;
+   interrupts = ;
+   clocks = <&platform_clk 0>;
+   clock-names = "dcu";
+   big-endian;
+   status = "disabled";
+   };
+
mdio0: mdio@2d24000 {
compatible = "gianfar";
device_type = "mdio";
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TTY: n_gsm, fix false positive WARN_ON

2015-11-24 Thread xinhui

hi, Jiri
This warning should blame on commit 5a640967 ("tty/n_gsm.c: fix a memory leak 
in gsmld_open()"). When gsm driver failed to activate one mux,there is memory leak. So 
I call this ->cleanup() to do the cleanup work. Seems I did not consider all cases.

I have one confusion. As there is field gsm->num to store the index of 
gsm_mux[]. so in gsm_cleanup_mux(), why we still use for-loop to find this mux?

In error handle path, for example, the call trace in this patch, as we failed 
to activate it and the
gsm->num is invalid(and the value is 0). we can just modify the codes like 
below:

if(gsm_mux[gsm->num] == gsm)
other work
else
return;

I think it would work, and the logic is correct. Or I just miss something 
important?

thanks
xinhui

On 2015/11/25 00:54, Jiri Slaby wrote:

Dmitry reported, that the current cleanup code in n_gsm can trigger a
warning:
WARNING: CPU: 2 PID: 24238 at drivers/tty/n_gsm.c:2048 
gsm_cleanup_mux+0x166/0x6b0()
...
Call Trace:
...
  [] warn_slowpath_null+0x29/0x30 kernel/panic.c:490
  [] gsm_cleanup_mux+0x166/0x6b0 drivers/tty/n_gsm.c:2048
  [] gsmld_open+0x5b7/0x7a0 drivers/tty/n_gsm.c:2386
  [] tty_ldisc_open.isra.2+0x78/0xd0 
drivers/tty/tty_ldisc.c:447
  [] tty_set_ldisc+0x1ca/0xa70 drivers/tty/tty_ldisc.c:567
  [< inline >] tiocsetd drivers/tty/tty_io.c:2650
  [] tty_ioctl+0xb2a/0x2140 drivers/tty/tty_io.c:2883
...

But this is a legal path when open fails to find a space in the
gsm_mux array and tries to clean up. So make it a standard test
instead of a warning.

Reported-by: "Dmitry Vyukov" 
Cc: Alan Cox 
Link: 
http://lkml.kernel.org/r/cact4y+bhqbab68vfi7romcs-z9zw3kqrvcq+bvhh1oa5nca...@mail.gmail.com
Fixes: e1eaea46bb40 ("tty: n_gsm line discipline")
Cc: stable 
Signed-off-by: Jiri Slaby 
---
  drivers/tty/n_gsm.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index c3fe026d3168..9aff37186246 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -2045,7 +2045,9 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm)
}
}
spin_unlock(&gsm_mux_lock);
-   WARN_ON(i == MAX_MUX);
+   /* open failed before registering => nothing to do */
+   if (i == MAX_MUX)
+   return;

/* In theory disconnecting DLCI 0 is sufficient but for some
   modems this is apparently not the case. */



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-11-24 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, Nov 20, 2015 at 06:36:48PM +0900, Hidehiro Kawai wrote:
> > Currently, panic() and crash_kexec() can be called at the same time.
> > For example (x86 case):
> >
> > CPU 0:
> >   oops_end()
> > crash_kexec()
> >   mutex_trylock() // acquired
> > nmi_shootdown_cpus() // stop other cpus
> >
> > CPU 1:
> >   panic()
> > crash_kexec()
> >   mutex_trylock() // failed to acquire
> > smp_send_stop() // stop other cpus
> > infinite loop
> >
> > If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump
> > fails.
> 
> So the smp_send_stop() stops CPU 0 from calling nmi_shootdown_cpus(), right?

Yes, but the important thing is that CPU 1 stops CPU 0 which is
only CPU processing crash_ kexec routines.

> >
> > In another case:
> >
> > CPU 0:
> >   oops_end()
> > crash_kexec()
> >   mutex_trylock() // acquired
> > 
> > io_check_error()
> >   panic()
> > crash_kexec()
> >   mutex_trylock() // failed to acquire
> > infinite loop
> >
> > Clearly, this is an undesirable result.
> 
> I'm trying to see how this patch fixes this case.
> 
> >
> > To fix this problem, this patch changes crash_kexec() to exclude
> > others by using atomic_t panic_cpu.
> >
> > V5:
> > - Add missing dummy __crash_kexec() for !CONFIG_KEXEC_CORE case
> > - Replace atomic_xchg() with atomic_set() in crash_kexec() because
> >   it is used as a release operation and there is no need of memory
> >   barrier effect.  This change also removes an unused value warning
> >
> > V4:
> > - Use new __crash_kexec(), no exclusion check version of crash_kexec(),
> >   instead of checking if panic_cpu is the current cpu or not
> >
> > V2:
> > - Use atomic_cmpxchg() instead of spin_trylock() on panic_lock
> >   to exclude concurrent accesses
> > - Don't introduce no-lock version of crash_kexec()
> >
> > Signed-off-by: Hidehiro Kawai 
> > Cc: Eric Biederman 
> > Cc: Vivek Goyal 
> > Cc: Andrew Morton 
> > Cc: Michal Hocko 
> > ---
> >  include/linux/kexec.h |2 ++
> >  kernel/kexec_core.c   |   26 +-
> >  kernel/panic.c|4 ++--
> >  3 files changed, 29 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > index d140b1e..7b68d27 100644
> > --- a/include/linux/kexec.h
> > +++ b/include/linux/kexec.h
> > @@ -237,6 +237,7 @@ extern int kexec_purgatory_get_set_symbol(struct kimage 
> > *image,
> >   unsigned int size, bool get_value);
> >  extern void *kexec_purgatory_get_symbol_addr(struct kimage *image,
> >  const char *name);
> > +extern void __crash_kexec(struct pt_regs *);
> >  extern void crash_kexec(struct pt_regs *);
> >  int kexec_should_crash(struct task_struct *);
> >  void crash_save_cpu(struct pt_regs *regs, int cpu);
> > @@ -332,6 +333,7 @@ int __weak arch_kexec_apply_relocations(const Elf_Ehdr 
> > *ehdr, Elf_Shdr *sechdrs,
> >  #else /* !CONFIG_KEXEC_CORE */
> >  struct pt_regs;
> >  struct task_struct;
> > +static inline void __crash_kexec(struct pt_regs *regs) { }
> >  static inline void crash_kexec(struct pt_regs *regs) { }
> >  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
> >  #define kexec_in_progress false
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 11b64a6..9d097f5 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -853,7 +853,8 @@ struct kimage *kexec_image;
> >  struct kimage *kexec_crash_image;
> >  int kexec_load_disabled;
> >
> > -void crash_kexec(struct pt_regs *regs)
> > +/* No panic_cpu check version of crash_kexec */
> > +void __crash_kexec(struct pt_regs *regs)
> >  {
> > /* Take the kexec_mutex here to prevent sys_kexec_load
> >  * running on one cpu from replacing the crash kernel
> > @@ -876,6 +877,29 @@ void crash_kexec(struct pt_regs *regs)
> > }
> >  }
> >
> > +void crash_kexec(struct pt_regs *regs)
> > +{
> > +   int old_cpu, this_cpu;
> > +
> > +   /*
> > +* Only one CPU is allowed to execute the crash_kexec() code as with
> > +* panic().  Otherwise parallel calls of panic() and crash_kexec()
> > +* may stop each other.  To exclude them, we use panic_cpu here too.
> > +*/
> > +   this_cpu = raw_smp_processor_id();
> > +   old_cpu = atomic_cmpxchg(&panic_cpu, -1, this_cpu);
> > +   if (old_cpu == -1) {
> > +   /* This is the 1st CPU which comes here, so go ahead. */
> > +   __crash_kexec(regs);
> > +
> > +   /*
> > +* Reset panic_cpu to allow another panic()/crash_kexec()
> > +* call.
> > +*/
> > +   atomic_set(&panic_cpu, -1);
> > +   }
> > +}
> > +
> >  size_t crash_get_memory_size(void)
> >  {
> > size_t size = 0;
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index 4fce2be..5d0b807 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -138,7 +138,7 @@ void panic(cons

Re: [PATCH] ASoC: rcar: remove unused variable

2015-11-24 Thread Kuninori Morimoto

Hi Arnd, Mark

> After a recent cleanup, the soc_card variable became unused
> and now produces a warning:
> 
> soc/sh/rcar/core.c: In function '__rsnd_kctrl_new':
> soc/sh/rcar/core.c:801:23: warning: unused variable 'soc_card' 
> [-Wunused-variable]
> 
> This removes the variable.
> 
> Fixes: 1a497983a5ae ("ASoC: Change the PCM runtime array to a list")
> Signed-off-by: Arnd Bergmann 
> 
> diff --git a/sound/soc/sh/rcar/core.c b/sound/soc/sh/rcar/core.c
> index c6685f14b9cb..90b244c1f526 100644
> --- a/sound/soc/sh/rcar/core.c
> +++ b/sound/soc/sh/rcar/core.c
> @@ -798,7 +798,6 @@ static int __rsnd_kctrl_new(struct rsnd_mod *mod,
>   void (*update)(struct rsnd_dai_stream *io,
>  struct rsnd_mod *mod))
>  {
> - struct snd_soc_card *soc_card = rtd->card;
>   struct snd_card *card = rtd->card->snd_card;
>   struct snd_kcontrol *kctrl;
>   struct snd_kcontrol_new knew = {

It seems this patch was accepted to topic/rcar branch,
but I got compile error

/opt/home/morimoto/WORK/linux/sound/soc/sh/rcar/core.c: In function 
'__rsnd_kctrl_new':
/opt/home/morimoto/WORK/linux/sound/soc/sh/rcar/core.c:807:18: error: 
'soc_card' undeclared (first use in this function)
   .index  = rtd - soc_card->rtd,
  ^
/opt/home/morimoto/WORK/linux/sound/soc/sh/rcar/core.c:807:18: note: each 
undeclared identifier is reported only once for each function it appears in
make[6]: *** [sound/soc/sh/rcar/core.o] エラー 1
make[6]: *** 未完了のジョブを待っています
make[5]: *** [sound/soc/sh/rcar] エラー 2
make[4]: *** [sound/soc/sh] エラー 2

This __rsnd_kctrl_new() is using "soc_card"...


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-24 Thread Jason Gunthorpe
On Tue, Nov 24, 2015 at 06:47:14PM -0800, Caitlin Bestler wrote:
> Acknowledge packet for the last packet of the request has been
> committed to the wire (including the appropriate fields for RDMA
> READ response).

> The target side cannot generate a response before it receives the request.
> The target side verbs cannot generate the completion before the acknowledge
> packet has been
> committed to the wire.

Sure, but, I keep saying this, the responder behavior is largely
irrelevant to what the target is able/required to do.

> Therefore the initiating side cannot receive a response before the write
> operation has completed.

Wrong. The ladder diagram would be

Requestor  Responder   Responder Verbs

SEND1 --->   Process
 X-  ACK (lost)
 > recv1 CQ
 <---   send2 WQE
recv2 CQE <  SEND2 Packet
[..]
send1 CQE <  ACK (resent)

The Ack may be lost, causing the send CQE to arrive after the recv
CQE, even though the responder did everything in a specific order.

The fundamental issue is that the responder cannot detect the lost
ACK. The PSN of the ACK packet is part of the Requestor's PSN space,
not part of the Responders:

 9.7.5.1.1 GENERATING PSNs FOR ACKNOWLEDGE MESSAGES

 C9-95: For responses to SEND requests or RDMA WRITE requests the
  responder shall insert in the PSN field of the response the PSN of the
  most recent request packet being acknowledged.

Or stated another way, the value of the AckReq bit in SEND1 has no
impact on the contents of the SEND2 packet - thus there is no way for
the requestor to detect the loss of the ACK and hold off delivering
recv2 CQE.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


4.4-rc2 crash: block related

2015-11-24 Thread Mika Penttilä

Hi,

With recent block layer pull i see a 100% repeatable crash on boot while
mounting roots (ext4 partition on eMMC, with cfq io scheduler).

---

5.674294] Unable to handle kernel NULL pointer dereference at virtual
address 0004
[5.682399] pgd = a8ca4000
[5.685113] [0004] *pgd=38a5e831, *pte=, *ppte=
[5.691428] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[5.696830] Modules linked in: st_drv
[5.700533] CPU: 1 PID: 221 Comm: mount Not tainted 4.4.0-rc2 #49
[5.706631] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[5.713163] task: a88e2ac0 ti: a88d4000 task.ti: a88d4000
[5.718578] PC is at cfq_init_prio_data+0x8/0xec
[5.723206] LR is at cfq_insert_request+0x28/0x4f0
[5.723211] pc : [<8024bf9c>]lr : [<8024e768>]psr: 600d0093
[5.723211] sp : a88d5bc0  ip :   fp : a8ab5400
[5.723219] r10: 0001  r9 : a617f4c0  r8 : 80b6359c
[5.723223] r7 : 80b62100  r6 : a873e200  r5 : a885ac30  r4 : 
[5.723226] r3 : a88d5bc0  r2 : a89106c0  r1 :   r0 : 
[5.723232] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment user
[5.723235] Control: 10c5387d  Table: 38ca404a  DAC: 0055
[5.723239] Process mount (pid: 221, stack limit = 0xa88d4210)
[5.723242] Stack: (0xa88d5bc0 to 0xa88d6000)
[5.723251] 5bc0:  a885ac30 a873e200 8024e768 a87c
a885ac30 0005 a88d4000
[5.723257] 5be0: 80b6359c a617f4c0 0001 8023817c 
a89106c0 a885ac30 
[5.723263] 5c00: a89106c0  a87c 8023654c 
 a8ab5400 a89106c0
[5.723269] 5c20: 0008 1411 f000 80236680 a88d5c44
a87c0168 a617f4c0 a81a45c0
[5.723276] 5c40: 0001 0240 80b6359c a617f4c0 0001
80231b04 a00d0013 000f
[5.723282] 5c60: a617f4c0 a89106c0 1411 f000 80b6359c
a617f4c0 0001 80110950
[5.723288] 5c80: a617f4c0 0001 1411 80b6370c 80b6359c
80112490 a8b35c00 
[5.723295] 5ca0: 80b63658 801602e4 0205a9d9  a62b4738
a8ab5400 a8b35c00 a8b36000
[5.723301] 5cc0:   a8b36000 a8ab5400 a88d5e8c
80162644 a62621e8 800f7004
[5.723307] 5ce0: a88d5e8c 806dd610 a62621e8 a617f4c0 a8b35c00
a8b36000 0001 80165480
[5.723313] 5d00:   a88d5d58 a88d5d50 a87f2a90
a88d5d54 01897158 800ec9dc
[5.723319] 5d20:  0002  a88d5dc8 0001
a88d5dc0 0001 a6023000
[5.723325] 5d40: a88d5d90 a88d5d88 a8887f10 a88d5d8c 01897158
800ec8d0 a8887f10 0004
[5.723332] 5d60:  a88d5dc0 a88d5dc0 a6029110 0001
a80fd000 a88d5d8c a8744800
[5.723338] 5d80:   0001 0980 b67c
 0001 800bf478
[5.723343] 5da0: a615e490 0001 006c a8102db0 
0001 000a 0001
[5.723349] 5dc0:     002b
a82ec200 80b6e735 0004
[5.723355] 5de0:   a8ab5400  a8b36264
 001013d0 
[5.723361] 5e00: 0001  a8b36000  1000
a8b35e88  
[5.723366] 5e20:   a8ab5594  80be3e54
  
[5.723372] 5e40:  4003  80b70288 01897158
8025e5bc a6298e00 a88d5e6c
[5.723378] 5e60: 3b9aca00 0009 a6298e00 a6298e74 a8b35c00
a6298e00 0083 
[5.723384] 5e80:  80b70288 01897158 800e6324 a6298e00
800c0050 62636d6d 70306b6c
[5.723391] 5ea0: a835 800d0013 0004 80be3e2c a8dca80e
 0001 8015f030
[5.723397] 5ec0: a8dca800  80b70288 80b70288 80b6aeb0
8015f048 801636d8 a8ab1a48
[5.723403] 5ee0: 01897158 800e6f14  a8dca800 a8ab19c0
a8dca800  80b70288
[5.723409] 5f00:  800febbc  0020 
a8dca800 a8dca840 80101a14
[5.723416] 5f20:  80b60be0 a8001f00 024000c0 88c5
800df23c 007f a8dca800
[5.723421] 5f40: a87f2a90 a6138cc0 c0ed a8dca800 000f
 000f a8dca840
[5.723428] 5f60: a8dca800  018971a0 c0ed a88d4000
 01897158 801027e4
[5.723434] 5f80:  28936a1b 563c86d0  
76f35688 c0ed 0015
[5.723440] 5fa0: 8000f6a4 8000f500  76f35688 01897188
018971a0 01897158 c0ed
[5.723447] 5fc0:  76f35688 c0ed 0015 018971a0
01897188 76f36dac 01897158
[5.723453] 5fe0: 76e56dc0 7eedcc30 76f09e70 76e56dd0 600d0010
01897188  
[5.723473] [<8024bf9c>] (cfq_init_prio_data) from [<8024e768>]
(cfq_insert_request+0x28/0x4f0)
[5.723484] [<8024e768>] (cfq_insert_request) from [<8023817c>]
(blk_queue_bio+0x254/0x260)
[5.723500] [<8023817c>] (blk_queue_bio) from [<8023654c>]
(generic_make_request+0xcc/0x17c)
[5.723510] [<8023654c>] (generic_make_request) from [<80236680>]
[5.723527] [<80236680>] (submit_bio) from [<80110950>]
(submit_bh_wbc+0x10c/0x144)
[5.723537] [<80110950>] (submit_bh_wbc) 

Re: [PATCH v5 5/5] ARM: dts: TS-4800: add basic device tree

2015-11-24 Thread Shawn Guo
On Tue, Nov 24, 2015 at 01:00:53PM -0500, Damien Riegel wrote:
> This device tree adds support for TS-4800 by Technologic Systems. This
> board is based on MX51-babbage, but there are some subtle differences in
> the pins used, and there is an additional FPGA that is memory-mapped.
> 
> More details here:
>   http://wiki.embeddedarm.com/wiki/TS-4800
> 
> Signed-off-by: Damien Riegel 
> ---
>  .../devicetree/bindings/arm/technologic.txt|   6 +

Please put binding doc into a separate patch, and have device tree
maintainers and list on copy of that patch.

>  arch/arm/boot/dts/Makefile |   3 +-
>  arch/arm/boot/dts/imx51-ts4800.dts | 190 
> +
>  3 files changed, 198 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/devicetree/bindings/arm/technologic.txt
>  create mode 100644 arch/arm/boot/dts/imx51-ts4800.dts
> 
> diff --git a/Documentation/devicetree/bindings/arm/technologic.txt 
> b/Documentation/devicetree/bindings/arm/technologic.txt
> new file mode 100644
> index 000..8422988
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/technologic.txt
> @@ -0,0 +1,6 @@
> +Technologic Systems Platforms Device Tree Bindings
> +--
> +
> +TS-4800 board
> +Required root node properties:
> + - compatible = "technologic,imx51-ts4800", "fsl,imx51";
> diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
> index bb8fa02..41b9985 100644
> --- a/arch/arm/boot/dts/Makefile
> +++ b/arch/arm/boot/dts/Makefile
> @@ -258,7 +258,8 @@ dtb-$(CONFIG_SOC_IMX51) += \
>   imx51-apf51dev.dtb \
>   imx51-babbage.dtb \
>   imx51-digi-connectcore-jsk.dtb \
> - imx51-eukrea-mbimxsd51-baseboard.dtb
> + imx51-eukrea-mbimxsd51-baseboard.dtb \
> + imx51-ts4800.dtb
>  dtb-$(CONFIG_SOC_IMX53) += \
>   imx53-ard.dtb \
>   imx53-m53evk.dtb \
> diff --git a/arch/arm/boot/dts/imx51-ts4800.dts 
> b/arch/arm/boot/dts/imx51-ts4800.dts
> new file mode 100644
> index 000..fac2058
> --- /dev/null
> +++ b/arch/arm/boot/dts/imx51-ts4800.dts
> @@ -0,0 +1,190 @@
> +/*
> + * Copyright 2015 Savoir-faire Linux
> + *
> + * This device tree is based on imx51-babbage.dts
> + *
> + * The code contained herein is licensed under the GNU General Public
> + * License. You may obtain a copy of the GNU General Public License
> + * Version 2 at the following locations:
> + *
> + * http://www.opensource.org/licenses/gpl-license.html
> + * http://www.gnu.org/copyleft/gpl.html
> + */
> +
> +/dts-v1/;
> +#include "imx51.dtsi"
> +
> +/ {
> + model = "Technologic Systems TS-4800";
> + compatible = "technologic,imx51-ts4800", "fsl,imx51";
> +
> + chosen {
> + stdout-path = &uart1;
> + };
> +
> + memory {
> + reg = <0x9000 0x1000>;
> + };
> +
> + soc {
> + fpga {

For node with 'reg' property, it should be named in form of
name@unit-adderss.

> + compatible = "simple-bus";
> + reg = <0xb000 0x1d000>;
> + #address-cells = <1>;
> + #size-cells = <1>;
> + ranges;
> +
> + syscon: syscon@b001 {
> + compatible = "syscon", "simple-mfd";
> + reg = <0xb001 0x3d>;
> + bus-width = <16>;
> +
> + wdt@e {
> + compatible = "technologic,ts4800-wdt";
> + syscon = <&syscon 0xe>;
> + };
> + };
> + };
> + };
> +
> + clocks {
> + ckih1 {
> + clock-frequency = <22579200>;
> + };
> +
> + ckih2 {
> + clock-frequency = <24576000>;
> + };
> + };
> +};
> +
> +&esdhc1 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_esdhc1>;
> + cd-gpios = <&gpio1 0 GPIO_ACTIVE_LOW>;
> + wp-gpios = <&gpio1 1 GPIO_ACTIVE_HIGH>;
> + status = "okay";
> +};
> +
> +&fec {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_fec>;
> + phy-mode = "mii";
> + phy-reset-gpios = <&gpio2 14 GPIO_ACTIVE_LOW>;
> + phy-reset-duration = <1>;
> + status = "okay";
> +};
> +
> +&uart1 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_uart1>;
> + status = "okay";
> +};
> +
> +&uart2 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_uart2>;
> + status = "okay";
> +};
> +
> +&uart3 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_uart3>;
> + status = "okay";
> +};
> +
> +&i2c2 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_i2c2>;
> + status = "okay";
> +
> + rtc: m41t00@68 {
> + compatible = "stm,m41t00";
> + reg = <0x68>;
> + };
> +};

W

ENHORABUENA

2015-11-24 Thread Bulk Less
Felicitaciones usted ha ganado £ 750.000 Libras británicas {GBP} La 
suma de £ 750.000 Libras británicas {GBP} ha sido otorgado a usted en 
la Organización en curso de la lotería de Países Exportadores de 
Petróleo en Nigeria con el número ganador: 46689 \ 67 \ 2015k.In esta 
en línea ganadores de ejercicio de la lotería electrónica emergieron de 
los Países Exportadores de Petróleo que se enumeran desde el mundo 
.Actualmente, la organización cuenta con doce miembros, a saber: 
Argelia, Angola, Ecuador, Irán, Iraq, Kuwait, Libia, Nigeria, Qatar, 
Arabia Saudita, Emiratos Árabes Unidos y Venezuela. Contacto sobre 
zenith@aol.com


La información a continuación: Apellidos: . Otro nombre: 
 Sexo: .. Fecha de Nacimiento: .. Ocupación : 
. Número móvil: . Nacionalidad: .. Domicilio: 
. País: .. . Una copia escaneada de su pasaporte o de 
seguridad de la licencia o el control social de la identidad de 
conducir internacional.


Envíe la información anterior se requiere para zenith@aol.com

Organización de la Lotería Anunciador Países Exportadores de Petróleo 
2015




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] cgroup: mount cgroupns-root when inside non-init cgroupns

2015-11-24 Thread Serge E. Hallyn
On Tue, Nov 24, 2015 at 12:16:10PM -0500, Tejun Heo wrote:
...
> > +   if (ns != &init_cgroup_ns) {
> > +   struct dentry *nsdentry;
> > +   struct cgroup *cgrp;
> > +
> > +   cgrp = cset_cgroup_from_root(ns->root_cgrps, root);
> > +   nsdentry = kernfs_obtain_root(dentry->d_sb,
> > +   cgrp->kn);
> > +   dput(dentry);
> > +   dentry = nsdentry;
> > +   }
> > +   }
> 
> So, this would effectively allow namespace mounts to claim controllers
> which aren't configured otherwise which doesn't seem like a good idea.
> I think the right thing to do for namespace mounts is to always
> require an existing superblock.

that was my goal with 
https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/commit/?h=cgroupns.v4&id=8eb75d2bb24df59e262f050dce567d2332adc5f3
(which was sent inline earlier in this thread in response to Eric)  Does
that look sufficient?

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V5 PATCH 2/4] panic/x86: Allow cpus to save registers even if they are looping in NMI context

2015-11-24 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Tue, Nov 24, 2015 at 11:48:53AM +0100, Borislav Petkov wrote:
> >
> > > +  */
> > > + while (!raw_spin_trylock(&nmi_reason_lock))
> > > + poll_crash_ipi_and_callback(regs);
> >
> > Waaait a minute: so if we're getting NMIs broadcasted on every core but
> > we're *not* crash dumping, we will run into here too. This can't be
> > right. :-\
> 
> This only does something if crash_ipi_done is set, which means you are killing
> the box. But perhaps a comment that states that here would be useful, or maybe
> just put in the check here. 

OK, I'll add more comments around this.

> There's no need to make it depend on SMP, as
> raw_spin_trylock() will turn to just ({1}) for UP, and that code wont even be
> hit.

I'll integrate these SMP and UP versions with a comment about
that.

Regards,
--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group






Re: [RFC PATCH V2 3/3] Ixgbevf: Add migration support for ixgbevf driver

2015-11-24 Thread Lan Tianyu
On 2015年11月25日 05:20, Michael S. Tsirkin wrote:
> I have to say, I was much more interested in the idea
> of tracking dirty memory. I have some thoughts about
> that one - did you give up on it then?

No, our finial target is to keep VF active before doing
migration and tracking dirty memory is essential. But this
seems not easy to do that in short term for upstream. As
starters, stop VF before migration.

After deep thinking, the way of stopping VF still needs tracking
DMA-accessed dirty memory to make sure the received data buffer
before stopping VF migrated. It's easier to do that via dummy writing
data buffer when receive packet.


-- 
Best regards
Tianyu Lan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] mfd: cros ec: Lock the SPI bus while holding chipselect

2015-11-24 Thread Nicolas Boichat
cros_ec_cmd_xfer_spi and cros_ec_pkt_xfer_spi generally work like
this:
 - Pull CS down (active), wait a bit, then send a command
 - Wait for response (multiple requests)
 - Wait a while, pull CS up (inactive)

These operations, individually, lock the SPI bus, but there is
nothing preventing the SPI framework from interleaving messages
intended for other devices as the bus is unlocked in between.

This is a problem as the EC expects CS to be held low for the
whole duration.

Solution: Lock the SPI bus during the whole transaction, to make
sure that no other messages can be interleaved.

Signed-off-by: Nicolas Boichat 
---

v2: Move bus_unlock earlier in the functions.
v3: Remove comments.

Applies on top on linux-next/master (20151124)

 drivers/mfd/cros_ec_spi.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 6a0f6ec..d6af52d 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -113,7 +113,7 @@ static int terminate_request(struct cros_ec_device *ec_dev)
trans.delay_usecs = ec_spi->end_of_msg_delay;
spi_message_add_tail(&trans, &msg);
 
-   ret = spi_sync(ec_spi->spi, &msg);
+   ret = spi_sync_locked(ec_spi->spi, &msg);
 
/* Reset end-of-response timer */
ec_spi->last_transfer_ns = ktime_get_ns();
@@ -147,7 +147,7 @@ static int receive_n_bytes(struct cros_ec_device *ec_dev, 
u8 *buf, int n)
 
spi_message_init(&msg);
spi_message_add_tail(&trans, &msg);
-   ret = spi_sync(ec_spi->spi, &msg);
+   ret = spi_sync_locked(ec_spi->spi, &msg);
if (ret < 0)
dev_err(ec_dev->dev, "spi transfer failed: %d\n", ret);
 
@@ -391,10 +391,10 @@ static int cros_ec_pkt_xfer_spi(struct cros_ec_device 
*ec_dev,
}
 
rx_buf = kzalloc(len, GFP_KERNEL);
-   if (!rx_buf) {
-   ret = -ENOMEM;
-   goto exit;
-   }
+   if (!rx_buf)
+   return -ENOMEM;
+
+   spi_bus_lock(ec_spi->spi->master);
 
/*
 * Leave a gap between CS assertion and clocking of data to allow the
@@ -414,7 +414,7 @@ static int cros_ec_pkt_xfer_spi(struct cros_ec_device 
*ec_dev,
trans.len = len;
trans.cs_change = 1;
spi_message_add_tail(&trans, &msg);
-   ret = spi_sync(ec_spi->spi, &msg);
+   ret = spi_sync_locked(ec_spi->spi, &msg);
 
/* Get the response */
if (!ret) {
@@ -440,6 +440,9 @@ static int cros_ec_pkt_xfer_spi(struct cros_ec_device 
*ec_dev,
}
 
final_ret = terminate_request(ec_dev);
+
+   spi_bus_unlock(ec_spi->spi->master);
+
if (!ret)
ret = final_ret;
if (ret < 0)
@@ -520,10 +523,10 @@ static int cros_ec_cmd_xfer_spi(struct cros_ec_device 
*ec_dev,
}
 
rx_buf = kzalloc(len, GFP_KERNEL);
-   if (!rx_buf) {
-   ret = -ENOMEM;
-   goto exit;
-   }
+   if (!rx_buf)
+   return -ENOMEM;
+
+   spi_bus_lock(ec_spi->spi->master);
 
/* Transmit phase - send our message */
debug_packet(ec_dev->dev, "out", ec_dev->dout, len);
@@ -534,7 +537,7 @@ static int cros_ec_cmd_xfer_spi(struct cros_ec_device 
*ec_dev,
trans.cs_change = 1;
spi_message_init(&msg);
spi_message_add_tail(&trans, &msg);
-   ret = spi_sync(ec_spi->spi, &msg);
+   ret = spi_sync_locked(ec_spi->spi, &msg);
 
/* Get the response */
if (!ret) {
@@ -560,6 +563,9 @@ static int cros_ec_cmd_xfer_spi(struct cros_ec_device 
*ec_dev,
}
 
final_ret = terminate_request(ec_dev);
+
+   spi_bus_unlock(ec_spi->spi->master);
+
if (!ret)
ret = final_ret;
if (ret < 0)
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] zram: pass gfp from zcomp frontend to backend

2015-11-24 Thread Minchan Kim
Each zcomp backend uses own gfp flag but it's pointless
because the context they could be called is driven by upper
layer(ie, zcomp frontend). As well, zcomp frondend could
call them in different context. One context(ie, zram init part)
is it should be better to make sure successful allocation
other context(ie, further stream allocation part for accelarating
I/O speed) is just optional so let's pass gfp down from driver
(ie, zcomp frontend) like normal MM convention.

Signed-off-by: Minchan Kim 
---
 drivers/block/zram/zcomp.c | 24 
 drivers/block/zram/zcomp.h |  2 +-
 drivers/block/zram/zcomp_lz4.c | 18 +++---
 drivers/block/zram/zcomp_lzo.c | 19 ---
 4 files changed, 24 insertions(+), 39 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index c53617752b93..e7c197ede919 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -74,18 +74,18 @@ static void zcomp_strm_free(struct zcomp *comp, struct 
zcomp_strm *zstrm)
  * allocate new zcomp_strm structure with ->private initialized by
  * backend, return NULL on error
  */
-static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
+static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp, gfp_t flags)
 {
-   struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_NOIO);
+   struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), flags);
if (!zstrm)
return NULL;
 
-   zstrm->private = comp->backend->create();
+   zstrm->private = comp->backend->create(flags);
/*
 * allocate 2 pages. 1 for compressed data, plus 1 extra for the
 * case when compressed size is larger than the original one
 */
-   zstrm->buffer = (void *)__get_free_pages(GFP_NOIO | __GFP_ZERO, 1);
+   zstrm->buffer = (void *)__get_free_pages(flags | __GFP_ZERO, 1);
if (!zstrm->private || !zstrm->buffer) {
zcomp_strm_free(comp, zstrm);
zstrm = NULL;
@@ -120,8 +120,16 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct 
zcomp *comp)
/* allocate new zstrm stream */
zs->avail_strm++;
spin_unlock(&zs->strm_lock);
-
-   zstrm = zcomp_strm_alloc(comp);
+   /*
+* This function could be called in swapout/fs write path
+* so we couldn't use GFP_FS|IO. And it assumes we already
+* have at least one stream in zram initialization so we
+* don't do best effort to allocate more stream in here.
+* A default stream will work well without further multiple
+* stream. That's why we use __GFP_NORETRY|NOWARN|NOMEMALLOC.
+*/
+   zstrm = zcomp_strm_alloc(comp, GFP_NOIO|__GFP_NORETRY|
+   __GFP_NOWARN|__GFP_NOMEMALLOC);
if (!zstrm) {
spin_lock(&zs->strm_lock);
zs->avail_strm--;
@@ -209,7 +217,7 @@ static int zcomp_strm_multi_create(struct zcomp *comp, int 
max_strm)
zs->max_strm = max_strm;
zs->avail_strm = 1;
 
-   zstrm = zcomp_strm_alloc(comp);
+   zstrm = zcomp_strm_alloc(comp, GFP_KERNEL);
if (!zstrm) {
kfree(zs);
return -ENOMEM;
@@ -259,7 +267,7 @@ static int zcomp_strm_single_create(struct zcomp *comp)
 
comp->stream = zs;
mutex_init(&zs->strm_lock);
-   zs->zstrm = zcomp_strm_alloc(comp);
+   zs->zstrm = zcomp_strm_alloc(comp, GFP_KERNEL);
if (!zs->zstrm) {
kfree(zs);
return -ENOMEM;
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index 46e2b9f8f1f0..b7d2a4bcae54 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -33,7 +33,7 @@ struct zcomp_backend {
int (*decompress)(const unsigned char *src, size_t src_len,
unsigned char *dst);
 
-   void *(*create)(void);
+   void *(*create)(gfp_t flags);
void (*destroy)(void *private);
 
const char *name;
diff --git a/drivers/block/zram/zcomp_lz4.c b/drivers/block/zram/zcomp_lz4.c
index 715df0e48c13..4e0cb4a4acf7 100644
--- a/drivers/block/zram/zcomp_lz4.c
+++ b/drivers/block/zram/zcomp_lz4.c
@@ -15,25 +15,13 @@
 
 #include "zcomp_lz4.h"
 
-static void *zcomp_lz4_create(void)
+static void *zcomp_lz4_create(gfp_t flags)
 {
void *ret;
 
-   /*
-* This function could be called in swapout/fs write path
-* so we couldn't use GFP_FS|IO. And it assumes we already
-* have at least one stream in zram initialization so we
-* don't do best effort to allocate more stream in here.
-* A default stream will work well without further multiple
-* stream. That's why we use  __GFP_NORETRY|NOWARN|NOMEMALLOC.
-*/
-   ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO|__GFP_NORETRY|

RE: [V5 PATCH 2/4] panic/x86: Allow cpus to save registers even if they are looping in NMI context

2015-11-24 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, Nov 20, 2015 at 06:36:46PM +0900, Hidehiro Kawai wrote:
> > nmi_shootdown_cpus(), a subroutine of crash_kexec(), sends NMI IPI
> > to non-panic cpus to stop them while saving their register
> 
>...to stop them and save their register...

Thanks for the correction.

> > information and doing some cleanups for crash dumping.  So if a
> > non-panic cpus is infinitely looping in NMI context, we fail to
> 
> That should be CPU. Please use "CPU" instead of "cpu" in all your text
> in your next submission.

OK, I'll fix that.

> > save its register information and lose the information from the
> > crash dump.
> >
> > `Infinite loop in NMI context' can happen:
> >
> >   a. when a cpu panics on NMI while another cpu is processing panic
> >   b. when a cpu received an external or unknown NMI while another
> >  cpu is processing panic on NMI
> >
> > In the case of a, it loops in panic_smp_self_stop().  In the case
> > of b, it loops in raw_spin_lock() of nmi_reason_lock.
> 
> Please describe those two cases more verbosely - it takes slow people
> like me a while to figure out what exactly can happen.

  a. when a cpu panics on NMI while another cpu is processing panic
 Ex.
 CPU 0 CPU 1
 = =
 panic()
   panic_cpu <-- 0
   check panic_cpu
   crash_kexec()
   receive an unknown NMI
   unknown_nmi_error()
 nmi_panic()
   panic()
 check panic_cpu
 panic_smp_self_stop()
   infinite loop in NMI context

  b. when a cpu received an external or unknown NMI while another
 cpu is processing panic on NMI
 Ex.
 CPU 0 CPU 1
 ====
 receive an unknown NMI
 unknown_nmi_error()
   nmi_panic() receive an unknown NMI
 panic_cpu <-- 0   unknown_nmi_error()
 panic() nmi_panic()
   check panic_cpu panic
   crash_kexec() check panic_cpu
 panic_smp_self_stop()
   infinite loop in NMI context
 
> > This can
> > happen on some servers which broadcasts NMIs to all CPUs when a dump
> > button is pushed.
> >
> > To save registers in these case too, this patch does following things:
> >
> > 1. Move the timing of `infinite loop in NMI context' (actually
> >done by panic_smp_self_stop()) outside of panic() to enable us to
> >refer pt_regs
> 
> I can't parse that sentence. And I really tried :-\
> panic_smp_self_stop() is still in panic().

panic_smp_self_stop() is still used when a CPU in normal context
should go into infinite loop.  Only when a CPU is in NMI context,
nmi_panic_self_stop() is called and the CPU loops infinitely
without entering panic().

I'll try to revise this sentense.

> > 2. call a callback of nmi_shootdown_cpus() directly to save
> >registers and do some cleanups after setting waiting_for_crash_ipi
> >which is used for counting down the number of cpus which handled
> >the callback
> >
> > V5:
> > - Use WRITE_ONCE() when setting crash_ipi_done to 1 so that the
> >   compiler doesn't change the instruction order
> > - Support the case of b in the above description
> > - Add poll_crash_ipi_and_callback()
> >
> > V4:
> > - Rewrite the patch description
> >
> > V3:
> > - Newly introduced
> >
> > Signed-off-by: Hidehiro Kawai 
> > Cc: Andrew Morton 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: Peter Zijlstra 
> > Cc: Eric Biederman 
> > Cc: Vivek Goyal 
> > Cc: Michal Hocko 
> > ---
> >  arch/x86/include/asm/reboot.h |1 +
> >  arch/x86/kernel/nmi.c |   17 +
> >  arch/x86/kernel/reboot.c  |   28 
> >  include/linux/kernel.h|   12 ++--
> >  kernel/panic.c|   10 ++
> >  kernel/watchdog.c |2 +-
> >  6 files changed, 63 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
> > index a82c4f1..964e82f 100644
> > --- a/arch/x86/include/asm/reboot.h
> > +++ b/arch/x86/include/asm/reboot.h
> > @@ -25,5 +25,6 @@ void __noreturn machine_real_restart(unsigned int type);
> >
> >  typedef void (*nmi_shootdown_cb)(int, struct pt_regs*);
> >  void nmi_shootdown_cpus(nmi_shootdown_cb callback);
> > +void poll_crash_ipi_and_callback(struct pt_regs *regs);
> >
> >  #endif /* _ASM_X86_REBOOT_H */
> > diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> > index 5131714..74a1434 100644
> > --- a/arch/x86/kernel/nmi.c
> > +++ b/arch/x86/kernel/nmi.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #defin

[PATCH v2 1/3] zram/zcomp: use GFP_NOIO to allocate streams

2015-11-24 Thread Minchan Kim
From: Sergey Senozhatsky 

We can end up allocating a new compression stream with GFP_KERNEL
from within the IO path, which may result is nested (recursive) IO
operations. That can introduce problems if the IO path in question
is a reclaimer, holding some locks that will deadlock nested IOs.

Allocate streams and working memory using GFP_NOIO flag, forbidding
recursive IO and FS operations.

An example:

[  747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
[  747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  747.233725]  (jbd2_handle){+.+.?.}, at: [] 
start_this_handle+0x4ca/0x555
[  747.233733] {IN-RECLAIM_FS-W} state was registered at:
[  747.233735]   [] __lock_acquire+0x8da/0x117b
[  747.233738]   [] lock_acquire+0x10c/0x1a7
[  747.233740]   [] start_this_handle+0x52d/0x555
[  747.233742]   [] jbd2__journal_start+0xb4/0x237
[  747.233744]   [] __ext4_journal_start_sb+0x108/0x17e
[  747.233748]   [] ext4_dirty_inode+0x32/0x61
[  747.233750]   [] __mark_inode_dirty+0x16b/0x60c
[  747.233754]   [] iput+0x11e/0x274
[  747.233757]   [] __dentry_kill+0x148/0x1b8
[  747.233759]   [] shrink_dentry_list+0x274/0x44a
[  747.233761]   [] prune_dcache_sb+0x4a/0x55
[  747.233763]   [] super_cache_scan+0xfc/0x176
[  747.233767]   [] 
shrink_slab.part.14.constprop.25+0x2a2/0x4d3
[  747.233770]   [] shrink_zone+0x74/0x140
[  747.233772]   [] kswapd+0x6b7/0x930
[  747.233774]   [] kthread+0x107/0x10f
[  747.233778]   [] ret_from_fork+0x3f/0x70
[  747.233783] irq event stamp: 138297
[  747.233784] hardirqs last  enabled at (138297): [] 
debug_check_no_locks_freed+0x113/0x12f
[  747.233786] hardirqs last disabled at (138296): [] 
debug_check_no_locks_freed+0x33/0x12f
[  747.233788] softirqs last  enabled at (137818): [] 
__do_softirq+0x2d3/0x3e9
[  747.233792] softirqs last disabled at (137813): [] 
irq_exit+0x41/0x95
[  747.233794]
   other info that might help us debug this:
[  747.233796]  Possible unsafe locking scenario:
[  747.233797]CPU0
[  747.233798]
[  747.233799]   lock(jbd2_handle);
[  747.233801]   
[  747.233801] lock(jbd2_handle);
[  747.233803]
*** DEADLOCK ***
[  747.233805] 5 locks held by git/20158:
[  747.233806]  #0:  (sb_writers#7){.+.+.+}, at: [] 
mnt_want_write+0x24/0x4b
[  747.233811]  #1:  (&type->i_mutex_dir_key#2/1){+.+.+.}, at: 
[] lock_rename+0xd9/0xe3
[  747.233817]  #2:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: 
[] lock_two_nondirectories+0x3f/0x6b
[  747.233822]  #3:  (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: 
[] lock_two_nondirectories+0x66/0x6b
[  747.233827]  #4:  (jbd2_handle){+.+.?.}, at: [] 
start_this_handle+0x4ca/0x555
[  747.233831]
   stack backtrace:
[  747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 
4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211
[  747.233837]  8800a56cea40 88010d0a75f8 814f446d 
81077036
[  747.233840]  823a84b0 88010d0a7638 814f3849 
0001
[  747.233843]  000a 8800a56cf6f8 8800a56cea40 
810795dd
[  747.233846] Call Trace:
[  747.233849]  [] dump_stack+0x4c/0x6e
[  747.233852]  [] ? up+0x39/0x3e
[  747.233854]  [] print_usage_bug.part.23+0x25b/0x26a
[  747.233857]  [] ? 
print_shortest_lock_dependencies+0x182/0x182
[  747.233859]  [] mark_lock+0x384/0x56d
[  747.233862]  [] mark_held_locks+0x5f/0x76
[  747.233865]  [] ? zcomp_strm_alloc+0x25/0x73 [zram]
[  747.233867]  [] lockdep_trace_alloc+0xb2/0xb5
[  747.233870]  [] kmem_cache_alloc_trace+0x32/0x1e2
[  747.233873]  [] zcomp_strm_alloc+0x25/0x73 [zram]
[  747.233876]  [] zcomp_strm_multi_find+0xe7/0x173 [zram]
[  747.233879]  [] zcomp_strm_find+0xc/0xe [zram]
[  747.233881]  [] zram_bvec_rw+0x2ca/0x7e0 [zram]
[  747.233885]  [] zram_make_request+0x1fa/0x301 [zram]
[  747.233889]  [] generic_make_request+0x9c/0xdb
[  747.233891]  [] submit_bio+0xf7/0x120
[  747.233895]  [] ? __test_set_page_writeback+0x1a0/0x1b8
[  747.233897]  [] ext4_io_submit+0x2e/0x43
[  747.233899]  [] ext4_bio_write_page+0x1b7/0x300
[  747.233902]  [] mpage_submit_page+0x60/0x77
[  747.233905]  [] mpage_map_and_submit_buffers+0x10f/0x21d
[  747.233907]  [] ext4_writepages+0xc8c/0xe1b
[  747.233910]  [] do_writepages+0x23/0x2c
[  747.233913]  [] __filemap_fdatawrite_range+0x84/0x8b
[  747.233915]  [] filemap_flush+0x1c/0x1e
[  747.233917]  [] ext4_alloc_da_blocks+0xb8/0x117
[  747.233919]  [] ext4_rename+0x132/0x6dc
[  747.233921]  [] ? mark_held_locks+0x5f/0x76
[  747.233924]  [] ext4_rename2+0x29/0x2b
[  747.233926]  [] vfs_rename+0x540/0x636
[  747.233928]  [] SyS_renameat2+0x359/0x44d
[  747.233931]  [] SyS_rename+0x1e/0x20
[  747.233933]  [] entry_SYSCALL_64_fastpath+0x12/0x6f

[minchan: add stable mark]
Cc: sta...@vger.kernel.org
Signed-off-by: Sergey Senozhatsky 
---
 drivers/block/zram/zcomp.c | 4 ++--
 drivers/block/zram/zcomp_lz4.c | 2 +-
 drivers/block/zram/zcomp_lzo.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --g

[PATCH v2 2/3] zram: try vmalloc() after kmalloc()

2015-11-24 Thread Minchan Kim
From: Kyeongdon Kim 

When we're using LZ4 multi compression streams for zram swap,
we found out page allocation failure message in system running test.
That was not only once, but a few(2 - 5 times per test).
Also, some failure cases were continually occurring to try allocation
order 3.

In order to make parallel compression private data, we should call
kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
 2/3 size memory to allocate in that time, page allocation fails.
This patch makes to use vmalloc() as fallback of kmalloc(), this
prevents page alloc failure warning.

After using this, we never found warning message in running test, also
It could reduce process startup latency about 60-120ms in each case.

For reference a call trace :

Binder_1: page allocation failure: order:3, mode:0x10c0d0
CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
Call trace:
[] dump_backtrace+0x0/0x270
[] show_stack+0x10/0x1c
[] dump_stack+0x1c/0x28
[] warn_alloc_failed+0xfc/0x11c
[] __alloc_pages_nodemask+0x724/0x7f0
[] __get_free_pages+0x14/0x5c
[] kmalloc_order_trace+0x38/0xd8
[] zcomp_lz4_create+0x2c/0x38
[] zcomp_strm_alloc+0x34/0x78
[] zcomp_strm_multi_find+0x124/0x1ec
[] zcomp_strm_find+0xc/0x18
[] zram_bvec_rw+0x2fc/0x780
[] zram_make_request+0x25c/0x2d4
[] generic_make_request+0x80/0xbc
[] submit_bio+0xa4/0x15c
[] __swap_writepage+0x218/0x230
[] swap_writepage+0x3c/0x4c
[] shrink_page_list+0x51c/0x8d0
[] shrink_inactive_list+0x3f8/0x60c
[] shrink_lruvec+0x33c/0x4cc
[] shrink_zone+0x3c/0x100
[] try_to_free_pages+0x2b8/0x54c
[] __alloc_pages_nodemask+0x514/0x7f0
[] __get_free_pages+0x14/0x5c
[] proc_info_read+0x50/0xe4
[] vfs_read+0xa0/0x12c
[] SyS_read+0x44/0x74
DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB

[minchan: change vmalloc gfp and adding comment about gfp]
Signed-off-by: Kyeongdon Kim 
Signed-off-by: Minchan Kim 
---
 drivers/block/zram/zcomp_lz4.c | 23 +--
 drivers/block/zram/zcomp_lzo.c | 22 --
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/drivers/block/zram/zcomp_lz4.c b/drivers/block/zram/zcomp_lz4.c
index ee44b51130a4..715df0e48c13 100644
--- a/drivers/block/zram/zcomp_lz4.c
+++ b/drivers/block/zram/zcomp_lz4.c
@@ -10,17 +10,36 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "zcomp_lz4.h"
 
 static void *zcomp_lz4_create(void)
 {
-   return kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO);
+   void *ret;
+
+   /*
+* This function could be called in swapout/fs write path
+* so we couldn't use GFP_FS|IO. And it assumes we already
+* have at least one stream in zram initialization so we
+* don't do best effort to allocate more stream in here.
+* A default stream will work well without further multiple
+* stream. That's why we use  __GFP_NORETRY|NOWARN|NOMEMALLOC.
+*/
+   ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO|__GFP_NORETRY|
+   __GFP_NOWARN|__GFP_NOMEMALLOC);
+   if (!ret)
+   ret = __vmalloc(LZ4_MEM_COMPRESS, GFP_NOIO|__GFP_NORETRY|
+   __GFP_NOWARN|__GFP_NOMEMALLOC|
+   __GFP_ZERO,
+   PAGE_KERNEL);
+   return ret;
 }
 
 static void zcomp_lz4_destroy(void *private)
 {
-   kfree(private);
+   kvfree(private);
 }
 
 static int zcomp_lz4_compress(const unsigned char *src, unsigned char *dst,
diff --git a/drivers/block/zram/zcomp_lzo.c b/drivers/block/zram/zcomp_lzo.c
index 683ce049e070..639b94affbfd 100644
--- a/drivers/block/zram/zcomp_lzo.c
+++ b/drivers/block/zram/zcomp_lzo.c
@@ -10,17 +10,35 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "zcomp_lzo.h"
 
 static void *lzo_create(void)
 {
-   return kzalloc(LZO1X_MEM_COMPRESS, GFP_NOIO);
+   void *ret;
+   /*
+* This function could be called in swapout/fs write path
+* so we couldn't use GFP_FS|IO. And it assumes we already
+* have at least one stream in zram initialization so we
+* don't do best effort to allocate more stream in here.
+* A default stream will work well without further multiple
+* stream. That's why we use  __GFP_NORETRY|NOWARN|NOMEMALLOC.
+*/
+   ret = kzalloc(LZO1X_MEM_COMPRESS, GFP_NOIO|__GFP_NORETRY|
+   __GFP_NOWARN|__GFP_NOMEMALLOC);
+   if (!ret)
+   ret = __vmalloc(LZO1X_MEM_COMPRESS, GFP_NOIO|__GFP_NORETRY|
+   __GFP_NOWARN|__GFP_NOMEMALLOC|
+   __GFP_ZERO,
+   PAGE_KERNEL);
+   return ret;
 }
 
 static void lzo_destroy(void *private)
 {
-   kfree(private);
+   kvfree(private);
 }
 
 static int lzo

[PATCH v5 3/4] Crypto: rockchip/crypto - add crypto driver for rk3288

2015-11-24 Thread Zain Wang
Crypto driver support:
 ecb(aes) cbc(aes) ecb(des) cbc(des) ecb(des3_ede) cbc(des3_ede)
You can alloc tags above in your case.

And other algorithms and platforms will be added later on.

Signed-off-by: Zain Wang 
Tested-by: Heiko Stuebner 
---
Changed in v5:
- copy IV back after operation
- use cra_block_size to tell AES from DES instaed flag AES/TDES

Changed in v4:
- modify irq function
- add devm_add_action in probe
- fix some minor mistakes

Changed in v3:
- add OF depended in Kconfig
- rename some variate
- add reset property
- remove crypto_p variate

Changed in v2:
- remove some part about hash
- add weak key detection
- changed some variate's type

Changed in v1:
- modify some variate's name
- modify some variate's type
- modify some return value
- remove or modify some print info
- use more dev_xxx in probe
- modify the prio of cipher
- add Kconfig

 drivers/crypto/Kconfig |  11 +
 drivers/crypto/Makefile|   1 +
 drivers/crypto/rockchip/Makefile   |   3 +
 drivers/crypto/rockchip/rk3288_crypto.c| 393 
 drivers/crypto/rockchip/rk3288_crypto.h| 216 +
 drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c | 503 +
 6 files changed, 1127 insertions(+)
 create mode 100644 drivers/crypto/rockchip/Makefile
 create mode 100644 drivers/crypto/rockchip/rk3288_crypto.c
 create mode 100644 drivers/crypto/rockchip/rk3288_crypto.h
 create mode 100644 drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 5357bc1..95dccde 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -497,4 +497,15 @@ config CRYPTO_DEV_SUN4I_SS
  To compile this driver as a module, choose M here: the module
  will be called sun4i-ss.
 
+config CRYPTO_DEV_ROCKCHIP
+   tristate "Rockchip's Cryptographic Engine driver"
+   depends on OF && ARCH_ROCKCHIP
+   select CRYPTO_AES
+   select CRYPTO_DES
+   select CRYPTO_BLKCIPHER
+
+   help
+ This driver interfaces with the hardware crypto accelerator.
+ Supporting cbc/ecb chainmode, and aes/des/des3_ede cipher mode.
+
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index c3ced6f..713de9d 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_CRYPTO_DEV_QAT) += qat/
 obj-$(CONFIG_CRYPTO_DEV_QCE) += qce/
 obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
 obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
+obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
diff --git a/drivers/crypto/rockchip/Makefile b/drivers/crypto/rockchip/Makefile
new file mode 100644
index 000..7051c6c
--- /dev/null
+++ b/drivers/crypto/rockchip/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rk_crypto.o
+rk_crypto-objs := rk3288_crypto.o \
+ rk3288_crypto_ablkcipher.o \
diff --git a/drivers/crypto/rockchip/rk3288_crypto.c 
b/drivers/crypto/rockchip/rk3288_crypto.c
new file mode 100644
index 000..6b72f8d
--- /dev/null
+++ b/drivers/crypto/rockchip/rk3288_crypto.c
@@ -0,0 +1,393 @@
+/*
+ * Crypto acceleration support for Rockchip RK3288
+ *
+ * Copyright (c) 2015, Fuzhou Rockchip Electronics Co., Ltd
+ *
+ * Author: Zain Wang 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * Some ideas are from marvell-cesa.c and s5p-sss.c driver.
+ */
+
+#include "rk3288_crypto.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int rk_crypto_enable_clk(struct rk_crypto_info *dev)
+{
+   int err;
+
+   err = clk_prepare_enable(dev->sclk);
+   if (err) {
+   dev_err(dev->dev, "[%s:%d], Couldn't enable clock sclk\n",
+   __func__, __LINE__);
+   goto err_return;
+   }
+   err = clk_prepare_enable(dev->aclk);
+   if (err) {
+   dev_err(dev->dev, "[%s:%d], Couldn't enable clock aclk\n",
+   __func__, __LINE__);
+   goto err_aclk;
+   }
+   err = clk_prepare_enable(dev->hclk);
+   if (err) {
+   dev_err(dev->dev, "[%s:%d], Couldn't enable clock hclk\n",
+   __func__, __LINE__);
+   goto err_hclk;
+   }
+   err = clk_prepare_enable(dev->dmaclk);
+   if (err) {
+   dev_err(dev->dev, "[%s:%d], Couldn't enable clock dmaclk\n",
+   __func__, __LINE__);
+   goto err_dmaclk;
+   }
+   return err;
+err_dmaclk:
+   clk_disable_unprepare(dev->hclk);
+err_hclk:
+   clk_disable_unprepare(dev->aclk);
+err_aclk:
+   clk_disable_unprepare(dev->sclk);
+err_return:
+   return err;
+}
+
+static void rk_crypto_disable_clk(struct rk_crypto_info *dev)
+{
+   clk_d

[PATCH v5 2/4] clk: rockchip: set an ID for crypto clk

2015-11-24 Thread Zain Wang
Set an ID for crypto clk, so that it can be called in other part.

Signed-off-by: Zain Wang 
Acked-by: Michael Turquette 
Tested-by: Heiko Stuebner 
---
Changed in v5:
- None
Changed in v4:
- None
Changed in v3:
- None
Changed in v2: 
- None
Changed in v1:
- define SCLK_CRYPTO in rk3288-cru.h
- use SCLK_CRYPTO instead of SRST_CRYPTO

 drivers/clk/rockchip/clk-rk3288.c  | 2 +-
 include/dt-bindings/clock/rk3288-cru.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/rockchip/clk-rk3288.c 
b/drivers/clk/rockchip/clk-rk3288.c
index 9040878..3fceda1 100644
--- a/drivers/clk/rockchip/clk-rk3288.c
+++ b/drivers/clk/rockchip/clk-rk3288.c
@@ -295,7 +295,7 @@ static struct rockchip_clk_branch rk3288_clk_branches[] 
__initdata = {
RK3288_CLKGATE_CON(0), 4, GFLAGS),
GATE(0, "c2c_host", "aclk_cpu_src", 0,
RK3288_CLKGATE_CON(13), 8, GFLAGS),
-   COMPOSITE_NOMUX(0, "crypto", "aclk_cpu_pre", 0,
+   COMPOSITE_NOMUX(SCLK_CRYPTO, "crypto", "aclk_cpu_pre", 0,
RK3288_CLKSEL_CON(26), 6, 2, DFLAGS,
RK3288_CLKGATE_CON(5), 4, GFLAGS),
GATE(0, "aclk_bus_2pmu", "aclk_cpu_pre", CLK_IGNORE_UNUSED,
diff --git a/include/dt-bindings/clock/rk3288-cru.h 
b/include/dt-bindings/clock/rk3288-cru.h
index c719aac..30dcd60 100644
--- a/include/dt-bindings/clock/rk3288-cru.h
+++ b/include/dt-bindings/clock/rk3288-cru.h
@@ -86,6 +86,7 @@
 #define SCLK_USBPHY480M_SRC122
 #define SCLK_PVTM_CORE 123
 #define SCLK_PVTM_GPU  124
+#define SCLK_CRYPTO125
 
 #define SCLK_MAC   151
 #define SCLK_MACREF_OUT152
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 1/4] crypto: rockchip/crypto - add DT bindings documentation

2015-11-24 Thread Zain Wang
Add DT bindings documentation for the rk3288 crypto drivers.

Signed-off-by: Zain Wang 
Acked-by: Rob Herring 
Tested-by: Heiko Stuebner 
---
Changed in v5:
- None

Changed in v4:
- None

Changed in v3:
- add reset property

Changed in v2:
- None

Changed in v1:
- remove the _crypto suffix
- use "rockchip,rk3288-crypto" instead of "rockchip,rk3288"
- remove the description of status

 .../devicetree/bindings/crypto/rockchip-crypto.txt | 29 ++
 1 file changed, 29 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/rockchip-crypto.txt

diff --git a/Documentation/devicetree/bindings/crypto/rockchip-crypto.txt 
b/Documentation/devicetree/bindings/crypto/rockchip-crypto.txt
new file mode 100644
index 000..096df34
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/rockchip-crypto.txt
@@ -0,0 +1,29 @@
+Rockchip Electronics And Security Accelerator
+
+Required properties:
+- compatible: Should be "rockchip,rk3288-crypto"
+- reg: Base physical address of the engine and length of memory mapped
+   region
+- interrupts: Interrupt number
+- clocks: Reference to the clocks about crypto
+- clock-names: "aclk" used to clock data
+  "hclk" used to clock data
+  "sclk" used to clock crypto accelerator
+  "apb_pclk" used to clock dma
+- resets: Must contain an entry for each entry in reset-names.
+ See ../reset/reset.txt for details.
+- reset-names: Must include the name "crypto-rst".
+
+Examples:
+
+   crypto: cypto-controller@ff8a {
+   compatible = "rockchip,rk3288-crypto";
+   reg = <0xff8a 0x4000>;
+   interrupts = ;
+   clocks = <&cru ACLK_CRYPTO>, <&cru HCLK_CRYPTO>,
+<&cru SCLK_CRYPTO>, <&cru ACLK_DMAC1>;
+   clock-names = "aclk", "hclk", "sclk", "apb_pclk";
+   resets = <&cru SRST_CRYPTO>;
+   reset-names = "crypto-rst";
+   status = "okay";
+   };
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/4] crypto: add crypto accelerator support for rk3288

2015-11-24 Thread Zain Wang
Changed in v5:
- copy IV back after operation
- use cra_block_size to tell AES from DES instaed flag AES/TDES

Changed in v4:
- modify irq function
- add devm_add_action in probe
- fix some minor mistakes

Changed in v3:
- add OF depended in Kconfig
- rename some variate
- add reset property
- remove crypto_p variate

Changed in v2:
- remove some part about hash
- add weak key detection
- changed some variate's type

Changed in v1:
- modify some variate's name
- modify some variate's type
- modify some return value
- remove or modify some print info
- use more dev_xxx in probe
- modify the prio of cipher
- add Kconfig

Zain Wang (4):
  crypto: rockchip/crypto - add DT bindings documentation
  clk: rockchip: set an ID for crypto clk
  Crypto: rockchip/crypto - add crypto driver for rk3288
  ARM: dts: rockchip: Add Crypto node for rk3288

 .../devicetree/bindings/crypto/rockchip-crypto.txt |  29 ++
 arch/arm/boot/dts/rk3288.dtsi  |  12 +
 drivers/clk/rockchip/clk-rk3288.c  |   2 +-
 drivers/crypto/Kconfig |  11 +
 drivers/crypto/Makefile|   1 +
 drivers/crypto/rockchip/Makefile   |   3 +
 drivers/crypto/rockchip/rk3288_crypto.c| 393 
 drivers/crypto/rockchip/rk3288_crypto.h| 216 +
 drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c | 503 +
 include/dt-bindings/clock/rk3288-cru.h |   1 +
 10 files changed, 1170 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/crypto/rockchip-crypto.txt
 create mode 100644 drivers/crypto/rockchip/Makefile
 create mode 100644 drivers/crypto/rockchip/rk3288_crypto.c
 create mode 100644 drivers/crypto/rockchip/rk3288_crypto.h
 create mode 100644 drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c

-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 4/4] ARM: dts: rockchip: Add Crypto node for rk3288

2015-11-24 Thread Zain Wang
Add Crypto node for rk3288 including crypto controller and dma clk.

Signed-off-by: Zain Wang 
Tested-by: Heiko Stuebner 
---
Changed in v5:
- None

Changed in v4:
- None

Changed in v3:
- add reset property

Changed in v2:
- None

Changed in v1:
- remove the _crypto suffix
- use "rockchip,rk3288-crypto" instead of "rockchip,rk3288"

 arch/arm/boot/dts/rk3288.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index ad44d80..c6b1aa4 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -781,6 +781,18 @@
status = "disabled";
};
 
+   crypto: cypto-controller@ff8a {
+   compatible = "rockchip,rk3288-crypto";
+   reg = <0xff8a 0x4000>;
+   interrupts = ;
+   clocks = <&cru ACLK_CRYPTO>, <&cru HCLK_CRYPTO>,
+<&cru SCLK_CRYPTO>, <&cru ACLK_DMAC1>;
+   clock-names = "aclk", "hclk", "sclk", "apb_pclk";
+   resets = <&cru SRST_CRYPTO>;
+   reset-names = "crypto-rst";
+   status = "okay";
+   };
+
vopb: vop@ff93 {
compatible = "rockchip,rk3288-vop";
reg = <0xff93 0x19c>;
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: restore bogomips information in /proc/cpuinfo

2015-11-24 Thread Jon Masters

On 11/18/15, 1:15 PM, Yang Shi wrote:


As what Pavel Machek reported [1], some userspace applications depend on
bogomips showed by /proc/cpuinfo.

Although there is much less legacy impact on aarch64 than arm, but it does
break libvirt.

Basically, this patch reverts commit 326b16db9f69fd0d279be873c6c00f88c0a4aad5
("arm64: delay: don't bother reporting bogomips in /proc/cpuinfo"), but with
some tweak due to context change.


On a total tangent, it would be ideal to (eventually) have something 
reported in /proc/cpuinfo or dmesg during boot that does "accurately" 
map back to the underlying core frequency (as opposed to the generic 
timer frequency). I have seen almost countless silly situations in the 
industry (external to my own organization) in which someone has taken a 
$VENDOR_X reference system that they're not supposed to run benchmarks 
on, and they've done it anyway. But usually on some silicon that's 
clocked multiples under what production would be. Then silly rumors 
about performance get around because nobody can do simple arithmetic and 
notice that they ought to have at least divided by some factor.


Whether we do this through one of the platform tables or otherwise 
(multiple vendor EFI firmwares are being modified to make this much more 
glaringly obvious in the GUI view of system configuration so that when 
they do things they shouldn't, it's at least in the output) we should 
ultimately make sure that idiots at least have a fighting chance of 
noticing that they're actually running at 1GHz, and not 2 or 3GHz.


Jon.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

2015-11-24 Thread Junxiao Bi
On 11/25/2015 01:04 PM, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 

>> Hi Mark,
>>
>> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>>> Hi Junxiao,
>>>
>>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
 Hi Gang,

 This is not like a right patch.
 First, online file check only checks inode's block number, valid flag,
 fs generation value, and meta ecc. I never see a real corruption
 happened only on this field, if these fields are corrupted, that means
 something bad may happen on other place. So fix this field may not help
 and even cause corruption more hard.
>>>
>>> I agree that these are rather uncommon, we might even consider removing the
>>> VALID_FL fixup. I definitely don't think we're ready for anything more
>>> complicated than this though either. We kind of have to start somewhere too.
>>>
>> Yes, the fix is too simple, and just a start, I think we'd better wait
>> more useful parts done before merging it.
> I agree, just remark VALID_FL flag to fix this field is too simple, we should 
> delay this field fix before 
> I have a flawless solution, I will remove these lines code in the first 
> version patches. In the future submits,
> I also hope your guys to help review the code carefully, shout out your 
> comments when you doubt somewhere.
Sure.

> 
> 
> 
>>>
 Second, the repair way is wrong. In
 ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
 match the ones in memory, the ones in memory are used to update the disk
 fields. The question is how do you know these field in memory are
 right(they may be the real corrupted ones)?
>>>
>>> Your second point (and the last part of your 1st point) makes a good
>>> argument for why this shouldn't happen automatically. Some of these
>>> corruptions might require a human to look at the log and decide what to do.
>>> Especially as you point out, where we might not know where the source of the
>>> corruption is. And if the human can't figure it out, then it's probably time
>>> to unmount and fsck.
>> The point is that the fix way is wrong, just flush memory info to disk
>> is not right. I agree online fsck is good feature, but need carefully
>> design, it should not involve more corruptions. A rough idea from mine
>> is that maybe we need some "frezee" mechanism in fs, which can hung all
>> fs op and let fs stop at a safe area. After freeze fs, we can do some
>> fsck work on it and these works should not cost lots time. What's your idea?
> If we need to touch some global data structures, freezing fs can be 
> considered when we can't
> get any way in case using the locks.
> If we only handle some independent problem, we just need to lock the related 
> data structures. 
Hmm, I am not sure whether it's hard to decide an independent issue.

Thanks,
Junxiao.
> 
>>
>> Thanks,
>> Junxiao.
>>
>>>
>>> Thanks,
>>> --Mark
>>>
>>> --
>>> Mark Fasheh
>>>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: use-after-free in sock_wake_async

2015-11-24 Thread Eric Dumazet
On Tue, 2015-11-24 at 18:28 -0800, Eric Dumazet wrote:
> Dmitry, could you test following patch with your setup ?
> 
> ( I tried to reproduce the error you reported but could not )
> 
> Inode can be freed (without RCU grace period), but not the socket or
> sk_wq
> 
> By using sk_wq in the critical paths, we do not dereference the inode,
> 
> 

I finally was able to reproduce the warning (with more instances running
in parallel), and apparently this patch solves the problem.

> 
> Thanks !
> 
>  include/linux/net.h |2 +-
>  include/net/sock.h  |8 ++--
>  net/core/stream.c   |2 +-
>  net/sctp/socket.c   |6 +-
>  net/socket.c|   16 +---
>  5 files changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/net.h b/include/linux/net.h
> index 70ac5e28e6b7..6b93ec234ce8 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -202,7 +202,7 @@ enum {
>   SOCK_WAKE_URG,
>  };
>  
> -int sock_wake_async(struct socket *sk, int how, int band);
> +int sock_wake_async(struct socket *sock, struct socket_wq *wq, int how, int 
> band);
>  int sock_register(const struct net_proto_family *fam);
>  void sock_unregister(int family);
>  int __sock_create(struct net *net, int family, int type, int proto,
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 7f89e4ba18d1..af78f9e7a218 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -2007,8 +2007,12 @@ static inline unsigned long sock_wspace(struct sock 
> *sk)
>  
>  static inline void sk_wake_async(struct sock *sk, int how, int band)
>  {
> - if (sock_flag(sk, SOCK_FASYNC))
> - sock_wake_async(sk->sk_socket, how, band);
> + if (sock_flag(sk, SOCK_FASYNC)) {
> + rcu_read_lock();
> + sock_wake_async(sk->sk_socket, rcu_dereference(sk->sk_wq),
> + how, band);
> + rcu_read_unlock();
> + }
>  }
>  
>  /* Since sk_{r,w}mem_alloc sums skb->truesize, even a small frame might
> diff --git a/net/core/stream.c b/net/core/stream.c
> index d70f77a0c889..92682228919d 100644
> --- a/net/core/stream.c
> +++ b/net/core/stream.c
> @@ -39,7 +39,7 @@ void sk_stream_write_space(struct sock *sk)
>   wake_up_interruptible_poll(&wq->wait, POLLOUT |
>   POLLWRNORM | POLLWRBAND);
>   if (wq && wq->fasync_list && !(sk->sk_shutdown & SEND_SHUTDOWN))
> - sock_wake_async(sock, SOCK_WAKE_SPACE, POLL_OUT);
> + sock_wake_async(sock, wq, SOCK_WAKE_SPACE, POLL_OUT);
>   rcu_read_unlock();
>   }
>  }
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 897c01c029ca..6ab04866a1e7 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -6817,9 +6817,13 @@ static void __sctp_write_space(struct sctp_association 
> *asoc)
>* here by modeling from the current TCP/UDP code.
>* We have not tested with it yet.
>*/
> - if (!(sk->sk_shutdown & SEND_SHUTDOWN))
> + if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
> + rcu_read_lock();
>   sock_wake_async(sock,
> + rcu_dereference(sk->sk_wq),
>   SOCK_WAKE_SPACE, POLL_OUT);
> + rcu_read_unlock();
> + }
>   }
>   }
>  }
> diff --git a/net/socket.c b/net/socket.c
> index dd2c247c99e3..8df62c8bef90 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -1058,18 +1058,12 @@ static int sock_fasync(int fd, struct file *filp, int 
> on)
>  
>  /* This function may be called only under socket lock or callback_lock or 
> rcu_lock */
>  
> -int sock_wake_async(struct socket *sock, int how, int band)
> +int sock_wake_async(struct socket *sock, struct socket_wq *wq,
> + int how, int band)
>  {
> - struct socket_wq *wq;
> -
> - if (!sock)
> - return -1;
> - rcu_read_lock();
> - wq = rcu_dereference(sock->wq);
> - if (!wq || !wq->fasync_list) {
> - rcu_read_unlock();
> + if (!sock || !wq || !wq->fasync_list)
>   return -1;
> - }
> +
>   switch (how) {
>   case SOCK_WAKE_WAITD:
>   if (test_bit(SOCK_ASYNC_WAITDATA, &sock->flags))
> @@ -1086,7 +1080,7 @@ call_kill:
>   case SOCK_WAKE_URG:
>   kill_fasync(&wq->fasync_list, SIGURG, band);
>   }
> - rcu_read_unlock();
> +
>   return 0;
>  }
>  EXPORT_SYMBOL(sock_wake_async);
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] thermal: rcar: add .set_trip_temp support

2015-11-24 Thread Kuninori Morimoto

From: Kuninori Morimoto 

You can set trip temp if your kernel has CONFIG_THERMAL_WRITABLE_TRIPS

echo $temp > /sys/class/thermal/thermal_zone0/trip_point_0_temp

-45000 < $temp < 125000 is supported
Default is 9

Signed-off-by: Kuninori Morimoto 
---
  This patch is v2 of "[PATCH] thermal: rcar: enable to set tripN-temp via DT"
  I think it will be full-DT feature if it uses of-thermal, but this driver is 
used
  from non-DT SoC too. We would like to keep non-DT support.
  And we would like to do is only exchange trip temp.
  .set_trip_temp is very enouth for it at this point.
  But, it can use of-thermal feature in the future.

 drivers/thermal/rcar_thermal.c | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
index 5d4ae7d..1eaa1be 100644
--- a/drivers/thermal/rcar_thermal.c
+++ b/drivers/thermal/rcar_thermal.c
@@ -63,6 +63,7 @@ struct rcar_thermal_priv {
struct mutex lock;
struct list_head list;
int id;
+   int trip_temp;
u32 ctemp;
 };
 
@@ -222,7 +223,7 @@ static int rcar_thermal_get_trip_type(struct 
thermal_zone_device *zone,
 
/* see rcar_thermal_get_temp() */
switch (trip) {
-   case 0: /* +90 <= temp */
+   case 0:
*type = THERMAL_TRIP_CRITICAL;
break;
default:
@@ -241,8 +242,8 @@ static int rcar_thermal_get_trip_temp(struct 
thermal_zone_device *zone,
 
/* see rcar_thermal_get_temp() */
switch (trip) {
-   case 0: /* +90 <= temp */
-   *temp = MCELSIUS(90);
+   case 0:
+   *temp = priv->trip_temp;
break;
default:
dev_err(dev, "rcar driver trip error\n");
@@ -270,10 +271,27 @@ static int rcar_thermal_notify(struct thermal_zone_device 
*zone,
return 0;
 }
 
+static int rcar_thermal_set_trip_temp(struct thermal_zone_device *zone,
+   int trip, int temp)
+{
+   struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
+
+   if (trip != 0)
+   return -EINVAL;
+
+   if (temp < -45000 || temp > 125000)
+   return -EINVAL;
+
+   priv->trip_temp = temp;
+
+   return 0;
+}
+
 static struct thermal_zone_device_ops rcar_thermal_zone_ops = {
.get_temp   = rcar_thermal_get_temp,
.get_trip_type  = rcar_thermal_get_trip_type,
.get_trip_temp  = rcar_thermal_get_trip_temp,
+   .set_trip_temp  = rcar_thermal_set_trip_temp,
.notify = rcar_thermal_notify,
 };
 
@@ -418,13 +436,14 @@ static int rcar_thermal_probe(struct platform_device 
*pdev)
 
priv->common = common;
priv->id = i;
+   priv->trip_temp = MCELSIUS(90); /* default*/
mutex_init(&priv->lock);
INIT_LIST_HEAD(&priv->list);
INIT_DELAYED_WORK(&priv->work, rcar_thermal_work);
rcar_thermal_update_temp(priv);
 
priv->zone = thermal_zone_device_register("rcar_thermal",
-   1, 0, priv,
+   1, 1, priv,
&rcar_thermal_zone_ops, NULL, 0,
idle);
if (IS_ERR(priv->zone)) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v9] PCI: Xilinx-NWL-PCIe: Added support for Xilinx NWL PCIe Host Controller

2015-11-24 Thread Bharat Kumar Gogada
> On Thu, 19 Nov 2015 11:05:23 +0530
> Bharat Kumar Gogada  wrote:
> 
> > Adding PCIe Root Port driver for Xilinx PCIe NWL bridge IP.
> >
> > Signed-off-by: Bharat Kumar Gogada 
> > Signed-off-by: Ravi Kiran Gummaluri 
> > Acked-by: Rob Herring 
> > ---
> > +
> > +#define MSI_ADDRESS0xDEED
> 
> How did you pick this value? What if it intersect with some actual RAM?
> What if a device actually does DMA to that location?
> 
> Wouldn't it make sense to actually pick a real *device* address (hint:
> your MSI controller itself) for this purpose, as the device will never DMA
> there?
>
> 
We have already mentioned in previous patch discussion, we don't have any 
device address on our SOC for MSI, that's 
 the reason we are allocating a page for MSI in RAM. Since our memory write is 
consumed by bridge and doesn't write to memory, you suggested to use 
some random address,  so using some random address.
> 
> 
> > +
> > +static int nwl_irq_domain_alloc(struct irq_domain *domain, unsigned int
> virq,
> > +   unsigned int nr_irqs, void *args) {
> > +   struct nwl_pcie *pcie = domain->host_data;
> > +   struct nwl_msi *msi = &pcie->msi;
> > +   int bit;
> > +   int i;
> > +   int ret;
> > +
> > +   mutex_lock(&msi->lock);
> > +   if (nr_irqs > 1) {
> > +   ret = nwl_check_hwirq(msi, nr_irqs);
> > +   if (ret < 0) {
> > +   mutex_unlock(&msi->lock);
> > +   return ret;
> > +   }
> > +   } else {
> > +   ret = find_first_zero_bit(msi->used, INT_PCI_MSI_NR);
> > +   if (ret == INT_PCI_MSI_NR) {
> > +   mutex_unlock(&msi->lock);
> > +   return -ENOSPC;
> > +   }
> > +   }
> 
> Let's be serious for a minute. What's wrong with
> bitmap_find_next_zero_area, for example?
Ok, will explore this API and do accordingly, and address in next patch.
> 
> > +
> > +   for (i = 0; i < nr_irqs; i++) {
> > +   bit = ret + i;
> > +   set_bit(bit, msi->used);
> > +
> > +   irq_domain_set_info(domain, virq + i, bit, &nwl_irq_chip,
> > +   domain->host_data, handle_simple_irq,
> > +   NULL, NULL);
> > +   }
> > +   mutex_unlock(&msi->lock);
> > +
> > +   return 0;
> > +}
> 
> Thanks,
> 
>   M.
> --
> Jazz is not dead. It just smells funny.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH V2 3/3] Ixgbevf: Add migration support for ixgbevf driver

2015-11-24 Thread Alexander Duyck
On Tue, Nov 24, 2015 at 1:20 PM, Michael S. Tsirkin  wrote:
> On Tue, Nov 24, 2015 at 09:38:18PM +0800, Lan Tianyu wrote:
>> This patch is to add migration support for ixgbevf driver. Using
>> faked PCI migration capability table communicates with Qemu to
>> share migration status and mailbox irq vector index.
>>
>> Qemu will notify VF via sending MSIX msg to trigger mailbox
>> vector during migration and store migration status in the
>> PCI_VF_MIGRATION_VMM_STATUS regs in the new capability table.
>> The mailbox irq will be triggered just befoe stop-and-copy stage
>> and after migration on the target machine.
>>
>> VF driver will put down net when detect migration and tell
>> Qemu it's ready for migration via writing PCI_VF_MIGRATION_VF_STATUS
>> reg. After migration, put up net again.
>>
>> Qemu will in charge of migrating PCI config space regs and MSIX config.
>>
>> The patch is to dedicate on the normal case that net traffic works
>> when mailbox irq is enabled. For other cases(such as the driver
>> isn't loaded, adapter is suspended or closed), mailbox irq won't be
>> triggered and VF driver will disable it via PCI_VF_MIGRATION_CAP
>> reg. These case will be resolved later.
>>
>> Signed-off-by: Lan Tianyu 
>
> I have to say, I was much more interested in the idea
> of tracking dirty memory. I have some thoughts about
> that one - did you give up on it then?

The tracking of dirty pages still needs to be addressed unless the
interface is being downed before migration even starts which based on
other comments I am assuming is not the case.

I still feel that having a means of marking a page as being dirty when
it is unmapped would be the best way to go.  That way you only have to
update the DMA API instead of messing with each and every driver
trying to add code to force the page to be dirtied.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARC: dw2 unwind: Remove falllback linear search thru FDE entries

2015-11-24 Thread Vineet Gupta
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"

| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/142/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:

in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.

However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.

This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.

Cc: sta...@vger.kernel.org
Signed-off-by: Vineet Gupta 
---
 arch/arc/kernel/unwind.c | 37 -
 1 file changed, 4 insertions(+), 33 deletions(-)

diff --git a/arch/arc/kernel/unwind.c b/arch/arc/kernel/unwind.c
index 93c6ea52b671..7352475451f6 100644
--- a/arch/arc/kernel/unwind.c
+++ b/arch/arc/kernel/unwind.c
@@ -986,42 +986,13 @@ int arc_unwind(struct unwind_frame_info *frame)
(const u8 *)(fde +
 1) +
*fde, ptrType);
-   if (pc >= endLoc)
+   if (pc >= endLoc) {
fde = NULL;
-   } else
-   fde = NULL;
-   }
-   if (fde == NULL) {
-   for (fde = table->address, tableSize = table->size;
-cie = NULL, tableSize > sizeof(*fde)
-&& tableSize - sizeof(*fde) >= *fde;
-tableSize -= sizeof(*fde) + *fde,
-fde += 1 + *fde / sizeof(*fde)) {
-   cie = cie_for_fde(fde, table);
-   if (cie == &bad_cie) {
cie = NULL;
-   break;
}
-   if (cie == NULL
-   || cie == ¬_fde
-   || (ptrType = fde_pointer_type(cie)) < 0)
-   continue;
-   ptr = (const u8 *)(fde + 2);
-   startLoc = read_pointer(&ptr,
-   (const u8 *)(fde + 1) +
-   *fde, ptrType);
-   if (!startLoc)
-   continue;
-   if (!(ptrType & DW_EH_PE_indirect))
-   ptrType &=
-   DW_EH_PE_FORM | DW_EH_PE_signed;
-   endLoc =
-   startLoc + read_pointer(&ptr,
-   (const u8 *)(fde +
-1) +
-   *fde, ptrType);
-   if (pc >= startLoc && pc < endLoc)
-   break;
+   } else {
+   fde = NULL;
+   cie = NULL;
}
}
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] mm/cma: always check which page cause allocation failure

2015-11-24 Thread Joonsoo Kim
Now, we have tracepoint in test_pages_isolated() to notify
pfn which cannot be isolated. But, in alloc_contig_range(),
some error path doesn't call test_pages_isolated() so it's still
hard to know exact pfn that causes allocation failure.

This patch change this situation by calling test_pages_isolated()
in almost error path. In allocation failure case, some overhead
is added by this change, but, allocation failure is really rare
event so it would not matter.

In fatal signal pending case, we don't call test_pages_isolated()
because this failure is intentional one.

There was a bogus outer_start problem due to unchecked buddy order
and this patch also fix it. Before this patch, it didn't matter,
because end result is same thing. But, after this patch,
tracepoint will report failed pfn so it should be accurate.

Signed-off-by: Joonsoo Kim 
---
 mm/page_alloc.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d0499ff..21e9172 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6748,8 +6748,12 @@ int alloc_contig_range(unsigned long start, unsigned 
long end,
if (ret)
return ret;
 
+   /*
+* In case of -EBUSY, we'd like to know which page causes problem.
+* So, just fall through. We will check it in test_pages_isolated().
+*/
ret = __alloc_contig_migrate_range(&cc, start, end);
-   if (ret)
+   if (ret && ret != -EBUSY)
goto done;
 
/*
@@ -6776,12 +6780,25 @@ int alloc_contig_range(unsigned long start, unsigned 
long end,
outer_start = start;
while (!PageBuddy(pfn_to_page(outer_start))) {
if (++order >= MAX_ORDER) {
-   ret = -EBUSY;
-   goto done;
+   outer_start = start;
+   break;
}
outer_start &= ~0UL << order;
}
 
+   if (outer_start != start) {
+   order = page_order(pfn_to_page(outer_start));
+
+   /*
+* outer_start page could be small order buddy page and
+* it doesn't include start page. Adjust outer_start
+* in this case to report failed page properly
+* on tracepoint in test_pages_isolated()
+*/
+   if (outer_start + (1UL << order) <= start)
+   outer_start = start;
+   }
+
/* Make sure the range is really isolated. */
if (test_pages_isolated(outer_start, end, false)) {
pr_info("%s: [%lx, %lx) PFNs busy\n",
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/5] Documentation: devicetree: Add DT bindings to eeprom_93xx46 driver.

2015-11-24 Thread Cory Tusar
This commit documents bindings to be added to the eeprom_93xx46 driver
which will allow:

  - Device word size and read-only attributes to be specified.
  - A device-specific compatible string for use with Atmel AT93C46D
EEPROMs.
  - Specifying a GPIO line to function as a 'select' or 'enable' signal
prior to accessing the EEPROM.

Signed-off-by: Cory Tusar 
Acked-by: Rob Herring 
---
 .../devicetree/bindings/misc/eeprom-93xx46.txt | 25 ++
 1 file changed, 25 insertions(+)

diff --git a/Documentation/devicetree/bindings/misc/eeprom-93xx46.txt 
b/Documentation/devicetree/bindings/misc/eeprom-93xx46.txt
new file mode 100644
index 000..a8ebb46
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/eeprom-93xx46.txt
@@ -0,0 +1,25 @@
+EEPROMs (SPI) compatible with Microchip Technology 93xx46 family.
+
+Required properties:
+- compatible : shall be one of:
+"atmel,at93c46d"
+"eeprom-93xx46"
+- data-size : number of data bits per word (either 8 or 16)
+
+Optional properties:
+- read-only : parameter-less property which disables writes to the EEPROM
+- select-gpios : if present, specifies the GPIO that will be asserted prior to
+  each access to the EEPROM (e.g. for SPI bus multiplexing)
+
+Property rules described in Documentation/devicetree/bindings/spi/spi-bus.txt
+apply.  In particular, "reg" and "spi-max-frequency" properties must be given.
+
+Example:
+   eeprom@0 {
+   compatible = "eeprom-93xx46";
+   reg = <0>;
+   spi-max-frequency = <100>;
+   spi-cs-high;
+   data-size = <8>;
+   select-gpios = <&gpio4 4 GPIO_ACTIVE_HIGH>;
+   };
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 5/5] misc: eeprom_93xx46: Add support for a GPIO 'select' line.

2015-11-24 Thread Cory Tusar
This commit adds support to the eeprom_93x46 driver allowing a GPIO line
to function as a 'select' or 'enable' signal prior to accessing the
EEPROM.

Signed-off-by: Cory Tusar 
---
 drivers/misc/eeprom/eeprom_93xx46.c | 35 +++
 include/linux/eeprom_93xx46.h   |  3 +++
 2 files changed, 38 insertions(+)

diff --git a/drivers/misc/eeprom/eeprom_93xx46.c 
b/drivers/misc/eeprom/eeprom_93xx46.c
index d50bc17..d28fac2 100644
--- a/drivers/misc/eeprom/eeprom_93xx46.c
+++ b/drivers/misc/eeprom/eeprom_93xx46.c
@@ -10,11 +10,13 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -343,6 +345,20 @@ static ssize_t eeprom_93xx46_store_erase(struct device 
*dev,
 }
 static DEVICE_ATTR(erase, S_IWUSR, NULL, eeprom_93xx46_store_erase);
 
+static void select_assert(void *context)
+{
+   struct eeprom_93xx46_dev *edev = context;
+
+   gpiod_set_value_cansleep(edev->pdata->select, 1);
+}
+
+static void select_deassert(void *context)
+{
+   struct eeprom_93xx46_dev *edev = context;
+
+   gpiod_set_value_cansleep(edev->pdata->select, 0);
+}
+
 static const struct of_device_id eeprom_93xx46_of_table[] = {
{ .compatible = "eeprom-93xx46", },
{ .compatible = "atmel,at93c46d", .data = &atmel_at93c46d_data, },
@@ -357,6 +373,8 @@ static int eeprom_93xx46_probe_dt(struct spi_device *spi)
struct device_node *np = spi->dev.of_node;
struct eeprom_93xx46_platform_data *pd;
u32 tmp;
+   int gpio;
+   enum of_gpio_flags of_flags;
int ret;
 
pd = devm_kzalloc(&spi->dev, sizeof(*pd), GFP_KERNEL);
@@ -381,6 +399,23 @@ static int eeprom_93xx46_probe_dt(struct spi_device *spi)
if (of_property_read_bool(np, "read-only"))
pd->flags |= EE_READONLY;
 
+   gpio = of_get_named_gpio_flags(np, "select-gpios", 0, &of_flags);
+   if (gpio_is_valid(gpio)) {
+   unsigned long flags =
+   of_flags == OF_GPIO_ACTIVE_LOW ? GPIOF_ACTIVE_LOW : 0;
+
+   ret = devm_gpio_request_one(&spi->dev, gpio, flags,
+   "eeprom_93xx46_select");
+   if (ret)
+   return ret;
+
+   pd->select = gpio_to_desc(gpio);
+   pd->prepare = select_assert;
+   pd->finish = select_deassert;
+
+   gpiod_direction_output(pd->select, 0);
+   }
+
if (of_id->data) {
const struct eeprom_93xx46_devtype_data *data = of_id->data;
 
diff --git a/include/linux/eeprom_93xx46.h b/include/linux/eeprom_93xx46.h
index 92fa4c3..03f3435 100644
--- a/include/linux/eeprom_93xx46.h
+++ b/include/linux/eeprom_93xx46.h
@@ -3,6 +3,8 @@
  * platform description for 93xx46 EEPROMs.
  */
 
+#include 
+
 struct eeprom_93xx46_platform_data {
unsigned char   flags;
 #define EE_ADDR8   0x01/*  8 bit addr. cfg */
@@ -21,4 +23,5 @@ struct eeprom_93xx46_platform_data {
 */
void (*prepare)(void *);
void (*finish)(void *);
+   struct gpio_desc *select;
 };
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/5] misc: eeprom_93xx46: Add quirks to support Atmel AT93C46D device.

2015-11-24 Thread Cory Tusar
Atmel devices in this family have some quirks not found in other similar
chips - they do not support a sequential read of the entire EEPROM
contents, and the control word sent at the start of each operation
varies in bit length.

This commit adds quirk support to the driver and modifies the read
implementation to support non-sequential reads for consistency with
other misc/eeprom drivers.

Tested on a custom Freescale VF610-based platform, with an AT93C46D
device attached via dspi2.  The spi-gpio driver was used to allow the
necessary non-byte-sized transfers.

Signed-off-by: Cory Tusar 
---
 drivers/misc/eeprom/eeprom_93xx46.c | 126 ++--
 include/linux/eeprom_93xx46.h   |   6 ++
 2 files changed, 97 insertions(+), 35 deletions(-)

diff --git a/drivers/misc/eeprom/eeprom_93xx46.c 
b/drivers/misc/eeprom/eeprom_93xx46.c
index cc27e11..d50bc17 100644
--- a/drivers/misc/eeprom/eeprom_93xx46.c
+++ b/drivers/misc/eeprom/eeprom_93xx46.c
@@ -27,6 +27,15 @@
 #define ADDR_ERAL  0x20
 #define ADDR_EWEN  0x30
 
+struct eeprom_93xx46_devtype_data {
+   unsigned int quirks;
+};
+
+static const struct eeprom_93xx46_devtype_data atmel_at93c46d_data = {
+   .quirks = EEPROM_93XX46_QUIRK_SINGLE_WORD_READ |
+ EEPROM_93XX46_QUIRK_INSTRUCTION_LENGTH,
+};
+
 struct eeprom_93xx46_dev {
struct spi_device *spi;
struct eeprom_93xx46_platform_data *pdata;
@@ -35,6 +44,16 @@ struct eeprom_93xx46_dev {
int addrlen;
 };
 
+static inline bool has_quirk_single_word_read(struct eeprom_93xx46_dev *edev)
+{
+   return edev->pdata->quirks & EEPROM_93XX46_QUIRK_SINGLE_WORD_READ;
+}
+
+static inline bool has_quirk_instruction_length(struct eeprom_93xx46_dev *edev)
+{
+   return edev->pdata->quirks & EEPROM_93XX46_QUIRK_INSTRUCTION_LENGTH;
+}
+
 static ssize_t
 eeprom_93xx46_bin_read(struct file *filp, struct kobject *kobj,
   struct bin_attribute *bin_attr,
@@ -42,58 +61,73 @@ eeprom_93xx46_bin_read(struct file *filp, struct kobject 
*kobj,
 {
struct eeprom_93xx46_dev *edev;
struct device *dev;
-   struct spi_message m;
-   struct spi_transfer t[2];
-   int bits, ret;
-   u16 cmd_addr;
+   ssize_t ret = 0;
 
dev = container_of(kobj, struct device, kobj);
edev = dev_get_drvdata(dev);
 
-   cmd_addr = OP_READ << edev->addrlen;
+   mutex_lock(&edev->lock);
 
-   if (edev->addrlen == 7) {
-   cmd_addr |= off & 0x7f;
-   bits = 10;
-   } else {
-   cmd_addr |= (off >> 1) & 0x3f;
-   bits = 9;
-   }
+   if (edev->pdata->prepare)
+   edev->pdata->prepare(edev);
 
-   dev_dbg(&edev->spi->dev, "read cmd 0x%x, %d Hz\n",
-   cmd_addr, edev->spi->max_speed_hz);
+   while (count) {
+   struct spi_message m;
+   struct spi_transfer t[2] = { { 0 } };
+   u16 cmd_addr = OP_READ << edev->addrlen;
+   size_t nbytes = count;
+   int bits;
+   int err;
+
+   if (edev->addrlen == 7) {
+   cmd_addr |= off & 0x7f;
+   bits = 10;
+   if (has_quirk_single_word_read(edev))
+   nbytes = 1;
+   } else {
+   cmd_addr |= (off >> 1) & 0x3f;
+   bits = 9;
+   if (has_quirk_single_word_read(edev))
+   nbytes = 2;
+   }
 
-   spi_message_init(&m);
-   memset(t, 0, sizeof(t));
+   dev_dbg(&edev->spi->dev, "read cmd 0x%x, %d Hz\n",
+   cmd_addr, edev->spi->max_speed_hz);
 
-   t[0].tx_buf = (char *)&cmd_addr;
-   t[0].len = 2;
-   t[0].bits_per_word = bits;
-   spi_message_add_tail(&t[0], &m);
+   spi_message_init(&m);
 
-   t[1].rx_buf = buf;
-   t[1].len = count;
-   t[1].bits_per_word = 8;
-   spi_message_add_tail(&t[1], &m);
+   t[0].tx_buf = (char *)&cmd_addr;
+   t[0].len = 2;
+   t[0].bits_per_word = bits;
+   spi_message_add_tail(&t[0], &m);
 
-   mutex_lock(&edev->lock);
+   t[1].rx_buf = buf;
+   t[1].len = count;
+   t[1].bits_per_word = 8;
+   spi_message_add_tail(&t[1], &m);
 
-   if (edev->pdata->prepare)
-   edev->pdata->prepare(edev);
+   err = spi_sync(edev->spi, &m);
+   /* have to wait at least Tcsl ns */
+   ndelay(250);
 
-   ret = spi_sync(edev->spi, &m);
-   /* have to wait at least Tcsl ns */
-   ndelay(250);
-   if (ret) {
-   dev_err(&edev->spi->dev, "read %zu bytes at %d: err. %d\n",
-   count, (int)off, ret);
+   if (err) {
+   dev_err(&edev->spi->dev, "read %zu bytes at %d: err. 
%d\n",
+   

[PATCH v3 0/5] Devicetree support for misc/eeprom/eeprom_93xx46.

2015-11-24 Thread Cory Tusar
This series of patches adds an initial set of devicetree bindings to the
eeprom_93xx46 driver which mirror the configuration options previously
available as a platform device.  These bindings are then extended to
include support for specific Atmel devices in this family and also to
support GPIO-based selection of the device (e.g. for use with an SPI bus
mux).

Additionally, an address aliasing issue with 16-bit read and write
accesses in the eeprom_93xx46 driver discovered during testing is fixed.

Changes since v2:
  - Changed node name to 'eeprom' in DT bindings example.
  - Simplified several bits of return logic.
  - Removed #ifdef CONFIG_OF.
  - Allow compiler to handle promotion to bool return values.
  - Reworked GPIO handling to use gpiod_* functions throughout (and
fixed an oversight where GPIO flags were being ignored).

Changes since v1:
  - Consolidated all Documentation/devictree additions into a single patch.
  - Clarified compatible string shall be only one of the supported values.
  - Renamed the 'select-gpio' binding to 'select-gpios'.

Cory Tusar (5):
  misc: eeprom_93xx46: Fix 16-bit read and write accesses.
  Documentation: devicetree: Add DT bindings to eeprom_93xx46 driver.
  misc: eeprom_93xx46: Implement eeprom_93xx46 DT bindings.
  misc: eeprom_93xx46: Add quirks to support Atmel AT93C46D device.
  misc: eeprom_93xx46: Add support for a GPIO 'select' line.

 .../devicetree/bindings/misc/eeprom-93xx46.txt |  25 +++
 drivers/misc/eeprom/eeprom_93xx46.c| 212 +
 include/linux/eeprom_93xx46.h  |   9 +
 3 files changed, 210 insertions(+), 36 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/misc/eeprom-93xx46.txt

-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/5] misc: eeprom_93xx46: Implement eeprom_93xx46 DT bindings.

2015-11-24 Thread Cory Tusar
This commit implements bindings in the eeprom_93xx46 driver allowing
device word size and read-only attributes to be specified via
devicetree.

Signed-off-by: Cory Tusar 
---
 drivers/misc/eeprom/eeprom_93xx46.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/drivers/misc/eeprom/eeprom_93xx46.c 
b/drivers/misc/eeprom/eeprom_93xx46.c
index e1bf0a5..cc27e11 100644
--- a/drivers/misc/eeprom/eeprom_93xx46.c
+++ b/drivers/misc/eeprom/eeprom_93xx46.c
@@ -13,6 +13,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -294,12 +296,58 @@ static ssize_t eeprom_93xx46_store_erase(struct device 
*dev,
 }
 static DEVICE_ATTR(erase, S_IWUSR, NULL, eeprom_93xx46_store_erase);
 
+static const struct of_device_id eeprom_93xx46_of_table[] = {
+   { .compatible = "eeprom-93xx46", },
+   {}
+};
+MODULE_DEVICE_TABLE(of, eeprom_93xx46_of_table);
+
+static int eeprom_93xx46_probe_dt(struct spi_device *spi)
+{
+   struct device_node *np = spi->dev.of_node;
+   struct eeprom_93xx46_platform_data *pd;
+   u32 tmp;
+   int ret;
+
+   pd = devm_kzalloc(&spi->dev, sizeof(*pd), GFP_KERNEL);
+   if (!pd)
+   return -ENOMEM;
+
+   ret = of_property_read_u32(np, "data-size", &tmp);
+   if (ret < 0) {
+   dev_err(&spi->dev, "data-size property not found\n");
+   return ret;
+   }
+
+   if (tmp == 8) {
+   pd->flags |= EE_ADDR8;
+   } else if (tmp == 16) {
+   pd->flags |= EE_ADDR16;
+   } else {
+   dev_err(&spi->dev, "invalid data-size (%d)\n", tmp);
+   return -EINVAL;
+   }
+
+   if (of_property_read_bool(np, "read-only"))
+   pd->flags |= EE_READONLY;
+
+   spi->dev.platform_data = pd;
+
+   return 0;
+}
+
 static int eeprom_93xx46_probe(struct spi_device *spi)
 {
struct eeprom_93xx46_platform_data *pd;
struct eeprom_93xx46_dev *edev;
int err;
 
+   if (spi->dev.of_node) {
+   err = eeprom_93xx46_probe_dt(spi);
+   if (err < 0)
+   return err;
+   }
+
pd = spi->dev.platform_data;
if (!pd) {
dev_err(&spi->dev, "missing platform data\n");
@@ -370,6 +418,7 @@ static int eeprom_93xx46_remove(struct spi_device *spi)
 static struct spi_driver eeprom_93xx46_driver = {
.driver = {
.name   = "93xx46",
+   .of_match_table = of_match_ptr(eeprom_93xx46_of_table),
},
.probe  = eeprom_93xx46_probe,
.remove = eeprom_93xx46_remove,
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC

2015-11-24 Thread Alexander Duyck
On Tue, Nov 24, 2015 at 7:18 PM, Lan Tianyu  wrote:
> On 2015年11月24日 22:20, Alexander Duyck wrote:
>> I'm still not a fan of this approach.  I really feel like this is
>> something that should be resolved by extending the existing PCI hot-plug
>> rather than trying to instrument this per driver.  Then you will get the
>> goodness for multiple drivers and multiple OSes instead of just one.  An
>> added advantage to dealing with this in the PCI hot-plug environment
>> would be that you could then still do a hot-plug even if the guest
>> didn't load a driver for the VF since you would be working with the PCI
>> slot instead of the device itself.
>>
>> - Alex
>
> Hi Alex:
> What's you mentioned seems the bonding driver solution.
> Paper "Live Migration with Pass-through Device for Linux VM" describes
> it. It does VF hotplug during migration. In order to maintain Network
> connection when VF is out, it takes advantage of Linux bonding driver to
> switch between VF NIC and emulated NIC. But the side affects, that
> requires VM to do additional configure and the performance during
> switching two NIC is not good.

No, what I am getting at is that you can't go around and modify the
configuration space for every possible device out there.  This
solution won't scale.  If you instead moved the logic for notifying
the device into a separate mechanism such as making it a part of the
hot-plug logic then you only have to write the code once per OS in
order to get the hot-plug capability to pause/resume the device.  What
I am talking about is not full hot-plug, but rather to extend the
existing hot-plug in Qemu and the Linux kernel to support a
"pause/resume" functionality.  The PCI hot-plug specification calls
out the option of implementing something like this, but we don't
currently have support for it.

I just feel doing it through PCI hot-plug messages will scale much
better as you could likely make use of the power management
suspend/resume calls to take care of most of the needed implementation
details.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/5] misc: eeprom_93xx46: Fix 16-bit read and write accesses.

2015-11-24 Thread Cory Tusar
Compatible at93xx46 devices from both Microchip and Atmel expect a
word-based address, regardless of whether the device is strapped for 8-
or 16-bit operation.  However, the offset parameter passed in when
reading or writing at a specific location is always specified in terms
of bytes.

This commit fixes 16-bit read and write accesses by shifting the offset
parameter to account for this difference between a byte offset and a
word-based address.

Signed-off-by: Cory Tusar 
---
 drivers/misc/eeprom/eeprom_93xx46.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/eeprom/eeprom_93xx46.c 
b/drivers/misc/eeprom/eeprom_93xx46.c
index ff63f05..e1bf0a5 100644
--- a/drivers/misc/eeprom/eeprom_93xx46.c
+++ b/drivers/misc/eeprom/eeprom_93xx46.c
@@ -54,7 +54,7 @@ eeprom_93xx46_bin_read(struct file *filp, struct kobject 
*kobj,
cmd_addr |= off & 0x7f;
bits = 10;
} else {
-   cmd_addr |= off & 0x3f;
+   cmd_addr |= (off >> 1) & 0x3f;
bits = 9;
}
 
@@ -155,7 +155,7 @@ eeprom_93xx46_write_word(struct eeprom_93xx46_dev *edev,
bits = 10;
data_len = 1;
} else {
-   cmd_addr |= off & 0x3f;
+   cmd_addr |= (off >> 1) & 0x3f;
bits = 9;
data_len = 2;
}
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] mm/compaction: __compact_pgdat() code cleanuup

2015-11-24 Thread Joonsoo Kim
This patch uses is_via_compact_memory() to distinguish direct compaction.
And it also reduces indentation on compaction_defer_reset
by filtering failure case. There is no functional change.

Acked-by: Yaowei Bai 
Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index de3e1e7..01b1e5e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1658,14 +1658,15 @@ static void __compact_pgdat(pg_data_t *pgdat, struct 
compact_control *cc)
!compaction_deferred(zone, cc->order))
compact_zone(zone, cc);
 
-   if (cc->order > 0) {
-   if (zone_watermark_ok(zone, cc->order,
-   low_wmark_pages(zone), 0, 0))
-   compaction_defer_reset(zone, cc->order, false);
-   }
-
VM_BUG_ON(!list_empty(&cc->freepages));
VM_BUG_ON(!list_empty(&cc->migratepages));
+
+   if (is_via_compact_memory(cc->order))
+   continue;
+
+   if (zone_watermark_ok(zone, cc->order,
+   low_wmark_pages(zone), 0, 0))
+   compaction_defer_reset(zone, cc->order, false);
}
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] arm64 updates for 4.4

2015-11-24 Thread Jon Masters

On 11/6/15, 11:04 AM, Catalin Marinas wrote:


On Fri, Nov 06, 2015 at 10:57:58AM +0100, Arnd Bergmann wrote:

On Thursday 05 November 2015 18:27:18 Catalin Marinas wrote:

On Wed, Nov 04, 2015 at 02:55:01PM -0800, Linus Torvalds wrote:

On Wed, Nov 4, 2015 at 10:25 AM, Catalin Marinas  
wrote:
It's good for single-process loads - if you do a lot of big fortran
jobs, or a lot of big database loads, and nothing else, you're fine.


These are some of the arguments from the server camp: specific
workloads.


On our end, I asked our performance folks (and many others) about 3 or 4 
years ago what they thought would make sense. The numbers suggested that 
16KB might have been ideal (for specific targeted workloads), but since 
that was optional in the architecture (as a later addition) that meant 
"does not exist" as far as server/general purpose goes. Which lead to 
more conversation, followed ultimately by the 64KB choice. The decision 
to go to 64KB was in part based upon various discussion that suggested 
this size was appropriate for workloads, but it is something that is 
under evaluation. And obviously the number of threads on the topic is 
not something that is ignored. 4KB with contiguous hint + huge pages 
might well end up being the sweet spot in the longer term.


One of the purposes of Red Hat Enterprise Linux Server for ARM (RHELSA) 
Development Preview (which I know just rolls off the tongue) is to test 
the water with various decisions and see what works out, and what does 
not. If 64KB does indeed turn out to be a poor decision then the page 
size will be reverted to 4KB at some future time. But it is only once we 
have some of the higher end mainstream systems running RHELSA (like we 
do now) that we can start to actually look at real data and decide.


In addition to the TLB/hardware walker (micro)cache impact of page size 
in terms of levels of walk through the tables (but we have cont. hint 
and aggressive microcaches of interim levels to help us with this), 
there is also the potential impact upon cache design. True we mostly 
claim to be PIPT but underneath implementations might well be able to 
optimize the (parallel) indexing stage given a larger page size. In many 
conversations over the past few years with the architects building the 
impending tsunami of high end v8 server cores, no objections have been 
raised against the choice of 64KB in the first go around.


Anyway. We'll all watch and see :)

Jon.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] PCI support added to ARC

2015-11-24 Thread Vineet Gupta
On Tuesday 24 November 2015 08:02 PM, Joao Pinto wrote:
> This patch adds PCI support to ARC and updates drivers/pci Makefile enabling
> the ARC arch to use the generic PCI setup functions.
>
> Signed-off-by: Joao Pinto 
> ---
>  arch/arc/Kconfig|  22 +++
>  arch/arc/include/asm/dma.h  |   5 +
>  arch/arc/include/asm/io.h   |   2 +
>  arch/arc/include/asm/mach/pci.h |  97 +++
>  arch/arc/include/asm/pci.h  |  34 
>  arch/arc/kernel/Makefile|   1 +
>  arch/arc/kernel/pcibios.c   | 360 
> 
>  arch/arc/mm/ioremap.c   |  29 +++-
>  arch/arc/plat-axs10x/Kconfig|   1 +
>  drivers/pci/Makefile|   1 +
>  10 files changed, 551 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arc/include/asm/mach/pci.h
>  create mode 100644 arch/arc/include/asm/pci.h
>  create mode 100644 arch/arc/kernel/pcibios.c
>
> diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> index 2c2ac3f..5b526a3 100644
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -19,6 +19,7 @@ config ARC
>   select GENERIC_FIND_FIRST_BIT
>   # for now, we don't need GENERIC_IRQ_PROBE, CONFIG_GENERIC_IRQ_CHIP
>   select GENERIC_IRQ_SHOW
> + select GENERIC_PCI_IOMAP
>   select GENERIC_PENDING_IRQ if SMP
>   select GENERIC_SMP_IDLE_THREAD
>   select HAVE_ARCH_KGDB
> @@ -39,6 +40,9 @@ config ARC
>   select PERF_USE_VMALLOC
>   select HAVE_DEBUG_STACKOVERFLOW
>  
> +config MIGHT_HAVE_PCI
> + bool
> +
>  config TRACE_IRQFLAGS_SUPPORT
>   def_bool y
>  
> @@ -110,6 +114,24 @@ config ISA_ARCV2
>  
>  endchoice
>  
> +menu "Bus Support"
> +
> +config PCI
> + bool "PCI support" if MIGHT_HAVE_PCI
> + help
> +   PCI is the name of a bus system, i.e. the way the CPU talks to the 
> other stuff inside
> +   your box.Find out if your board/platform have PCI.
> +   Note: PCIE support for Synopsys Device will be available only when
> +   HAPS DX is configured with PCIE RC bitmap. If you have PCI, say Y, 
> otherwise N.
> +
> +config PCI_SYSCALL
> + def_bool PCI
> +
> +source "drivers/pci/Kconfig"
> +source "drivers/pci/pcie/Kconfig"
> +
> +endmenu
> +

Could you please move these towards end of file - preferably between sourcing of
drivers/Kconfig and fs/Kconfig to keep hardware related stuff together.

>  menu "ARC CPU Configuration"
>  
>  choice
> diff --git a/arch/arc/include/asm/dma.h b/arch/arc/include/asm/dma.h
> index ca7c451..37942fa 100644
> --- a/arch/arc/include/asm/dma.h
> +++ b/arch/arc/include/asm/dma.h
> @@ -10,5 +10,10 @@
>  #define ASM_ARC_DMA_H
>  
>  #define MAX_DMA_ADDRESS 0xC000
> +#ifdef CONFIG_PCI
> +extern int isa_dma_bridge_buggy;
> +#else
> +#define isa_dma_bridge_buggy(0)
> +#endif
>  
>  #endif
> diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h
> index 694ece8..d86c2e3 100644
> --- a/arch/arc/include/asm/io.h
> +++ b/arch/arc/include/asm/io.h
> @@ -17,6 +17,8 @@ extern void __iomem *ioremap(unsigned long physaddr, 
> unsigned long size);
>  extern void __iomem *ioremap_prot(phys_addr_t offset, unsigned long size,
> unsigned long flags);
>  extern void iounmap(const void __iomem *addr);
> +extern void __iomem *ioport_map(unsigned long port, unsigned int size);
> +extern int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr);

ioport_unmap is missing. Anyhow you can define the empty ioport_{map,unmap} as
static inline here (see below)

>  #define ioremap_nocache(phy, sz) ioremap(phy, sz)
>  #define ioremap_wc(phy, sz)  ioremap(phy, sz)
> diff --git a/arch/arc/include/asm/mach/pci.h b/arch/arc/include/asm/mach/pci.h
> new file mode 100644
> index 000..9e75277
> --- /dev/null
> +++ b/arch/arc/include/asm/mach/pci.h
> @@ -0,0 +1,97 @@
> +/*
> + *  arch/arc/include/asm/mach/pci.h
> + *
> + *  Copyright (C) 2004-2014 Synopsys, Inc. (www.synopsys.com)

Perhaps extend this to 2016 (and other copyrights in the patch too if needed)

> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __ASM_MACH_PCI_H
> +#define __ASM_MACH_PCI_H
> +
> +#include 
> +
> +struct pci_sys_data;
> +struct pci_ops;
> +struct pci_bus;
> +struct device;
> +
> +struct hw_pci {
> +#ifdef CONFIG_PCI_DOMAINS
> + int domain;
> +#endif
> + struct pci_ops  *ops;
> + int nr_controllers;
> + void**private_data;
> + int (*setup)(int nr, struct pci_sys_data *);
> + struct pci_bus *(*scan)(int nr, struct pci_sys_data *);
> + void(*preinit)(void);
> + void(*postinit)(void);
> + u8  (*swizzle)(struct pci_dev *dev, u8 *pin);
> + int (*map_irq)(const struct pci_dev *dev, u8 slot, u8 pin);
> + resource_size_t (*align_resourc

Re: [PATCH] thermal: rcar: enable to set tripN-temp via DT

2015-11-24 Thread Kuninori Morimoto

Hi Eduardo

Thank you for your feedback

> > From: Kuninori Morimoto 
> > 
> > Current rcar thermal driver is using 90 degrees as trip temp, but it
> > should be based on each SoC / platform.
> > This patch enables to set trip temp via DT. (It uses db8500-thermal
> > style for it)
> > It will use 90 degrees as default trip temp if DT doesn't have it.
> > 
> > Signed-off-by: Kuninori Morimoto 
> > ---
> >  .../devicetree/bindings/thermal/rcar-thermal.txt   |  2 ++
> >  drivers/thermal/rcar_thermal.c | 34 
> > --
> >  2 files changed, 33 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/thermal/rcar-thermal.txt 
> > b/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> > index 332e625..6c57f7e 100644
> > --- a/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> > +++ b/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> > @@ -18,6 +18,8 @@ Required properties:
> >  Option properties:
> >  
> >  - interrupts   : use interrupt
> > +- tripN-temp   : temperature of trip point N. it will use 
> > 9 as default
> > + if DT doesn't have tripN-temp
> 
> First of all, you are creating an entry with is specific to your driver.
> That requires it to use proper prefixing.
> 
> Besides, your property is already covered by of-thermal. Please convert
> your driver to use of-thermal, this way it will give you the flexibility
> to configure thermal data in DT.

I see, but we need to keep compatibility for non-DT SoC.
(This driver is used from both DT, non-DT SoC)

Then, I noticed we can use .set_trip_temp, and it is very enough for us.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] clk: imx: clk-imx6q: Let OSC to be routed to anaclk2/2b

2015-11-24 Thread Shawn Guo
On Sun, Nov 15, 2015 at 11:38:04AM +0100, Michael Trimarchi wrote:
> OSC can be used as USB hub source clock. An example we can route to
> CLK2_P imx6 pin.
> 
> This show a usage example:
> 
> [...]
>   usb_hub: usb-hub {
>   compatible = "smsc,usb3503a";
>   clocks = <&clks IMX6QDL_CLK_LVDS2_GATE>;
>   clock-names = "refclk";
>   };
> };
> 
> [...]
> &clks {
> assigned-clocks = <&clks IMX6QDL_CLK_LVDS2_SEL>;
> assigned-clock-parents = <&clks IMX6QDL_CLK_OSC>;
> };
> 
> /sys/kernel/debug/clk/clk_summary
> 
> osc 552400  0 0
> [...]
> lvds2_sel   112400  0 0
>lvds2_gate   112400  0 0
> [...]
> 
> Signed-off-by: Michael Trimarchi 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check

2015-11-24 Thread Gang He
Hi Junxiao,


>>> 
> Hi Gang,
> 
> On 11/25/2015 11:29 AM, Gang He wrote:
>> Hi Mark and Junxiao,
>> 
>> 
>
>>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
 Hi Gang,

 On 11/03/2015 03:54 PM, Gang He wrote:
> Hi Junxiao,
>
> Thank for your reviewing.
> Current design, we use a sysfile as a interface to check/fix a file (via 
>>> pass a ino number).
> But, this operation is manually triggered by user, instead of 
> automatically 
>>>  fix in the kernel.
> Why?
> 1) we should let users make this decision, since some users do not want 
> to 
>>> fix when encountering a file system corruption, maybe they want to keep the 
>>> file system unchanged for a further investigation.
 If user don't want this, they should not use error=continue option, let
 fs go after a corruption is very dangerous.
>>>
>>> Maybe we need another errors=XXX flag (maybe errors=fix)?
>>>
>>> You both make good points, here's what I gather from the conversation:
>>>
>>>  - Some customers would be sad if they have to manually fix corruptions.
>>>This takes effort on their part, and if the FS can handle it
>>>automatically, it should.
>>>
>>>  - There are valid concerns that automatically fixing things is a change in
>>>behavior that might not be welcome, or worse might lead to unforseeable
>>>circumstances.
>>>
>>>  - I will add that fixing things automatically implies checking them
>>>automatically which could introduce some performance impact depending on
>>>how much checking we're doing.
>>>
>>> So if the user wants errors to be fixed automatically, they could mount with
>>> errros=fix, and everyone else would have no change in behavior unless they
>>> wanted to make use of the new feature.
>> That is what I want to say, add a mount option to let users to decide. Here, 
> I want to split "error=fix"
>> mount option  task out from online file check feature, I think this part 
> should be a independent feature.
>> We can implement this feature after online file check is done, I want to 
> split the feature into some more 
>> detailed features, implement them one by one. Do you agree this point?
> With error=fix, when a possible corruption is found, online fsck will
> start to check and fix things. So this doesn't looks like a independent
> feature.
My means is, we can implement online file check by manually triage feature 
first, then
Add a mount option "error=fix" feature, the second feature can be implemented 
after
the first part is done. I want to split them into more detailed items, maybe it 
is more helpful
to be reviewed, but the whole feature ideas are very OK, just need to do one by 
one.  

> 
> Thanks,
> Junxiao.
> 
>> 
>>>
>>>
> 2) frankly speaking, this feature will probably bring a second corruption 
>>> if there is some error in the code, I do not suggest to use automatically 
> fix 
>>> by default in the first version.
 I think if this feature could bring more corruption, then this should be
 fixed first.
>>>
>>> Btw, I am pretty sure that Gang is referring to the feature being new and
>>> thus more likely to have problems. There is nothing I see in here that is
>>> file system corrupting.
>>> --Mark
>>>
>>>
>>> --
>>> Mark Fasheh
>> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hibernate resume bug around 3,18-rc2 - Full PAT support

2015-11-24 Thread Juergen Gross
On 24/11/15 23:46, Luis R. Rodriguez wrote:
> On Mon, Nov 23, 2015 at 03:19:16PM +0100, Juergen Gross wrote:
>> On 23/11/15 15:11, vas...@iit.demokritos.gr wrote:
>>> Ok I will send the .config when I get back home. I have all kernels I
>>> build in .deb archive. The problem is that the debian kernel build
>>> procedure does not hold somewhere in the deb file the git commit hash.
>>>
>>> Fow which kernel would you care to see the config? 4.3?
>>
>> Doesn't really matter anymore. I've posted a patch already to fix it and
>> got the reply, that the fix is okay, but no harm can come from the
>> current implementation, as the two config options are always either both
>> set or reset.
> 
> Hrm, Vassilis seems to be able to reproduce this more effectively by heating 
> up
> his CPU prior to hibernation though. I have no idea what adding 
> APIC_LVT_MASKED
> ((1 << 16)) to the Local Vector Table (LVT) Thermal Monitor (APIC_LVTTHMR 
> 0x330) does but
> clear_local_APIC() seems to be used to "cleanout any BIOS leftovers during
> boot." If we're suspending but the fan is still on I wonder if this could 
> cause
> an issue with some settings the BIOS may have set prior to hibernation, and
> a mismatch upon resume.
> 
> I can't find what APIC_LVT_MASKED does though, the best doc I found:

http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf

Local APIC (chapter 10.4).


Juergen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

2015-11-24 Thread David Ahern

On 11/24/15 8:50 PM, Wangnan (F) wrote:

Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...


If you look at my daemon code I process task events (FORK, MMAP, EXIT) 
to maintain task state including flushing threads when they terminate. 
This is a trade-off to having the knowledge to pretty-print addresses 
(address to symbol resolution) yet not grow without bounds -- be it a 
file or memory.




Furthermore, there's another problem being discussed: if userspace
ringbuffer
is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting. Instead, we are
thinking about a bucket-based ringbuffer that, let perf maintain a series
of bucket, each time 'poll' return, perf copies new events to the start of
a bucket. If all bucket is occupied, we drop the oldest bucket.
Bucket-based
ringbuffer watest some memory but can avoid event parsing.

And there's many other problems in this patch. For example, when SIGUSR2 is
received, we need to do something to let all perf events start dumping.
Current implementation can't ensure we receive events just before the
SIGUSR2 if we not set 'no-buffer'.

Also, output events are in one perf.data, which is not user friendly.
Our final goal is to make perf a daemonized moniter, which can run 7x24
in user's environment. Each time a glitch is detected, a framework sends
a signal to perf to get a perf.data from it perf. The framework manage
those perf.data like logrotate, help developer analysis those glitch.


Exactly. And that's why my daemon is written the way it is. It is 
intended to run 24x7x365. It retains the last N events which are dumped 
when some external trigger tells it to.


Arnaldo: you asked about an event in the stream but that is not 
possible. My scheduling daemon targets CPU usage prior to a significant 
event (what was running, how long, where, etc). The significant event in 
the motivating case was STP timeouts -- if stp daemon is not able to 
send BPDUs why? What was running leading up to the timeout. The point is 
something external to the perf daemon says 'hey, save the last N-events 
for analysis'.


This case sounds like a generalization of my problem with the desire to 
write a perf.data file instead of processing the events and dumping to a 
file. It is doable. For example, synthesize task events for all threads 
in memory and then write out the saved samples.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

2015-11-24 Thread Gang He
Hi Mark and Junxiao,


>>> 
> Hi Mark,
> 
> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>> Hi Junxiao,
>> 
>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>> 
>> I agree that these are rather uncommon, we might even consider removing the
>> VALID_FL fixup. I definitely don't think we're ready for anything more
>> complicated than this though either. We kind of have to start somewhere too.
>> 
> Yes, the fix is too simple, and just a start, I think we'd better wait
> more useful parts done before merging it.
I agree, just remark VALID_FL flag to fix this field is too simple, we should 
delay this field fix before 
I have a flawless solution, I will remove these lines code in the first version 
patches. In the future submits,
I also hope your guys to help review the code carefully, shout out your 
comments when you doubt somewhere.



>> 
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>> 
>> Your second point (and the last part of your 1st point) makes a good
>> argument for why this shouldn't happen automatically. Some of these
>> corruptions might require a human to look at the log and decide what to do.
>> Especially as you point out, where we might not know where the source of the
>> corruption is. And if the human can't figure it out, then it's probably time
>> to unmount and fsck.
> The point is that the fix way is wrong, just flush memory info to disk
> is not right. I agree online fsck is good feature, but need carefully
> design, it should not involve more corruptions. A rough idea from mine
> is that maybe we need some "frezee" mechanism in fs, which can hung all
> fs op and let fs stop at a safe area. After freeze fs, we can do some
> fsck work on it and these works should not cost lots time. What's your idea?
If we need to touch some global data structures, freezing fs can be considered 
when we can't
get any way in case using the locks.
If we only handle some independent problem, we just need to lock the related 
data structures. 

> 
> Thanks,
> Junxiao.
> 
>> 
>> Thanks,
>>  --Mark
>> 
>> --
>> Mark Fasheh
>> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/5] misc: eeprom_93xx46: Add support for a GPIO 'select' line.

2015-11-24 Thread Cory Tusar
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/19/2015 01:05 AM, Vladimir Zapolskiy wrote:
> On 19.11.2015 05:29, Cory Tusar wrote:
>> This commit adds support to the eeprom_93x46 driver allowing a GPIO line
>> to function as a 'select' or 'enable' signal prior to accessing the
>> EEPROM.
>>
>> Signed-off-by: Cory Tusar 
>> ---
>>  drivers/misc/eeprom/eeprom_93xx46.c | 26 ++
>>  include/linux/eeprom_93xx46.h   |  1 +
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/misc/eeprom/eeprom_93xx46.c 
>> b/drivers/misc/eeprom/eeprom_93xx46.c
>> index 0386b03..375951f 100644
>> --- a/drivers/misc/eeprom/eeprom_93xx46.c
>> +++ b/drivers/misc/eeprom/eeprom_93xx46.c
>> @@ -10,11 +10,14 @@
>>  
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>> +#include 
> 
> Please double check, adding only linux/of_gpio.h header should work,
> linux/gpio.h and linux/gpio/consumer.h are redundant.

There was an error which turned up on a 0-day build related to this:

tree:   https://github.com/lunn/linux.git asl_v4.3-rc2-zii-stable-dsa-reset
head:   c91bad95b39a98e0d06809c4c70c9c26747c874a
commit: a3e1b85039c722799102366b527b6bab9543e4ac [4/41] misc: eeprom: 
93xx46: Add support for a GPIO 'select' line.
config: x86_64-randconfig-x007-11010710 (attached as .config)
reproduce:
git checkout a3e1b85039c722799102366b527b6bab9543e4ac
# save the attached .config to linux build tree
make ARCH=x86_64

All errors (new ones prefixed by >>):

   drivers/misc/eeprom/eeprom_93xx46.c: In function 'select_assert':
>> drivers/misc/eeprom/eeprom_93xx46.c:342:2: error: implicit declaration 
of function 'gpiod_set_value_cansleep' [-Werror=implicit-function-declaration]
 gpiod_set_value_cansleep(gpio_to_desc(edev->pdata->select_gpio), 1);
 ^
>> drivers/misc/eeprom/eeprom_93xx46.c:342:27: error: implicit declaration 
of function 'gpio_to_desc' [-Werror=implicit-function-declaration]
 gpiod_set_value_cansleep(gpio_to_desc(edev->pdata->select_gpio), 1);

I'll re-check with v3 (where everything uses the gpiod_*() interface) to
see if this can be eliminated...

>>  #include 
>>  #include 
>>  #include 
>> @@ -344,6 +347,20 @@ static ssize_t eeprom_93xx46_store_erase(struct device 
>> *dev,
>>  static DEVICE_ATTR(erase, S_IWUSR, NULL, eeprom_93xx46_store_erase);
>>  
>>  #ifdef CONFIG_OF
>> +static void select_assert(void *context)
>> +{
>> +struct eeprom_93xx46_dev *edev = context;
>> +
>> +gpiod_set_value_cansleep(gpio_to_desc(edev->pdata->select_gpio), 1);
> 
> I would suggest to use gpio_set_value()

v3 uses gpiod_*() throughout.  This also addresses an issue where flags
were not being tracked and used properly...

>> +}
>> +
>> +static void select_deassert(void *context)
>> +{
>> +struct eeprom_93xx46_dev *edev = context;
>> +
>> +gpiod_set_value_cansleep(gpio_to_desc(edev->pdata->select_gpio), 0);
> 
> Same here.

As above.

>> +}
>> +
>>  static const struct of_device_id eeprom_93xx46_of_table[] = {
>>  { .compatible = "eeprom-93xx46", },
>>  { .compatible = "atmel,at93c46d", .data = &atmel_at93c46d_data, },
>> @@ -385,6 +402,15 @@ static int eeprom_93xx46_probe_dt(struct spi_device 
>> *spi)
>>  if (of_property_read_bool(np, "read-only"))
>>  pd->flags |= EE_READONLY;
>>  
>> +ret = of_get_named_gpio(np, "select-gpios", 0);
> 
> gpios or gpio? I see only one requested gpio.

gpios - for consistency.

>> +if (ret < 0) {
>> +pd->select_gpio = -1;
>> +} else {
>> +pd->select_gpio = ret;
>> +pd->prepare = select_assert;
>> +pd->finish = select_deassert;
>> +}
>> +
>>  if (of_id->data) {
>>  const struct eeprom_93xx46_devtype_data *data = of_id->data;
>>  
>> diff --git a/include/linux/eeprom_93xx46.h b/include/linux/eeprom_93xx46.h
>> index 92fa4c3..aa472c7 100644
>> --- a/include/linux/eeprom_93xx46.h
>> +++ b/include/linux/eeprom_93xx46.h
>> @@ -21,4 +21,5 @@ struct eeprom_93xx46_platform_data {
>>   */
>>  void (*prepare)(void *);
>>  void (*finish)(void *);
>> +unsigned int select_gpio;
> 
> Same questions as in v2 4/5.

I simply see it as more straightforward to keep all platform-specific
data together, rather than mix-and-match between eeprom_93xx46_dev and
eeprom_93xx46_platform_data...

Also, the private eeprom_93xx46_dev structure has not been allocated
prior to parsing for DT bindings (without additional restructuring of
.probe() logic).

>>  };
>>
> 
> --
> With best wishes,
> Vladimir
> 


- -- 
Cory Tusar
Principal
PID 1 Solutions, Inc.


"There are two ways of constructing a software design.  One way is to
 make it so simple that there are obviously no deficiencies, and the
 other way is to make it so complicated that there are no obvious
 deficiencies."  --Sir Charles Anthony Richard Hoa

[PATCH] mm: vmscan: Obey indeed proportional scanning for kswapd and memcg

2015-11-24 Thread Yaowei Bai
Commit e82e0561dae9f3ae5 ("mm: vmscan: obey proportional scanning
requirements for kswapd") intended to preserve the proportional scanning
and reclaim what was requested by get_scan_count() for kswapd and memcg
by stopping reclaiming one type(anon or file) LRU and reducing the other's
amount of scanning proportional to the original scan target.

So the way to determine which LRU should be stopped reclaiming should be
comparing scanned/unscanned percentages to the original scan target of two
lru types instead of absolute values what implemented currently, because
larger absolute value doesn't mean larger percentage, there shall be
chance that larger absolute value with smaller percentage, for instance:

target_file = 1000
target_anon = 500
nr_file = 500
nr_anon = 400

in this case, because nr_file > nr_anon, according to current implement,
we will stop scanning anon lru and shrink file lru. This breaks
proportional scanning intent and makes more unproportional.

This patch changes to compare percentage to the original scan target to
determine which lru should be shrunk.

Signed-off-by: Yaowei Bai 
---
 mm/vmscan.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2aec424..09a37436 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2216,6 +2216,7 @@ static void shrink_lruvec(struct lruvec *lruvec, int 
swappiness,
while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
nr[LRU_INACTIVE_FILE]) {
unsigned long nr_anon, nr_file, percentage;
+   unsigned long percentage_anon, percentage_file;
unsigned long nr_scanned;
 
for_each_evictable_lru(lru) {
@@ -2250,16 +2251,17 @@ static void shrink_lruvec(struct lruvec *lruvec, int 
swappiness,
if (!nr_file || !nr_anon)
break;
 
-   if (nr_file > nr_anon) {
-   unsigned long scan_target = targets[LRU_INACTIVE_ANON] +
-   targets[LRU_ACTIVE_ANON] + 1;
+   percentage_anon = nr_anon * 100 / (targets[LRU_INACTIVE_ANON] +
+   targets[LRU_ACTIVE_ANON] + 1);
+   percentage_file = nr_file * 100 / (targets[LRU_INACTIVE_FILE] +
+   targets[LRU_ACTIVE_FILE] + 1);
+
+   if (percentage_file > percentage_anon) {
lru = LRU_BASE;
-   percentage = nr_anon * 100 / scan_target;
+   percentage = percentage_anon;
} else {
-   unsigned long scan_target = targets[LRU_INACTIVE_FILE] +
-   targets[LRU_ACTIVE_FILE] + 1;
lru = LRU_FILE;
-   percentage = nr_file * 100 / scan_target;
+   percentage = percentage_file;
}
 
/* Stop scanning the smaller of the LRU */
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check

2015-11-24 Thread Junxiao Bi
Hi Gang,

On 11/25/2015 11:29 AM, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 

>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> On 11/03/2015 03:54 PM, Gang He wrote:
 Hi Junxiao,

 Thank for your reviewing.
 Current design, we use a sysfile as a interface to check/fix a file (via 
>> pass a ino number).
 But, this operation is manually triggered by user, instead of 
 automatically 
>>  fix in the kernel.
 Why?
 1) we should let users make this decision, since some users do not want to 
>> fix when encountering a file system corruption, maybe they want to keep the 
>> file system unchanged for a further investigation.
>>> If user don't want this, they should not use error=continue option, let
>>> fs go after a corruption is very dangerous.
>>
>> Maybe we need another errors=XXX flag (maybe errors=fix)?
>>
>> You both make good points, here's what I gather from the conversation:
>>
>>  - Some customers would be sad if they have to manually fix corruptions.
>>This takes effort on their part, and if the FS can handle it
>>automatically, it should.
>>
>>  - There are valid concerns that automatically fixing things is a change in
>>behavior that might not be welcome, or worse might lead to unforseeable
>>circumstances.
>>
>>  - I will add that fixing things automatically implies checking them
>>automatically which could introduce some performance impact depending on
>>how much checking we're doing.
>>
>> So if the user wants errors to be fixed automatically, they could mount with
>> errros=fix, and everyone else would have no change in behavior unless they
>> wanted to make use of the new feature.
> That is what I want to say, add a mount option to let users to decide. Here, 
> I want to split "error=fix"
> mount option  task out from online file check feature, I think this part 
> should be a independent feature.
> We can implement this feature after online file check is done, I want to 
> split the feature into some more 
> detailed features, implement them one by one. Do you agree this point?
With error=fix, when a possible corruption is found, online fsck will
start to check and fix things. So this doesn't looks like a independent
feature.

Thanks,
Junxiao.

> 
>>
>>
 2) frankly speaking, this feature will probably bring a second corruption 
>> if there is some error in the code, I do not suggest to use automatically 
>> fix 
>> by default in the first version.
>>> I think if this feature could bring more corruption, then this should be
>>> fixed first.
>>
>> Btw, I am pretty sure that Gang is referring to the feature being new and
>> thus more likely to have problems. There is nothing I see in here that is
>> file system corrupting.
>>  --Mark
>>
>>
>> --
>> Mark Fasheh
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/4] mm: mmap: Add new /proc tunable for mmap_base ASLR.

2015-11-24 Thread Michael Ellerman
On Wed, 2015-11-18 at 15:20 -0800, Daniel Cashman wrote:

> From: dcashman 
> 
> ASLR currently only uses 8 bits to generate the random offset for the
> mmap base address on 32 bit architectures. This value was chosen to
> prevent a poorly chosen value from dividing the address space in such
> a way as to prevent large allocations. This may not be an issue on all
> platforms. Allow the specification of a minimum number of bits so that
> platforms desiring greater ASLR protection may determine where to place
> the trade-off.

...

> diff --git a/arch/Kconfig b/arch/Kconfig
> index 4e949e5..141823f 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -511,6 +511,70 @@ config ARCH_HAS_ELF_RANDOMIZE
> - arch_mmap_rnd()
> - arch_randomize_brk()
>  
> +config HAVE_ARCH_MMAP_RND_BITS
> + bool
> + help
> +   An arch should select this symbol if it supports setting a variable
> +   number of bits for use in establishing the base address for mmap
> +   allocations and provides values for both:
> +   - ARCH_MMAP_RND_BITS_MIN
> +   - ARCH_MMAP_RND_BITS_MAX
> +
> +config ARCH_MMAP_RND_BITS_MIN
> + int
> +
> +config ARCH_MMAP_RND_BITS_MAX
> + int
> +
> +config ARCH_MMAP_RND_BITS_DEFAULT
> + int
> +
> +config ARCH_MMAP_RND_BITS
> + int "Number of bits to use for ASLR of mmap base address" if EXPERT
> + range ARCH_MMAP_RND_BITS_MIN ARCH_MMAP_RND_BITS_MAX
> + default ARCH_MMAP_RND_BITS_DEFAULT if ARCH_MMAP_RND_BITS_DEFAULT

Here you support a default which is separate from the minimum.

> + default ARCH_MMAP_RND_BITS_MIN
> + depends on HAVE_ARCH_MMAP_RND_BITS

...
> +
> +config ARCH_MMAP_RND_COMPAT_BITS
> + int "Number of bits to use for ASLR of mmap base address for compatible 
> applications" if EXPERT
> + range ARCH_MMAP_RND_COMPAT_BITS_MIN ARCH_MMAP_RND_COMPAT_BITS_MAX
> + default ARCH_MMAP_RND_COMPAT_BITS_MIN

But here you don't.

Just forgot?

I'd like to have a default which is separate from the minimum. That way we can
have a default which is reasonably large, but allow it to be lowered easily if
anything breaks.

cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check

2015-11-24 Thread Junxiao Bi
On 11/25/2015 05:46 AM, Mark Fasheh wrote:
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> On 11/03/2015 03:54 PM, Gang He wrote:
>>> Hi Junxiao,
>>>
>>> Thank for your reviewing.
>>> Current design, we use a sysfile as a interface to check/fix a file (via 
>>> pass a ino number).
>>> But, this operation is manually triggered by user, instead of automatically 
>>>  fix in the kernel.
>>> Why?
>>> 1) we should let users make this decision, since some users do not want to 
>>> fix when encountering a file system corruption, maybe they want to keep the 
>>> file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> 
> Maybe we need another errors=XXX flag (maybe errors=fix)?
Sound great. This is a good option since user may have not enough
knowledge whether to fix the found issue.

Thanks,
Junxiao.
> 
> You both make good points, here's what I gather from the conversation:
> 
>  - Some customers would be sad if they have to manually fix corruptions.
>This takes effort on their part, and if the FS can handle it
>automatically, it should.
> 
>  - There are valid concerns that automatically fixing things is a change in
>behavior that might not be welcome, or worse might lead to unforseeable
>circumstances.
> 
>  - I will add that fixing things automatically implies checking them
>automatically which could introduce some performance impact depending on
>how much checking we're doing.
> 
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
> 
> 
>>> 2) frankly speaking, this feature will probably bring a second corruption 
>>> if there is some error in the code, I do not suggest to use automatically 
>>> fix by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> 
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
>   --Mark
> 
> 
> --
> Mark Fasheh
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: pv88060: Fix irq leak

2015-11-24 Thread Axel Lin
Use devm_request_threaded_irq to ensure the irq is freed when unload the
module.

Signed-off-by: Axel Lin 
---
 drivers/regulator/pv88060-regulator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/pv88060-regulator.c 
b/drivers/regulator/pv88060-regulator.c
index 60b16d8..69893f2 100644
--- a/drivers/regulator/pv88060-regulator.c
+++ b/drivers/regulator/pv88060-regulator.c
@@ -365,7 +365,7 @@ static int pv88060_i2c_probe(struct i2c_client *i2c,
return ret;
}
 
-   ret = request_threaded_irq(i2c->irq, NULL,
+   ret = devm_request_threaded_irq(&i2c->dev, i2c->irq, NULL,
pv88060_irq_handler,
IRQF_TRIGGER_LOW|IRQF_ONESHOT,
"pv88060", chip);
-- 
2.1.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: LTO build errors (Re: linux-next: clean up the kbuild tree?)

2015-11-24 Thread Andi Kleen

Hi Takashi,

On Tue, Nov 24, 2015 at 05:33:36PM +0100, Takashi Iwai wrote:
>   LD  vmlinux
> arch/x86/kernel/cpu/perf_event_intel_rapl.c:66:20: error: rapl_domain_names 
> causes a section type conflict with __setup_str_set_reset_devices
>  static const char *rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
> ^
> init/main.c:159:19: note: ‘__setup_str_set_reset_devices’ was declared here
>  __setup("reset_devices", set_reset_devices);
> 
> Hmm...  I see no direct relation, but OK, let's try to get rid of
> __initconst.  Now it hits lots of other errors like:

I hit the same issue, will send a patch. The other symbol is typically some
random correct symbol because gcc detects the conflict on a pair of symbols.

The problem is that placing const correctly is too difficult, the correct line
would be 

static const char *const rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {

> 
> `__sw_hweight32' referenced in section `.text' of 
> /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> lib/built-in.o (symbol from plugin)
> `__sw_hweight32' referenced in section `.text' of 
> /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> lib/built-in.o (symbol from plugin)
> `__sw_hweight32' referenced in section `.text' of 
> /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> lib/built-in.o (symbol from plugin)
> `__sw_hweight32' referenced in section `.text' of 
> /tmp/ccUCMU7n.ltrans13.ltrans.o: defined in discarded section `.text' of 
> lib/built-in.o (symbol from plugin)

This needs

https://git.kernel.org/cgit/linux/kernel/git/ak/linux-misc.git/commit/?h=lto-4.0&id=d826425f7a9d935d521989bd0a871b76fb4c59e2


> /tmp/ccUCMU7n.ltrans21.ltrans.o: In function `do_exit':
> :(.text+0xfc0): undefined reference to `sys_futex'
> /tmp/ccUCMU7n.ltrans22.ltrans.o: In function `_do_fork':
> :(.text+0x39f7): undefined reference to `ret_from_fork'
> :(.text+0x4428): undefined reference to `ret_from_kernel_thread'


That's new, but can be fixed by adding __visible or asmlinkage to these symbols
I guess it's from the recent entry* restructuring.

I'll do an updated tree later.

Everything that's called from assembler in C needs to be marked like this. It's
fairly mechanic.

-andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] arm64: mm: support ARCH_MMAP_RND_BITS.

2015-11-24 Thread Michael Ellerman
On Mon, 2015-11-23 at 10:55 -0800, Daniel Cashman wrote:
> On 11/23/2015 07:04 AM, Will Deacon wrote:
> > On Wed, Nov 18, 2015 at 03:20:07PM -0800, Daniel Cashman wrote:
> > > +config ARCH_MMAP_RND_BITS_MAX
> > > +   default 20 if ARM64_64K_PAGES && ARCH_VA_BITS=39
> > > +   default 24 if ARCH_VA_BITS=39
> > > +   default 23 if ARM64_64K_PAGES && ARCH_VA_BITS=42
> > > +   default 27 if ARCH_VA_BITS=42
> > > +   default 29 if ARM64_64K_PAGES && ARCH_VA_BITS=48
> > > +   default 33 if ARCH_VA_BITS=48
> > > +   default 15 if ARM64_64K_PAGES
> > > +   default 19
> > > +
> > > +config ARCH_MMAP_RND_COMPAT_BITS_MIN
> > > +   default 7 if ARM64_64K_PAGES
> > > +   default 11
> > 
> > FYI: we now support 16k pages too, so this might need updating. It would
> > be much nicer if this was somehow computed rather than have the results
> > all open-coded like this.
> 
> Yes, I ideally wanted this to be calculated based on the different page
> options and VA_BITS (which itself has a similar stanza), but I don't
> know how to do that/if it is currently supported in Kconfig. This would
> be even more desirable with the addition of 16K_PAGES, as with this
> setup we have a combinatorial problem.
> 
> We could move this logic into the code where min/max are initialized,
> but that would create its own mess, creating new Kconfig values to
> introduce it in an arch-agnostic way after patch-set v2 moved that to
> mm/mmap.c instead of arch/${arch}/mm/mmap.c Suggestions welcome.


Could we instead change the meaning of the mmap_rnd_bits value to be the number
of address space bits that may be randomised?

ie. 40 would mean "please randomise in a 1T range", which with PAGE_SIZE=4K
gives you 28 random bits. etc.

That would make the value independent of PAGE_SIZE, and only depend on the size
of the address space.

It would also mean the values userspace sets and sees don't need to change if 
the
kernel PAGE_SIZE changes. (which probably doesn't happen often but still)

cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM: exynos_defconfig: Enable NFSv4 client

2015-11-24 Thread Krzysztof Kozlowski
NFS client is already enabled (NFS_FS) and by default it enables clients
for version 2 and 3. Enable explicitly the version 4 client to utilize
the newer protocol.

The NFS client is especially useful for testing kernel in automated
environments (network boot with network file system).

Signed-off-by: Krzysztof Kozlowski 
---
 arch/arm/configs/exynos_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/exynos_defconfig 
b/arch/arm/configs/exynos_defconfig
index b3f9c7558851..409adc1eaf33 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -224,6 +224,7 @@ CONFIG_TMPFS_POSIX_ACL=y
 CONFIG_CRAMFS=y
 CONFIG_ROMFS_FS=y
 CONFIG_NFS_FS=y
+CONFIG_NFS_V4=y
 CONFIG_ROOT_NFS=y
 CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ASCII=y
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

2015-11-24 Thread Junxiao Bi
Hi Mark,

On 11/25/2015 06:16 AM, Mark Fasheh wrote:
> Hi Junxiao,
> 
> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
> 
> I agree that these are rather uncommon, we might even consider removing the
> VALID_FL fixup. I definitely don't think we're ready for anything more
> complicated than this though either. We kind of have to start somewhere too.
> 
Yes, the fix is too simple, and just a start, I think we'd better wait
more useful parts done before merging it.
> 
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
> 
> Your second point (and the last part of your 1st point) makes a good
> argument for why this shouldn't happen automatically. Some of these
> corruptions might require a human to look at the log and decide what to do.
> Especially as you point out, where we might not know where the source of the
> corruption is. And if the human can't figure it out, then it's probably time
> to unmount and fsck.
The point is that the fix way is wrong, just flush memory info to disk
is not right. I agree online fsck is good feature, but need carefully
design, it should not involve more corruptions. A rough idea from mine
is that maybe we need some "frezee" mechanism in fs, which can hung all
fs op and let fs stop at a safe area. After freeze fs, we can do some
fsck work on it and these works should not cost lots time. What's your idea?

Thanks,
Junxiao.

> 
> Thanks,
>   --Mark
> 
> --
> Mark Fasheh
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] thermal: setup monitor only once after handling trips

2015-11-24 Thread Eduardo Valentin
Instead of changing the monitoring setup every time after
handling each trip, this patch simplifies the monitoring
setup by moving the setup call to a place where all
trips have been treated already.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d9e525c..6debb54 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -457,11 +457,6 @@ static void handle_thermal_trip(struct thermal_zone_device 
*tz, int trip)
handle_critical_trips(tz, trip, type);
else
handle_non_critical_trips(tz, trip, type);
-   /*
-* Alright, we handled this trip successfully.
-* So, start monitoring again.
-*/
-   monitor_thermal_zone(tz);
 }
 
 /**
@@ -547,6 +542,12 @@ void thermal_zone_device_update(struct thermal_zone_device 
*tz)
 
for (count = 0; count < tz->trips; count++)
handle_thermal_trip(tz, count);
+
+   /*
+* Alright, we handled this trip successfully.
+* So, start monitoring again.
+*/
+   monitor_thermal_zone(tz);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

2015-11-24 Thread Wangnan (F)



On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:

Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:

On 11/24/15 7:00 AM, Yunlong Song wrote:

+static int record__write(struct record *rec, void *bf, size_t size)
+{
+   if (rec->memory.size && memory_enabled) {
+   if (perf_memory__write(&rec->memory, bf, size) < 0) {
+   pr_err("failed to write memory data, error: %m\n");
+   return -1;
+   }
+   } else {
+   if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+   pr_err("failed to write perf data, error: %m\n");
+   return -1;
+   }
+   rec->bytes_written += size;
}

-   rec->bytes_written += size;
return 0;
  }

@@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int idx)
if (old == head)
return 0;

+   memory_enabled = 1;
+
rec->samples++;

size = head - old;
@@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec, int idx)
md->prev = old;
perf_evlist__mmap_consume(rec->evlist, idx);
  out:
+   memory_enabled = 0;
return rc;
  }


So you are basically ignoring all samples until SIGUSR2 is received. That

No, he is not, its just that his code is difficult to follow, has to be
rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
will..


means the resulting data file will have limited history of task events for

... have a complete history of task events, since PERF_RECORD_FORK, etc
are not being ignored.

No?


Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...

Furthermore, there's another problem being discussed: if userspace 
ringbuffer

is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting. Instead, we are
thinking about a bucket-based ringbuffer that, let perf maintain a series
of bucket, each time 'poll' return, perf copies new events to the start of
a bucket. If all bucket is occupied, we drop the oldest bucket. Bucket-based
ringbuffer watest some memory but can avoid event parsing.

And there's many other problems in this patch. For example, when SIGUSR2 is
received, we need to do something to let all perf events start dumping.
Current implementation can't ensure we receive events just before the
SIGUSR2 if we not set 'no-buffer'.

Also, output events are in one perf.data, which is not user friendly.
Our final goal is to make perf a daemonized moniter, which can run 7x24
in user's environment. Each time a glitch is detected, a framework sends
a signal to perf to get a perf.data from it perf. The framework manage
those perf.data like logrotate, help developer analysis those glitch.

We are seeking the route implementing the final monitor. This patch is
an attempt to let you know what we want and get your thought about it.
Looks like you agree out basic idea. That's good. Then we decide to
start from some small feature to support the final goal. For example:
snapshot mode for specific events:

 # perf record -a -e cycles/snapshot/

And when C-c is pressed, for cycles event, only those data still in
kernel would be dump.

Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/4] iio: adc: add IMX7D ADC driver support

2015-11-24 Thread Stefan Agner
Hi Haibo,

Some comments below:

On 2015-11-20 07:48, Haibo Chen wrote:
> Freescale i.MX7D soc contains a new ADC IP. This patch add this ADC
> driver support, and the driver only support ADC software trigger.
> 
> Signed-off-by: Haibo Chen 
> ---
>  drivers/iio/adc/Kconfig |   9 +
>  drivers/iio/adc/Makefile|   1 +
>  drivers/iio/adc/imx7d_adc.c | 570 
> 
>  3 files changed, 580 insertions(+)
>  create mode 100644 drivers/iio/adc/imx7d_adc.c
> 
> diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
> index 7868c74..bf0611c 100644
> --- a/drivers/iio/adc/Kconfig
> +++ b/drivers/iio/adc/Kconfig
> @@ -194,6 +194,15 @@ config HI8435
> This driver can also be built as a module. If so, the module will be
> called hi8435.
>  
> +config IMX7D_ADC
> + tristate "IMX7D ADC driver"
> + depends on OF

Hm, not sure, but shouldn't we use a proper depends here? Otherwise this
will show up as modules in all kinds of distributions.

> + help
> +   Say yes here to build support for IMX7D ADC.
> +
> +   This driver can also be built as a module. If so, the module will be
> +   called imx7d_adc.
> +
>  config LP8788_ADC
>   tristate "LP8788 ADC driver"
>   depends on MFD_LP8788
> diff --git a/drivers/iio/adc/Makefile b/drivers/iio/adc/Makefile
> index 99b37a9..282ffc01 100644
> --- a/drivers/iio/adc/Makefile
> +++ b/drivers/iio/adc/Makefile
> @@ -20,6 +20,7 @@ obj-$(CONFIG_CC10001_ADC) += cc10001_adc.o
>  obj-$(CONFIG_DA9150_GPADC) += da9150-gpadc.o
>  obj-$(CONFIG_EXYNOS_ADC) += exynos_adc.o
>  obj-$(CONFIG_HI8435) += hi8435.o
> +obj-$(CONFIG_IMX7D_ADC) += imx7d_adc.o
>  obj-$(CONFIG_LP8788_ADC) += lp8788_adc.o
>  obj-$(CONFIG_MAX1027) += max1027.o
>  obj-$(CONFIG_MAX1363) += max1363.o
> diff --git a/drivers/iio/adc/imx7d_adc.c b/drivers/iio/adc/imx7d_adc.c
> new file mode 100644
> index 000..d9547bf
> --- /dev/null
> +++ b/drivers/iio/adc/imx7d_adc.c
> @@ -0,0 +1,570 @@
> +/*
> + * Freescale i.MX7D ADC driver
> + *
> + * Copyright (C) 2015 Freescale Semiconductor, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Can you sort these alphabetically

> +
> +#include 
> +#include 
> +#include 
> +
> +/* ADC register */
> +#define IMX7D_REG_ADC_CH_A_CFG1  0x00
> +#define IMX7D_REG_ADC_CH_A_CFG2  0x10
> +#define IMX7D_REG_ADC_CH_B_CFG1  0x20
> +#define IMX7D_REG_ADC_CH_B_CFG2  0x30
> +#define IMX7D_REG_ADC_CH_C_CFG1  0x40
> +#define IMX7D_REG_ADC_CH_C_CFG2  0x50
> +#define IMX7D_REG_ADC_CH_D_CFG1  0x60
> +#define IMX7D_REG_ADC_CH_D_CFG2  0x70
> +#define IMX7D_REG_ADC_CH_SW_CFG  0x80
> +#define IMX7D_REG_ADC_TIMER_UNIT 0x90
> +#define IMX7D_REG_ADC_DMA_FIFO   0xa0
> +#define IMX7D_REG_ADC_FIFO_STATUS0xb0
> +#define IMX7D_REG_ADC_INT_SIG_EN 0xc0
> +#define IMX7D_REG_ADC_INT_EN 0xd0
> +#define IMX7D_REG_ADC_INT_STATUS 0xe0
> +#define IMX7D_REG_ADC_CHA_B_CNV_RSLT 0xf0
> +#define IMX7D_REG_ADC_CHC_D_CNV_RSLT 0x100
> +#define IMX7D_REG_ADC_CH_SW_CNV_RSLT 0x110
> +#define IMX7D_REG_ADC_DMA_FIFO_DAT   0x120
> +#define IMX7D_REG_ADC_ADC_CFG0x130
> +
> +#define IMX7D_EACH_CHANNEL_REG_SHIF  0x20

I would call that OFFSET, SHIFT is typically used for a bit offset
within a register.

> +
> +#define IMX7D_REG_ADC_CH_CFG1_CHANNEL_EN (0x1 << 31)
> +#define IMX7D_REG_ADC_CH_CFG1_CHANNEL_DISABLE(0x0 << 
> 31)

I would just define the _EN definition (along with using BIT).
Bitshifting a 0 is not really useful.

> +#define IMX7D_REG_ADC_CH_CFG1_CHANNEL_SINGLE BIT(30)
> +#define IMX7D_REG_ADC_CH_CFG1_CHANNEL_AVG_EN BIT(29)
> +#define IMX7D_REG_ADC_CH_CFG1_CHANNEL_SEL_SHIF   24
> +
> +#define IMX7D_REG_ADC_CH_CFG2_AVG_NUM_4  (0x0 << 
> 12)
> +#define IMX7D_REG_ADC_CH_CFG2_AVG_NUM_8  (0x1 << 
> 12)
> +#define IMX7D_REG_ADC_CH_CFG2_AVG_NUM_16 (0x2 << 12)
> +#define IMX7D_REG_ADC_CH_CFG2_AVG_NUM_32 (0x3 << 12)
> +
> +#define IMX7D_REG_ADC_TIMER_UNIT_PRE_DIV_4   (0x0 << 29)
> +#define IMX7D_REG_ADC_TIMER_UNIT_PRE_DIV_8   (0x1 << 29)
> +#define IMX7D_REG_ADC_TIMER_UNIT_PRE_DIV_16  (0x2 << 29)
> +#define IMX7D_REG_ADC_TIMER_UNIT

Re: [PATCH] paravirt: remove paravirt ops pmd_update[_defer] and pte_update_defer

2015-11-24 Thread Rusty Russell
Juergen Gross  writes:
> Ping?

Acked-by: Rusty Russell 

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] module: keep percpu symbols in module's symtab

2015-11-24 Thread Rusty Russell
Miroslav Benes  writes:
> Currently, percpu symbols from .data..percpu ELF section of a module are
> not copied over and stored in final symtab array of struct module.
> Consequently such symbol cannot be returned via kallsyms API (for
> example kallsyms_lookup_name). This can be especially confusing when the
> percpu symbol is exported. Only its __ksymtab et al. are present in its
> symtab.
>
> The culprit is in layout_and_allocate() function where SHF_ALLOC flag is
> dropped for .data..percpu section. There is in fact no need to copy the
> section to final struct module, because kernel module loader allocates
> extra percpu section by itself. Unfortunately only symbols from
> SHF_ALLOC sections are copied (see is_core_symbol()).
>
> The patch restores SHF_ALLOC flag for original percpu section. The
> section with its symbols is thus copied over, but not otherwise used.
> st_value of percpu symbols points to correct newly allocated section
> thanks to correction in simplify_symbols().
>
> Signed-off-by: Miroslav Benes 
> ---
>
> I don't deem the solution nice. The other one I came up with was to hack
> is_core_symbol() to copy percpu symbols. There is a catch though.
> Elf_Sym's st_shndx is an index to an associated section. If we do not
> preserve .data..percpu section the index would be invalid. But this is
> similar to other symbols as well I guess. The index is never valid after
> move_module(), right? The only relevant check I found in the kernel is
> in get_ksymbol() - 'st_shndx == SHN_UNDEF'. So it could be harmless.

Yes, you should do this instead, I think.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Resend] clk: imx: add 'is_prepared' clk_ops callback for pllv3 clk

2015-11-24 Thread Shawn Guo
On Wed, Nov 25, 2015 at 12:06:53AM +0800, Bai Ping wrote:
> Add 'is_prepared' callback function for pllv3 type clk to make sure when
> the system is bootup, the unused clk is in a known state to match the
> prepare count info.
> 
> Signed-off-by: Bai Ping 
> Reviewed-by: Lucas Stach 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] thermal: rcar: enable to set tripN-temp via DT

2015-11-24 Thread Eduardo Valentin
Morimoto-san,


On Wed, Nov 25, 2015 at 01:45:14AM +, Kuninori Morimoto wrote:
> 
> From: Kuninori Morimoto 
> 
> Current rcar thermal driver is using 90 degrees as trip temp, but it
> should be based on each SoC / platform.
> This patch enables to set trip temp via DT. (It uses db8500-thermal
> style for it)
> It will use 90 degrees as default trip temp if DT doesn't have it.
> 
> Signed-off-by: Kuninori Morimoto 
> ---
>  .../devicetree/bindings/thermal/rcar-thermal.txt   |  2 ++
>  drivers/thermal/rcar_thermal.c | 34 
> --
>  2 files changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/thermal/rcar-thermal.txt 
> b/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> index 332e625..6c57f7e 100644
> --- a/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> +++ b/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> @@ -18,6 +18,8 @@ Required properties:
>  Option properties:
>  
>  - interrupts : use interrupt
> +- tripN-temp : temperature of trip point N. it will use 9 as 
> default
> +   if DT doesn't have tripN-temp

First of all, you are creating an entry with is specific to your driver.
That requires it to use proper prefixing.

Besides, your property is already covered by of-thermal. Please convert
your driver to use of-thermal, this way it will give you the flexibility
to configure thermal data in DT.

BR,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] arcmsr: changes driver version number

2015-11-24 Thread Ching Huang
From: Ching Huang 

Changes driver version number.

Signed-of-by: Ching Huang 

---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2015-11-25 10:52:13.33447 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2015-10-19 15:57:08.0 +0800
@@ -52,7 +52,7 @@ struct device_attribute;
#define ARCMSR_MAX_FREECCB_NUM  320
 #define ARCMSR_MAX_OUTSTANDING_CMD 255
 #endif
-#define ARCMSR_DRIVER_VERSION  "v1.30.00.04-20140919"
+#define ARCMSR_DRIVER_VERSION  "v1.30.00.21-20151019"
 #define ARCMSR_SCSI_INITIATOR_ID   
255
 #define ARCMSR_MAX_XFER_SECTORS
512
 #define ARCMSR_MAX_XFER_SECTORS_B  
4096


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC

2015-11-24 Thread Lan Tianyu
On 2015年11月24日 22:20, Alexander Duyck wrote:
> I'm still not a fan of this approach.  I really feel like this is
> something that should be resolved by extending the existing PCI hot-plug
> rather than trying to instrument this per driver.  Then you will get the
> goodness for multiple drivers and multiple OSes instead of just one.  An
> added advantage to dealing with this in the PCI hot-plug environment
> would be that you could then still do a hot-plug even if the guest
> didn't load a driver for the VF since you would be working with the PCI
> slot instead of the device itself.
> 
> - Alex

Hi Alex:
What's you mentioned seems the bonding driver solution.
Paper "Live Migration with Pass-through Device for Linux VM" describes
it. It does VF hotplug during migration. In order to maintain Network
connection when VF is out, it takes advantage of Linux bonding driver to
switch between VF NIC and emulated NIC. But the side affects, that
requires VM to do additional configure and the performance during
switching two NIC is not good.

-- 
Best regards
Tianyu Lan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check

2015-11-24 Thread Gang He
Hi Mark and Junxiao,


>>> 
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>> 
>> On 11/03/2015 03:54 PM, Gang He wrote:
>> > Hi Junxiao,
>> > 
>> > Thank for your reviewing.
>> > Current design, we use a sysfile as a interface to check/fix a file (via 
> pass a ino number).
>> > But, this operation is manually triggered by user, instead of 
>> > automatically 
>  fix in the kernel.
>> > Why?
>> > 1) we should let users make this decision, since some users do not want to 
> fix when encountering a file system corruption, maybe they want to keep the 
> file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> 
> Maybe we need another errors=XXX flag (maybe errors=fix)?
> 
> You both make good points, here's what I gather from the conversation:
> 
>  - Some customers would be sad if they have to manually fix corruptions.
>This takes effort on their part, and if the FS can handle it
>automatically, it should.
> 
>  - There are valid concerns that automatically fixing things is a change in
>behavior that might not be welcome, or worse might lead to unforseeable
>circumstances.
> 
>  - I will add that fixing things automatically implies checking them
>automatically which could introduce some performance impact depending on
>how much checking we're doing.
> 
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
That is what I want to say, add a mount option to let users to decide. Here, I 
want to split "error=fix"
mount option  task out from online file check feature, I think this part should 
be a independent feature.
We can implement this feature after online file check is done, I want to split 
the feature into some more 
detailed features, implement them one by one. Do you agree this point?

> 
> 
>> > 2) frankly speaking, this feature will probably bring a second corruption 
> if there is some error in the code, I do not suggest to use automatically fix 
> by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> 
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
>   --Mark
> 
> 
> --
> Mark Fasheh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/3] arcmsr: adds code for support areca new adapter ARC1203

2015-11-24 Thread Ching Huang
From: Ching Huang 

Support areca new PCIe to SATA RAID adapter ARC1203

Signed-of-by: Ching Huang

---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2015-11-25 10:52:16.28647 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2015-11-25 10:52:13.33447 +0800
@@ -74,6 +74,9 @@ struct device_attribute;
 #ifndef PCI_DEVICE_ID_ARECA_1214
#define PCI_DEVICE_ID_ARECA_12140x1214
 #endif
+#ifndef PCI_DEVICE_ID_ARECA_1203
+   #define PCI_DEVICE_ID_ARECA_12030x1203
+#endif
 /*
 
**
 **
@@ -245,6 +248,12 @@ struct FIRMWARE_INFO
 /* window of "instruction flags" from iop to driver */
 #define ARCMSR_IOP2DRV_DOORBELL   0x00020408
 #define ARCMSR_IOP2DRV_DOORBELL_MASK  0x0002040C
+/* window of "instruction flags" from iop to driver */
+#define ARCMSR_IOP2DRV_DOORBELL_1203  0x00021870
+#define ARCMSR_IOP2DRV_DOORBELL_MASK_1203 0x00021874
+/* window of "instruction flags" from driver to iop */
+#define ARCMSR_DRV2IOP_DOORBELL_1203  0x00021878
+#define ARCMSR_DRV2IOP_DOORBELL_MASK_1203 0x0002187C
 /* ARECA FLAG LANGUAGE */
 /* ioctl transfer */
 #define ARCMSR_IOP2DRV_DATA_WRITE_OK  0x0001
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2015-11-24 11:35:26.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2015-11-24 18:58:40.640226000 +0800
@@ -114,6 +114,7 @@ static void arcmsr_hardware_reset(struct
 static const char *arcmsr_info(struct Scsi_Host *);
 static irqreturn_t arcmsr_interrupt(struct AdapterControlBlock *acb);
 static void arcmsr_free_irq(struct pci_dev *, struct AdapterControlBlock *);
+static void arcmsr_wait_firmware_ready(struct AdapterControlBlock *acb);
 static int arcmsr_adjust_disk_queue_depth(struct scsi_device *sdev, int 
queue_depth)
 {
if (queue_depth > ARCMSR_MAX_CMD_PERLUN)
@@ -157,6 +158,8 @@ static struct pci_device_id arcmsr_devic
.driver_data = ACB_ADAPTER_TYPE_B},
{PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1202),
.driver_data = ACB_ADAPTER_TYPE_B},
+   {PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1203),
+   .driver_data = ACB_ADAPTER_TYPE_B},
{PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1210),
.driver_data = ACB_ADAPTER_TYPE_A},
{PCI_DEVICE(PCI_VENDOR_ID_ARECA, PCI_DEVICE_ID_ARECA_1214),
@@ -2621,7 +2624,7 @@ static bool arcmsr_hbaA_get_config(struc
 }
 static bool arcmsr_hbaB_get_config(struct AdapterControlBlock *acb)
 {
-   struct MessageUnit_B *reg = acb->pmuB;
+   struct MessageUnit_B *reg;
struct pci_dev *pdev = acb->pdev;
void *dma_coherent;
dma_addr_t dma_coherent_handle;
@@ -2649,10 +2652,17 @@ static bool arcmsr_hbaB_get_config(struc
acb->dma_coherent2 = dma_coherent;
reg = (struct MessageUnit_B *)dma_coherent;
acb->pmuB = reg;
-   reg->drv2iop_doorbell= (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL);
-   reg->drv2iop_doorbell_mask = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_MASK);
-   reg->iop2drv_doorbell = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL);
-   reg->iop2drv_doorbell_mask = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_MASK);
+   if (acb->pdev->device == PCI_DEVICE_ID_ARECA_1203) {
+   reg->drv2iop_doorbell = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_1203);
+   reg->drv2iop_doorbell_mask = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_MASK_1203);
+   reg->iop2drv_doorbell = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_1203);
+   reg->iop2drv_doorbell_mask = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_MASK_1203);
+   } else {
+   reg->drv2iop_doorbell= (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL);
+   reg->drv2iop_doorbell_mask = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_DRV2IOP_DOORBELL_MASK);
+   reg->iop2drv_doorbell = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL);
+   reg->iop2drv_doorbell_mask = (uint32_t __iomem *)((unsigned 
long)acb->mem_base0 + ARCMSR_IOP2DRV_DOORBELL_MASK);
+   }
reg->message_wbuffer = (uint32_t __iomem *)((unsigned 
long)acb->mem_base1 + ARCMSR_MESSAGE_WBUFFER);
reg->message_rbuffer =  (uint32_t __iomem *)((unsigned 
long)acb->mem_base1 + ARCMSR_MESSAGE_RBUFFER);
reg->message_rwbuffer = (uint32_t __iomem *)((unsi

Re: [PATCH v6 01/17] arm64:ilp32: add documentation on the ILP32 ABI for ARM64

2015-11-24 Thread Iosif Harutyunov
Sonicwall is very interested in ILP32, is there a way we can get access to the 
SuSe builds?

Iosif,_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/3] arcmsr: fixed getting wrong configuration data

2015-11-24 Thread Ching Huang
From: Ching Huang 

Fixed getting wrong configuration data of adapter type B and type D.

Signed-of-by: Ching Huang 

---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2015-11-23 16:25:22.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2015-11-24 11:35:26.0 +0800
@@ -2694,15 +2694,15 @@ static bool arcmsr_hbaB_get_config(struc
acb->firm_model,
acb->firm_version);
 
-   acb->signature = readl(®->message_rwbuffer[1]);
+   acb->signature = readl(®->message_rwbuffer[0]);
/*firm_signature,1,00-03*/
-   acb->firm_request_len = readl(®->message_rwbuffer[2]);
+   acb->firm_request_len = readl(®->message_rwbuffer[1]);
/*firm_request_len,1,04-07*/
-   acb->firm_numbers_queue = readl(®->message_rwbuffer[3]);
+   acb->firm_numbers_queue = readl(®->message_rwbuffer[2]);
/*firm_numbers_queue,2,08-11*/
-   acb->firm_sdram_size = readl(®->message_rwbuffer[4]);
+   acb->firm_sdram_size = readl(®->message_rwbuffer[3]);
/*firm_sdram_size,3,12-15*/
-   acb->firm_hd_channels = readl(®->message_rwbuffer[5]);
+   acb->firm_hd_channels = readl(®->message_rwbuffer[4]);
/*firm_ide_channels,4,16-19*/
acb->firm_cfg_version = readl(®->message_rwbuffer[25]);  
/*firm_cfg_version,25,100-103*/
/*firm_ide_channels,4,16-19*/
@@ -2880,15 +2880,15 @@ static bool arcmsr_hbaD_get_config(struc
iop_device_map++;
count--;
}
-   acb->signature = readl(®->msgcode_rwbuffer[1]);
+   acb->signature = readl(®->msgcode_rwbuffer[0]);
/*firm_signature,1,00-03*/
-   acb->firm_request_len = readl(®->msgcode_rwbuffer[2]);
+   acb->firm_request_len = readl(®->msgcode_rwbuffer[1]);
/*firm_request_len,1,04-07*/
-   acb->firm_numbers_queue = readl(®->msgcode_rwbuffer[3]);
+   acb->firm_numbers_queue = readl(®->msgcode_rwbuffer[2]);
/*firm_numbers_queue,2,08-11*/
-   acb->firm_sdram_size = readl(®->msgcode_rwbuffer[4]);
+   acb->firm_sdram_size = readl(®->msgcode_rwbuffer[3]);
/*firm_sdram_size,3,12-15*/
-   acb->firm_hd_channels = readl(®->msgcode_rwbuffer[5]);
+   acb->firm_hd_channels = readl(®->msgcode_rwbuffer[4]);
/*firm_hd_channels,4,16-19*/
acb->firm_cfg_version = readl(®->msgcode_rwbuffer[25]);
pr_notice("Areca RAID Controller%d: Model %s, F/W %s\n",


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] KVM: x86: Add lowest-priority support for vt-d posted-interrupts

2015-11-24 Thread Wu, Feng


> -Original Message-
> From: Radim Krčmář [mailto:rkrc...@redhat.com]
> Sent: Tuesday, November 24, 2015 10:32 PM
> To: Wu, Feng 
> Cc: pbonz...@redhat.com; k...@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH] KVM: x86: Add lowest-priority support for vt-d posted-
> interrupts
> 
> 2015-11-24 01:26+, Wu, Feng:
> > "I don't think we do any vector hashing on our client parts.  This may be
> why the customer is not able to detect this on Skylake client silicon.
> > The vector hashing is micro-architectural and something we had done on
> server parts.
> >
> > If you look at the haswell server CPU spec (https://www-
> ssl.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-
> e5-v3-datasheet-vol-2.pdf)
> > In section 4.1.2, you will see an IntControl register (this is a register
> controlled/configured by BIOS) - see below.
> 
> Thank you!
> 
> > If you look at bits 6:4 in that register, you see the option we offer in
> hardware for what kind of redirection is applied to lowest priority 
> interrupts.
> > There are three options:
> > 1.  Fixed priority
> > 2.  Redirect last
> > 3.  Hash Vector
> >
> > If picking vector hash, then bits 10:8 specifies the APIC-ID bits used for 
> > the
> hashing."
> 
> The hash function just interprets a subset of vector's bits as a number
> and uses that as a starting offset in a search for an enabled APIC
> within the destination set?
> 
> For example:
> The x2APIC destination is 0x0055 (= first four even APICs in cluster
> 0), the vector is 0b1110, and bits 10:8 of IntControl are 000.
> 
> 000 means that bits 7:4 of vector are selected, thus the vector hash is
> 0b1110 = 14, so the round-robin effectively does 14 % 4 (because we only
> have 4 destinations) and delivers to the 3rd possible APIC (= ID 6)?

In my current implementation, I don't select a subset of vector's bits as
the number, instead, I use the whole vector number. For software emulation
p. o. v, do we really need to select a subset of the vector's bits as the base
number? What is your opinion? Thanks a lot!

Thank,
Feng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb : replace dma_pool_alloc and memset with dma_pool_zalloc

2015-11-24 Thread Saurabh Sengar
any conclusion for this patch ?
any feedback ?

On Oct 28, 2015 2:46 PM, "Peter Chen"  wrote:
>
> On Wed, Oct 28, 2015 at 12:44:35PM +0530, Saurabh Sengar wrote:
> > replace dma_pool_alloc and memset with a single call to dma_pool_zalloc
> >
> > Signed-off-by: Saurabh Sengar 
> > ---
> >  drivers/usb/chipidea/udc.c  | 3 +--
> >  drivers/usb/gadget/udc/gr_udc.c | 3 +--
> >  drivers/usb/host/uhci-q.c   | 3 +--
> >  drivers/usb/host/whci/qset.c| 3 +--
> >  drivers/usb/host/xhci-mem.c | 6 ++
> >  5 files changed, 6 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
> > index 8223fe7..235b948f 100644
> > --- a/drivers/usb/chipidea/udc.c
> > +++ b/drivers/usb/chipidea/udc.c
> > @@ -349,14 +349,13 @@ static int add_td_to_list(struct ci_hw_ep *hwep, 
> > struct ci_hw_req *hwreq,
> >   if (node == NULL)
> >   return -ENOMEM;
> >
> > - node->ptr = dma_pool_alloc(hwep->td_pool, GFP_ATOMIC,
> > + node->ptr = dma_pool_zalloc(hwep->td_pool, GFP_ATOMIC,
> >  &node->dma);
> >   if (node->ptr == NULL) {
> >   kfree(node);
> >   return -ENOMEM;
> >   }
> >
> > - memset(node->ptr, 0, sizeof(struct ci_hw_td));
> >   node->ptr->token = cpu_to_le32(length << __ffs(TD_TOTAL_BYTES));
> >   node->ptr->token &= cpu_to_le32(TD_TOTAL_BYTES);
> >   node->ptr->token |= cpu_to_le32(TD_STATUS_ACTIVE);
> > diff --git a/drivers/usb/gadget/udc/gr_udc.c 
> > b/drivers/usb/gadget/udc/gr_udc.c
> > index b9429bc..39b7136 100644
> > --- a/drivers/usb/gadget/udc/gr_udc.c
> > +++ b/drivers/usb/gadget/udc/gr_udc.c
> > @@ -253,13 +253,12 @@ static struct gr_dma_desc *gr_alloc_dma_desc(struct 
> > gr_ep *ep, gfp_t gfp_flags)
> >   dma_addr_t paddr;
> >   struct gr_dma_desc *dma_desc;
> >
> > - dma_desc = dma_pool_alloc(ep->dev->desc_pool, gfp_flags, &paddr);
> > + dma_desc = dma_pool_zalloc(ep->dev->desc_pool, gfp_flags, &paddr);
> >   if (!dma_desc) {
> >   dev_err(ep->dev->dev, "Could not allocate from DMA pool\n");
> >   return NULL;
> >   }
> >
> > - memset(dma_desc, 0, sizeof(*dma_desc));
> >   dma_desc->paddr = paddr;
> >
> >   return dma_desc;
> > diff --git a/drivers/usb/host/uhci-q.c b/drivers/usb/host/uhci-q.c
> > index da6f56d..c17ea15 100644
> > --- a/drivers/usb/host/uhci-q.c
> > +++ b/drivers/usb/host/uhci-q.c
> > @@ -248,11 +248,10 @@ static struct uhci_qh *uhci_alloc_qh(struct uhci_hcd 
> > *uhci,
> >   dma_addr_t dma_handle;
> >   struct uhci_qh *qh;
> >
> > - qh = dma_pool_alloc(uhci->qh_pool, GFP_ATOMIC, &dma_handle);
> > + qh = dma_pool_zalloc(uhci->qh_pool, GFP_ATOMIC, &dma_handle);
> >   if (!qh)
> >   return NULL;
> >
> > - memset(qh, 0, sizeof(*qh));
> >   qh->dma_handle = dma_handle;
> >
> >   qh->element = UHCI_PTR_TERM(uhci);
> > diff --git a/drivers/usb/host/whci/qset.c b/drivers/usb/host/whci/qset.c
> > index dc31c42..3297473 100644
> > --- a/drivers/usb/host/whci/qset.c
> > +++ b/drivers/usb/host/whci/qset.c
> > @@ -30,10 +30,9 @@ struct whc_qset *qset_alloc(struct whc *whc, gfp_t 
> > mem_flags)
> >   struct whc_qset *qset;
> >   dma_addr_t dma;
> >
> > - qset = dma_pool_alloc(whc->qset_pool, mem_flags, &dma);
> > + qset = dma_pool_zalloc(whc->qset_pool, mem_flags, &dma);
> >   if (qset == NULL)
> >   return NULL;
> > - memset(qset, 0, sizeof(struct whc_qset));
> >
> >   qset->qset_dma = dma;
> >   qset->whc = whc;
> > diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> > index 41f841f..060c20c 100644
> > --- a/drivers/usb/host/xhci-mem.c
> > +++ b/drivers/usb/host/xhci-mem.c
> > @@ -47,13 +47,12 @@ static struct xhci_segment *xhci_segment_alloc(struct 
> > xhci_hcd *xhci,
> >   if (!seg)
> >   return NULL;
> >
> > - seg->trbs = dma_pool_alloc(xhci->segment_pool, flags, &dma);
> > + seg->trbs = dma_pool_zalloc(xhci->segment_pool, flags, &dma);
> >   if (!seg->trbs) {
> >   kfree(seg);
> >   return NULL;
> >   }
> >
> > - memset(seg->trbs, 0, TRB_SEGMENT_SIZE);
> >   /* If the cycle state is 0, set the cycle bit to 1 for all the TRBs */
> >   if (cycle_state == 0) {
> >   for (i = 0; i < TRBS_PER_SEGMENT; i++)
> > @@ -517,12 +516,11 @@ static struct xhci_container_ctx 
> > *xhci_alloc_container_ctx(struct xhci_hcd *xhci
> >   if (type == XHCI_CTX_TYPE_INPUT)
> >   ctx->size += CTX_SIZE(xhci->hcc_params);
> >
> > - ctx->bytes = dma_pool_alloc(xhci->device_pool, flags, &ctx->dma);
> > + ctx->bytes = dma_pool_zalloc(xhci->device_pool, flags, &ctx->dma);
> >   if (!ctx->bytes) {
> >   kfree(ctx);
> >   return NULL;
> >   }
> > - memset(ctx->bytes, 0, ctx->size);
> >   return ctx;
> >  }
> >
> > --
>
> For chipidea changes:
> Acked-by: Peter

[PATCH v2 0/3] arcmsr: support areca new adapter ARC1203

2015-11-24 Thread Ching Huang
From: Ching Huang 

Patch 1 fixes getting wrong configuration data.

Pacth 2 adds codes to support new adapter ARC1203.

Patch 3 changes driver version number.

--


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-24 Thread Finn Thain

On Tue, 24 Nov 2015, Ondrej Zary wrote:

> On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> > Linux v2.1.105 changed the algorithm for polling for the BSY signal
> > in NCR5380_select() and NCR5380_main().
> > 
> > Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> > meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> > provided the recommended 250 ms selection time-out delay. This got broken
> > when HZ became configurable.
> > 
> > We could fix this but there's no need to reschedule the main loop. This
> > BSY polling presently happens when the NCR5380_main() work queue item
> > calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> > calls NCR5380_select() again, and so on.
> > 
> > This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> > The extra complexity and state is pointless. There's no reason to
> > stop selection half-way and return to to the main loop when the main
> > loop can do nothing useful until selection completes.
> > 
> > So just poll for BSY. We can sleep while polling now that we have a
> > suitable workqueue.
> 
> Bisecting slow module initialization pointed to this commit.

That's disappointing. This patch removed some nasty code. Anyway, thanks 
for taking the trouble to bisect.

> 
> Before this commit (2 seconds):
> [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options 
> { AUTOPROBE_IRQ PSEUDO_DMA }
> [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 
> After this commit (22 seconds):
> [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options 
> { AUTOPROBE_IRQ PSEUDO_DMA }
> [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' 
> as unstable because the skew is too large:
> [  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
> wd_last: f4da04 mask: ff
> [  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
> cs_last: 2c710aa17f mask: 
> [  145.032733] clocksource: Switched to clocksource acpi_pm

I figured that it was okay to sleep from an unbound CPU-intensive 
workqueue but doing so seems to cause problems. (See also patch 66/71 
"Fix soft lockups".)

Perhaps a kthread is needed instead of a workqueue? (This workqueue 
already has it's own kthread, but top shows that it doesn't accrue CPU 
time.)

> [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 

This problem doesn't show up on my hardware, and I'd like to know where 
those 22 seconds are being spent. Would you please apply the entire series 
and add,
#define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
to the top of g_NCR5380.c and send me the messages logged during modprobe?

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KEYS: Fix handling of stored error in a negatively instantiated user key

2015-11-24 Thread James Morris
On Tue, 24 Nov 2015, David Howells wrote:

> Hi James,
> 
> Can this be passed straight to Linus please?

Is this triggerable by normal users?


-- 
James Morris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH (v2) 1/10] clocksource: Add brcm,bcm6345-timer device tree binding

2015-11-24 Thread Rob Herring
On Mon, Nov 23, 2015 at 06:55:38PM +, Simon Arlott wrote:
> Add device tree bindings for the BCM6345/BCM6318 timers. This is required
> for the BCM6345 watchdog which needs to respond to one of the timer
> interrupts.
> 
> Signed-off-by: Simon Arlott 

Acked-by: Rob Herring 

> ---
> On 23/11/15 15:33, Jonas Gorski wrote:
> > On Sat, Nov 21, 2015 at 8:02 PM, Simon Arlott  wrote:
> >> +- compatible: should be "brcm,bcm-timer", "brcm,bcm6345-timer"
> > 
> > Since bcm6318 uses a slightly different register layout than the
> > earlier SoCs, I'd argue that using bcm6345-timer as a compatible for
> > bcm6318 is wrong.
> 
> I've split them out into two very similar bindings.
> 
> Patches 1/4 and 2/4 are replaced with (v2) 1/10 and (v2) 2/10.
> 
>  .../bindings/timer/brcm,bcm6318-timer.txt  | 44 
>  .../bindings/timer/brcm,bcm6345-timer.txt  | 47 
> ++
>  2 files changed, 91 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/timer/brcm,bcm6318-timer.txt
>  create mode 100644 
> Documentation/devicetree/bindings/timer/brcm,bcm6345-timer.txt
> 
> diff --git a/Documentation/devicetree/bindings/timer/brcm,bcm6318-timer.txt 
> b/Documentation/devicetree/bindings/timer/brcm,bcm6318-timer.txt
> new file mode 100644
> index 000..cf4be7e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/timer/brcm,bcm6318-timer.txt
> @@ -0,0 +1,44 @@
> +Broadcom BCM6318 Timer
> +
> +This block is a timer that is connected to multiple interrupts on the main
> +interrupt controller and functions as a programmable interrupt controller for
> +timer events. There is a main timer interrupt for all timers.
> +
> +- 4 independent timers with their own interrupt, and own maskable level
> +  interrupt bit in the main timer interrupt
> +
> +- 1 watchdog timer with an unmaskable level interrupt bit in the main timer
> +  interrupt
> +
> +- Contains one enable/status word pair
> +
> +- No atomic set/clear operations
> +
> +Required properties:
> +
> +- compatible: should be "brcm,bcm-timer", "brcm,bcm6318-timer"
> +- reg: specifies the base physical address and size of the registers, 
> excluding
> +  the watchdog registers
> +- interrupt-controller: identifies the node as an interrupt controller
> +- #interrupt-cells: specifies the number of cells needed to encode an 
> interrupt
> +  source, should be 1.
> +- interrupt-parent: specifies the phandle to the parent interrupt 
> controller(s)
> +  this one is cascaded from
> +- interrupts: specifies the interrupt line(s) in the interrupt-parent 
> controller
> +  node for the main timer interrupt, followed by the individual timer
> +  interrupts; valid values depend on the type of parent interrupt controller
> +- clocks: phandle of timer reference clock (periph)
> +
> +Example:
> +
> +timer: timer@1040 {
> + compatible = "brcm,bcm63148-timer", "brcm,bcm6318-timer";
> + reg = <0x1040 0x28>;
> +
> + interrupt-controller;
> + #interrupt-cells = <1>;
> +
> + interrupt-parent = <&periph_intc>;
> + interrupts = <31>, <0>, <1>, <2>, <3>;
> + clock = <&periph_osc>;
> +};
> diff --git a/Documentation/devicetree/bindings/timer/brcm,bcm6345-timer.txt 
> b/Documentation/devicetree/bindings/timer/brcm,bcm6345-timer.txt
> new file mode 100644
> index 000..03250dd
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/timer/brcm,bcm6345-timer.txt
> @@ -0,0 +1,47 @@
> +Broadcom BCM6345 Timer
> +
> +This block is a timer that is connected to one interrupt on the main 
> interrupt
> +controller and functions as a programmable interrupt controller for timer
> +events.
> +
> +- 3 independent timers with their own maskable level interrupt bit (but not
> +  per CPU because there is only one parent interrupt and the timers share it)
> +
> +- 1 watchdog timer with an unmaskable level interrupt
> +
> +- Contains one enable/status word pair
> +
> +- No atomic set/clear operations
> +
> +The lack of per CPU ability of timers makes them unusable as a set of
> +clockevent devices, otherwise they could be attached to the remaining
> +interrupts.
> +
> +Required properties:
> +
> +- compatible: should be "brcm,bcm-timer", "brcm,bcm6345-timer"
> +- reg: specifies the base physical address and size of the registers, 
> excluding
> +  the watchdog registers
> +- interrupt-controller: identifies the node as an interrupt controller
> +- #interrupt-cells: specifies the number of cells needed to encode an 
> interrupt
> +  source, should be 1.
> +- interrupt-parent: specifies the phandle to the parent interrupt 
> controller(s)
> +  this one is cascaded from
> +- interrupts: specifies the interrupt line(s) in the interrupt-parent 
> controller
> +  node for the timer interrupt; valid values depend on the type of parent
> +  interrupt controller
> +- clocks: phandle of timer reference clock (periph)
> +
> +Example:
> +
> +timer: timer@1080 {
> + compatible = "brcm,bcm63168-timer", "brcm,bcm6345-timer

Re: [PATCH] mm/compaction: __compact_pgdat() code cleanuup

2015-11-24 Thread Joonsoo Kim
On Tue, Nov 24, 2015 at 09:49:49AM +0100, Vlastimil Babka wrote:
> On 11/24/2015 07:24 AM, Joonsoo Kim wrote:
> >This patch uses is_via_compact_memory() to distinguish direct compaction.
> >And it also reduces indentation on compaction_defer_reset
> >by filtering failure case. There is no functional change.
> >
> >Signed-off-by: Joonsoo Kim 
> >---
> >  mm/compaction.c | 15 +--
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> >diff --git a/mm/compaction.c b/mm/compaction.c
> >index de3e1e7..2b1a15e 100644
> >--- a/mm/compaction.c
> >+++ b/mm/compaction.c
> >@@ -1658,14 +1658,17 @@ static void __compact_pgdat(pg_data_t *pgdat, struct 
> >compact_control *cc)
> > !compaction_deferred(zone, cc->order))
> > compact_zone(zone, cc);
> >
> >-if (cc->order > 0) {
> >-if (zone_watermark_ok(zone, cc->order,
> >-low_wmark_pages(zone), 0, 0))
> >-compaction_defer_reset(zone, cc->order, false);
> >-}
> >-
> > VM_BUG_ON(!list_empty(&cc->freepages));
> > VM_BUG_ON(!list_empty(&cc->migratepages));
> >+
> >+if (is_via_compact_memory(cc->order))
> >+continue;
> 
> That's fine.
> 
> >+if (!zone_watermark_ok(zone, cc->order,
> >+low_wmark_pages(zone), 0, 0))
> >+continue;
> >+
> >+compaction_defer_reset(zone, cc->order, false);
> 
> Here I'd personally find the way of "if(watermark_ok) defer_reset()"
> logic easier to follow.

Okay. Will change it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

2015-11-24 Thread Josh Triplett
On November 24, 2015 4:10:48 PM PST, Andy Lutomirski  
wrote:
>On Tue, Nov 24, 2015 at 2:42 PM, Josh Triplett 
>wrote:
>>>  textdata bss dec hex filename
>>> before:644896  127436 1189384 1961716  1deef4
>vmlinux
>>> after: 645446  131532 1189384 1966362  1e011a
>vmlinux
>>>
>>>   [Nr] Name  TypeAddr OffSize  
>ES Flg Lk Inf Al
>>> before:   [12] .altinstructions  PROGBITSc10bdf48 0bef48
>000680 00   A  0   0  1
>>> after:[12] .altinstructions  PROGBITSc10bff48 0c0f48
>0007d2 00   A  0   0  1
>>>
>>> before:   [13] .altinstr_replace PROGBITSc10be5c8 0bf5c8
>00016c 00  AX  0   0  1
>>> after:[13] .altinstr_replace PROGBITSc10c071a 0c171a
>0001ad 00  AX  0   0  1
>>>
>>> before:   [ 7] .data PROGBITSc1092000 093000
>0132a0 00  WA  0   0 4096
>>> after:[ 7] .data PROGBITSc1093000 094000
>0142a0 00  WA  0   0 4096
>>>
>>> So I'm wondering if we should make a config option which converts
>>> static_cpu_has* macros to boot_cpu_has()? That should slim down
>>> the kernel even more but it won't benefit from the speedup of the
>>> static_cpu_has* stuff.
>>>
>>> Josh, thoughts?
>>
>> Seems like a good idea to me: that would sacrifice a small amount of
>> runtime performance in favor of code size.  (Note that the config
>option
>> should use static_cpu_has when =y, and the slower, smaller method
>when
>> =n, so that "allnoconfig" can DTRT.)
>>
>> Given that many embedded systems will know exactly what CPU they want
>to
>> run on, I'd also love to see a way to set the capabilities of the CPU
>at
>> compile time, so that all those checks (and the code within them) can
>> constant-fold away.
>>
>
>As another idea, the alternatives infrastructure could plausibly be
>rearranged so that it never exists in memory in decompressed form.  We
>could decompress it streamily and process it as we go.

That doesn't help when running the uncompressed kernel in place, though. It'd 
be nice if every use of alternatives and similar mechanisms supported 
build-time resolution.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value

2015-11-24 Thread Joonsoo Kim
On Tue, Nov 24, 2015 at 09:36:09AM -0600, Christoph Lameter wrote:
> On Tue, 24 Nov 2015, Joonsoo Kim wrote:
> 
> > When I tested compaction in low memory condition, I found that
> > my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> > This stuck last for 1 sec and after then it can escape. More investigation
> > shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> > so it is stuck for 1 sec.
> 
> vmstat values are not designed to be accurate and are not guaranteed to be
> accurate. Comparing to specific values should not be done. If you need an
> accurate counter then please use another method of accounting like an
> atomic.

I think that maintaining duplicate counter to guarantee accuracy isn't
reasonable solution. It would cause more overhead to the system.

Although vmstat values aren't designed for accuracy, these are already
used by some sensitive places so it is better to be more accurate.
What this patch does is just adding current cpu's diff to global value
when retrieving in order to get more accurate value and this would not be
expensive. I think that it doesn't break any design principle of vmstat.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 00/16] MADV_FREE support

2015-11-24 Thread Minchan Kim
Hi Andrew,

On Tue, Nov 24, 2015 at 01:58:51PM -0800, Andrew Morton wrote:
> On Fri, 20 Nov 2015 17:02:32 +0900 Minchan Kim  wrote:
> 
> > I have been spent a lot of time to land MADV_FREE feature
> > by request of userland people(esp, Daniel and Jason, jemalloc guys.
> 
> A couple of things...
> 
> There's a massive and complex reject storm against Kirill's page-flags
> and thp changes.  The problems in hugetlb.c are more than I can
> reasonably fix up, sorry.  How would you feel about redoing the patches
> against next -mm?

No problem at all.

> 
> Secondly, "mm: introduce lazyfree LRU list" and "mm: support MADV_FREE
> on swapless system" are new, and require significant reviewer
> attention.  But there's so much other stuff flying around that I doubt
> if we'll get effective review.  So perhaps it would be best to shelve
> those new things and introduce them later, after the basic old
> MADV_FREE work has settled in?
> 

That's really what we(Daniel, Michael and me) want so far.
A people who is reluctant to it is Johannes who wanted to support
MADV_FREE on swapless system via new LRU from the beginning.

If Johannes is not strong against Andrew's plan, I will resend
new patchset(ie, not including new stuff) based on next -mmotm.

Hannes?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH (v2) 7/10] watchdog: bcm63xx_wdt: Add get_timeleft function

2015-11-24 Thread Guenter Roeck

On 11/24/2015 02:15 PM, Simon Arlott wrote:

Return the remaining time from the hardware control register.

Warn when the device is registered if the hardware watchdog is currently
running and report the remaining time left.


This is really two logical changes, isn't it ?

Nice trick to figure out if the watchdog is running.

What is the impact ? Will this result in interrupts ?
If so, would it make sense to _not_ reset the system after a timeout
in this case, but to keep pinging the watchdog while the watchdog device
is not open ?

Thanks,
Guenter



Signed-off-by: Simon Arlott 
---
Changed "if (timeleft > 0)" to "if (hw->running)" when checking if a
warning should be printed, in case the time left is truncated down to
0 seconds.

  drivers/watchdog/bcm63xx_wdt.c | 37 +
  1 file changed, 37 insertions(+)

diff --git a/drivers/watchdog/bcm63xx_wdt.c b/drivers/watchdog/bcm63xx_wdt.c
index 3c7667a..9d099e0 100644
--- a/drivers/watchdog/bcm63xx_wdt.c
+++ b/drivers/watchdog/bcm63xx_wdt.c
@@ -14,6 +14,7 @@
  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -75,6 +76,19 @@ static int bcm63xx_wdt_stop(struct watchdog_device *wdd)
return 0;
  }

+static unsigned int bcm63xx_wdt_get_timeleft(struct watchdog_device *wdd)
+{
+   struct bcm63xx_wdt_hw *hw = watchdog_get_drvdata(wdd);
+   unsigned long flags;
+   u32 val;
+
+   raw_spin_lock_irqsave(&hw->lock, flags);
+   val = __raw_readl(hw->regs + WDT_CTL_REG);
+   val /= hw->clock_hz;
+   raw_spin_unlock_irqrestore(&hw->lock, flags);
+   return val;
+}
+
  static int bcm63xx_wdt_set_timeout(struct watchdog_device *wdd,
unsigned int timeout)
  {
@@ -130,6 +144,7 @@ static struct watchdog_ops bcm63xx_wdt_ops = {
.owner = THIS_MODULE,
.start = bcm63xx_wdt_start,
.stop = bcm63xx_wdt_stop,
+   .get_timeleft = bcm63xx_wdt_get_timeleft,
.set_timeout = bcm63xx_wdt_set_timeout,
  };

@@ -144,6 +159,8 @@ static int bcm63xx_wdt_probe(struct platform_device *pdev)
struct bcm63xx_wdt_hw *hw;
struct watchdog_device *wdd;
struct resource *r;
+   u32 timeleft1, timeleft2;
+   unsigned int timeleft;
int ret;

hw = devm_kzalloc(&pdev->dev, sizeof(*hw), GFP_KERNEL);
@@ -197,6 +214,23 @@ static int bcm63xx_wdt_probe(struct platform_device *pdev)
watchdog_init_timeout(wdd, 0, &pdev->dev);
watchdog_set_nowayout(wdd, nowayout);

+   /* Compare two reads of the time left value, 2 clock ticks apart */
+   rmb();
+   timeleft1 = __raw_readl(hw->regs + WDT_CTL_REG);
+   udelay(DIV_ROUND_UP(100, hw->clock_hz / 2));
+   /* Ensure the register is read twice */
+   rmb();
+   timeleft2 = __raw_readl(hw->regs + WDT_CTL_REG);
+
+   /* If the time left is changing, the watchdog is running */
+   if (timeleft1 != timeleft2) {
+   hw->running = true;
+   timeleft = bcm63xx_wdt_get_timeleft(wdd);
+   } else {
+   hw->running = false;
+   timeleft = 0;
+   }
+
ret = bcm63xx_timer_register(TIMER_WDT_ID, bcm63xx_wdt_isr, wdd);
if (ret < 0) {
dev_err(&pdev->dev, "failed to register wdt timer isr\n");
@@ -214,6 +248,8 @@ static int bcm63xx_wdt_probe(struct platform_device *pdev)
dev_name(wdd->dev), hw->regs,
wdd->timeout, wdd->max_timeout);

+   if (hw->running)
+   dev_alert(wdd->dev, "running, reboot in %us\n", timeleft);
return 0;

  unregister_timer:
@@ -255,6 +291,7 @@ module_platform_driver(bcm63xx_wdt_driver);

  MODULE_AUTHOR("Miguel Gaio ");
  MODULE_AUTHOR("Florian Fainelli ");
+MODULE_AUTHOR("Simon Arlott");
  MODULE_DESCRIPTION("Driver for the Broadcom BCM63xx SoC watchdog");
  MODULE_LICENSE("GPL");
  MODULE_ALIAS("platform:bcm63xx-wdt");



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/10] watchdog: bcm63xx_wdt: Use WATCHDOG_CORE

2015-11-24 Thread Guenter Roeck

Hi Simon,

On 11/22/2015 06:06 AM, Simon Arlott wrote:

Convert bcm63xx_wdt to use WATCHDOG_CORE.

The default and maximum time constants that are only used once have been
moved to the initialisation of the struct watchdog_device.


Comments inline.

Thanks,
Guenter


Signed-off-by: Simon Arlott 
---
  drivers/watchdog/Kconfig   |   1 +
  drivers/watchdog/bcm63xx_wdt.c | 249 -
  2 files changed, 74 insertions(+), 176 deletions(-)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 7a8a6c6..6815b74 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1273,6 +1273,7 @@ config OCTEON_WDT
  config BCM63XX_WDT
tristate "Broadcom BCM63xx hardware watchdog"
depends on BCM63XX
+   select WATCHDOG_CORE
help
  Watchdog driver for the built in watchdog hardware in Broadcom
  BCM63xx SoC.
diff --git a/drivers/watchdog/bcm63xx_wdt.c b/drivers/watchdog/bcm63xx_wdt.c
index f88fc97..1d2a501 100644
--- a/drivers/watchdog/bcm63xx_wdt.c
+++ b/drivers/watchdog/bcm63xx_wdt.c
@@ -13,20 +13,15 @@

  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

-#include 
  #include 
-#include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
-#include 
  #include 
  #include 

@@ -38,53 +33,57 @@
  #define PFX KBUILD_MODNAME

  #define WDT_HZ5000/* Fclk */
-#define WDT_DEFAULT_TIME   30  /* seconds */
-#define WDT_MAX_TIME   (0x / WDT_HZ)   /* seconds */

  struct bcm63xx_wdt_hw {
raw_spinlock_t lock;
void __iomem *regs;
-   unsigned long inuse;
bool running;


The "running" flag should no longer be needed. watchdog_active()
should provide that information.


  };
-static struct bcm63xx_wdt_hw bcm63xx_wdt_device;

-static int expect_close;
-
-static int wdt_time = WDT_DEFAULT_TIME;
  static bool nowayout = WATCHDOG_NOWAYOUT;
  module_param(nowayout, bool, 0);
  MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
__MODULE_STRING(WATCHDOG_NOWAYOUT) ")");

-/* HW functions */
-static void bcm63xx_wdt_hw_start(void)
+static int bcm63xx_wdt_start(struct watchdog_device *wdd)
  {
+   struct bcm63xx_wdt_hw *hw = watchdog_get_drvdata(wdd);
unsigned long flags;

-   raw_spin_lock_irqsave(&bcm63xx_wdt_device.lock, flags);
-   bcm_writel(wdt_time * WDT_HZ, bcm63xx_wdt_device.regs + WDT_DEFVAL_REG);
-   bcm_writel(WDT_START_1, bcm63xx_wdt_device.regs + WDT_CTL_REG);
-   bcm_writel(WDT_START_2, bcm63xx_wdt_device.regs + WDT_CTL_REG);
-   bcm63xx_wdt_device.running = true;
-   raw_spin_unlock_irqrestore(&bcm63xx_wdt_device.lock, flags);
+   raw_spin_lock_irqsave(&hw->lock, flags);
+   bcm_writel(wdd->timeout * WDT_HZ, hw->regs + WDT_DEFVAL_REG);
+   bcm_writel(WDT_START_1, hw->regs + WDT_CTL_REG);
+   bcm_writel(WDT_START_2, hw->regs + WDT_CTL_REG);
+   hw->running = true;
+   raw_spin_unlock_irqrestore(&hw->lock, flags);
+   return 0;
  }

-static void bcm63xx_wdt_hw_stop(void)
+static int bcm63xx_wdt_stop(struct watchdog_device *wdd)
  {
+   struct bcm63xx_wdt_hw *hw = watchdog_get_drvdata(wdd);
unsigned long flags;

-   raw_spin_lock_irqsave(&bcm63xx_wdt_device.lock, flags);
-   bcm_writel(WDT_STOP_1, bcm63xx_wdt_device.regs + WDT_CTL_REG);
-   bcm_writel(WDT_STOP_2, bcm63xx_wdt_device.regs + WDT_CTL_REG);
-   bcm63xx_wdt_device.running = false;
-   raw_spin_unlock_irqrestore(&bcm63xx_wdt_device.lock, flags);
+   raw_spin_lock_irqsave(&hw->lock, flags);
+   bcm_writel(WDT_STOP_1, hw->regs + WDT_CTL_REG);
+   bcm_writel(WDT_STOP_2, hw->regs + WDT_CTL_REG);
+   hw->running = false;
+   raw_spin_unlock_irqrestore(&hw->lock, flags);
+   return 0;
+}
+
+static int bcm63xx_wdt_set_timeout(struct watchdog_device *wdd,
+   unsigned int timeout)
+{
+   wdd->timeout = timeout;
+   return bcm63xx_wdt_start(wdd);


If I see correctly, there is no ping function. In that case, the watchdog core
will call the start function after updating the timeout, so there is no need
to do it here.


  }

  /* The watchdog interrupt occurs when half the timeout is remaining */
  static void bcm63xx_wdt_isr(void *data)
  {
-   struct bcm63xx_wdt_hw *hw = &bcm63xx_wdt_device;
+   struct watchdog_device *wdd = data;
+   struct bcm63xx_wdt_hw *hw = watchdog_get_drvdata(wdd);
unsigned long flags;

raw_spin_lock_irqsave(&hw->lock, flags);
@@ -118,147 +117,36 @@ static void bcm63xx_wdt_isr(void *data)
}

ms = timeleft / (WDT_HZ / 1000);
-   pr_alert("warning timer fired, reboot in %ums\n", ms);
+   dev_alert(wdd->dev,
+   "warning timer fired, reboot in %ums\n", ms);
}
raw_spin_unlock_irqrestore(&hw->lock

Re: [PATCH 0/2] arcmsr: support areca new adapter ARC1203

2015-11-24 Thread Ching Huang
On Tue, 2015-11-24 at 16:07 +0100, Hannes Reinecke wrote:
> On 11/24/2015 09:00 AM, Ching Huang wrote:
> > From: Ching Huang 
> > 
> > Patch 1 fixes getting wrong configuration data.
> > 
> > Pacth 2 adds codes to support new adapter ARC1203.
> > 
Patch 3 changes driver version number.
> Please split off the driver version update into a separate patch and
> add it as the last patch in the series.
> Otherwise it'll be hard to keep track of which patch belongs to
> which version.
> Thanks.
> 
> Cheers,
> 
> Hannes
> 
OK. So I have patch 3 as above.
I will resend v2 all 3 patches later. 

Thanks, 
Ching
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm, vmstat: Allow WQ concurrency to discover memory reclaim doesn't make any progress

2015-11-24 Thread Joonsoo Kim
On Tue, Nov 24, 2015 at 03:44:48PM -0800, Andrew Morton wrote:
> On Thu, 19 Nov 2015 13:30:53 +0100 Michal Hocko  wrote:
> 
> > From: Michal Hocko 
> > 
> > Tetsuo Handa has reported that the system might basically livelock in OOM
> > condition without triggering the OOM killer. The issue is caused by
> > internal dependency of the direct reclaim on vmstat counter updates (via
> > zone_reclaimable) which are performed from the workqueue context.
> > If all the current workers get assigned to an allocation request,
> > though, they will be looping inside the allocator trying to reclaim
> > memory but zone_reclaimable can see stalled numbers so it will consider
> > a zone reclaimable even though it has been scanned way too much. WQ
> > concurrency logic will not consider this situation as a congested workqueue
> > because it relies that worker would have to sleep in such a situation.
> > This also means that it doesn't try to spawn new workers or invoke
> > the rescuer thread if the one is assigned to the queue.
> > 
> > In order to fix this issue we need to do two things. First we have to
> > let wq concurrency code know that we are in trouble so we have to do
> > a short sleep. In order to prevent from issues handled by 0e093d99763e
> > ("writeback: do not sleep on the congestion queue if there are no
> > congested BDIs or if significant congestion is not being encountered in
> > the current zone") we limit the sleep only to worker threads which are
> > the ones of the interest anyway.
> > 
> > The second thing to do is to create a dedicated workqueue for vmstat and
> > mark it WQ_MEM_RECLAIM to note it participates in the reclaim and to
> > have a spare worker thread for it.
> 
> This vmstat update thing is being a problem.  Please see Joonsoo's
> "mm/vmstat: retrieve more accurate vmstat value".
> 
> Joonsoo, might this patch help with that issue?

That issue cannot be solved by this patch. This patch solves blocking
vmstat updator problem but that issue is caused by long update delay
(not blocking). In there, update happens every 1 sec as usuall.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 4/4] devicetree: update documentation for fw_cfg ARM bindings

2015-11-24 Thread Rob Herring
On Mon, Nov 23, 2015 at 10:57:44AM -0500, Gabriel L. Somlo wrote:
> From: Gabriel Somlo 
> 
> Remove fw_cfg hardware interface details from
> Documentation/devicetree/bindings/arm/fw-cfg.txt,
> and replace them with a pointer to the authoritative
> documentation in the QEMU source tree.
> 
> Signed-off-by: Gabriel Somlo 
> Cc: Laszlo Ersek 

Acked-by: Rob Herring 

> ---
>  Documentation/devicetree/bindings/arm/fw-cfg.txt | 38 
> ++--
>  1 file changed, 2 insertions(+), 36 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/arm/fw-cfg.txt 
> b/Documentation/devicetree/bindings/arm/fw-cfg.txt
> index 953fb64..ce27386 100644
> --- a/Documentation/devicetree/bindings/arm/fw-cfg.txt
> +++ b/Documentation/devicetree/bindings/arm/fw-cfg.txt
> @@ -11,43 +11,9 @@ QEMU exposes the control and data register to ARM guests 
> as memory mapped
>  registers; their location is communicated to the guest's UEFI firmware in the
>  DTB that QEMU places at the bottom of the guest's DRAM.
>  
> -The guest writes a selector value (a key) to the selector register, and then
> -can read the corresponding data (produced by QEMU) via the data register. If
> -the selected entry is writable, the guest can rewrite it through the data
> -register.
> +The authoritative guest-side hardware interface documentation to the fw_cfg
> +device ca be found in "docs/specs/fw_cfg.txt" in the QEMU source tree.
>  
> -The selector register takes keys in big endian byte order.
> -
> -The data register allows accesses with 8, 16, 32 and 64-bit width (only at
> -offset 0 of the register). Accesses larger than a byte are interpreted as
> -arrays, bundled together only for better performance. The bytes constituting
> -such a word, in increasing address order, correspond to the bytes that would
> -have been transferred by byte-wide accesses in chronological order.
> -
> -The interface allows guest firmware to download various parameters and blobs
> -that affect how the firmware works and what tables it installs for the guest
> -OS. For example, boot order of devices, ACPI tables, SMBIOS tables, kernel 
> and
> -initrd images for direct kernel booting, virtual machine UUID, SMP 
> information,
> -virtual NUMA topology, and so on.
> -
> -The authoritative registry of the valid selector values and their meanings is
> -the QEMU source code; the structure of the data blobs corresponding to the
> -individual key values is also defined in the QEMU source code.
> -
> -The presence of the registers can be verified by selecting the "signature" 
> blob
> -with key 0x, and reading four bytes from the data register. The returned
> -signature is "QEMU".
> -
> -The outermost protocol (involving the write / read sequences of the control 
> and
> -data registers) is expected to be versioned, and/or described by feature 
> bits.
> -The interface revision / feature bitmap can be retrieved with key 0x0001. The
> -blob to be read from the data register has size 4, and it is to be 
> interpreted
> -as a uint32_t value in little endian byte order. The current value
> -(corresponding to the above outer protocol) is zero.
> -
> -The guest kernel is not expected to use these registers (although it is
> -certainly allowed to); the device tree bindings are documented here because
> -this is where device tree bindings reside in general.
>  
>  Required properties:
>  
> -- 
> 2.4.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >