Re: [PATCH RESEND] powerpc/numa: initialize distance lookup table from drconf path

2015-08-11 Thread Nikunj A Dadhania

Hi Michael,

Nikunj A Dadhania  writes:
> In some situations, a NUMA guest that supports
> ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
> distances between nodes. This is because of two problems in the
> current code.
>
> 1) Different representations of associativity lists.
>
>There is an assumption about the associativity list in
>initialize_distance_lookup_table(). Associativity list has two forms:
>
>a) [cpu,memory]@x/ibm,associativity has following
>   format:
> 
>
>b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays
>
>  
>M = the number of associativity lists
>N = the number of entries per associativity list
>
>Fix initialize_distance_lookup_table() so that it does not assume
>"case a". And update the caller to skip the length field before
>sending the associativity list.
>
> 2) Distance table not getting updated from drconf path.
>
>Node distance table will not get initialized in certain cases as
>ibm,dynamic-reconfiguration-memory path does not initialize the
>lookup table.
>
>Call initialize_distance_lookup_table() from drconf path with
>appropriate associativity list.
>
> Reported-by: Bharata B Rao 
> Signed-off-by: Nikunj A Dadhania 
> Acked-by: Anton Blanchard 

Have you pulled this?

Regards,
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] powerpc/numa: initialize distance lookup table from drconf path

2015-07-01 Thread Nikunj A Dadhania
In some situations, a NUMA guest that supports
ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
distances between nodes. This is because of two problems in the
current code.

1) Different representations of associativity lists.

   There is an assumption about the associativity list in
   initialize_distance_lookup_table(). Associativity list has two forms:

   a) [cpu,memory]@x/ibm,associativity has following
  format:


   b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays

 
   M = the number of associativity lists
   N = the number of entries per associativity list

   Fix initialize_distance_lookup_table() so that it does not assume
   "case a". And update the caller to skip the length field before
   sending the associativity list.

2) Distance table not getting updated from drconf path.

   Node distance table will not get initialized in certain cases as
   ibm,dynamic-reconfiguration-memory path does not initialize the
   lookup table.

   Call initialize_distance_lookup_table() from drconf path with
   appropriate associativity list.

Reported-by: Bharata B Rao 
Signed-off-by: Nikunj A Dadhania 
Acked-by: Anton Blanchard 
---

* Rebased to mpe/next
* Dropped RFC tag
* Updated commit log

 arch/powerpc/mm/numa.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 5e80621..8b9502a 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -225,7 +225,7 @@ static void initialize_distance_lookup_table(int nid,
for (i = 0; i < distance_ref_points_depth; i++) {
const __be32 *entry;
 
-   entry = &associativity[be32_to_cpu(distance_ref_points[i])];
+   entry = &associativity[be32_to_cpu(distance_ref_points[i]) - 1];
distance_lookup_table[nid][i] = of_read_number(entry, 1);
}
 }
@@ -248,8 +248,12 @@ static int associativity_to_nid(const __be32 
*associativity)
nid = -1;
 
if (nid > 0 &&
-   of_read_number(associativity, 1) >= distance_ref_points_depth)
-   initialize_distance_lookup_table(nid, associativity);
+   of_read_number(associativity, 1) >= distance_ref_points_depth) {
+   /*
+* Skip the length field and send start of associativity array
+*/
+   initialize_distance_lookup_table(nid, associativity + 1);
+   }
 
 out:
return nid;
@@ -507,6 +511,12 @@ static int of_drconf_to_nid_single(struct of_drconf_cell 
*drmem,
 
if (nid == 0x || nid >= MAX_NUMNODES)
nid = default_nid;
+
+   if (nid > 0) {
+   index = drmem->aa_index * aa->array_sz;
+   initialize_distance_lookup_table(nid,
+   &aa->arrays[index]);
+   }
}
 
return nid;
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] powerpc/numa: initialize distance lookup table from drconf path

2015-06-25 Thread Nikunj A Dadhania

Hi Anton/Michael,

Nikunj A Dadhania  writes:
> Hi Anton,
>
> Anton Blanchard  writes:
>> Hi Nikunj,
>>
>>> From: Nikunj A Dadhania 
>>> 
>>> powerpc/numa: initialize distance lookup table from drconf path
>>> 
>>> In some situations, a NUMA guest that supports
>>> ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
>>> distances between nodes. This is because of two problems in the
>>> current code.
>>
>> Thanks for the patch. Have we tested that this doesn't regress the
>> non dynamic representation?
>
> Yes, that is tested. And works as expected.

If the patch looks fine, can this be pushed upstream ?

Regards,
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] powerpc/numa: initialize distance lookup table from drconf path

2015-06-23 Thread Nikunj A Dadhania

Hi Anton,

Anton Blanchard  writes:
> Hi Nikunj,
>
>> From: Nikunj A Dadhania 
>> 
>> powerpc/numa: initialize distance lookup table from drconf path
>> 
>> In some situations, a NUMA guest that supports
>> ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
>> distances between nodes. This is because of two problems in the
>> current code.
>
> Thanks for the patch. Have we tested that this doesn't regress the
> non dynamic representation?

Yes, that is tested. And works as expected.

Regards
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] powerpc/numa: initialize distance lookup table from drconf path

2015-06-15 Thread Nikunj A Dadhania
Nikunj A Dadhania  writes:

> Reworded commit log:
>
> From: Nikunj A Dadhania 
>
> powerpc/numa: initialize distance lookup table from drconf path
>

Ping ?

Regards
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] powerpc/numa: initialize distance lookup table from drconf path

2015-06-09 Thread Nikunj A Dadhania

Reworded commit log:

From: Nikunj A Dadhania 

powerpc/numa: initialize distance lookup table from drconf path

In some situations, a NUMA guest that supports
ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
distances between nodes. This is because of two problems in the
current code.

1) Different representations of associativity lists.

   There is an assumption about the associativity list in
   initialize_distance_lookup_table(). Associativity list has two forms:

   a) [cpu,memory]@x/ibm,associativity has following
  format:


   b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays

 
   M = the number of associativity lists
   N = the number of entries per associativity list

   Fix initialize_distance_lookup_table() so that it does not assume
   "case a". And update the caller to skip the length field before
   sending the associativity list.

2) Distance table not getting updated from drconf path.

   Node distance table will not get initialized in certain cases as
   ibm,dynamic-reconfiguration-memory path does not initialize the
   lookup table.

   Call initialize_distance_lookup_table() from drconf path with
   appropriate associativity list.

Reported-by: Bharata B Rao 
Signed-off-by: Nikunj A Dadhania 
---
 arch/powerpc/mm/numa.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 5e80621..8b9502a 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -225,7 +225,7 @@ static void initialize_distance_lookup_table(int nid,
for (i = 0; i < distance_ref_points_depth; i++) {
const __be32 *entry;
 
-   entry = &associativity[be32_to_cpu(distance_ref_points[i])];
+   entry = &associativity[be32_to_cpu(distance_ref_points[i]) - 1];
distance_lookup_table[nid][i] = of_read_number(entry, 1);
}
 }
@@ -248,8 +248,12 @@ static int associativity_to_nid(const __be32 
*associativity)
nid = -1;
 
if (nid > 0 &&
-   of_read_number(associativity, 1) >= distance_ref_points_depth)
-   initialize_distance_lookup_table(nid, associativity);
+   of_read_number(associativity, 1) >= distance_ref_points_depth) {
+   /*
+* Skip the length field and send start of associativity array
+*/
+   initialize_distance_lookup_table(nid, associativity + 1);
+   }
 
 out:
return nid;
@@ -507,6 +511,12 @@ static int of_drconf_to_nid_single(struct of_drconf_cell 
*drmem,
 
if (nid == 0x || nid >= MAX_NUMNODES)
nid = default_nid;
+
+   if (nid > 0) {
+   index = drmem->aa_index * aa->array_sz;
+   initialize_distance_lookup_table(nid,
+   &aa->arrays[index]);
+   }
}
 
return nid;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH] powerpc/numa: initialize distance lookup table from drconf path

2015-06-09 Thread Nikunj A Dadhania
Node distance will not get initialized in certain cases as
dynamic-reconfiguration path does not initialize the lookup table.

There is an assumption about the associativity list in
initialize_distance_lookup_table(). Associativity list has two forms:

a) [cpu,memory]@x/ibm,associativity has following
   format:

 

b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays

 
   M = the number of associativity lists
   N = the number of entries per associativity list

Fix initialize_distance_lookup_table() so that it does not assume
"case a". And update the caller to skip the length field before
sending the associativity list.

Call initialize_distance_lookup_table() from drconf path with
appropriate associativity list.

Reported-by: Bharata B Rao 
Signed-off-by: Nikunj A Dadhania 
---
 arch/powerpc/mm/numa.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 5e80621..8b9502a 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -225,7 +225,7 @@ static void initialize_distance_lookup_table(int nid,
for (i = 0; i < distance_ref_points_depth; i++) {
const __be32 *entry;
 
-   entry = &associativity[be32_to_cpu(distance_ref_points[i])];
+   entry = &associativity[be32_to_cpu(distance_ref_points[i]) - 1];
distance_lookup_table[nid][i] = of_read_number(entry, 1);
}
 }
@@ -248,8 +248,12 @@ static int associativity_to_nid(const __be32 
*associativity)
nid = -1;
 
if (nid > 0 &&
-   of_read_number(associativity, 1) >= distance_ref_points_depth)
-   initialize_distance_lookup_table(nid, associativity);
+   of_read_number(associativity, 1) >= distance_ref_points_depth) {
+   /*
+* Skip the length field and send start of associativity array
+*/
+   initialize_distance_lookup_table(nid, associativity + 1);
+   }
 
 out:
return nid;
@@ -507,6 +511,12 @@ static int of_drconf_to_nid_single(struct of_drconf_cell 
*drmem,
 
if (nid == 0x || nid >= MAX_NUMNODES)
nid = default_nid;
+
+   if (nid > 0) {
+   index = drmem->aa_index * aa->array_sz;
+   initialize_distance_lookup_table(nid,
+   &aa->arrays[index]);
+   }
}
 
return nid;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

2013-01-11 Thread Nikunj A Dadhania
Hi Michael,

Michael Wang  writes:
>   Prev:
>   +-+-+---+
>   | 7484 MB |  32 | 42463 |
>   Post:
>   | 7483 MB |  32 | 44185 |   +0.18%
That should be +4.05%

Regards
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-10-11 Thread Nikunj A Dadhania
On Wed, 10 Oct 2012 09:24:55 -0500, Andrew Theurer 
 wrote:
> 
> Below is again 8 x 20-way VMs, but this time I tried out Nikunj's gang
> scheduling patches.  While I am not recommending gang scheduling, I
> think it's a good data point.  The performance is 3.88x the PLE result.
> 
> https://docs.google.com/open?id=0B6tfUNlZ-14wWXdscWcwNTVEY3M

That looks pretty good and serves the purpose. And the result says it all.

> Note that the task switching intervals of 4ms are quite obvious again,
> and this time all vCPUs from same VM run at the same time.  It
> represents the best possible outcome.
> 
> 
> Anyway, I thought the bitmaps might help better visualize what's going
> on.
> 
> -Andrew
> 

Regards
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] printk: add option to print cpu id

2012-08-03 Thread Nikunj A Dadhania
On Fri, 3 Aug 2012 02:16:18 -0700, Vikram Pandita  wrote:
> From: Vikram Pandita 
> 
> Introduce config option to enable CPU id reporting for printk() calls.
> 
> Example logs with this option enabled look like:
>  [1] [2.328613] usbcore: registered new interface driver libusual
>  [1] [2.335418] usbcore: registered new interface driver usbtest
>  [1] [2.342803] mousedev: PS/2 mouse device common for all mice
>  [0] [2.352600] twl_rtc twl_rtc: Power up reset detected.
>  [0] [2.359191] twl_rtc twl_rtc: Enabling TWL-RTC
>  [1] [2.367797] twl_rtc twl_rtc: rtc core: registered twl_rtc as rtc0
>  [1] [2.375274] i2c /dev entries driver
>  [1] [2.382324] Driver for 1-wire Dallas network protocol.
> 
> Its sometimes very useful to have printk also print the CPU Identifier
> that executed the call. This has helped to debug various SMP issues on 
> shipping
> products.
> 
> Known limitation is if the system gets preempted between function call and
> actual printk, the reported cpu-id might not be accurate. But most of the
> times its seen to give a good feel of how the N cpu's in the system are
> getting loaded.
> 
> Signed-off-by: Vikram Pandita 
> Cc: Kay Sievers 
> Cc: Mike Turquette 
> Cc: Vimarsh Zutshi 
> ---
> v1: initial version - had wrong cpuid logging mechanism
> v2: fixed as per review comments from Kay Sievers
> 
>  kernel/printk.c   |   51 +--
>  lib/Kconfig.debug |   13 +
>  2 files changed, 54 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 6a76ab9..64f4a1b 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -208,6 +208,7 @@ struct log {
>   u8 facility;/* syslog facility */
>   u8 flags:5; /* internal record flags */
>   u8 level:3; /* syslog level */
> + u8 cpuid;   /* cpu invoking the log */
>  };

That would be sufficient for only 256 cpus. Is that what you intend?

There are systems which will have much higher numbers than this limit.

>  /*
> @@ -305,7 +306,8 @@ static u32 log_next(u32 idx)
>  static void log_store(int facility, int level,
> enum log_flags flags, u64 ts_nsec,
> const char *dict, u16 dict_len,
> -   const char *text, u16 text_len)
> +   const char *text, u16 text_len,
> +   const u8 cpuid)
>  {
>   struct log *msg;
>   u32 size, pad_len;
> @@ -356,6 +358,7 @@ static void log_store(int facility, int level,
>   msg->ts_nsec = local_clock();
>   memset(log_dict(msg) + dict_len, 0, pad_len);
>   msg->len = sizeof(struct log) + text_len + dict_len + pad_len;
> + msg->cpuid = cpuid;
> 
>   /* insert message */
>   log_next_idx += msg->len;
> @@ -855,6 +858,25 @@ static size_t print_time(u64 ts, char *buf)
>  (unsigned long)ts, rem_nsec / 1000);
>  }
> 
> +#if defined(CONFIG_PRINTK_CPUID)
> +static bool printk_cpuid = 1;
> +#else
> +static bool printk_cpuid;
> +#endif
> +module_param_named(cpuid, printk_cpuid, bool, S_IRUGO | S_IWUSR);
> +
> +static size_t print_cpuid(u8 cpuid, char *buf)
> +{
> +
> + if (!printk_cpuid)
> + return 0;
> +
> + if (!buf)
> + return 4;
> +
Firstly, why this magic number?
Secondly, if buf is NULL, why should you increment?

> + return sprintf(buf, "[%1d] ", cpuid);
> +}
> +
>  static size_t print_prefix(const struct log *msg, bool syslog, char *buf)
>  {
>   size_t len = 0;

Regards
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/7] kvm,x86: RCU based table free

2012-08-01 Thread Nikunj A Dadhania
Hi Stefano,

On Wed, 1 Aug 2012 12:23:37 +0100, Stefano Stabellini 
 wrote:
> On Tue, 5 Jun 2012, Stefano Stabellini wrote:
> > On Tue, 5 Jun 2012, Peter Zijlstra wrote:
> > > On Tue, 2012-06-05 at 18:34 +0530, Nikunj A Dadhania wrote:
> > > > PeterZ, is 7/7 alright to be picked?
> > > 
> > > Yeah, I guess it is.. I haven't had time to rework my tlb series yet
> > > though. But these two patches together should make it work for x86.
> > > 
> > 
> > Good. Do you think they are OK for 3.5-rc2? Or is it better to wait for
> > 3.6?
> > 
> 
> Hello Nikunj,
> what happened to this patch series?
> In particular I am interested in the following two patches:
> 
> kvm,x86: RCU based table free
> Flush page-table pages before freeing them
> 
> do you still intend to carry on with the development? Is there anything
> missing that is preventing them from going upstream?
>
I have posted a v3 on the kvm-list:
http://www.spinics.net/lists/kvm/msg76955.html

I am carrying the above two patches(with one fix) in my series as well
for completeness. 

I have picked up the patches from PeterZ's "Unify TLB gather
implementations -v3"
http://article.gmane.org/gmane.linux.kernel.mm/81278

Regards
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/20] mm: Add optional TLB flush to generic RCU page-table freeing

2012-07-23 Thread Nikunj A Dadhania
On Thu, 28 Jun 2012 01:01:46 +0200, Peter Zijlstra  
wrote:
  
> +#ifdef CONFIG_STRICT_TLB_FILL
> +/*
> + * Some archictures (sparc64, ppc) cannot refill TLBs after the they've 
> removed
> + * the PTE entries from their hash-table. Their hardware never looks at the
> + * linux page-table structures, so they don't need a hardware TLB invalidate
> + * when tearing down the page-table structure itself.
> + */
> +static inline void tlb_table_flush_mmu(struct mmu_gather *tlb) { }
> +#else
> +static inline void tlb_table_flush_mmu(struct mmu_gather *tlb)
> +{
> + tlb_flush_mmu(tlb);
> +}
> +#endif
> +
>  void tlb_table_flush(struct mmu_gather *tlb)
>  {
>   struct mmu_table_batch **batch = &tlb->batch;
>  
>   if (*batch) {
> + tlb_table_flush_mmu(tlb);
>   call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu);
>   *batch = NULL;
>   }

Hi Peter,

When running munmap(https://lkml.org/lkml/2012/5/17/59) test with KVM
and pvflush patches I got a crash. I have verified that the crash
happens on the base(non virt) as well when I have
CONFIG_HAVE_RCU_TABLE_FREE defined. Here is the crash details and my
analysis below:

---

BUG: unable to handle kernel NULL pointer dereference at 0008
IP: [] __call_rcu+0x29/0x1c0
PGD 0 
Oops: 0002 [#1] SMP 
CPU 24 
Modules linked in: kvm_intel kvm [last unloaded: scsi_wait_scan]


Pid: 32643, comm: munmap Not tainted 3.5.0-rc7+ #46 IBM System x3850 X5 
-[7042CR6]-[root@mx3850x5 ~/Node 1, Processor Card]# 
RIP: 0010:[]  [] __call_rcu+0x29/0x1c0
RSP: 0018:88203164fc28  EFLAGS: 00010246
RAX: 88203164fba8 RBX:  RCX: 
RDX: 81e34280 RSI: 81130330 RDI: 
RBP: 88203164fc58 R08: ea00d2680340 R09: 
R10: 883c7fbd4ef8 R11: 0078 R12: 81130330
R13: 7f09ee803000 R14: 883c2fa5bab0 R15: 88203164fe08
FS:  7f09ee7ee700() GS:883c7fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0008 CR3: 01e0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process munmap (pid: 32643, threadinfo 88203164e000, task 882030458a70)
Stack:
 883c2fa5bab0 88203164fe08 88203164fc68 88203164fe08
 88203164fe08 7f09ee803000 88203164fc68 810d33c7
 88203164fc88 81130e0d 88203164fc88 ea00d28e54f8
Call Trace:
 [] call_rcu_sched+0x17/0x20
 [] tlb_table_flush+0x2d/0x40
 [] tlb_remove_table+0x60/0xc0
 [] ___pte_free_tlb+0x63/0x70
 [] free_pgd_range+0x298/0x4b0
 [] free_pgtables+0xce/0x120
 [] exit_mmap+0xa7/0x160
 [] mmput+0x6f/0xf0
 [] exit_mm+0x105/0x130
 [] ? taskstats_exit+0x17d/0x240
 [] do_exit+0x176/0x480
 [] do_group_exit+0x55/0xd0
 [] sys_exit_group+0x17/0x20
 [] system_call_fastpath+0x16/0x1b
Code: ff ff 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 
66 66 90 40 f6 c7 03 48 89 fb 49 89 f4 0f 85 19 01 00 00 <4c> 89 63 08 48 c7 03 
00 00 00 00 0f ae f0 9c 58 66 66 90 66 90 
RIP  [] __call_rcu+0x29/0x1c0
 RSP 
CR2: 0008
---[ end trace 3ed30a91ea7cb375 ]---



I think this is what is happening:

___pte_free_tlb
   tlb_remove_table
  tlb_table_flush
 tlb_table_flush_mmu
tlb_flush_mmu
Sets need_flush = 0
tlb_table_flush (if CONFIG_HAVE_RCU_TABLE_FREE)
[Gets called twice with same *tlb!]

tlb_table_flush_mmu
tlb_flush_mmu(nop as need_flush is 0)
call_rcu_sched(&(*batch)->rcu,...);
*batch = NULL;
 call_rcu_sched(&(*batch)->rcu,...); < *batch would be NULL

I verified this by putting following fix and do not see the crash
anymore:

diff --git a/mm/memory.c b/mm/memory.c
index 1797bc1..329fcb9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -367,7 +367,8 @@ void tlb_table_flush(struct mmu_gather *tlb)
 
if (*batch) {
tlb_table_flush_mmu(tlb);
-   call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu);
+   if(*batch)
+   call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu);
*batch = NULL;
}
 }

Thanks
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 1/2] kvm vcpu: Note down pause loop exit

2012-07-12 Thread Nikunj A Dadhania
On Wed, 11 Jul 2012 16:22:29 +0530, Raghavendra K T 
 wrote:
> On 07/11/2012 02:23 PM, Avi Kivity wrote:
> >
> > This adds some tiny overhead to vcpu entry.  You could remove it by
> > using the vcpu->requests mechanism to clear the flag, since
> > vcpu->requests is already checked on every entry.
> 
> So IIUC,  let's have request bit for indicating PLE,
> 
> pause_interception() /handle_pause()
> {
>   make_request(PLE_REQUEST)
>   vcpu_on_spin()
> 
> }
> 
> check_eligibility()
>   {
>   !test_request(PLE_REQUEST) || ( test_request(PLE_REQUEST)  && 
> dy_eligible())
> .
> .
> }
> 
> vcpu_run()
> {
> 
> check_request(PLE_REQUEST)
>
I know check_request will clear PLE_REQUEST, but you just need a
clear_request here, right?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 0/2] kvm: Improving directed yield in PLE handler

2012-07-12 Thread Nikunj A Dadhania
On Wed, 11 Jul 2012 14:04:03 +0300, Avi Kivity  wrote:
> 
> > So this would probably improve guests that uses cpu_relax, for example
> > stop_machine_run. I have no measurements, though.
> 
> smp_call_function() too (though that can be converted to directed yield
> too).  It seems worthwhile.
> 
With 

https://lkml.org/lkml/2012/6/26/266 in tip:x86/mm

which now uses smp_call_function_many in native_flush_tlb_others. It
will help that too.

Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/