On Wed, 18 Jul 2007, Joe Jin wrote:
> Sorry for I use a unclean kernel tree, the panic have gone ;)
> also, I will testing it again with other server, if I hit it I'll let you
> know.
That's a relief! Thanks,
Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
Sorry for I use a unclean kernel tree, the panic have gone ;)
also, I will testing it again with other server, if I hit it I'll let you know.
Thanks,
Joe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
> >
> > That is... surprising to me.
>
> To me also. I'd like to double-check the code which Joe actually
> tested, please - have a sneaking suspicion that the 2.6.22 version
> of that function was tested ;)
Maybe I make a fault? I'll testing it again then give you a report.
Thank,
Joe
-
To
On 2007-07-18 05:49, Hugh Dickins wrote:
> On Wed, 18 Jul 2007, Joe Jin wrote:
> >
> > With your patch, I have reproduced the panic:
>
> That is... surprising to me. (I hadn't been able to reproduce it with
> or without the patches: maybe I just need to try harder.) Please post
> your gcc
On 2007-07-18 05:49, Hugh Dickins wrote:
On Wed, 18 Jul 2007, Joe Jin wrote:
With your patch, I have reproduced the panic:
That is... surprising to me. (I hadn't been able to reproduce it with
or without the patches: maybe I just need to try harder.) Please post
your gcc --version,
That is... surprising to me.
To me also. I'd like to double-check the code which Joe actually
tested, please - have a sneaking suspicion that the 2.6.22 version
of that function was tested ;)
Maybe I make a fault? I'll testing it again then give you a report.
Thank,
Joe
-
To
Sorry for I use a unclean kernel tree, the panic have gone ;)
also, I will testing it again with other server, if I hit it I'll let you know.
Thanks,
Joe
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
On Wed, 18 Jul 2007, Joe Jin wrote:
Sorry for I use a unclean kernel tree, the panic have gone ;)
also, I will testing it again with other server, if I hit it I'll let you
know.
That's a relief! Thanks,
Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body
On Wed, 18 Jul 2007 05:49:50 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote:
> > With your patch, I have reproduced the panic:
>
> That is... surprising to me.
To me also. I'd like to double-check the code which Joe actually
tested, please - have a sneaking suspicion that the 2.6.22
On Wed, 18 Jul 2007, Joe Jin wrote:
>
> With your patch, I have reproduced the panic:
That is... surprising to me. (I hadn't been able to reproduce it with
or without the patches: maybe I just need to try harder.) Please post
your gcc --version, and the disassembly (objdump -d) output for
With your patch, I have reproduced the panic:
Unable to handle kernel paging request at 186a RIP:
[] __alloc_pages+0x2f/0x2c3
PGD 72595067 PUD 72594067 PMD 0
Oops: [1] SMP
CPU 0
Modules linked in: xt_tcpudp iptable_filter ip_tables x_tables cpufreq_ondemand
dm_mirror
On Tue, 17 Jul 2007, Andrew Morton wrote:
>
> Given that we've now gone and added deliberate-but-we-hope-benign
> races into this code, an elaborate comment which explains and justifies
> it all is pretty much obligatory, IMO.
[PATCH] Remove nid_lock from alloc_fresh_huge_page
The fix to that
On Tue, 17 Jul 2007 20:49:25 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote:
> --- 2.6.22-git9/mm/hugetlb.c 2007-07-17 20:29:33.0 +0100
> +++ linux/mm/hugetlb.c2007-07-17 20:32:51.0 +0100
> @@ -107,15 +107,12 @@ static int alloc_fresh_huge_page(void)
> {
>
On Tue, 17 Jul 2007, Andrew Morton wrote:
> On Tue, 17 Jul 2007 18:26:14 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]>
> wrote:
> > > I think I like the lock ;)
> >
> > I hate to waste your time, but I'm still puzzled. Wasn't the race fixed
> > by your changeover from use of "static int nid"
On Tue, 17 Jul 2007 18:26:14 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote:
> On Tue, 17 Jul 2007, Andrew Morton wrote:
> > On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]>
> > wrote:
> > > On Thu, 12 Jul 2007, Andrew Morton wrote:
> > > >
> > > > It'd be much
On Tue, 17 Jul 2007, Andrew Morton wrote:
> On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]>
> wrote:
> > On Thu, 12 Jul 2007, Andrew Morton wrote:
> > >
> > > It'd be much better to fix the race within alloc_fresh_huge_page(). That
> > > function is pretty pathetic.
>
On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote:
> On Thu, 12 Jul 2007, Andrew Morton wrote:
> >
> > It'd be much better to fix the race within alloc_fresh_huge_page(). That
> > function is pretty pathetic.
> >
> > Something like this?
> >
> > ---
On Thu, 12 Jul 2007, Andrew Morton wrote:
>
> It'd be much better to fix the race within alloc_fresh_huge_page(). That
> function is pretty pathetic.
>
> Something like this?
>
> --- a/mm/hugetlb.c~a
> +++ a/mm/hugetlb.c
> @@ -105,13 +105,20 @@ static void free_huge_page(struct page *
>
>
On Thu, 12 Jul 2007, Andrew Morton wrote:
It'd be much better to fix the race within alloc_fresh_huge_page(). That
function is pretty pathetic.
Something like this?
--- a/mm/hugetlb.c~a
+++ a/mm/hugetlb.c
@@ -105,13 +105,20 @@ static void free_huge_page(struct page *
static int
On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins [EMAIL PROTECTED] wrote:
On Thu, 12 Jul 2007, Andrew Morton wrote:
It'd be much better to fix the race within alloc_fresh_huge_page(). That
function is pretty pathetic.
Something like this?
--- a/mm/hugetlb.c~a
+++
On Tue, 17 Jul 2007, Andrew Morton wrote:
On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins [EMAIL PROTECTED]
wrote:
On Thu, 12 Jul 2007, Andrew Morton wrote:
It'd be much better to fix the race within alloc_fresh_huge_page(). That
function is pretty pathetic.
Something
On Tue, 17 Jul 2007 18:26:14 +0100 (BST) Hugh Dickins [EMAIL PROTECTED] wrote:
On Tue, 17 Jul 2007, Andrew Morton wrote:
On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins [EMAIL PROTECTED]
wrote:
On Thu, 12 Jul 2007, Andrew Morton wrote:
It'd be much better to fix the race
On Tue, 17 Jul 2007, Andrew Morton wrote:
On Tue, 17 Jul 2007 18:26:14 +0100 (BST) Hugh Dickins [EMAIL PROTECTED]
wrote:
I think I like the lock ;)
I hate to waste your time, but I'm still puzzled. Wasn't the race fixed
by your changeover from use of static int nid throughout, to
On Tue, 17 Jul 2007 20:49:25 +0100 (BST) Hugh Dickins [EMAIL PROTECTED] wrote:
--- 2.6.22-git9/mm/hugetlb.c 2007-07-17 20:29:33.0 +0100
+++ linux/mm/hugetlb.c2007-07-17 20:32:51.0 +0100
@@ -107,15 +107,12 @@ static int alloc_fresh_huge_page(void)
{
static int
On Tue, 17 Jul 2007, Andrew Morton wrote:
Given that we've now gone and added deliberate-but-we-hope-benign
races into this code, an elaborate comment which explains and justifies
it all is pretty much obligatory, IMO.
[PATCH] Remove nid_lock from alloc_fresh_huge_page
The fix to that race
With your patch, I have reproduced the panic:
Unable to handle kernel paging request at 186a RIP:
[8105d4e7] __alloc_pages+0x2f/0x2c3
PGD 72595067 PUD 72594067 PMD 0
Oops: [1] SMP
CPU 0
Modules linked in: xt_tcpudp iptable_filter ip_tables x_tables cpufreq_ondemand
On Wed, 18 Jul 2007, Joe Jin wrote:
With your patch, I have reproduced the panic:
That is... surprising to me. (I hadn't been able to reproduce it with
or without the patches: maybe I just need to try harder.) Please post
your gcc --version, and the disassembly (objdump -d) output for
On Wed, 18 Jul 2007 05:49:50 +0100 (BST) Hugh Dickins [EMAIL PROTECTED] wrote:
With your patch, I have reproduced the panic:
That is... surprising to me.
To me also. I'd like to double-check the code which Joe actually
tested, please - have a sneaking suspicion that the 2.6.22 version
of
On 7/14/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
On Sat, 14 Jul 2007 10:40:25 -0700 "Nish Aravamudan" <[EMAIL PROTECTED]> wrote:
> On 7/13/07, Joe Jin <[EMAIL PROTECTED]> wrote:
> > >
> > > Patch gone too ;) I deleted it. I was hoping that you'd send me the final
> > > finished product
On Sat, 14 Jul 2007 10:40:25 -0700 "Nish Aravamudan" <[EMAIL PROTECTED]> wrote:
> On 7/13/07, Joe Jin <[EMAIL PROTECTED]> wrote:
> > >
> > > Patch gone too ;) I deleted it. I was hoping that you'd send me the final
> > > finished product (please).
> > >
> >
> > Ha.., the patch against 2.6.22, at
On 7/13/07, Joe Jin <[EMAIL PROTECTED]> wrote:
>
> Patch gone too ;) I deleted it. I was hoping that you'd send me the final
> finished product (please).
>
Ha.., the patch against 2.6.22, at your patch have use htlb_alloc_mask, but I
cannot found it at 2.6.22 kernel tree, I think you must use
On 7/13/07, Joe Jin [EMAIL PROTECTED] wrote:
Patch gone too ;) I deleted it. I was hoping that you'd send me the final
finished product (please).
Ha.., the patch against 2.6.22, at your patch have use htlb_alloc_mask, but I
cannot found it at 2.6.22 kernel tree, I think you must use
On Sat, 14 Jul 2007 10:40:25 -0700 Nish Aravamudan [EMAIL PROTECTED] wrote:
On 7/13/07, Joe Jin [EMAIL PROTECTED] wrote:
Patch gone too ;) I deleted it. I was hoping that you'd send me the final
finished product (please).
Ha.., the patch against 2.6.22, at your patch have use
On 7/14/07, Andrew Morton [EMAIL PROTECTED] wrote:
On Sat, 14 Jul 2007 10:40:25 -0700 Nish Aravamudan [EMAIL PROTECTED] wrote:
On 7/13/07, Joe Jin [EMAIL PROTECTED] wrote:
Patch gone too ;) I deleted it. I was hoping that you'd send me the final
finished product (please).
On Fri, 2007-07-13 at 01:54 -0700, Paul Jackson wrote:
> > Something for the young ...
>
> ... ancient self smiles with relief, glad that Andrew is not looking
> this way ;).
>
> > We could do this: add a little bitops torture-test and run that
> > unconditionally, nice and early.
>
> Good
Joe wrote:
> --- linux-2.6.22/mm/hugetlb.c.orig2007-07-12 15:02:19.0 +0800
> +++ linux-2.6.22/mm/hugetlb.c 2007-07-13 17:33:45.0 +0800
> @@ -101,13 +101,20 @@ static void free_huge_page(struct page *
>
> static int alloc_fresh_huge_page(void)
> {
> ...
Patch looks ok
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
Something like this?
--- a/mm/hugetlb.c~a
+++ a/mm/hugetlb.c
@@ -105,13 +105,20 @@ static void free_huge_page(struct page *
static int alloc_fresh_huge_page(void)
{
- static int nid = 0;
+ static int
>
> Patch gone too ;) I deleted it. I was hoping that you'd send me the final
> finished product (please).
>
Ha.., the patch against 2.6.22, at your patch have use htlb_alloc_mask, but I
cannot found it at 2.6.22 kernel tree, I think you must use difference kernel
tree :)
Thanks,
Joe
---
> Something for the young ...
... ancient self smiles with relief, glad that Andrew is not looking
this way ;).
> We could do this: add a little bitops torture-test and run that
> unconditionally, nice and early.
Good idea. That would allow for wide spread coverage across all arch's
that they
On Fri, 13 Jul 2007 01:43:49 -0700 Paul Jackson <[EMAIL PROTECTED]> wrote:
> > Sigh. What crap. I guess we leave it as-is.
>
> That matches my conclusion.
>
> The thought of exposing a bug in some user of this find bit code by
> tightening up the API to return == N, not >= N, scares poor
On Fri, 13 Jul 2007 16:37:32 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> >
> > if (nid > MAX_NUMNODES) then that is a bug and we should report it (doing
> > this via a BUG() is OK) rather than quietly covering it up.
>
> I have create a patch to check if nid > MAX_NUMNODES, please apply it
>
> Sigh. What crap. I guess we leave it as-is.
That matches my conclusion.
The thought of exposing a bug in some user of this find bit code by
tightening up the API to return == N, not >= N, scares poor little old
me. I can imagine a nasty bug coming from this
If someone bold enough and
>
> if (nid > MAX_NUMNODES) then that is a bug and we should report it (doing
> this via a BUG() is OK) rather than quietly covering it up.
I have create a patch to check if nid > MAX_NUMNODES, please apply it
thanks
Signed-off-by: Joe Jin <[EMAIL PROTECTED]>
---
---
On Fri, 13 Jul 2007 01:29:06 -0700 Paul Jackson <[EMAIL PROTECTED]> wrote:
> > I'm scratching my head over that min_t in __first_node(), too. I don't
> > think
> > it's possible for find_first_bit(..., N) to return anything >N _anyway_.
> > And if
> > it does, we want to know about it.
> >
> I'm scratching my head over that min_t in __first_node(), too. I don't think
> it's possible for find_first_bit(..., N) to return anything >N _anyway_. And
> if
> it does, we want to know about it.
>
>
I'm not sure I've got this right, but looks like that min_t went in after
Zwane
On Fri, 13 Jul 2007 13:34:29 +0530 gurudas pai <[EMAIL PROTECTED]> wrote:
>
>
> Andrew Morton wrote:
> > On Fri, 13 Jul 2007 14:40:04 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> >
> >> On 2007-07-12 22:18, Andrew Morton wrote:
> >>> On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin <[EMAIL PROTECTED]>
On Fri, 13 Jul 2007 16:03:36 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> > >
> > > The patch looks good for this bug, thanks :)
> >
> > If you have time could you test it and sent it back at me please?
> >
>
> Test passed and panic gone.
Patch gone too ;) I deleted it. I was hoping that
> >
> > The patch looks good for this bug, thanks :)
>
> If you have time could you test it and sent it back at me please?
>
Test passed and panic gone.
thanks.
> > if other caller give a invalid nid to alloc_pages_node(), __alloc_pages
> > will crash again.
>
> That would be a buggy caller,
Andrew Morton wrote:
On Fri, 13 Jul 2007 14:40:04 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
On 2007-07-12 22:18, Andrew Morton wrote:
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
Something like this?
--- a/mm/hugetlb.c~a
+++ a/mm/hugetlb.c
@@ -105,13 +105,20 @@
On Thu, 12 Jul 2007 23:49:38 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote:
> first_node(node_online_map);
Incidentally, we have a helper for this operation:
#if MAX_NUMNODES > 1
...
#define first_online_node first_node(node_online_map)
...
#else
...
#define first_online_node 0
...
On Fri, 13 Jul 2007 14:40:04 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> On 2007-07-12 22:18, Andrew Morton wrote:
> > On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> >
> > Something like this?
> >
> > --- a/mm/hugetlb.c~a
> > +++ a/mm/hugetlb.c
> > @@ -105,13 +105,20 @@
On 2007-07-12 22:18, Andrew Morton wrote:
> On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
>
> > This patch add nid sanity check on alloc_pages_node().
> > While two process change nr_hugepages at a system, alloc_fresh_huge_page()
> > been
On 2007-07-12 22:18, Andrew Morton wrote:
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin [EMAIL PROTECTED] wrote:
This patch add nid sanity check on alloc_pages_node().
While two process change nr_hugepages at a system, alloc_fresh_huge_page()
been called, at this function, nid defined
On Fri, 13 Jul 2007 14:40:04 +0800 Joe Jin [EMAIL PROTECTED] wrote:
On 2007-07-12 22:18, Andrew Morton wrote:
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin [EMAIL PROTECTED] wrote:
Something like this?
--- a/mm/hugetlb.c~a
+++ a/mm/hugetlb.c
@@ -105,13 +105,20 @@ static void
On Thu, 12 Jul 2007 23:49:38 -0700 Andrew Morton [EMAIL PROTECTED] wrote:
first_node(node_online_map);
Incidentally, we have a helper for this operation:
#if MAX_NUMNODES 1
...
#define first_online_node first_node(node_online_map)
...
#else
...
#define first_online_node 0
...
On Fri, 13 Jul 2007 16:03:36 +0800 Joe Jin [EMAIL PROTECTED] wrote:
The patch looks good for this bug, thanks :)
If you have time could you test it and sent it back at me please?
Test passed and panic gone.
Patch gone too ;) I deleted it. I was hoping that you'd send me the
On Fri, 13 Jul 2007 13:34:29 +0530 gurudas pai [EMAIL PROTECTED] wrote:
Andrew Morton wrote:
On Fri, 13 Jul 2007 14:40:04 +0800 Joe Jin [EMAIL PROTECTED] wrote:
On 2007-07-12 22:18, Andrew Morton wrote:
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin [EMAIL PROTECTED] wrote:
Something
The patch looks good for this bug, thanks :)
If you have time could you test it and sent it back at me please?
Test passed and panic gone.
thanks.
if other caller give a invalid nid to alloc_pages_node(), __alloc_pages
will crash again.
That would be a buggy caller, so we should
Andrew Morton wrote:
On Fri, 13 Jul 2007 14:40:04 +0800 Joe Jin [EMAIL PROTECTED] wrote:
On 2007-07-12 22:18, Andrew Morton wrote:
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin [EMAIL PROTECTED] wrote:
Something like this?
--- a/mm/hugetlb.c~a
+++ a/mm/hugetlb.c
@@ -105,13 +105,20 @@ static
Patch gone too ;) I deleted it. I was hoping that you'd send me the final
finished product (please).
Ha.., the patch against 2.6.22, at your patch have use htlb_alloc_mask, but I
cannot found it at 2.6.22 kernel tree, I think you must use difference kernel
tree :)
Thanks,
Joe
---
On Fri, 2007-07-13 at 01:54 -0700, Paul Jackson wrote:
Something for the young ...
... ancient self smiles with relief, glad that Andrew is not looking
this way ;).
We could do this: add a little bitops torture-test and run that
unconditionally, nice and early.
Good idea. That
Joe wrote:
--- linux-2.6.22/mm/hugetlb.c.orig2007-07-12 15:02:19.0 +0800
+++ linux-2.6.22/mm/hugetlb.c 2007-07-13 17:33:45.0 +0800
@@ -101,13 +101,20 @@ static void free_huge_page(struct page *
static int alloc_fresh_huge_page(void)
{
...
Patch looks ok to me,
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin [EMAIL PROTECTED] wrote:
Something like this?
--- a/mm/hugetlb.c~a
+++ a/mm/hugetlb.c
@@ -105,13 +105,20 @@ static void free_huge_page(struct page *
static int alloc_fresh_huge_page(void)
{
- static int nid = 0;
+ static int
Something for the young ...
... ancient self smiles with relief, glad that Andrew is not looking
this way ;).
We could do this: add a little bitops torture-test and run that
unconditionally, nice and early.
Good idea. That would allow for wide spread coverage across all arch's
that they
if (nid MAX_NUMNODES) then that is a bug and we should report it (doing
this via a BUG() is OK) rather than quietly covering it up.
I have create a patch to check if nid MAX_NUMNODES, please apply it
thanks
Signed-off-by: Joe Jin [EMAIL PROTECTED]
---
---
I'm scratching my head over that min_t in __first_node(), too. I don't think
it's possible for find_first_bit(..., N) to return anything N _anyway_. And
if
it does, we want to know about it.
looks at Paul
I'm not sure I've got this right, but looks like that min_t went in after
Zwane
On Fri, 13 Jul 2007 01:29:06 -0700 Paul Jackson [EMAIL PROTECTED] wrote:
I'm scratching my head over that min_t in __first_node(), too. I don't
think
it's possible for find_first_bit(..., N) to return anything N _anyway_.
And if
it does, we want to know about it.
looks at Paul
On Fri, 13 Jul 2007 16:37:32 +0800 Joe Jin [EMAIL PROTECTED] wrote:
if (nid MAX_NUMNODES) then that is a bug and we should report it (doing
this via a BUG() is OK) rather than quietly covering it up.
I have create a patch to check if nid MAX_NUMNODES, please apply it
thanks
On Fri, 13 Jul 2007 01:43:49 -0700 Paul Jackson [EMAIL PROTECTED] wrote:
Sigh. What crap. I guess we leave it as-is.
That matches my conclusion.
The thought of exposing a bug in some user of this find bit code by
tightening up the API to return == N, not = N, scares poor little old
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> This patch add nid sanity check on alloc_pages_node().
> While two process change nr_hugepages at a system, alloc_fresh_huge_page()
> been called, at this function, nid defined as a static variable, but, the
This patch add nid sanity check on alloc_pages_node().
While two process change nr_hugepages at a system, alloc_fresh_huge_page()
been called, at this function, nid defined as a static variable, but, there
is not any protection of, if 2 process called at the same time, maybe pass a
invalid nid
This patch add nid sanity check on alloc_pages_node().
While two process change nr_hugepages at a system, alloc_fresh_huge_page()
been called, at this function, nid defined as a static variable, but, there
is not any protection of, if 2 process called at the same time, maybe pass a
invalid nid
On Fri, 13 Jul 2007 10:45:07 +0800 Joe Jin [EMAIL PROTECTED] wrote:
This patch add nid sanity check on alloc_pages_node().
While two process change nr_hugepages at a system, alloc_fresh_huge_page()
been called, at this function, nid defined as a static variable, but, there
is not any
73 matches
Mail list logo