[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-30 Thread mabshoff


Robert Bradshaw wrote:

Hi Robert,

> I had an error like this and it turned out to be caused by a GC
> error. Specifically, an object A was allocated, then erroneously
> deallocated (but there was still stuff pointing to it). Object B was
> then allocated, with memory overlapping that of object A. Object A
> was then modified, and that modified the refcount field of object B.
> That might be a variant of what's going on here.
>
> I tracked it down by using gdb's watch command on the object with the
> weird ref count...
>
> - Robert

As it turned out it was a bug in 2.8.9's Cython [and probably at least
some of the earlier versions] which Carl tracked down after I isolated
the cause:

[02:30]  Ok, every time c=0 the refcount for int(0) goes up
by one.
[02:42]  It's a Cython bug.
[02:42]  Yep, for c in cols:
[02:42]  increments the refcount on int(?)
[02:42]  I've been having a hard time debugging it because the
bug doesn't occur any more in the current Cython (so it's gone in
2.8.10).
[02:43]  Ok, let me upgrade and see if the problem goes
away.
[02:44]  Got any more technical details?
[02:45]  In 2.8.9's version of Cython, converting a Python
object (like the int objects we're dealing with) to a Py_Ssize_t goes
via this line of code:
[02:45]  #define __pyx_PyIndex_AsSsize_t(b)
PyInt_AsSsize_t(PyNumber_Index(b))
[02:46]  PyNumber_Index(b) returns a new object (although in
this case it's actually the exact same int object), with its refcount
incremented.
[02:46]  Ok, that makes sense.
[02:46]  So the caller of PyNumber_Index() is supposed to
decref the return value at some point.

With 2.8.10 the refcount of int(0) doesn't increase by invoking
dance(m) any more:

sage: a=int(0)
sage: sys.getrefcount(a)
5304
sage: dance(5)
h^5 - 5*h^4 + 25*h^3 - 55*h^2 + 80*h - 46
sage: sys.getrefcount(a)
5304
sage: dance(6)
h^6 - 9*h^5 + 60*h^4 - 225*h^3 + 555*h^2 - 774*h + 484
sage: sys.getrefcount(a)
5304

It used to be that sys.getrefcount(a) would end up around 48,000 after
dance(5), so now I consider this issues closed.

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-29 Thread mabshoff



On Oct 29, 10:03 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> Hi Machael,
>

Hi Jaap,

>
>
> You wrote:
>
> > On Oct 29, 9:18 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> >> Michael,
>
> > Hello Jaap,
>
> >> Remembering that dance(10) has a segmentation fault long before
> >> there was a function _choose, is it possible that some other
> >> part of the code is going wild?
>
> > Sure, but so far no lead. Which specific version did exhibit that
> > behavior? I can build older releases locally and see if I can come up
> > with anything.
>
> On 23 Oct I wrote:
>
> To day I got with sage-2.8.8.1:
>

Well, I just looked at the patch where you introduced _choose and the
lst you create is indentical.

>  > > sage: time dance(10)
>  > >
>  > >
>  > > 
>  > > Unhandled SIGSEGV: A segmentation fault occured in SAGE.
>  > > This probably occured because a *compiled* component
>  > > of SAGE has a bug in it (typically accessing invalid memory)
>  > > or is not properly wrapped with _sig_on, _sig_off.
>  > > You might want to run SAGE under gdb with 'sage -gdb' to debug this.
>  > > SAGE will now terminate (sorry).
>  > > 
>
> > Now that 2.8.10 with all your speed improvements is out I will do some
> > more testing with that release on sage.math with a 32 bit binary to
> > see if I can flush anything out with that.
>
> Warning: see the reopened trac #217, I stupid changed some Py_ssize_t
> in ints, but knowing the nrows and ncols are small, there will be no
> problem, I hope!
>

Nope, I understand the argument to change it back, but the change
shouldn't case any trouble.

> > As I wrote above: The issues with the refcount going insane is a nice
> > result from the debugging and that will hopefully lead to some
> > improvements in our fight against memleaks.
>
> Thanks and good luck,

Carl had some interesting input:

[02:03]  I was looking at those refcounts some more.
[02:03]  ok
[02:03]  It turns out that Python caches and shares small int
objects:
[02:04]  sage: a = int(0)
[02:04]  sage: sys.getrefcount(a)
[02:04]  6288
[02:04]  sage: b = int(0)
[02:04]  sage: sys.getrefcount(a)
[02:04]  6289
[02:04]  sage: a is b
[02:04]  True
[02:04]  ok
[02:04]  So the reference-counting problem could be anywhere
that deals with small Python ints.
[02:05]  mmmh.
[02:05]  So it is not the list [0], but the smallint 0 that
has the reference counter overflow?
[02:06]  Let me go look at your printouts again.  I thought
so, but I didn't look closely.
[02:06]  Yeah, that's what it looks like to me.  (Just judging
from your next-to-latest e-mail.)
[02:07]  Because the numbers "jump", while I would expect
them to increase by 1 each time.
[02:11]  So it looks like between your first and second
printouts, the refcount for "0" jumped by 135, and the refcount for
"1" jumped by at least 125 (to make it wrap).
[02:11]  Yep, otherwise it would take *forever* to even wrap
on a 32 bit box.

So now comes the question: Why doesn't the refcount get decremented?
How do we fix this?

>
> Jaap

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-29 Thread Robert Bradshaw

I had an error like this and it turned out to be caused by a GC  
error. Specifically, an object A was allocated, then erroneously  
deallocated (but there was still stuff pointing to it). Object B was  
then allocated, with memory overlapping that of object A. Object A  
was then modified, and that modified the refcount field of object B.  
That might be a variant of what's going on here.

I tracked it down by using gdb's watch command on the object with the  
weird ref count...

- Robert


On Oct 30, 2007, at 12:19 AM, mabshoff wrote:

>>> As I wrote above: The issues with the refcount going insane is a  
>>> nice
>>> result from the debugging and that will hopefully lead to some
>>> improvements in our fight against memleaks.
>>
>> Thanks and good luck,
>
> Carl had some interesting input:
>
> [02:03]  I was looking at those refcounts some more.
> [02:03]  ok
> [02:03]  It turns out that Python caches and shares small int
> objects:
> [02:04]  sage: a = int(0)
> [02:04]  sage: sys.getrefcount(a)
> [02:04]  6288
> [02:04]  sage: b = int(0)
> [02:04]  sage: sys.getrefcount(a)
> [02:04]  6289
> [02:04]  sage: a is b
> [02:04]  True
> [02:04]  ok
> [02:04]  So the reference-counting problem could be anywhere
> that deals with small Python ints.
> [02:05]  mmmh.
> [02:05]  So it is not the list [0], but the smallint 0 that
> has the reference counter overflow?
> [02:06]  Let me go look at your printouts again.  I thought
> so, but I didn't look closely.
> [02:06]  Yeah, that's what it looks like to me.  (Just judging
> from your next-to-latest e-mail.)
> [02:07]  Because the numbers "jump", while I would expect
> them to increase by 1 each time.
> [02:11]  So it looks like between your first and second
> printouts, the refcount for "0" jumped by 135, and the refcount for
> "1" jumped by at least 125 (to make it wrap).
> [02:11]  Yep, otherwise it would take *forever* to even wrap
> on a 32 bit box.
>
> So now comes the question: Why doesn't the refcount get decremented?
> How do we fix this?
>
>>
>> Jaap
>
> Cheers,
>
> Michael
>
>
> 

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-29 Thread Jaap Spies

Hi Machael,

You wrote:
> 
> 
> On Oct 29, 9:18 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
>> Michael,
>>
> 
> Hello Jaap,
> 
>> Remembering that dance(10) has a segmentation fault long before
>> there was a function _choose, is it possible that some other
>> part of the code is going wild?
>>
> 
> Sure, but so far no lead. Which specific version did exhibit that
> behavior? I can build older releases locally and see if I can come up
> with anything.
> 

On 23 Oct I wrote:

To day I got with sage-2.8.8.1:

 > > sage: time dance(10)
 > >
 > >
 > > 
 > > Unhandled SIGSEGV: A segmentation fault occured in SAGE.
 > > This probably occured because a *compiled* component
 > > of SAGE has a bug in it (typically accessing invalid memory)
 > > or is not properly wrapped with _sig_on, _sig_off.
 > > You might want to run SAGE under gdb with 'sage -gdb' to debug this.
 > > SAGE will now terminate (sorry).
 > > 


> Now that 2.8.10 with all your speed improvements is out I will do some
> more testing with that release on sage.math with a 32 bit binary to
> see if I can flush anything out with that.
> 

Warning: see the reopened trac #217, I stupid changed some Py_ssize_t
in ints, but knowing the nrows and ncols are small, there will be no
problem, I hope!

> As I wrote above: The issues with the refcount going insane is a nice
> result from the debugging and that will hopefully lead to some
> improvements in our fight against memleaks.
> 

Thanks and good luck,

Jaap


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-29 Thread mabshoff



On Oct 29, 9:18 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> Michael,
>

Hello Jaap,

> Remembering that dance(10) has a segmentation fault long before
> there was a function _choose, is it possible that some other
> part of the code is going wild?
>

Sure, but so far no lead. Which specific version did exhibit that
behavior? I can build older releases locally and see if I can come up
with anything.

Now that 2.8.10 with all your speed improvements is out I will do some
more testing with that release on sage.math with a 32 bit binary to
see if I can flush anything out with that.

As I wrote above: The issues with the refcount going insane is a nice
result from the debugging and that will hopefully lead to some
improvements in our fight against memleaks.

> Jaap

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-29 Thread Jaap Spies

Michael,

Remembering that dance(10) has a segmentation fault long before
there was a function _choose, is it possible that some other
part of the code is going wild?

Jaap


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-29 Thread mabshoff

I did look into this some more and I did verify that the refcounts
wraps:

lst [[0, 1, 2, 3, 4, 5, 6]]
lst [[0], [1], [2], [3], [4], [5], [6]]
lst [[0, 1], [0, 2], [1, 2], [0, 3], [1, 3], [2, 3], [0, 4], [1, 4],
[2, 4], [3, 4], [0, 5], [1, 5], [2, 5], [3, 5], [4, 5],
 [0, 6], [1, 6], [2, 6], [3, 6], [4, 6], [5, 6]]
lst [[0, 1, 2], [0, 1, 3], [0, 2, 3], [1, 2, 3], [0, 1, 4], [0, 2, 4],
[1, 2, 4], [0, 3, 4], [1, 3, 4], [2, 3, 4], [0, 1, 5]
, [0, 2, 5], [1, 2, 5], [0, 3, 5], [1, 3, 5], [2, 3, 5], [0, 4, 5],
[1, 4, 5], [2, 4, 5], [3, 4, 5], [0, 1, 6], [0, 2, 6], [
1, 2, 6], [0, 3, 6], [1, 3, 6], [2, 3, 6], [0, 4, 6], [1, 4, 6], [2,
4, 6], [3, 4, 6], [0, 5, 6], [1, 5, 6], [2, 5, 6], [3,
5, 6], [4, 5, 6]]
lst [[, 1, 2, 3], [, 1, 2, 4], [, 1, 3, 4], [, 2, 3, 4], [1,
2, 3, 4], [, 1, 2, 5],
 [, 1, 3, 5], [, 2, 3, 5], [1, 2, 3, 5], [, 1, 4, 5], [, 2, 4,
5], [1, 2, 4, 5], [,
 3, 4, 5], [1, 3, 4, 5], [2, 3, 4, 5], [, 1, 2, 6], [, 1,
3, 6], [, 2, 3, 6], [1, 2, 3, 6],
[, 1, 4, 6], [, 2, 4, 6], [1, 2, 4, 6], [, 3, 4, 6], [1, 3, 4, 6], [2, 3, 4, 6], [
, 1, 5, 6], [, 2, 5, 6], [1, 2, 5, 6], [, 3, 5, 6], [1, 3, 5, 6], [2, 3, 5, 6], [, 4, 5, 6], [1, 4, 5, 6], [2, 4, 5,
 6], [3, 4, 5, 6]]
lst [[, , 2, 3, 4], [, , 2, 3, 5], [, , 2, 4, 5],
[, ,
3, 4, 5], [, 2, 3, 4
, 5], [, 2, 3, 4, 5], [, 
, 2, 3, 6], [, , 2, 4, 6], [, , 3, 4, 6], [, 2, 3, 4, 6], [, 2, 3, 4, 6], [, , 2, 5, 6], [, , 3, 5, 6], [, 2, 3, 5, 6], [, 2, 3, 5, 6], [, , 4, 5, 6], [, 2, 4, 5, 6], [, 2, 4, 5, 6], [, 3, 4, 5, 6], [, 3, 4, 5, 6], [2, 3, 4, 5, 
>6]]

The crash happens when the refcount for [0] goes up to zero. I now
tend to blame Cython for this, so can somebody take a closer look at
the code Cython generates for _choose.

I would also conjecture that these kinds of refcounting issues cause a
lot of the "still reachable memory" in a valgrind run.

Carl Witty suggested to turn the refcount macros into functions that
jump to a predefined function in case the refcount exceeds a certain
threshold like 1,000 or even 1,000,000. That way we can search for
those issues and hopefully eliminate them by setting a breakpoint on
that predefined function.

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-28 Thread mabshoff

Okay, I got some new results:

Changing permanent() slightly in matrix2.pyx:

print "entering permanent() - m: ",m," n: ",n
from sage.rings.arith import binomial
for r from 1 <= r < m+1:
lst = _choose(n, r)
print "lst", lst
tmp = []
for cols in lst:
tmp.append(self.prod_of_row_sums(cols))
s = sum(tmp)
# sn = (-1)^(m-r)
if (m - r) % 2 == 0:
sn = 1
else:
sn = -1
perm = perm + sn * binomial(n-r, m-r) * s
print "exiting permanent() - perm: ", perm

Where _choose is defined as:

def _choose(int n, int t):
"""
Returns all possible sublists of length t from range(n)

Based on algoritm L from Knuth's taocp part 4: 7.2.1.3 p.4

AUTHOR:
-- Jaap Spies (2007-10-22)
"""

x = []
c = range(t)
c.append(n)
c.append(0)
j = 0

while j < t:
x.append(c[:t])
j = 0
while c[j]+1 == c[j+1]:
   c[j] = j
   j = j+1
c[j] = c[j]+1

return x
[end of _choose code]

At some point the refcount for the list elements goes FUBAR (this is
not the first occurence of this, but those scrolled off my screen):

We should get  (r=1 in this case):
entering permanent() - m:  7  n:  7
lst [[1],[2],[3],[4],[5],[6]]

But we get:

entering permanent() - m:  7  n:  7
lst [[], [], [], [], [4], [5], [6]]

I assume on 64 bit boxen the refcount is 64 bit, so we never see the
issue on sage.math.

So does anybody have any idea what is wrong?

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-27 Thread Jaap Spies

mabshoff wrote:
> 
> 
> On Oct 27, 1:48 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:

>> Now you have changed #217 to milestone sage-2.8.10, I'll
>> better send a patch here, so this old ticket can be closed.
> 
> :), but feel free to change the title of the ticket and also the
> milestone it is tagged against.

I attached a patch bundle. If it gets accepted #217 can be closed
and, as you said, more specific tickets can be opened.


Jaap


Permanents are here forever!


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-27 Thread mabshoff



On Oct 27, 1:48 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> Hi Michael,
>
> You wrote:
> > On a side note: Could you comment on #217? It is rather vague and
> > there have been some improvements. If you have more improvements
> > relative to #931 you should open another/more tickets for it/them.
>
> Now you have changed #217 to milestone sage-2.8.10, I'll
> better send a patch here, so this old ticket can be closed.

:), but feel free to change the title of the ticket and also the
milestone it is tagged against. William said that it is one of those
very vague generic milestones from the old days and we have been
invalidating those and usually open more specific tickets in place of
them.

>
> To be honnest, I never saw this ticket before!

I came across it today by accident searching for another ticket, so
you are not alone.

dance(10) crashed for me inside rook_vector(), so I will be taking it
from here by adding debug output to nail the exact place where it
happens.

> Jaap

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-27 Thread Jaap Spies

Hi Michael,

You wrote:


> On a side note: Could you comment on #217? It is rather vague and
> there have been some improvements. If you have more improvements
> relative to #931 you should open another/more tickets for it/them.
> 

Now you have changed #217 to milestone sage-2.8.10, I'll
better send a patch here, so this old ticket can be closed.

To be honnest, I never saw this ticket before!

Jaap




--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-26 Thread mabshoff



On Oct 27, 1:57 am, Jaap Spies <[EMAIL PROTECTED]> wrote:
> Jaap Spies wrote:

Hello Jaap,

>
> > Actually I'm running dance(10) now on my machine and I hope there will be no
> > segfaults, because I simplified the code. It will take less than 3.5 hours 
> > to finish.
>
> Sorry, my hope was idle: same error!!

Arrg, change of strategy: Since rook_vector() consumes the vast amount
of time to compute I am trying to compute it on my box. If it does
compute without segfaulting for dance(10) I will then run the rest of
the code under valgrind, otherwise I will start adding debug print
statements into the code the rook_vector() code and the components
used. I am also running 2.8.9 + your patch from #931, which has
already been merged for 2.8.10.alpha0.

On a side note: Could you comment on #217? It is rather vague and
there have been some improvements. If you have more improvements
relative to #931 you should open another/more tickets for it/them.

>
> Jaap

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-26 Thread Jaap Spies

Jaap Spies wrote:

> 
> Actually I'm running dance(10) now on my machine and I hope there will be no
> segfaults, because I simplified the code. It will take less than 3.5 hours to 
> finish.
> 

Sorry, my hope was idle: same error!!

Jaap


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-26 Thread Jaap Spies

mabshoff wrote:
> Hello Jaap,
> 
> I assume you care about the following (computed on sage.math):
> 
> sage: dance(11)
> h^11 - 44*h^10 + 1045*h^9 - 16500*h^8 + 187935*h^7 - 1595748*h^6 +
> 10199343*h^5 - 48691500*h^4 + 169140180*h^3 - 405230320*h^2 +
> 600311624*h - 415232800
> 

Whow! This is great! I can now updat the corresponding entry in the OEIS.
My next wish is to find a formula for this kind of permanents!


I'm improving speed of the methods in matrix2.pyx in the meantime.
I'm makeing progress.

>> sage: time dance(9)
>> h^9 - 27*h^8 + 414*h^7 - 4158*h^6 + 29421*h^5 - 148743*h^4 + 530796*h^3 - 
>> 1276992*h^2 + 1866384*h - 1255608
>> CPU times: user 1453.75 s, sys: 8.03 s, total: 1461.78 s
>> Wall time: 1487.04

Actually I'm running dance(10) now on my machine and I hope there will be no
segfaults, because I simplified the code. It will take less than 3.5 hours to 
finish.



> I am currently computing dance(12) on sage.math. William: feel free to
> kill pid  5655 in case you consider this a waste of your resources.
> 
> I had to kill the valgrind process on my local box because valgrind
> stopped counting errors after the 1,000,000th. I also extrapolated how
> long it would take and I am not willing to wait a month for it to
> finish (if it does at all).
> 

Cheers!!!

> I started looking at the code itself to see if we cannot come up with
> an easier to run test case that segfaults and currently I am looking
> at the components like permanent() and so on that are using via
> rook_vector. So if you have any ideas where that code could overflow
> on 32 bit please let me know.
>

There are some issues related to the very art of permanents. There is no
one who want to compute the permanent of a matrix larger than an int x int!
In the literature you only will find calculations of certain types
of (0,1) matrices larger than 50 x 50, except the trivial ones.

I did send a patch for trac #931 to speed up the _choose part,
and also boost things up with a local variant of the binomial
function, more to come!(?)

Jaap





--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-26 Thread mabshoff

Hello Jaap,

I assume you care about the following (computed on sage.math):

sage: dance(11)
h^11 - 44*h^10 + 1045*h^9 - 16500*h^8 + 187935*h^7 - 1595748*h^6 +
10199343*h^5 - 48691500*h^4 + 169140180*h^3 - 405230320*h^2 +
600311624*h - 415232800

I am currently computing dance(12) on sage.math. William: feel free to
kill pid  5655 in case you consider this a waste of your resources.

I had to kill the valgrind process on my local box because valgrind
stopped counting errors after the 1,000,000th. I also extrapolated how
long it would take and I am not willing to wait a month for it to
finish (if it does at all).

I started looking at the code itself to see if we cannot come up with
an easier to run test case that segfaults and currently I am looking
at the components like permanent() and so on that are using via
rook_vector. So if you have any ideas where that code could overflow
on 32 bit please let me know.

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-25 Thread mabshoff



On Oct 24, 7:00 pm, "William Stein" <[EMAIL PROTECTED]> wrote:
> On 10/24/07, Jaap Spies <[EMAIL PROTECTED]> wrote:
>
>
>
> > It is not in matrix2.pyx, It is on the bottom of my first message and here 
> > below.
>
> > It uses some functions/methods present in matrix2.pyx:
> > rook_vector, permanental_minor, prod_of_row_sums, permanent.
>
> I think you should submit a patch to Sage so that this code gets
> included standard.  It could go somewhere in the combinat directory.
> Is it a sloane sequences?   It shouldn't be in sage.all by default,
> but should be easy for users to import.
>
>
>
>
>
> >  increase the amount of swap you have in that box. Adding more physical
> >  RAM will probably makes the problem go away, too.
> > >>> I rebooted with no succes. Same error using 50-60% of 1 GB memory.
> > >>> Swap space is 5 GB.
> > >>> Jaap
>
> > > Ok, I will investigate on a 32 bit box then. Could you give us
> > > distribution/gcc and so on please? It might be related to specific
> > > compilers.
>
> > Linux paix 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 23:10:59 EDT 2007 i686 i686 
> > i386 GNU/Linux
>
> > gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27)
>
> > Jaap
>
> > ---
>
> > ##
> > #  Copyright (C) 2006 Jaap Spies, [EMAIL PROTECTED]
> > #
> > #  Distributed under the terms of the GNU General Public License (GPL):
> > #
> > #  http://www.gnu.org/licenses/
> > ##
>
> > """
> >   Usage from sage
>
> >   sage: attach 'dancing.sage'
>
> >   sage: dance(4)
> >   h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
> > """
>
> > # use variable 'h' in the polynomial ring over the rationals
>
> > h = QQ['h'].gen()
>
> > def dance(m):
> >   """
> >   Generates the polynomial solutions of the Dancing School Problem
> >   Based on a modification of theorem 7.2.1 from Brualdi and Ryser,
> >   Combinatorial Matrix Theory.
>
> >   See NAW 5/7 nr. 4 december 2006 p. 285
>
> >   INPUT: integer m
>
> >   OUTPUT: polynomial in 'h'
>
> >   EXAMPLE:
> >   sage: dance(4)
> >   h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
> >   AUTHOR: Jaap Spies (2006)
> >   """
> >   n = 2*m-2
> >   M = MatrixSpace(ZZ, m, n)
> >   A = M([0 for i in range(m*n)])
> >   for i in range(m):
> >   for j in range(n):
> >   if i > j or j > i + n - m:
> >   A[i,j] = 1
> >   rv = A.rook_vector()
> > #   print rv
> >   s = sum([(-1)^k*rv[k]*falling_factorial(m+h-k, m-k) for k in 
> > range(m+1)])
> >   print s
>

Hello folks,

over night I got a better bt on my local 32 bit FC 7 box:

sage: time dance(10)

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208650048 (LWP 3143)]
0x00786473 in strlen () from /lib/libc.so.6
(gdb)
(gdb) bt
#0  0x00786473 in strlen () from /lib/libc.so.6
#1  0x0809354f in PyString_FromFormatV (format=0x811ef6c "'%.200s'
object cannot be interpreted as an index",
vargs=) at Objects/stringobject.c:202
#2  0x080d4ee0 in PyErr_Format (exception=0x813b560, format=0x811ef6c
"'%.200s' object cannot be interpreted as an index")
at Python/errors.c:522
#3  0x0805d578 in PyNumber_Index (item=0x0) at Objects/abstract.c:965
#4  0x011ac5ee in
__pyx_f_py_20matrix_integer_dense_20Matrix_integer_dense_prod_of_row_sums
(__pyx_v_self=0xa4a8104,
__pyx_v_cols=0xaf96ecc) at sage/matrix/matrix_integer_dense.c:
13576
#5  0x0805a277 in PyObject_Call (func=0x0, arg=0xa4a048c, kw=0x0) at
Objects/abstract.c:1860
#6  0x080be7cc in PyEval_CallObjectWithKeywords (func=0xaf9b62c,
arg=0xa4a048c, kw=0x0) at Python/ceval.c:3433
#7  0x0805a490 in PyObject_CallObject (o=0xaf9b62c, a=0xa4a048c) at
Objects/abstract.c:1851
#8  0x07db98c6 in __pyx_f_py_7matrix2_6Matrix_permanent
(__pyx_v_self=0xa4a8104, unused=0x0) at sage/matrix/matrix2.c:1657
#9  0x0805a277 in PyObject_Call (func=0x0, arg=0xb7f1702c, kw=0x0) at
Objects/abstract.c:1860
#10 0x080be7cc in PyEval_CallObjectWithKeywords (func=0xaf9694c,
arg=0xb7f1702c, kw=0x0) at Python/ceval.c:3433
#11 0x0805a490 in PyObject_CallObject (o=0xaf9694c, a=0x0) at Objects/
abstract.c:1851
#12 0x07db8711 in __pyx_f_py_7matrix2_6Matrix_permanental_minor
(__pyx_v_self=0xa4a8a94, __pyx_arg_k=0x97f8ee4)
at sage/matrix/matrix2.c:2077
#13 0x0805a277 in PyObject_Call (func=0x0, arg=0xa41c44c, kw=0x0) at
Objects/abstract.c:1860
#14 0x080be7cc in PyEval_CallObjectWithKeywords (func=0xaf87cec,
arg=0xa41c44c, kw=0x0) at Python/ceval.c:3433
#15 0x0805a490 in PyObject_CallObject (o=0xaf87cec, a=0xa41c44c) at
Objects/abstract.c:1851
#16 0x07daf6e2 in __pyx_f_py_7matrix2_6Matrix_rook_vector
(__pyx_v_self=0xa4a8a94, __pyx_args=0xb7f1702c, __pyx_kwds=0x0)
at sage/matrix/matrix2.c:2402
#17 0x080c539c in PyEval_EvalFrameEx (f=0xb00497c, throwflag=0) at
Python/ceval.c:3564
#18 0x08

[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread William Stein

On 10/24/07, Jaap Spies <[EMAIL PROTECTED]> wrote:
>
> It is not in matrix2.pyx, It is on the bottom of my first message and here 
> below.
>
> It uses some functions/methods present in matrix2.pyx:
> rook_vector, permanental_minor, prod_of_row_sums, permanent.

I think you should submit a patch to Sage so that this code gets
included standard.  It could go somewhere in the combinat directory.
Is it a sloane sequences?   It shouldn't be in sage.all by default,
but should be easy for users to import.

>
>  increase the amount of swap you have in that box. Adding more physical
>  RAM will probably makes the problem go away, too.
> >>> I rebooted with no succes. Same error using 50-60% of 1 GB memory.
> >>> Swap space is 5 GB.
> >>> Jaap
> >
> > Ok, I will investigate on a 32 bit box then. Could you give us
> > distribution/gcc and so on please? It might be related to specific
> > compilers.
> >
> Linux paix 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 23:10:59 EDT 2007 i686 i686 i386 
> GNU/Linux
>
> gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27)
>
> Jaap
>
> ---
>
> ##
> #  Copyright (C) 2006 Jaap Spies, [EMAIL PROTECTED]
> #
> #  Distributed under the terms of the GNU General Public License (GPL):
> #
> #  http://www.gnu.org/licenses/
> ##
>
> """
>   Usage from sage
>
>   sage: attach 'dancing.sage'
>
>   sage: dance(4)
>   h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
> """
>
> # use variable 'h' in the polynomial ring over the rationals
>
> h = QQ['h'].gen()
>
> def dance(m):
>   """
>   Generates the polynomial solutions of the Dancing School Problem
>   Based on a modification of theorem 7.2.1 from Brualdi and Ryser,
>   Combinatorial Matrix Theory.
>
>   See NAW 5/7 nr. 4 december 2006 p. 285
>
>   INPUT: integer m
>
>   OUTPUT: polynomial in 'h'
>
>   EXAMPLE:
>   sage: dance(4)
>   h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
>   AUTHOR: Jaap Spies (2006)
>   """
>   n = 2*m-2
>   M = MatrixSpace(ZZ, m, n)
>   A = M([0 for i in range(m*n)])
>   for i in range(m):
>   for j in range(n):
>   if i > j or j > i + n - m:
>   A[i,j] = 1
>   rv = A.rook_vector()
> #   print rv
>   s = sum([(-1)^k*rv[k]*falling_factorial(m+h-k, m-k) for k in 
> range(m+1)])
>   print s
>
>
> >
>


-- 
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread mabshoff



On Oct 24, 6:52 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> mabshoff wrote:
>
> > On Oct 24, 6:27 pm, "William Stein" <[EMAIL PROTECTED]> wrote:
> >> On 10/24/07, Jaap Spies <[EMAIL PROTECTED]> wrote:
>
> > I am on that, I got a 32 bit build of 2.8.9.alpha0.
> >> By the way, could you remind me where dance is defined?
>
> >> sage: search_src('dance')
> >> [nothing]
> >> sage:
>
> It is not in matrix2.pyx, It is on the bottom of my first message and heer 
> below.
>
> It uses some functions/methods present in matrix2.pyx:
> rook_vector, permanental_minor, prod_of_row_sums, permanent.
>

Yep, you are obviously right, sorry for the confusion.

>  increase the amount of swap you have in that box. Adding more physical
>  RAM will probably makes the problem go away, too.
> >>> I rebooted with no succes. Same error using 50-60% of 1 GB memory.
> >>> Swap space is 5 GB.
> >>> Jaap
>
> > Ok, I will investigate on a 32 bit box then. Could you give us
> > distribution/gcc and so on please? It might be related to specific
> > compilers.
>
> Linux paix 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 23:10:59 EDT 2007 i686 i686 i386 
> GNU/Linux
>
> gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27)

Great, I have

Linux localhost.localdomain 2.6.22.1-41.fc7 #1 SMP Fri Jul 27 17:26:33
EDT 2007 i686 i686 i386 GNU/Linux

gcc version 4.1.2 20070626 (Red Hat 4.1.2-13)

so I will apply the latest FC7 updates.

>
> Jaap
>

Cheers,

Michael

> ---
>
> ##
> #  Copyright (C) 2006 Jaap Spies, [EMAIL PROTECTED]
> #
> #  Distributed under the terms of the GNU General Public License (GPL):
> #
> #  http://www.gnu.org/licenses/
> ##
>
> """
>   Usage from sage
>
>   sage: attach 'dancing.sage'
>
>   sage: dance(4)
>   h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
> """
>
> # use variable 'h' in the polynomial ring over the rationals
>
> h = QQ['h'].gen()
>
> def dance(m):
>   """
>   Generates the polynomial solutions of the Dancing School Problem
>   Based on a modification of theorem 7.2.1 from Brualdi and Ryser,
>   Combinatorial Matrix Theory.
>
>   See NAW 5/7 nr. 4 december 2006 p. 285
>
>   INPUT: integer m
>
>   OUTPUT: polynomial in 'h'
>
>   EXAMPLE:
>   sage: dance(4)
>   h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
>   AUTHOR: Jaap Spies (2006)
>   """
>   n = 2*m-2
>   M = MatrixSpace(ZZ, m, n)
>   A = M([0 for i in range(m*n)])
>   for i in range(m):
>   for j in range(n):
>   if i > j or j > i + n - m:
>   A[i,j] = 1
>   rv = A.rook_vector()
> #   print rv
>   s = sum([(-1)^k*rv[k]*falling_factorial(m+h-k, m-k) for k in 
> range(m+1)])
>   print s


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread Jaap Spies

mabshoff wrote:
> 
> 
> On Oct 24, 6:27 pm, "William Stein" <[EMAIL PROTECTED]> wrote:
>> On 10/24/07, Jaap Spies <[EMAIL PROTECTED]> wrote:

> 
> I am on that, I got a 32 bit build of 2.8.9.alpha0.
>> By the way, could you remind me where dance is defined?
>>
>> sage: search_src('dance')
>> [nothing]
>> sage:
>>
>>

It is not in matrix2.pyx, It is on the bottom of my first message and heer 
below.

It uses some functions/methods present in matrix2.pyx:
rook_vector, permanental_minor, prod_of_row_sums, permanent.


 increase the amount of swap you have in that box. Adding more physical
 RAM will probably makes the problem go away, too.
>>> I rebooted with no succes. Same error using 50-60% of 1 GB memory.
>>> Swap space is 5 GB.
>>> Jaap
> 
> Ok, I will investigate on a 32 bit box then. Could you give us
> distribution/gcc and so on please? It might be related to specific
> compilers.
> 
Linux paix 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 23:10:59 EDT 2007 i686 i686 i386 
GNU/Linux

gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27)

Jaap

---

##
#  Copyright (C) 2006 Jaap Spies, [EMAIL PROTECTED]
#
#  Distributed under the terms of the GNU General Public License (GPL):
#
#  http://www.gnu.org/licenses/
##

"""
  Usage from sage

  sage: attach 'dancing.sage'

  sage: dance(4)
  h^4 - 2*h^3 + 9*h^2 - 8*h + 6

"""

# use variable 'h' in the polynomial ring over the rationals

h = QQ['h'].gen()

def dance(m):
  """
  Generates the polynomial solutions of the Dancing School Problem
  Based on a modification of theorem 7.2.1 from Brualdi and Ryser,
  Combinatorial Matrix Theory.

  See NAW 5/7 nr. 4 december 2006 p. 285

  INPUT: integer m

  OUTPUT: polynomial in 'h'

  EXAMPLE:
  sage: dance(4)
  h^4 - 2*h^3 + 9*h^2 - 8*h + 6

  AUTHOR: Jaap Spies (2006)
  """
  n = 2*m-2
  M = MatrixSpace(ZZ, m, n)
  A = M([0 for i in range(m*n)])
  for i in range(m):
  for j in range(n):
  if i > j or j > i + n - m:
  A[i,j] = 1
  rv = A.rook_vector()
#   print rv
  s = sum([(-1)^k*rv[k]*falling_factorial(m+h-k, m-k) for k in range(m+1)])
  print s


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread mabshoff



On Oct 24, 6:27 pm, "William Stein" <[EMAIL PROTECTED]> wrote:
> On 10/24/07, Jaap Spies <[EMAIL PROTECTED]> wrote:
>
> > > The above corresponds to the following lines in matrix2.pyx:279-281:
>
> > > tmp = []
> > > for cols in lst:
> > > tmp.append(self.prod_of_row_sums(cols))
>
> > > The crash itself happens when the tmp.append() fails. That indicates a
>
> It would perhaps be worth changing the code to
>
> some_large_integer = 3**(10**5)
> tmp = [some_large_integer]*len(lst)
> for i in xrange(len(lst)):
>  tmp[i] = self.prod_of_row_sums(cols[i])
>
> and see what happens.  I mean, maybe the problem is in
> prod_of_row_sums instead of append?
>
> We really need to replicate this on another 32-bit machine.

I am on that, I got a 32 bit build of 2.8.9.alpha0.
>
> By the way, could you remind me where dance is defined?
>
> sage: search_src('dance')
> [nothing]
> sage:
>
>

It is in matrix/matrix2.pyx

>
> > > problem with python's memory management and not an issue in the Sage
> > > code. I would expect python to throw some kind of allocation error,
> > > but since python uses pymalloc it usually asks for a "large" amount of
> > > memory to parcel out smaller chunks for the requests it receives. That
> > > could explain why despite having consumed roughly 80% of memory
> > > available the allocation fails. The other 20% might also be taken up
> > > to a large extent by the kernel and user space, so that makes a
> > > situation where you encounter a "no more memory" situation very
> > > likely. The matrix code in Sage does throw a RuntimeError when it
> > > fails to allocate memory, so I think it is off the hook for now.
>
> > > Because you run at about 80% memory consumption and have a large
> > > uptime I would suggest to do a reboot and see if the computation still
> > > fails; hopefully your heap fragmentation problem is gone due to the
> > > reboot and the computation will succeed. Depending on the version of
> > > Sage you last tried this with the memory footprint of Sage might have
> > > grown in relative terms so that the computation no longer succeeds
> > > with the amount of RAM you have, so another possibility would be to
> > > increase the amount of swap you have in that box. Adding more physical
> > > RAM will probably makes the problem go away, too.
>
> > I rebooted with no succes. Same error using 50-60% of 1 GB memory.
> > Swap space is 5 GB.
>
> > Jaap
>

Ok, I will investigate on a 32 bit box then. Could you give us
distribution/gcc and so on please? It might be related to specific
compilers.

> > > If anything pops up from the still running debug sessions I will let
> > > you know.
>

Cheers,

Michael

> --
> William Stein
> Associate Professor of Mathematics
> University of Washingtonhttp://wstein.org


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread William Stein

On 10/24/07, Jaap Spies <[EMAIL PROTECTED]> wrote:
> > The above corresponds to the following lines in matrix2.pyx:279-281:
> >
> > tmp = []
> > for cols in lst:
> > tmp.append(self.prod_of_row_sums(cols))
> >
> > The crash itself happens when the tmp.append() fails. That indicates a

It would perhaps be worth changing the code to

some_large_integer = 3**(10**5)
tmp = [some_large_integer]*len(lst)
for i in xrange(len(lst)):
 tmp[i] = self.prod_of_row_sums(cols[i])

and see what happens.  I mean, maybe the problem is in
prod_of_row_sums instead of append?

We really need to replicate this on another 32-bit machine.

By the way, could you remind me where dance is defined?

sage: search_src('dance')
[nothing]
sage:

> > problem with python's memory management and not an issue in the Sage
> > code. I would expect python to throw some kind of allocation error,
> > but since python uses pymalloc it usually asks for a "large" amount of
> > memory to parcel out smaller chunks for the requests it receives. That
> > could explain why despite having consumed roughly 80% of memory
> > available the allocation fails. The other 20% might also be taken up
> > to a large extent by the kernel and user space, so that makes a
> > situation where you encounter a "no more memory" situation very
> > likely. The matrix code in Sage does throw a RuntimeError when it
> > fails to allocate memory, so I think it is off the hook for now.
> >
> > Because you run at about 80% memory consumption and have a large
> > uptime I would suggest to do a reboot and see if the computation still
> > fails; hopefully your heap fragmentation problem is gone due to the
> > reboot and the computation will succeed. Depending on the version of
> > Sage you last tried this with the memory footprint of Sage might have
> > grown in relative terms so that the computation no longer succeeds
> > with the amount of RAM you have, so another possibility would be to
> > increase the amount of swap you have in that box. Adding more physical
> > RAM will probably makes the problem go away, too.
> >
>
> I rebooted with no succes. Same error using 50-60% of 1 GB memory.
> Swap space is 5 GB.
>
>
> Jaap
>
>
> > If anything pops up from the still running debug sessions I will let
> > you know.
> >
>
>
> >
>


-- 
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread Jaap Spies

mabshoff wrote:

>> #20 0x0805a213 in PyIter_Next (iter=0xa70a66c) at Objects/abstract.c:2375
>> #21 0x0121c5bd in __pyx_f_py_7matrix2_6Matrix_permanent 
>> (__pyx_v_self=0x9c37194, unused=0x0) at sage/matrix/matrix2.c:1633
> 
> The above corresponds to the following lines in matrix2.pyx:279-281:
> 
> tmp = []
> for cols in lst:
> tmp.append(self.prod_of_row_sums(cols))
> 
> The crash itself happens when the tmp.append() fails. That indicates a
> problem with python's memory management and not an issue in the Sage
> code. I would expect python to throw some kind of allocation error,
> but since python uses pymalloc it usually asks for a "large" amount of
> memory to parcel out smaller chunks for the requests it receives. That
> could explain why despite having consumed roughly 80% of memory
> available the allocation fails. The other 20% might also be taken up
> to a large extent by the kernel and user space, so that makes a
> situation where you encounter a "no more memory" situation very
> likely. The matrix code in Sage does throw a RuntimeError when it
> fails to allocate memory, so I think it is off the hook for now.
> 
> Because you run at about 80% memory consumption and have a large
> uptime I would suggest to do a reboot and see if the computation still
> fails; hopefully your heap fragmentation problem is gone due to the
> reboot and the computation will succeed. Depending on the version of
> Sage you last tried this with the memory footprint of Sage might have
> grown in relative terms so that the computation no longer succeeds
> with the amount of RAM you have, so another possibility would be to
> increase the amount of swap you have in that box. Adding more physical
> RAM will probably makes the problem go away, too.
> 

I rebooted with no succes. Same error using 50-60% of 1 GB memory.
Swap space is 5 GB.


Jaap


> If anything pops up from the still running debug sessions I will let
> you know.
> 


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread mabshoff



On Oct 24, 12:31 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> Michael,

Hello Jaap,

>
> You wrote:
>
> > dance(10) computes fine on sage.math (in about 6 hours under gdb), I
> > am running dance(11) to see if it finishes [I guess you would like the
> > result ;)].
>
> Yes, sure!

Ok, should it finish I will send you the output.

>
> > So, any chance you are running the computation on a 32 bit
> > box and/or run out of memory/have highly fragmented memory after a
> > long uptime?
>
> There is a long uptime, but I saw in the system console a steady line
> on 80% memory usage.

Ok.

>
>   I have to wait for the valgrind run to finish (which will> take a long, 
> long time), to say something definite, but I will look
> > into it some more.
>
> Below the output from bt

Thanks.

>
> Thanks!
>
> Jaap
>
> sage: time dance(10)
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread -1208297792 (LWP 3914)]
> 0x0a6c2a64 in ?? ()
> (gdb) bt
> #0  0x0a6c2a64 in ?? ()
> #1  0x08087aa2 in PyObject_RichCompare (v=0x8f94f44, w=0x8f94f44, op=2) at 
> Objects/object.c:914
> #2  0x080c3862 in PyEval_EvalFrameEx (f=0xa7bb864, throwflag=0) at 
> Python/ceval.c:3980
> #3  0x0810b4ed in gen_send_ex (gen=0xa70a3ec, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #4  0x080c00cd in PyEval_EvalFrameEx (f=0xa7bd5e4, throwflag=0) at 
> Python/ceval.c:2164
> #5  0x0810b4ed in gen_send_ex (gen=0xa70a32c, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #6  0x080c00cd in PyEval_EvalFrameEx (f=0xa7bd15c, throwflag=0) at 
> Python/ceval.c:2164
> #7  0x0810b4ed in gen_send_ex (gen=0xa70a72c, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #8  0x080c00cd in PyEval_EvalFrameEx (f=0xa7bc4fc, throwflag=0) at 
> Python/ceval.c:2164
> #9  0x0810b4ed in gen_send_ex (gen=0xa70a70c, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #10 0x080c00cd in PyEval_EvalFrameEx (f=0xa7bea2c, throwflag=0) at 
> Python/ceval.c:2164
> #11 0x0810b4ed in gen_send_ex (gen=0xa70a6ec, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #12 0x080c00cd in PyEval_EvalFrameEx (f=0xa7be134, throwflag=0) at 
> Python/ceval.c:2164
> #13 0x0810b4ed in gen_send_ex (gen=0xa70a68c, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #14 0x080c00cd in PyEval_EvalFrameEx (f=0xa7bf77c, throwflag=0) at 
> Python/ceval.c:2164
> #15 0x0810b4ed in gen_send_ex (gen=0xa70a6cc, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #16 0x080c00cd in PyEval_EvalFrameEx (f=0xa7be434, throwflag=0) at 
> Python/ceval.c:2164
> #17 0x0810b4ed in gen_send_ex (gen=0xa70aa2c, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #18 0x080c00cd in PyEval_EvalFrameEx (f=0xa7bb9dc, throwflag=0) at 
> Python/ceval.c:2164
> #19 0x0810b4ed in gen_send_ex (gen=0xa70a66c, arg=0x16, exc=0) at 
> Objects/genobject.c:82
> #20 0x0805a213 in PyIter_Next (iter=0xa70a66c) at Objects/abstract.c:2375
> #21 0x0121c5bd in __pyx_f_py_7matrix2_6Matrix_permanent 
> (__pyx_v_self=0x9c37194, unused=0x0) at sage/matrix/matrix2.c:1633

The above corresponds to the following lines in matrix2.pyx:279-281:

tmp = []
for cols in lst:
tmp.append(self.prod_of_row_sums(cols))

The crash itself happens when the tmp.append() fails. That indicates a
problem with python's memory management and not an issue in the Sage
code. I would expect python to throw some kind of allocation error,
but since python uses pymalloc it usually asks for a "large" amount of
memory to parcel out smaller chunks for the requests it receives. That
could explain why despite having consumed roughly 80% of memory
available the allocation fails. The other 20% might also be taken up
to a large extent by the kernel and user space, so that makes a
situation where you encounter a "no more memory" situation very
likely. The matrix code in Sage does throw a RuntimeError when it
fails to allocate memory, so I think it is off the hook for now.

Because you run at about 80% memory consumption and have a large
uptime I would suggest to do a reboot and see if the computation still
fails; hopefully your heap fragmentation problem is gone due to the
reboot and the computation will succeed. Depending on the version of
Sage you last tried this with the memory footprint of Sage might have
grown in relative terms so that the computation no longer succeeds
with the amount of RAM you have, so another possibility would be to
increase the amount of swap you have in that box. Adding more physical
RAM will probably makes the problem go away, too.

If anything pops up from the still running debug sessions I will let
you know.

Cheers,

Michael

> #22 0x0805a277 in PyObject_Call (func=0x92c1a54, arg=0xb7f6d02c, kw=0x0) at 
> Objects/abstract.c:1860
> ---Type  to continue, or q  to quit---
> #23 0x080be7cc in PyEval_CallObjectWithKeywords (func=0xa71282c, 
> arg=0xb7f6d02c, kw=0x0) at Python/ceval.c:3433
> #24 0x0805a490 in PyObject_CallObject (o=0xa71282c, a=0x0) at 
> Objects/abstract.c:1851
> #25 0x0121b491 in __pyx_f_py_7matrix2_6Matrix_pe

[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-24 Thread Jaap Spies

Michael,
You wrote:
> 
> dance(10) computes fine on sage.math (in about 6 hours under gdb), I
> am running dance(11) to see if it finishes [I guess you would like the
> result ;)]. 

Yes, sure!

So, any chance you are running the computation on a 32 bit
> box and/or run out of memory/have highly fragmented memory after a
> long uptime?

There is a long uptime, but I saw in the system console a steady line
on 80% memory usage.


  I have to wait for the valgrind run to finish (which will
> take a long, long time), to say something definite, but I will look
> into it some more.
> 
Below the output from bt

Thanks!

Jaap



sage: time dance(10)

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208297792 (LWP 3914)]
0x0a6c2a64 in ?? ()
(gdb) bt
#0  0x0a6c2a64 in ?? ()
#1  0x08087aa2 in PyObject_RichCompare (v=0x8f94f44, w=0x8f94f44, op=2) at 
Objects/object.c:914
#2  0x080c3862 in PyEval_EvalFrameEx (f=0xa7bb864, throwflag=0) at 
Python/ceval.c:3980
#3  0x0810b4ed in gen_send_ex (gen=0xa70a3ec, arg=0x16, exc=0) at 
Objects/genobject.c:82
#4  0x080c00cd in PyEval_EvalFrameEx (f=0xa7bd5e4, throwflag=0) at 
Python/ceval.c:2164
#5  0x0810b4ed in gen_send_ex (gen=0xa70a32c, arg=0x16, exc=0) at 
Objects/genobject.c:82
#6  0x080c00cd in PyEval_EvalFrameEx (f=0xa7bd15c, throwflag=0) at 
Python/ceval.c:2164
#7  0x0810b4ed in gen_send_ex (gen=0xa70a72c, arg=0x16, exc=0) at 
Objects/genobject.c:82
#8  0x080c00cd in PyEval_EvalFrameEx (f=0xa7bc4fc, throwflag=0) at 
Python/ceval.c:2164
#9  0x0810b4ed in gen_send_ex (gen=0xa70a70c, arg=0x16, exc=0) at 
Objects/genobject.c:82
#10 0x080c00cd in PyEval_EvalFrameEx (f=0xa7bea2c, throwflag=0) at 
Python/ceval.c:2164
#11 0x0810b4ed in gen_send_ex (gen=0xa70a6ec, arg=0x16, exc=0) at 
Objects/genobject.c:82
#12 0x080c00cd in PyEval_EvalFrameEx (f=0xa7be134, throwflag=0) at 
Python/ceval.c:2164
#13 0x0810b4ed in gen_send_ex (gen=0xa70a68c, arg=0x16, exc=0) at 
Objects/genobject.c:82
#14 0x080c00cd in PyEval_EvalFrameEx (f=0xa7bf77c, throwflag=0) at 
Python/ceval.c:2164
#15 0x0810b4ed in gen_send_ex (gen=0xa70a6cc, arg=0x16, exc=0) at 
Objects/genobject.c:82
#16 0x080c00cd in PyEval_EvalFrameEx (f=0xa7be434, throwflag=0) at 
Python/ceval.c:2164
#17 0x0810b4ed in gen_send_ex (gen=0xa70aa2c, arg=0x16, exc=0) at 
Objects/genobject.c:82
#18 0x080c00cd in PyEval_EvalFrameEx (f=0xa7bb9dc, throwflag=0) at 
Python/ceval.c:2164
#19 0x0810b4ed in gen_send_ex (gen=0xa70a66c, arg=0x16, exc=0) at 
Objects/genobject.c:82
#20 0x0805a213 in PyIter_Next (iter=0xa70a66c) at Objects/abstract.c:2375
#21 0x0121c5bd in __pyx_f_py_7matrix2_6Matrix_permanent 
(__pyx_v_self=0x9c37194, unused=0x0) at sage/matrix/matrix2.c:1633
#22 0x0805a277 in PyObject_Call (func=0x92c1a54, arg=0xb7f6d02c, kw=0x0) at 
Objects/abstract.c:1860
---Type  to continue, or q  to quit---
#23 0x080be7cc in PyEval_CallObjectWithKeywords (func=0xa71282c, 
arg=0xb7f6d02c, kw=0x0) at Python/ceval.c:3433
#24 0x0805a490 in PyObject_CallObject (o=0xa71282c, a=0x0) at 
Objects/abstract.c:1851
#25 0x0121b491 in __pyx_f_py_7matrix2_6Matrix_permanental_minor 
(__pyx_v_self=0x9c37fa4, __pyx_arg_k=0x8f94ee4)
 at sage/matrix/matrix2.c:2074
#26 0x0805a277 in PyObject_Call (func=0x92c1a54, arg=0x9186b4c, kw=0x0) at 
Objects/abstract.c:1860
#27 0x080be7cc in PyEval_CallObjectWithKeywords (func=0xa70ab2c, arg=0x9186b4c, 
kw=0x0) at Python/ceval.c:3433
#28 0x0805a490 in PyObject_CallObject (o=0xa70ab2c, a=0x9186b4c) at 
Objects/abstract.c:1851
#29 0x01212462 in __pyx_f_py_7matrix2_6Matrix_rook_vector 
(__pyx_v_self=0x9c37fa4, __pyx_args=0xb7f6d02c, __pyx_kwds=0x0)
 at sage/matrix/matrix2.c:2399
#30 0x080c539c in PyEval_EvalFrameEx (f=0x9c58bf4, throwflag=0) at 
Python/ceval.c:3564
#31 0x080c57c5 in PyEval_EvalFrameEx (f=0x9c662b4, throwflag=0) at 
Python/ceval.c:3650
#32 0x080c65d5 in PyEval_EvalCodeEx (co=0x9c37068, globals=0xa7048ac, 
locals=0xa7048ac, args=0x0, argcount=0, kws=0x0,
 kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2831
#33 0x080c6647 in PyEval_EvalCode (co=0x9c37068, globals=0xa7048ac, 
locals=0xa7048ac) at Python/ceval.c:494
#34 0x080bdc01 in builtin_eval (self=0x0, args=0xa70886c) at 
Python/bltinmodule.c:571
#35 0x080c539c in PyEval_EvalFrameEx (f=0x9ba3cbc, throwflag=0) at 
Python/ceval.c:3564
#36 0x080c65d5 in PyEval_EvalCodeEx (co=0x91c8338, globals=0x91bd1c4, 
locals=0x0, args=0x9c5af7c, argcount=2, kws=0x9c5af84,
 kwcount=0, defs=0x91dc6d8, defcount=1, closure=0x0) at Python/ceval.c:2831
#37 0x080c4a89 in PyEval_EvalFrameEx (f=0x9c5ae2c, throwflag=0) at 
Python/ceval.c:3660
#38 0x080c57c5 in PyEval_EvalFrameEx (f=0x9b8a354, throwflag=0) at 
Python/ceval.c:3650
#39 0x080c65d5 in PyEval_EvalCodeEx (co=0x9ae78d8, globals=0xa7048ac, 
locals=0xa7048ac, args=0x0, argcount=0, kws=0x0,
 kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2831
#40 0x080c5852 in PyEval_EvalFrameEx (f=0x9c5a61c, throwflag=0) at 
Python/ceval.c:494
---Type  to continue, or q  to 

[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-23 Thread mabshoff

Jaap,

dance(10) computes fine on sage.math (in about 6 hours under gdb), I
am running dance(11) to see if it finishes [I guess you would like the
result ;)]. So, any chance you are running the computation on a 32 bit
box and/or run out of memory/have highly fragmented memory after a
long uptime? I have to wait for the valgrind run to finish (which will
take a long, long time), to say something definite, but I will look
into it some more.

Cheers,

Michael


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---



[sage-devel] Re: Unhandled SIGSEGV: Ticket #973

2007-10-23 Thread mabshoff



On Oct 23, 1:23 pm, Jaap Spies <[EMAIL PROTECTED]> wrote:
> Last year after less than two days I could finish
> a calculation and write to William:
>
>
>
> >  Original Message 
> > Subject: dance(10)
> > Date: Thu, 09 Mar 2006 00:10:19 +0100
> > From: Jaap Spies <[EMAIL PROTECTED]>
> > To: William Stein <[EMAIL PROTECTED]>
>
> > William,
>
> > Some time ago, dance(10) finished:
>
> > sage: time dance(10)
> > [1, 90, 3405, 71040, 901152, 7225344, 36862960, 117340800, 221170456,
> > 220658800, 87396728]
> > h^10 - 35*h^9 + 675*h^8 - 8610*h^7 + 78435*h^6 - 523467*h^5 +
> > 2562525*h^4 - 9008160*h^3 + 21623220*h^2 - 31840760*h + 21750840
> > CPU times: user 141822.70 s, sys: 18387.03 s, total: 160209.74 s
> > Wall time: 165997.13
>
> > Jaap
>
> To day I got with sage-2.8.8.1:
>
> > sage: time dance(10)
>
> > 
> > Unhandled SIGSEGV: A segmentation fault occured in SAGE.
> > This probably occured because a *compiled* component
> > of SAGE has a bug in it (typically accessing invalid memory)
> > or is not properly wrapped with _sig_on, _sig_off.
> > You might want to run SAGE under gdb with 'sage -gdb' to debug this.
> > SAGE will now terminate (sorry).
> > 
>
> With sage -gdb:
>
> > sage: time dance(9)
> > h^9 - 27*h^8 + 414*h^7 - 4158*h^6 + 29421*h^5 - 148743*h^4 + 530796*h^3 - 
> > 1276992*h^2 + 1866384*h - 1255608
> > CPU times: user 1786.82 s, sys: 23.05 s, total: 1809.87 s
> > Wall time: 1831.52
>
> > sage: time dance(10)
>
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread -1208523072 (LWP 30162)]
> > 0x0064d473 in strlen () from /lib/libc.so.6
> > (gdb)
>
> The program (see below) uses methods from sage.matrix.matrix2.pyx
>
> Jaap

Hello Jaap,

I am looking into this, but I am pressed for time until Thursday, so
don't expect any solution before that. It will take forever to
valgrind this anyway. If you still have that gdb session it would be
great to get a backtrace. I did run the code with smaller parameters,
i.e. dance(4) and dance(5), under valgrind and nothing popped up
there. So I am suspecting that the code might smash the stack with
large parameters, but that ought to be taken with a grain of salt
until I get a backtrace and some valgrind output

Cheers,

Michael

>
> ##
> #  Copyright (C) 2006 Jaap Spies, [EMAIL PROTECTED]
> #
> #  Distributed under the terms of the GNU General Public License (GPL):
> #
> #  http://www.gnu.org/licenses/
> ##
>
> """
>  Usage from sage
>
>  sage: attach 'dancing.sage'
>
>  sage: dance(4)
>  h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
> """
>
> # use variable 'h' in the polynomial ring over the rationals
>
> h = QQ['h'].gen()
>
> def dance(m):
>  """
>  Generates the polynomial solutions of the Dancing School Problem
>  Based on a modification of theorem 7.2.1 from Brualdi and Ryser,
>  Combinatorial Matrix Theory.
>
>  See NAW 5/7 nr. 4 december 2006 p. 285
>
>  INPUT: integer m
>
>  OUTPUT: polynomial in 'h'
>
>  EXAMPLE:
>  sage: dance(4)
>  h^4 - 2*h^3 + 9*h^2 - 8*h + 6
>
>  AUTHOR: Jaap Spies (2006)
>  """
>  n = 2*m-2
>  M = MatrixSpace(ZZ, m, n)
>  A = M([0 for i in range(m*n)])
>  for i in range(m):
>  for j in range(n):
>  if i > j or j > i + n - m:
>  A[i,j] = 1
>  rv = A.rook_vector()
> #   print rv
>  s = sum([(-1)^k*rv[k]*falling_factorial(m+h-k, m-k) for k in range(m+1)])
>  print s


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~--~~~~--~~--~--~---