On Thu, Aug 5, 2010 at 4:26 AM, Mitesh Patel <qed...@gmail.com> wrote:
> On 08/04/2010 03:10 AM, Robert Bradshaw wrote:
>> On Sat, Jul 31, 2010 at 2:51 AM, Mitesh Patel <qed...@gmail.com> wrote:
>>> On 07/30/2010 01:54 AM, Craig Citro wrote:
>>>> So we're currently working on a long-overdue release of Cython with
>>>> all kinds of snazzy new features. However, our automated testing
>>>> system seems to keep turning up sporadic segfaults when running the
>>>> sage doctest suite. This is obviously bad, but we're having a hard
>>>> time reproducing this -- they seem to be *very* occasional failures
>>>> while starting up sage, and thus far the only consistent appearance
>>>> has been *within* our automated testing system (hudson). We've got a
>>>> pile of dumped cores, which have mostly led us to the conclusions that
>>>> (1) the problem occurs at a seemingly random point, so we should
>>>> suspect some sort of memory corruption, and (2) sage does a *whole*
>>>> lot of stuff when it starts up. ;)
>>>> [...]
>>>> After that, run the full test suite as many times as you're willing,
>>>> hopefully with and without parallel doctesting (i.e. sage -tp). Then
>>>> let us know what you turn up -- lots of random failures, or does
>>>> everything pass? Points for machines we can ssh into and generated
>>>> core files (ulimit -c unlimited), and even more points for anyone
>>>> seeing consistent/repeatable failures. I'd also be very interested of
>>>> reports that you've run the test suite N times with no failures.
>>>
>>> Below are some results from parallel doctests on sage.math.  In each of
>>>
>>> /mnt/usb1/scratch/mpatel/tmp/sage-4.4.4-cython
>>> /mnt/usb1/scratch/mpatel/tmp/sage-4.5.1-cython
>>>
>>> I have run (or am still running)
>>>
>>> ./tester  | tee -a ztester &
>>>
>>> where 'tester' contains
>>>
>>> #!/bin/bash
>>> ulimit -c unlimited
>>>
>>> RUNS=20
>>> for I in `seq 1 $RUNS`;
>>> do
>>>    LOG="ptestlong-j20-$I.log"
>>>    if [ ! -f "$LOG" ]; then
>>>        echo "Run $I of $RUNS"
>>>        nice ./sage -tp 20 -long -sagenb devel/sage > "$LOG" 2>&1
>>>
>>>        # grep -A2 -B1 dumped "$LOG"
>>>        ls -lsFtr `find -type f -name core` | grep core | tee -a "$LOG"
>>>
>>>        # Rename each core to core_cy.$I
>>>        rm -f _ren
>>>        find -name core -type f | awk '{print "mv "$0" "$0"_cy.'${I}'"}'
>>>> _ren
>>>        . _ren
>>>    fi
>>> done
>>>
>>> The log files and cores (renamed to core_cy.1, etc.) are still in/under
>>> SAGE_ROOT.
>>>
>>> I don't know if the results tell you more than you already know.  For
>>> example,
>>>
>>> sage-4.5.1-cython$ for x in `\ls ptestlong-j20-*`; do grep "doctests
>>> failed" $x; done | grep -v "0 doctests failed" | sort | uniq -c
>>>      1         sage -t -long  devel/sage/sage/graphs/graph.py # 2
>>> doctests failed
>>>     19         sage -t -long  devel/sage/sage/tests/startup.py # 1
>>> doctests failed
>>>
>>> But
>>>
>>> sage-4.5.1-cython$ find -name core_cy\* | sort
>>> ./data/extcode/genus2reduction/core_cy.1
>>> ./data/extcode/genus2reduction/core_cy.10
>>> ./data/extcode/genus2reduction/core_cy.11
>>> ./data/extcode/genus2reduction/core_cy.12
>>> ./data/extcode/genus2reduction/core_cy.13
>>> ./data/extcode/genus2reduction/core_cy.14
>>> ./data/extcode/genus2reduction/core_cy.15
>>> ./data/extcode/genus2reduction/core_cy.16
>>> ./data/extcode/genus2reduction/core_cy.17
>>> ./data/extcode/genus2reduction/core_cy.18
>>> ./data/extcode/genus2reduction/core_cy.19
>>> ./data/extcode/genus2reduction/core_cy.2
>>> ./data/extcode/genus2reduction/core_cy.20
>>> ./data/extcode/genus2reduction/core_cy.3
>>> ./data/extcode/genus2reduction/core_cy.4
>>> ./data/extcode/genus2reduction/core_cy.5
>>> ./data/extcode/genus2reduction/core_cy.6
>>> ./data/extcode/genus2reduction/core_cy.7
>>> ./data/extcode/genus2reduction/core_cy.8
>>> ./data/extcode/genus2reduction/core_cy.9
>>> ./devel/sage-main/doc/fr/tutorial/core_cy.17
>>> ./devel/sage-main/sage/algebras/core_cy.17
>>> ./devel/sage-main/sage/categories/core_cy.4
>>> ./devel/sage-main/sage/categories/core_cy.6
>>> ./devel/sage-main/sage/combinat/root_system/core_cy.12
>>> ./devel/sage-main/sage/databases/core_cy.1
>>> ./devel/sage-main/sage/databases/core_cy.18
>>> ./devel/sage-main/sage/ext/core_cy.18
>>> ./devel/sage-main/sage/groups/matrix_gps/core_cy.5
>>> ./devel/sage-main/sage/gsl/core_cy.4
>>> ./devel/sage-main/sage/misc/core_cy.10
>>> ./devel/sage-main/sage/misc/core_cy.17
>>> ./devel/sage-main/sage/misc/core_cy.2
>>> ./devel/sage-main/sage/modular/abvar/core_cy.7
>>> ./devel/sage-main/sage/plot/plot3d/core_cy.19
>>> ./devel/sage-main/sage/rings/core_cy.20
>>> ./local/lib/python2.6/site-packages/sagenb-0.8.1-py2.6.egg/sagenb/testing/tests/core_cy.19
>>>
>>> Should I test differently?
>>
>> So it looks like you're getting segfaults all over the place as
>> well... Hmm... Could you test with
>> https://sage.math.washington.edu:8091/hudson/job/sage-build/163/artifact/cython-devel.spkg
>> ?
>
> With the new package, I get similar results, i.e., apparently random
> segfaults.  The core score is about the same:

Well, it's clear there's something going on. What about testing on a
plain-vanilla sage (with the old Cython)?

> $ find -name core_cy2.\* | sort
> ./data/extcode/genus2reduction/core_cy2.1
> ./data/extcode/genus2reduction/core_cy2.10
> ./data/extcode/genus2reduction/core_cy2.11
> ./data/extcode/genus2reduction/core_cy2.12
> ./data/extcode/genus2reduction/core_cy2.13
> ./data/extcode/genus2reduction/core_cy2.14
> ./data/extcode/genus2reduction/core_cy2.15
> ./data/extcode/genus2reduction/core_cy2.16
> ./data/extcode/genus2reduction/core_cy2.17
> ./data/extcode/genus2reduction/core_cy2.18
> ./data/extcode/genus2reduction/core_cy2.19
> ./data/extcode/genus2reduction/core_cy2.2
> ./data/extcode/genus2reduction/core_cy2.20
> ./data/extcode/genus2reduction/core_cy2.3
> ./data/extcode/genus2reduction/core_cy2.4
> ./data/extcode/genus2reduction/core_cy2.5
> ./data/extcode/genus2reduction/core_cy2.6
> ./data/extcode/genus2reduction/core_cy2.7
> ./data/extcode/genus2reduction/core_cy2.8
> ./data/extcode/genus2reduction/core_cy2.9
> ./devel/sage-main/doc/en/reference/core_cy2.20
> ./devel/sage-main/doc/en/reference/sagenb/misc/core_cy2.9
> ./devel/sage-main/sage/categories/core_cy2.3
> ./devel/sage-main/sage/combinat/core_cy2.13
> ./devel/sage-main/sage/combinat/matrices/core_cy2.16
> ./devel/sage-main/sage/crypto/core_cy2.17
> ./devel/sage-main/sage/interfaces/core_cy2.12
> ./devel/sage-main/sage/interfaces/core_cy2.9
> ./devel/sage-main/sage/matrix/core_cy2.16
> ./devel/sage-main/sage/misc/core_cy2.17
> ./devel/sage-main/sage/misc/core_cy2.2
> ./devel/sage-main/sage/misc/core_cy2.5
> ./devel/sage-main/sage/modular/abvar/core_cy2.20
> ./devel/sage-main/sage/rings/core_cy2.6
> ./devel/sage-main/sage/rings/polynomial/core_cy2.10
> ./local/lib/python2.6/site-packages/sagenb-0.8.1-py2.6.egg/sagenb/interfaces/core_cy2.5
>
> (I've moved the earlier logs to SAGE_ROOT/oldlogs but left the
> corresponding core_cy.* in place.)
>
> By the way, in order to get 'sage -ba' to succeed for the 4.5.2 series,
> I added an 8th patch (attached).  I don't know if the changes are OK,
> but in limited testing, the doctests pass, modulo the heisenbug.

Yeah, it looks fine (and similar to some of the other stuff we had to do).

- Robert

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to