Re: VT_WAITACTIVE (was: Re: ffreep support on Geode LX (XO-1))

2009-12-03 Thread Martin Langhoff
On Thu, Dec 3, 2009 at 4:13 PM, Sascha Silbe
 wrote:
> There's been some work in 2.6.32 on some shiny new interface that is
> supposed not to have the VT_WAITACTIVE bugs: [1]
> Maybe ul-warning can switch to that once the olpc tree has been rebased on
> top of 2.6.32... (or maybe not; haven't taken any look at it)

Interesting. For anyone interested in this, I've reimplemented dsd's
trick on chvt.pyx so it doesn't affect us anymore. The upcoming 8.2.2
has the fix.

For the F11 trac, dsd's was going to merge the fix into
modules-dracut-olpc to keep things consistent, even though that
codepath is now only hit on boot, and there's AFAIK nothing to race
against.

Not sure what handles the shutdown-time VT-switch to the ul-warning,
but dsd sounded confident that it's not racey.

The bug report
http://dev.laptop.org/ticket/7531



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


VT_WAITACTIVE (was: Re: ffreep support on Geode LX (XO-1))

2009-12-03 Thread Sascha Silbe

On Thu, May 07, 2009 at 04:21:35PM -0400, Daniel Drake wrote:


i think you're exactly right -- the fact that ul-warning completes
when you manually switch screens is pretty convincing.  i'm
amazed the vt system has never grown a new api of some sort to
fix this problem.  because powerd puts up an additional shutdown
splash screen prior to the ul-warning screen, the sequencing is
perturbed, and i see it (the race) more often.

i think we should apply the bandaid of a retry to the chvt code --
i.e., add a timeout to the wait, and retry (or, at the very
least, exit) if it expires.  i actually went in to make the code
changes the other day, but i backed off when i realized the
source was in pyrex, not C.  i realized i wasn't sure i'd be able
to do the signal handling successfully (in the time i'd allotted
myself, at any rate).

I encountered this bug while working on another product, and came up
with a solution that isn't quite so complex.  It involves not using
VT_WAITACTIVE and instead just polling the v_active member of the
VT_GETSTATE result in a loop, retrying the VT_ACTIVATE until happy.

Site seems down at the moment but here it is from google's cache:
http://74.125.47.132/search?q=cache:dmF2nkv9GL0J:www.brontes3d.com/opensource/dist/v1.2/overlay/sys-apps/kbd/files/kbd-1.12-chvt-userwait.patch+chvt+userwait&cd=1&hl=en&ct=clnk&gl=py&client=firefox-a

AFAIK the patch was never accepted upstream (due to it being very
inactive or dead) but I definitely submitted it.

This is something that we could perhaps roll into bobby's work on
bootanim/ul-warning, assuming it is affected in the first place...
[Fullquote to refresh everyones memory because it's a rather old mail 
I'm replying to]


There's been some work in 2.6.32 on some shiny new interface that is 
supposed not to have the VT_WAITACTIVE bugs: [1]
Maybe ul-warning can switch to that once the olpc tree has been rebased 
on top of 2.6.32... (or maybe not; haven't taken any look at it)



[1] 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=8b92e87d39bfd046e7581e1fe0f40eac40f88608


CU Sascha

--
http://sascha.silbe.org/
http://www.infra-silbe.de/

signature.asc
Description: Digital signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread Daniel Drake
2009/5/7  :
> i think you're exactly right -- the fact that ul-warning completes
> when you manually switch screens is pretty convincing.  i'm
> amazed the vt system has never grown a new api of some sort to
> fix this problem.  because powerd puts up an additional shutdown
> splash screen prior to the ul-warning screen, the sequencing is
> perturbed, and i see it (the race) more often.
>
> i think we should apply the bandaid of a retry to the chvt code --
> i.e., add a timeout to the wait, and retry (or, at the very
> least, exit) if it expires.  i actually went in to make the code
> changes the other day, but i backed off when i realized the
> source was in pyrex, not C.  i realized i wasn't sure i'd be able
> to do the signal handling successfully (in the time i'd allotted
> myself, at any rate).

I encountered this bug while working on another product, and came up
with a solution that isn't quite so complex.  It involves not using
VT_WAITACTIVE and instead just polling the v_active member of the
VT_GETSTATE result in a loop, retrying the VT_ACTIVATE until happy.

Site seems down at the moment but here it is from google's cache:
http://74.125.47.132/search?q=cache:dmF2nkv9GL0J:www.brontes3d.com/opensource/dist/v1.2/overlay/sys-apps/kbd/files/kbd-1.12-chvt-userwait.patch+chvt+userwait&cd=1&hl=en&ct=clnk&gl=py&client=firefox-a

AFAIK the patch was never accepted upstream (due to it being very
inactive or dead) but I definitely submitted it.

This is something that we could perhaps roll into bobby's work on
bootanim/ul-warning, assuming it is affected in the first place...

Daniel
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread pgf
daniel wrote:
 > 2009/5/7 NoiseEHC :
 > >
 > >> please file a ticket at dev.laptop.org, with details on how to
 > >> reproduce the ffreep issue using build 802.  (if it's only
 > >> reproducible with debxo (unclear from what's been written so
 > >> far), then the priority (and the fix) will likely be very
 > >> different.)
 > >>
 > >>
 > > I cannot reproduce it reliably so I do not know what should I file as a
 > > bug. What I have seen with 800 (months ago) can be seen here:
 > > http://wiki.laptop.org/go/Image:Xo_freeze_on_shutdown.png
 > > I think that it is sure that at least some parts of the 80x versions
 > > contain some code compiled with illegal instructions.
 > 
 > I'm pretty sure the shutdown "freeze" (which can be unfrozen by using
 > ctrl+alt+f1/f2 to change terminals) is totally unrelated to any
 > instruction problems. I am pretty sure I have an explanation of the
 > shutdown freeze, it's a race that occurs when 2 processes try to chvt
 > at the same time (the 2 processes being X as it terminates and moves
 > to the console, and ul-warning as it tries to move to the graphical
 > terminal). Examining chvt source code makes it pretty obvious why this
 > might happen... The switch process is actually switch-and-wait,
 > performed by 2 separate ioctls, and hence is not atomic.
 > 
 > I tried adding appropriate diagnostics once but this occurs
 > frustratingly rarely that I never got to see if my theory is correct.

i think you're exactly right -- the fact that ul-warning completes
when you manually switch screens is pretty convincing.  i'm
amazed the vt system has never grown a new api of some sort to
fix this problem.  because powerd puts up an additional shutdown
splash screen prior to the ul-warning screen, the sequencing is
perturbed, and i see it (the race) more often.

i think we should apply the bandaid of a retry to the chvt code --
i.e., add a timeout to the wait, and retry (or, at the very
least, exit) if it expires.  i actually went in to make the code
changes the other day, but i backed off when i realized the
source was in pyrex, not C.  i realized i wasn't sure i'd be able
to do the signal handling successfully (in the time i'd allotted
myself, at any rate).

paul

 > 
 > Daniel

=-
 paul fox, p...@laptop.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread Daniel Drake
2009/5/7 NoiseEHC :
>
>> please file a ticket at dev.laptop.org, with details on how to
>> reproduce the ffreep issue using build 802.  (if it's only
>> reproducible with debxo (unclear from what's been written so
>> far), then the priority (and the fix) will likely be very
>> different.)
>>
>>
> I cannot reproduce it reliably so I do not know what should I file as a
> bug. What I have seen with 800 (months ago) can be seen here:
> http://wiki.laptop.org/go/Image:Xo_freeze_on_shutdown.png
> I think that it is sure that at least some parts of the 80x versions
> contain some code compiled with illegal instructions.

I'm pretty sure the shutdown "freeze" (which can be unfrozen by using
ctrl+alt+f1/f2 to change terminals) is totally unrelated to any
instruction problems. I am pretty sure I have an explanation of the
shutdown freeze, it's a race that occurs when 2 processes try to chvt
at the same time (the 2 processes being X as it terminates and moves
to the console, and ul-warning as it tries to move to the graphical
terminal). Examining chvt source code makes it pretty obvious why this
might happen... The switch process is actually switch-and-wait,
performed by 2 separate ioctls, and hence is not atomic.

I tried adding appropriate diagnostics once but this occurs
frustratingly rarely that I never got to see if my theory is correct.

Daniel
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread Sascha Silbe

On Tue, May 05, 2009 at 02:20:42PM +0200, Sascha Silbe wrote:

As ffreep was added to gcc 3.4 [3] and Build 801 seems to use 4.3.0, 
I'm wondering whether it has been patched/configured in some way to 
avoid this issue
Further tests have indicated (though not proven, at least for point a) 
that
a) ffreep is only emitted if -march=native is used (which I added in an 
attempt to fix point b) and
b) xulrunner-1.9.0.7 mistakenly enables SSE1 (not ffreep, but also 
causing SIGILL) on all x86 hosts [1].


So while still a serious compiler issue (and reported before for a 
different processor [2]), it only seems to happen if additional compiler 
options are passed.



[1] https://bugzilla.mozilla.org/show_bug.cgi?id=491829
[2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37179

CU Sascha

--
http://sascha.silbe.org/
http://www.infra-silbe.de/

signature.asc
Description: Digital signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread NoiseEHC

> please file a ticket at dev.laptop.org, with details on how to
> reproduce the ffreep issue using build 802.  (if it's only
> reproducible with debxo (unclear from what's been written so
> far), then the priority (and the fix) will likely be very
> different.)
>
>   
I cannot reproduce it reliably so I do not know what should I file as a 
bug. What I have seen with 800 (months ago) can be seen here:
http://wiki.laptop.org/go/Image:Xo_freeze_on_shutdown.png
I think that it is sure that at least some parts of the 80x versions 
contain some code compiled with illegal instructions.

> the failure to switch to the UL-warning screen during shutdown
> is a secondary effect of whatever it is you're seeing, and if
> reproducible should have a second ticket filed.
>
>   
On Windows if there is a debugger installed and a program crashes then 
Windows asks if I wanna attach a debugger. Can something like this be 
done on the XO? Or shall I always run it from a debugger? If so then how 
can I do it? Or can I create a crash dump file somehow? It happens quite 
regularly but I cannot give your instructions. Do not you experience it?
> paul
>
>  > Sascha Silbe wrote:
>  > > Hi!
>  > >
>  > > While trying to use sugar-jhbuild on DebXO (Debian on XO-1), I 
>  > > encountered several programs that crashed with SIGILL, apparently 
>  > > during execution of ffreep. While the "AMD Athlon Processor x86 Code 
>  > > Optimization Guide" [1] claims that "although insufficiently 
>  > > documented in the past, [ffreep] is supported by all 32-bit x86 
>  > > processors", the AMD Geode LX datasheet [2] doesn't list ffreep.
>  > >
>  > > As ffreep was added to gcc 3.4 [3] and Build 801 seems to use 4.3.0, 
>  > > I'm wondering whether it has been patched/configured in some way to 
>  > > avoid this issue or whether the processor actually supports it and 
>  > > something else on my machine is broken.
>  > >
>  > >
>  > > [1] 
>  > > 
>  > 
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pd
>  > f 
>  > >
>  > > [2] 
>  > > 
>  > 
> http://www.amd.com/files/connectivitysolutions/geode/geode_lx/33234d_lx_ds.pdf
>  
>  > >
>  > > [3] http://gcc.gnu.org/ml/gcc-patches/2002-11/msg01386.html
>  > >
>  > > CU Sascha
>  > >
>
> =-
>  paul fox, p...@laptop.org
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
>
> __ Information from ESET NOD32 Antivirus, version of virus signature 
> database 4059 (20090507) __
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>
>
>   

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread pgf
noiseehc wrote:
 > I have just tested it on my XO and the Geode DOES NOT support the ffreep 
 > instruction. It could explain the halting shutdown when it stalls with a 
 > signal 15 (which happens to be SIGILL) and only continuing it when I 
 > switch to the other console (as I reported in [1]). So fixing it and 
 > creating a 803 is absolutely necessary IMHO.
 > 
 > [1] http://lists.laptop.org/pipermail/devel/2009-May/024356.html
 > 

please file a ticket at dev.laptop.org, with details on how to
reproduce the ffreep issue using build 802.  (if it's only
reproducible with debxo (unclear from what's been written so
far), then the priority (and the fix) will likely be very
different.)

the failure to switch to the UL-warning screen during shutdown
is a secondary effect of whatever it is you're seeing, and if
reproducible should have a second ticket filed.

paul

 > Sascha Silbe wrote:
 > > Hi!
 > >
 > > While trying to use sugar-jhbuild on DebXO (Debian on XO-1), I 
 > > encountered several programs that crashed with SIGILL, apparently 
 > > during execution of ffreep. While the "AMD Athlon Processor x86 Code 
 > > Optimization Guide" [1] claims that "although insufficiently 
 > > documented in the past, [ffreep] is supported by all 32-bit x86 
 > > processors", the AMD Geode LX datasheet [2] doesn't list ffreep.
 > >
 > > As ffreep was added to gcc 3.4 [3] and Build 801 seems to use 4.3.0, 
 > > I'm wondering whether it has been patched/configured in some way to 
 > > avoid this issue or whether the processor actually supports it and 
 > > something else on my machine is broken.
 > >
 > >
 > > [1] 
 > > 
 > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pd
 > f 
 > >
 > > [2] 
 > > 
 > http://www.amd.com/files/connectivitysolutions/geode/geode_lx/33234d_lx_ds.pdf
 >  
 > >
 > > [3] http://gcc.gnu.org/ml/gcc-patches/2002-11/msg01386.html
 > >
 > > CU Sascha
 > >

=-
 paul fox, p...@laptop.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ffreep support on Geode LX (XO-1)

2009-05-07 Thread NoiseEHC
I have just tested it on my XO and the Geode DOES NOT support the ffreep 
instruction. It could explain the halting shutdown when it stalls with a 
signal 15 (which happens to be SIGILL) and only continuing it when I 
switch to the other console (as I reported in [1]). So fixing it and 
creating a 803 is absolutely necessary IMHO.


[1] http://lists.laptop.org/pipermail/devel/2009-May/024356.html

Sascha Silbe wrote:

Hi!

While trying to use sugar-jhbuild on DebXO (Debian on XO-1), I 
encountered several programs that crashed with SIGILL, apparently 
during execution of ffreep. While the "AMD Athlon Processor x86 Code 
Optimization Guide" [1] claims that "although insufficiently 
documented in the past, [ffreep] is supported by all 32-bit x86 
processors", the AMD Geode LX datasheet [2] doesn't list ffreep.


As ffreep was added to gcc 3.4 [3] and Build 801 seems to use 4.3.0, 
I'm wondering whether it has been patched/configured in some way to 
avoid this issue or whether the processor actually supports it and 
something else on my machine is broken.



[1] 
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf 

[2] 
http://www.amd.com/files/connectivitysolutions/geode/geode_lx/33234d_lx_ds.pdf 


[3] http://gcc.gnu.org/ml/gcc-patches/2002-11/msg01386.html

CU Sascha



___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel



__ Information from ESET NOD32 Antivirus, version of virus signature 
database 4052 (20090504) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

  


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


ffreep support on Geode LX (XO-1)

2009-05-05 Thread Sascha Silbe

Hi!

While trying to use sugar-jhbuild on DebXO (Debian on XO-1), I encountered several programs that 
crashed with SIGILL, apparently during execution of ffreep. While the "AMD Athlon Processor 
x86 Code Optimization Guide" [1] claims that "although insufficiently documented in the 
past, [ffreep] is supported by all 32-bit x86 processors", the AMD Geode LX datasheet [2] 
doesn't list ffreep.

As ffreep was added to gcc 3.4 [3] and Build 801 seems to use 4.3.0, I'm 
wondering whether it has been patched/configured in some way to avoid this 
issue or whether the processor actually supports it and something else on my 
machine is broken.


[1] 
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf
[2] 
http://www.amd.com/files/connectivitysolutions/geode/geode_lx/33234d_lx_ds.pdf
[3] http://gcc.gnu.org/ml/gcc-patches/2002-11/msg01386.html

CU Sascha

--
http://sascha.silbe.org/
http://www.infra-silbe.de/


signature.asc
Description: Digital signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel