Re: [Mono-list] Help debugging program failing randomly

2013-04-09 Thread Danny

(Sent this earlier, but it didn't post to the list)
Thanks to Ian and Alan for the replies.  I have done some further 
elimination (by removing runtime components) and I don't think it is the 
new board interface causing this.  I think it is another component, that 
isn't quite as new, but I had forgotten is new in this context (ubuntu 
server).  This component periodically uses a graphing library (ZedGraph) 
to generate line graphs from the data collected from the input boards. 
I have included the entire capture of the stack trace that mono sends to 
stdout.  Note that this is a capture of the console, not a log file or 
core dump (which I'd like to know how to get from monoservice2), so it 
includes system status messages from my code as well - I left them in 
for context, whether it matters or not.


http://pastebin.com/kQFF4TUB

I currently have a test running that eliminates this graphing component, 
but includes the new board component, and it seems promising so far. 
I'll feel better after it runs for a week though, since I've had it run 
for almost 5 days before it crashed.


At any rate, if it is this new component and the graphing library
causing this issue, I need figure out how to fix it.  Also, I have used 
ZedGraph for a very long time to generate images like this, but the 
frequency used to be limited to once per day.  Now it can be once per 
minute.  The once/day generation is done in yet another component, so it 
could be the two 'walking' on each other if the underlying code isn't
thread-safe.  I would expect some kind of time correlation if that was 
the case, and I just don't see that.  I have some ideas on how to
serialize all of these operations to a single thread, but I'd need to be 
fairly sure of the problem before I went to the effort to implement that.


If I could get a good bead on what I'm doing that causes this error I
can work around it.

Thanks again for the help,
Danny

On 04/08/2013 06:21 AM, Alan wrote:

I'm not sure if fontconfig is threadsafe and the finalizer thread is
directly unreffing some fontconfig objects. This could easily be causing
the corruption you're seeing if that's the case. Can you paste the full
stacktrace of your crash (including all threads!) in a pastebin, or
attach it to your email in some way?

Alan


On 8 April 2013 08:42, Ian Norton
mailto:ian.norton-bad...@thales-esecurity.com>> wrote:

I'd be sure to check your struct packing and call conventions
properly. And
perhaps be sure that you aren't passing in any "ref System.String"
instead of
StringBuilders

Ian

On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
 > Hello,
 >
 > I'm having a difficult time with an application I have written.  I
 > recently made some changes and I'm having a problem with it
failing at
 > seemingly random times and locations (within the code), with sigsegv
 > errors.  This is a multithreaded plugin-style daemon/service (can be
 > launched from CLI) and I recently added a new component to it to
poll a
 > data acquisition board via USB using FTDI.
 >
 > Almost all of our integrations like this use a shared library (or
DLL on
 > Windows) and p/invoke to access hardware.  I have done dozens of
these
 > integrations over USB without a persistent issue like this.  But
still
 > at first I suspected this new component, as I had initially
thought it
 > was trashing RAM because of the problems I had developing the
shared library
 >
 > However, at the same time as I made this addition, I was also
(somewhat)
 > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on
 > 10.04).  So unfortunately, I have more than one variable changing
at a
 > time.  So I confirmed, with a configuration that eliminates the newly
 > developed component, that this problem occurs without that running.
 >
 > That's good and bad, since now it seems likely that the offending
code
 > is out of my control.  I am hoping to get some information on the
 > error(s) I was able to capture, or some advice on how to debug
the root
 > cause of this problem.
 >
 > I have a couple of stack traces captured and I'll include what I
believe
 > is the crucial part of one here.  It's worth noting that not all
of the
 > stack traces are the same.  It's also worth noting that I have seen
 > libgdiplus.so in other traces that I didn't get captured.
 >
 > I tried setting up a 10.04 machine to test with, but one of our newer
 > dependencies (ServiceStack) introduced a class that is not in the
 > default mono on that platform, giving a startup error trying to
resolve
 > the IgnoreDataMemberAttribute class.  So I then got the latest
mono set
 > up on that machine now, but fear that this will result in the
same error
 > I am reporting (ie: I believe this to be a mono problem),

Re: [Mono-list] Help debugging program failing randomly

2013-04-08 Thread Danny
Thanks to Ian and Alan for the replies.  I have done some further 
elimination (by removing runtime components) and I don't think it is the 
new board interface causing this.  I think it is another component, that 
isn't quite as new, but I had forgotten is new in this context (ubuntu 
server).  This component periodically uses a graphing library (ZedGraph) 
to generate line graphs from the data collected from the input boards. 
I have included the entire capture of the stack trace that mono sends to 
stdout.  Note that this is a capture of the console, not a log file, so 
it includes system status messages from my code as well - I left them in 
for context, whether it matters or not.


http://pastebin.com/kQFF4TUB

I currently have a test running that eliminates this graphing component, 
but includes the new board component, and it seems promising so far. 
I'll feel better after it runs for a week though, since I've had it run 
for almost 5 days before it crashed.


At any rate, if it is this new component and the graphing library 
causing this issue, I need figure out how to fix it.  Also, I have used 
ZedGraph for a very long time to generate images like this, but the 
frequency used to be limited to once per day.  Now it can be once per 
minute.  The once/day generation is done in yet another component, so it 
could be the two 'walking' on each other if the underlying code isn't 
thread-safe.  I would expect some kind of time correlation if that was 
the case, and I just don't see that.  I have some ideas on how to 
serialize all of these operations to a single thread, but I'd need to be 
fairly sure of the problem before I went to the effort to implement that.


If I could get a good bead on what I'm doing that causes this error I 
can work around it.


Thanks again for the help,
Danny


On 04/08/2013 06:21 AM, Alan wrote:

I'm not sure if fontconfig is threadsafe and the finalizer thread is
directly unreffing some fontconfig objects. This could easily be causing
the corruption you're seeing if that's the case. Can you paste the full
stacktrace of your crash (including all threads!) in a pastebin, or
attach it to your email in some way?

Alan


On 8 April 2013 08:42, Ian Norton
mailto:ian.norton-bad...@thales-esecurity.com>> wrote:

I'd be sure to check your struct packing and call conventions
properly. And
perhaps be sure that you aren't passing in any "ref System.String"
instead of
StringBuilders

Ian

On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
 > Hello,
 >
 > I'm having a difficult time with an application I have written.  I
 > recently made some changes and I'm having a problem with it
failing at
 > seemingly random times and locations (within the code), with sigsegv
 > errors.  This is a multithreaded plugin-style daemon/service (can be
 > launched from CLI) and I recently added a new component to it to
poll a
 > data acquisition board via USB using FTDI.
 >
 > Almost all of our integrations like this use a shared library (or
DLL on
 > Windows) and p/invoke to access hardware.  I have done dozens of
these
 > integrations over USB without a persistent issue like this.  But
still
 > at first I suspected this new component, as I had initially
thought it
 > was trashing RAM because of the problems I had developing the
shared library
 >
 > However, at the same time as I made this addition, I was also
(somewhat)
 > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on
 > 10.04).  So unfortunately, I have more than one variable changing
at a
 > time.  So I confirmed, with a configuration that eliminates the newly
 > developed component, that this problem occurs without that running.
 >
 > That's good and bad, since now it seems likely that the offending
code
 > is out of my control.  I am hoping to get some information on the
 > error(s) I was able to capture, or some advice on how to debug
the root
 > cause of this problem.
 >
 > I have a couple of stack traces captured and I'll include what I
believe
 > is the crucial part of one here.  It's worth noting that not all
of the
 > stack traces are the same.  It's also worth noting that I have seen
 > libgdiplus.so in other traces that I didn't get captured.
 >
 > I tried setting up a 10.04 machine to test with, but one of our newer
 > dependencies (ServiceStack) introduced a class that is not in the
 > default mono on that platform, giving a startup error trying to
resolve
 > the IgnoreDataMemberAttribute class.  So I then got the latest
mono set
 > up on that machine now, but fear that this will result in the
same error
 > I am reporting (ie: I believe this to be a mono problem), since it
 > should be the same mono framework running there.
 >
 > Any help is greatly appreciated.
  

Re: [Mono-list] Help debugging program failing randomly

2013-04-08 Thread Alan
I'm not sure if fontconfig is threadsafe and the finalizer thread is
directly unreffing some fontconfig objects. This could easily be causing
the corruption you're seeing if that's the case. Can you paste the full
stacktrace of your crash (including all threads!) in a pastebin, or attach
it to your email in some way?

Alan


On 8 April 2013 08:42, Ian Norton wrote:

> I'd be sure to check your struct packing and call conventions properly. And
> perhaps be sure that you aren't passing in any "ref System.String" instead
> of
> StringBuilders
>
> Ian
>
> On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
> > Hello,
> >
> > I'm having a difficult time with an application I have written.  I
> > recently made some changes and I'm having a problem with it failing at
> > seemingly random times and locations (within the code), with sigsegv
> > errors.  This is a multithreaded plugin-style daemon/service (can be
> > launched from CLI) and I recently added a new component to it to poll a
> > data acquisition board via USB using FTDI.
> >
> > Almost all of our integrations like this use a shared library (or DLL on
> > Windows) and p/invoke to access hardware.  I have done dozens of these
> > integrations over USB without a persistent issue like this.  But still
> > at first I suspected this new component, as I had initially thought it
> > was trashing RAM because of the problems I had developing the shared
> library
> >
> > However, at the same time as I made this addition, I was also (somewhat)
> > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on
> > 10.04).  So unfortunately, I have more than one variable changing at a
> > time.  So I confirmed, with a configuration that eliminates the newly
> > developed component, that this problem occurs without that running.
> >
> > That's good and bad, since now it seems likely that the offending code
> > is out of my control.  I am hoping to get some information on the
> > error(s) I was able to capture, or some advice on how to debug the root
> > cause of this problem.
> >
> > I have a couple of stack traces captured and I'll include what I believe
> > is the crucial part of one here.  It's worth noting that not all of the
> > stack traces are the same.  It's also worth noting that I have seen
> > libgdiplus.so in other traces that I didn't get captured.
> >
> > I tried setting up a 10.04 machine to test with, but one of our newer
> > dependencies (ServiceStack) introduced a class that is not in the
> > default mono on that platform, giving a startup error trying to resolve
> > the IgnoreDataMemberAttribute class.  So I then got the latest mono set
> > up on that machine now, but fear that this will result in the same error
> > I am reporting (ie: I believe this to be a mono problem), since it
> > should be the same mono framework running there.
> >
> > Any help is greatly appreciated.
> >
> >
> >
> > 
> >
> > Stacktrace:
> >
> >at (wrapper managed-to-native) System.Drawing.GDIPlus.GdipDeleteFont
> > (intptr) <0x>
> >at System.Drawing.Font.Dispose () <0x0002b>
> >at (wrapper remoting-invoke-with-check) System.Drawing.Font.Dispose
> > () <0x>
> >at System.Drawing.Font.Finalize () <0x00013>
> >at (wrapper runtime-invoke)
> > object.runtime_invoke_virtual_void__this__ (object,intptr,intptr,intpt$
> >
> > Native stacktrace:
> >
> >  mono() [0x80e16fc]
> >  mono() [0x81209fc]
> >  mono() [0x806094d]
> >  [0xb770240c]
> >
> > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15)
> > [0xb4b1b9b5]
> >  /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43)
> [0xb4b29b43]
> >
> > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82)
> > [0xb4b29e12]
> >  /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132)
> [0xb5004642]
> >  /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c]
> >  [0xaf711940]
> >  [0xaf7118cc]
> >  [0xaf711870]
> >  [0xaf7117ec]
> >  [0xb5cddf41]
> >  mono() [0x8150107]
> >
> > 
> >
> > =
> > Got a SIGSEGV while executing native code. This usually indicates
> > a fatal error in the mono runtime or one of the native libraries
> > used by your application.
> > =
> >
> >
> >
> >
> > Danny
> > ___
> > Mono-list maillist  -  Mono-list@lists.ximian.com
> > http://lists.ximian.com/mailman/listinfo/mono-list
> ___
> Mono-list maillist  -  Mono-list@lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-list
>
___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list


Re: [Mono-list] Help debugging program failing randomly

2013-04-08 Thread Ian Norton
I'd be sure to check your struct packing and call conventions properly. And
perhaps be sure that you aren't passing in any "ref System.String" instead of
StringBuilders

Ian

On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
> Hello,
> 
> I'm having a difficult time with an application I have written.  I 
> recently made some changes and I'm having a problem with it failing at 
> seemingly random times and locations (within the code), with sigsegv 
> errors.  This is a multithreaded plugin-style daemon/service (can be 
> launched from CLI) and I recently added a new component to it to poll a 
> data acquisition board via USB using FTDI.
> 
> Almost all of our integrations like this use a shared library (or DLL on 
> Windows) and p/invoke to access hardware.  I have done dozens of these 
> integrations over USB without a persistent issue like this.  But still 
> at first I suspected this new component, as I had initially thought it 
> was trashing RAM because of the problems I had developing the shared library
> 
> However, at the same time as I made this addition, I was also (somewhat) 
> forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on 
> 10.04).  So unfortunately, I have more than one variable changing at a 
> time.  So I confirmed, with a configuration that eliminates the newly 
> developed component, that this problem occurs without that running.
> 
> That's good and bad, since now it seems likely that the offending code 
> is out of my control.  I am hoping to get some information on the 
> error(s) I was able to capture, or some advice on how to debug the root 
> cause of this problem.
> 
> I have a couple of stack traces captured and I'll include what I believe 
> is the crucial part of one here.  It's worth noting that not all of the 
> stack traces are the same.  It's also worth noting that I have seen 
> libgdiplus.so in other traces that I didn't get captured.
> 
> I tried setting up a 10.04 machine to test with, but one of our newer 
> dependencies (ServiceStack) introduced a class that is not in the 
> default mono on that platform, giving a startup error trying to resolve 
> the IgnoreDataMemberAttribute class.  So I then got the latest mono set 
> up on that machine now, but fear that this will result in the same error 
> I am reporting (ie: I believe this to be a mono problem), since it 
> should be the same mono framework running there.
> 
> Any help is greatly appreciated.
> 
> 
> 
> 
> 
> Stacktrace:
> 
>at (wrapper managed-to-native) System.Drawing.GDIPlus.GdipDeleteFont 
> (intptr) <0x>
>at System.Drawing.Font.Dispose () <0x0002b>
>at (wrapper remoting-invoke-with-check) System.Drawing.Font.Dispose 
> () <0x>
>at System.Drawing.Font.Finalize () <0x00013>
>at (wrapper runtime-invoke) 
> object.runtime_invoke_virtual_void__this__ (object,intptr,intptr,intpt$
> 
> Native stacktrace:
> 
>  mono() [0x80e16fc]
>  mono() [0x81209fc]
>  mono() [0x806094d]
>  [0xb770240c]
> 
> /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15) 
> [0xb4b1b9b5]
>  /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43) [0xb4b29b43]
> 
> /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82) 
> [0xb4b29e12]
>  /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132) [0xb5004642]
>  /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c]
>  [0xaf711940]
>  [0xaf7118cc]
>  [0xaf711870]
>  [0xaf7117ec]
>  [0xb5cddf41]
>  mono() [0x8150107]
> 
> 
> 
> =
> Got a SIGSEGV while executing native code. This usually indicates
> a fatal error in the mono runtime or one of the native libraries
> used by your application.
> =
> 
> 
> 
> 
> Danny
> ___
> Mono-list maillist  -  Mono-list@lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-list
___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list


[Mono-list] Help debugging program failing randomly

2013-04-07 Thread Danny

Hello,

I'm having a difficult time with an application I have written.  I 
recently made some changes and I'm having a problem with it failing at 
seemingly random times and locations (within the code), with sigsegv 
errors.  This is a multithreaded plugin-style daemon/service (can be 
launched from CLI) and I recently added a new component to it to poll a 
data acquisition board via USB using FTDI.


Almost all of our integrations like this use a shared library (or DLL on 
Windows) and p/invoke to access hardware.  I have done dozens of these 
integrations over USB without a persistent issue like this.  But still 
at first I suspected this new component, as I had initially thought it 
was trashing RAM because of the problems I had developing the shared library


However, at the same time as I made this addition, I was also (somewhat) 
forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on 
10.04).  So unfortunately, I have more than one variable changing at a 
time.  So I confirmed, with a configuration that eliminates the newly 
developed component, that this problem occurs without that running.


That's good and bad, since now it seems likely that the offending code 
is out of my control.  I am hoping to get some information on the 
error(s) I was able to capture, or some advice on how to debug the root 
cause of this problem.


I have a couple of stack traces captured and I'll include what I believe 
is the crucial part of one here.  It's worth noting that not all of the 
stack traces are the same.  It's also worth noting that I have seen 
libgdiplus.so in other traces that I didn't get captured.


I tried setting up a 10.04 machine to test with, but one of our newer 
dependencies (ServiceStack) introduced a class that is not in the 
default mono on that platform, giving a startup error trying to resolve 
the IgnoreDataMemberAttribute class.  So I then got the latest mono set 
up on that machine now, but fear that this will result in the same error 
I am reporting (ie: I believe this to be a mono problem), since it 
should be the same mono framework running there.


Any help is greatly appreciated.





Stacktrace:

  at (wrapper managed-to-native) System.Drawing.GDIPlus.GdipDeleteFont 
(intptr) <0x>

  at System.Drawing.Font.Dispose () <0x0002b>
  at (wrapper remoting-invoke-with-check) System.Drawing.Font.Dispose 
() <0x>

  at System.Drawing.Font.Finalize () <0x00013>
  at (wrapper runtime-invoke) 
object.runtime_invoke_virtual_void__this__ (object,intptr,intptr,intpt$


Native stacktrace:

mono() [0x80e16fc]
mono() [0x81209fc]
mono() [0x806094d]
[0xb770240c]

/usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15) 
[0xb4b1b9b5]

/usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43) [0xb4b29b43]

/usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82) 
[0xb4b29e12]

/usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132) [0xb5004642]
/usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c]
[0xaf711940]
[0xaf7118cc]
[0xaf711870]
[0xaf7117ec]
[0xb5cddf41]
mono() [0x8150107]



=
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=




Danny
___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list