Re: [Mono-list] Help debugging program failing randomly
(Sent this earlier, but it didn't post to the list) Thanks to Ian and Alan for the replies. I have done some further elimination (by removing runtime components) and I don't think it is the new board interface causing this. I think it is another component, that isn't quite as new, but I had forgotten is new in this context (ubuntu server). This component periodically uses a graphing library (ZedGraph) to generate line graphs from the data collected from the input boards. I have included the entire capture of the stack trace that mono sends to stdout. Note that this is a capture of the console, not a log file or core dump (which I'd like to know how to get from monoservice2), so it includes system status messages from my code as well - I left them in for context, whether it matters or not. http://pastebin.com/kQFF4TUB I currently have a test running that eliminates this graphing component, but includes the new board component, and it seems promising so far. I'll feel better after it runs for a week though, since I've had it run for almost 5 days before it crashed. At any rate, if it is this new component and the graphing library causing this issue, I need figure out how to fix it. Also, I have used ZedGraph for a very long time to generate images like this, but the frequency used to be limited to once per day. Now it can be once per minute. The once/day generation is done in yet another component, so it could be the two 'walking' on each other if the underlying code isn't thread-safe. I would expect some kind of time correlation if that was the case, and I just don't see that. I have some ideas on how to serialize all of these operations to a single thread, but I'd need to be fairly sure of the problem before I went to the effort to implement that. If I could get a good bead on what I'm doing that causes this error I can work around it. Thanks again for the help, Danny On 04/08/2013 06:21 AM, Alan wrote: I'm not sure if fontconfig is threadsafe and the finalizer thread is directly unreffing some fontconfig objects. This could easily be causing the corruption you're seeing if that's the case. Can you paste the full stacktrace of your crash (including all threads!) in a pastebin, or attach it to your email in some way? Alan On 8 April 2013 08:42, Ian Norton mailto:ian.norton-bad...@thales-esecurity.com>> wrote: I'd be sure to check your struct packing and call conventions properly. And perhaps be sure that you aren't passing in any "ref System.String" instead of StringBuilders Ian On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote: > Hello, > > I'm having a difficult time with an application I have written. I > recently made some changes and I'm having a problem with it failing at > seemingly random times and locations (within the code), with sigsegv > errors. This is a multithreaded plugin-style daemon/service (can be > launched from CLI) and I recently added a new component to it to poll a > data acquisition board via USB using FTDI. > > Almost all of our integrations like this use a shared library (or DLL on > Windows) and p/invoke to access hardware. I have done dozens of these > integrations over USB without a persistent issue like this. But still > at first I suspected this new component, as I had initially thought it > was trashing RAM because of the problems I had developing the shared library > > However, at the same time as I made this addition, I was also (somewhat) > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on > 10.04). So unfortunately, I have more than one variable changing at a > time. So I confirmed, with a configuration that eliminates the newly > developed component, that this problem occurs without that running. > > That's good and bad, since now it seems likely that the offending code > is out of my control. I am hoping to get some information on the > error(s) I was able to capture, or some advice on how to debug the root > cause of this problem. > > I have a couple of stack traces captured and I'll include what I believe > is the crucial part of one here. It's worth noting that not all of the > stack traces are the same. It's also worth noting that I have seen > libgdiplus.so in other traces that I didn't get captured. > > I tried setting up a 10.04 machine to test with, but one of our newer > dependencies (ServiceStack) introduced a class that is not in the > default mono on that platform, giving a startup error trying to resolve > the IgnoreDataMemberAttribute class. So I then got the latest mono set > up on that machine now, but fear that this will result in the same error > I am reporting (ie: I believe this to be a mono problem),
Re: [Mono-list] Help debugging program failing randomly
Thanks to Ian and Alan for the replies. I have done some further elimination (by removing runtime components) and I don't think it is the new board interface causing this. I think it is another component, that isn't quite as new, but I had forgotten is new in this context (ubuntu server). This component periodically uses a graphing library (ZedGraph) to generate line graphs from the data collected from the input boards. I have included the entire capture of the stack trace that mono sends to stdout. Note that this is a capture of the console, not a log file, so it includes system status messages from my code as well - I left them in for context, whether it matters or not. http://pastebin.com/kQFF4TUB I currently have a test running that eliminates this graphing component, but includes the new board component, and it seems promising so far. I'll feel better after it runs for a week though, since I've had it run for almost 5 days before it crashed. At any rate, if it is this new component and the graphing library causing this issue, I need figure out how to fix it. Also, I have used ZedGraph for a very long time to generate images like this, but the frequency used to be limited to once per day. Now it can be once per minute. The once/day generation is done in yet another component, so it could be the two 'walking' on each other if the underlying code isn't thread-safe. I would expect some kind of time correlation if that was the case, and I just don't see that. I have some ideas on how to serialize all of these operations to a single thread, but I'd need to be fairly sure of the problem before I went to the effort to implement that. If I could get a good bead on what I'm doing that causes this error I can work around it. Thanks again for the help, Danny On 04/08/2013 06:21 AM, Alan wrote: I'm not sure if fontconfig is threadsafe and the finalizer thread is directly unreffing some fontconfig objects. This could easily be causing the corruption you're seeing if that's the case. Can you paste the full stacktrace of your crash (including all threads!) in a pastebin, or attach it to your email in some way? Alan On 8 April 2013 08:42, Ian Norton mailto:ian.norton-bad...@thales-esecurity.com>> wrote: I'd be sure to check your struct packing and call conventions properly. And perhaps be sure that you aren't passing in any "ref System.String" instead of StringBuilders Ian On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote: > Hello, > > I'm having a difficult time with an application I have written. I > recently made some changes and I'm having a problem with it failing at > seemingly random times and locations (within the code), with sigsegv > errors. This is a multithreaded plugin-style daemon/service (can be > launched from CLI) and I recently added a new component to it to poll a > data acquisition board via USB using FTDI. > > Almost all of our integrations like this use a shared library (or DLL on > Windows) and p/invoke to access hardware. I have done dozens of these > integrations over USB without a persistent issue like this. But still > at first I suspected this new component, as I had initially thought it > was trashing RAM because of the problems I had developing the shared library > > However, at the same time as I made this addition, I was also (somewhat) > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on > 10.04). So unfortunately, I have more than one variable changing at a > time. So I confirmed, with a configuration that eliminates the newly > developed component, that this problem occurs without that running. > > That's good and bad, since now it seems likely that the offending code > is out of my control. I am hoping to get some information on the > error(s) I was able to capture, or some advice on how to debug the root > cause of this problem. > > I have a couple of stack traces captured and I'll include what I believe > is the crucial part of one here. It's worth noting that not all of the > stack traces are the same. It's also worth noting that I have seen > libgdiplus.so in other traces that I didn't get captured. > > I tried setting up a 10.04 machine to test with, but one of our newer > dependencies (ServiceStack) introduced a class that is not in the > default mono on that platform, giving a startup error trying to resolve > the IgnoreDataMemberAttribute class. So I then got the latest mono set > up on that machine now, but fear that this will result in the same error > I am reporting (ie: I believe this to be a mono problem), since it > should be the same mono framework running there. > > Any help is greatly appreciated.
Re: [Mono-list] Help debugging program failing randomly
I'm not sure if fontconfig is threadsafe and the finalizer thread is directly unreffing some fontconfig objects. This could easily be causing the corruption you're seeing if that's the case. Can you paste the full stacktrace of your crash (including all threads!) in a pastebin, or attach it to your email in some way? Alan On 8 April 2013 08:42, Ian Norton wrote: > I'd be sure to check your struct packing and call conventions properly. And > perhaps be sure that you aren't passing in any "ref System.String" instead > of > StringBuilders > > Ian > > On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote: > > Hello, > > > > I'm having a difficult time with an application I have written. I > > recently made some changes and I'm having a problem with it failing at > > seemingly random times and locations (within the code), with sigsegv > > errors. This is a multithreaded plugin-style daemon/service (can be > > launched from CLI) and I recently added a new component to it to poll a > > data acquisition board via USB using FTDI. > > > > Almost all of our integrations like this use a shared library (or DLL on > > Windows) and p/invoke to access hardware. I have done dozens of these > > integrations over USB without a persistent issue like this. But still > > at first I suspected this new component, as I had initially thought it > > was trashing RAM because of the problems I had developing the shared > library > > > > However, at the same time as I made this addition, I was also (somewhat) > > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on > > 10.04). So unfortunately, I have more than one variable changing at a > > time. So I confirmed, with a configuration that eliminates the newly > > developed component, that this problem occurs without that running. > > > > That's good and bad, since now it seems likely that the offending code > > is out of my control. I am hoping to get some information on the > > error(s) I was able to capture, or some advice on how to debug the root > > cause of this problem. > > > > I have a couple of stack traces captured and I'll include what I believe > > is the crucial part of one here. It's worth noting that not all of the > > stack traces are the same. It's also worth noting that I have seen > > libgdiplus.so in other traces that I didn't get captured. > > > > I tried setting up a 10.04 machine to test with, but one of our newer > > dependencies (ServiceStack) introduced a class that is not in the > > default mono on that platform, giving a startup error trying to resolve > > the IgnoreDataMemberAttribute class. So I then got the latest mono set > > up on that machine now, but fear that this will result in the same error > > I am reporting (ie: I believe this to be a mono problem), since it > > should be the same mono framework running there. > > > > Any help is greatly appreciated. > > > > > > > > > > > > Stacktrace: > > > >at (wrapper managed-to-native) System.Drawing.GDIPlus.GdipDeleteFont > > (intptr) <0x> > >at System.Drawing.Font.Dispose () <0x0002b> > >at (wrapper remoting-invoke-with-check) System.Drawing.Font.Dispose > > () <0x> > >at System.Drawing.Font.Finalize () <0x00013> > >at (wrapper runtime-invoke) > > object.runtime_invoke_virtual_void__this__ (object,intptr,intptr,intpt$ > > > > Native stacktrace: > > > > mono() [0x80e16fc] > > mono() [0x81209fc] > > mono() [0x806094d] > > [0xb770240c] > > > > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15) > > [0xb4b1b9b5] > > /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43) > [0xb4b29b43] > > > > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82) > > [0xb4b29e12] > > /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132) > [0xb5004642] > > /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c] > > [0xaf711940] > > [0xaf7118cc] > > [0xaf711870] > > [0xaf7117ec] > > [0xb5cddf41] > > mono() [0x8150107] > > > > > > > > = > > Got a SIGSEGV while executing native code. This usually indicates > > a fatal error in the mono runtime or one of the native libraries > > used by your application. > > = > > > > > > > > > > Danny > > ___ > > Mono-list maillist - Mono-list@lists.ximian.com > > http://lists.ximian.com/mailman/listinfo/mono-list > ___ > Mono-list maillist - Mono-list@lists.ximian.com > http://lists.ximian.com/mailman/listinfo/mono-list > ___ Mono-list maillist - Mono-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-list
Re: [Mono-list] Help debugging program failing randomly
I'd be sure to check your struct packing and call conventions properly. And perhaps be sure that you aren't passing in any "ref System.String" instead of StringBuilders Ian On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote: > Hello, > > I'm having a difficult time with an application I have written. I > recently made some changes and I'm having a problem with it failing at > seemingly random times and locations (within the code), with sigsegv > errors. This is a multithreaded plugin-style daemon/service (can be > launched from CLI) and I recently added a new component to it to poll a > data acquisition board via USB using FTDI. > > Almost all of our integrations like this use a shared library (or DLL on > Windows) and p/invoke to access hardware. I have done dozens of these > integrations over USB without a persistent issue like this. But still > at first I suspected this new component, as I had initially thought it > was trashing RAM because of the problems I had developing the shared library > > However, at the same time as I made this addition, I was also (somewhat) > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on > 10.04). So unfortunately, I have more than one variable changing at a > time. So I confirmed, with a configuration that eliminates the newly > developed component, that this problem occurs without that running. > > That's good and bad, since now it seems likely that the offending code > is out of my control. I am hoping to get some information on the > error(s) I was able to capture, or some advice on how to debug the root > cause of this problem. > > I have a couple of stack traces captured and I'll include what I believe > is the crucial part of one here. It's worth noting that not all of the > stack traces are the same. It's also worth noting that I have seen > libgdiplus.so in other traces that I didn't get captured. > > I tried setting up a 10.04 machine to test with, but one of our newer > dependencies (ServiceStack) introduced a class that is not in the > default mono on that platform, giving a startup error trying to resolve > the IgnoreDataMemberAttribute class. So I then got the latest mono set > up on that machine now, but fear that this will result in the same error > I am reporting (ie: I believe this to be a mono problem), since it > should be the same mono framework running there. > > Any help is greatly appreciated. > > > > > > Stacktrace: > >at (wrapper managed-to-native) System.Drawing.GDIPlus.GdipDeleteFont > (intptr) <0x> >at System.Drawing.Font.Dispose () <0x0002b> >at (wrapper remoting-invoke-with-check) System.Drawing.Font.Dispose > () <0x> >at System.Drawing.Font.Finalize () <0x00013> >at (wrapper runtime-invoke) > object.runtime_invoke_virtual_void__this__ (object,intptr,intptr,intpt$ > > Native stacktrace: > > mono() [0x80e16fc] > mono() [0x81209fc] > mono() [0x806094d] > [0xb770240c] > > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15) > [0xb4b1b9b5] > /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43) [0xb4b29b43] > > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82) > [0xb4b29e12] > /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132) [0xb5004642] > /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c] > [0xaf711940] > [0xaf7118cc] > [0xaf711870] > [0xaf7117ec] > [0xb5cddf41] > mono() [0x8150107] > > > > = > Got a SIGSEGV while executing native code. This usually indicates > a fatal error in the mono runtime or one of the native libraries > used by your application. > = > > > > > Danny > ___ > Mono-list maillist - Mono-list@lists.ximian.com > http://lists.ximian.com/mailman/listinfo/mono-list ___ Mono-list maillist - Mono-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-list
[Mono-list] Help debugging program failing randomly
Hello, I'm having a difficult time with an application I have written. I recently made some changes and I'm having a problem with it failing at seemingly random times and locations (within the code), with sigsegv errors. This is a multithreaded plugin-style daemon/service (can be launched from CLI) and I recently added a new component to it to poll a data acquisition board via USB using FTDI. Almost all of our integrations like this use a shared library (or DLL on Windows) and p/invoke to access hardware. I have done dozens of these integrations over USB without a persistent issue like this. But still at first I suspected this new component, as I had initially thought it was trashing RAM because of the problems I had developing the shared library However, at the same time as I made this addition, I was also (somewhat) forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on 10.04). So unfortunately, I have more than one variable changing at a time. So I confirmed, with a configuration that eliminates the newly developed component, that this problem occurs without that running. That's good and bad, since now it seems likely that the offending code is out of my control. I am hoping to get some information on the error(s) I was able to capture, or some advice on how to debug the root cause of this problem. I have a couple of stack traces captured and I'll include what I believe is the crucial part of one here. It's worth noting that not all of the stack traces are the same. It's also worth noting that I have seen libgdiplus.so in other traces that I didn't get captured. I tried setting up a 10.04 machine to test with, but one of our newer dependencies (ServiceStack) introduced a class that is not in the default mono on that platform, giving a startup error trying to resolve the IgnoreDataMemberAttribute class. So I then got the latest mono set up on that machine now, but fear that this will result in the same error I am reporting (ie: I believe this to be a mono problem), since it should be the same mono framework running there. Any help is greatly appreciated. Stacktrace: at (wrapper managed-to-native) System.Drawing.GDIPlus.GdipDeleteFont (intptr) <0x> at System.Drawing.Font.Dispose () <0x0002b> at (wrapper remoting-invoke-with-check) System.Drawing.Font.Dispose () <0x> at System.Drawing.Font.Finalize () <0x00013> at (wrapper runtime-invoke) object.runtime_invoke_virtual_void__this__ (object,intptr,intptr,intpt$ Native stacktrace: mono() [0x80e16fc] mono() [0x81209fc] mono() [0x806094d] [0xb770240c] /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15) [0xb4b1b9b5] /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43) [0xb4b29b43] /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82) [0xb4b29e12] /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132) [0xb5004642] /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c] [0xaf711940] [0xaf7118cc] [0xaf711870] [0xaf7117ec] [0xb5cddf41] mono() [0x8150107] = Got a SIGSEGV while executing native code. This usually indicates a fatal error in the mono runtime or one of the native libraries used by your application. = Danny ___ Mono-list maillist - Mono-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-list