Re: Why is troubleshooting Linux so hard?
On Fri, 19 Nov 2010 11:18:09 -0500, Borden Rhodes wrote: Can I get a second on Teddy's opinion? I tend to believe that I just share the Linux experience, and if I can get something useful done whilst the computer is willing, so much the better. Is this the truth about open source software? Maybe I am in the wrong distribution and I'm wasting the list's time. Or maybe you are not interested in getting your problems solved. There is no much room for your point. It is no difficult to see for anyone who has been (and still is) working with windows systems (that's me) a big difference for troubleshooting problems in both, linux an windows. I mean debugging real problems. Windows logs tell nothing about the nature of the error and applications are not usually very verbose. What's important to emphasise here is that I'm not an idiot. And nobody tell that. Camaleón, I can tell the difference between a hardware problem and a software problem. I fix computers for profit so I know how to troubleshoot problems when I have a clear set of symptoms which I can isolate and test. Yes, I test my hard drive and RAM so, by the process of elimination, the problem *is* the software. Congrats. For me is very difficult to provide an accurate diagnosis when hardware problems arise because no logs are being written and you have to make use of your crystal-ball (if available) and/or your previous experience. Neither for a newbie nor skilled user is an easy task. You can get your bdd to crash just for using a ram module not recommended by the manufacturer of the board or by mixing between two different brands. And you neither get a memtest warning nor a single line advising you about this. And this can happen in any OS. You seem to have concluded that yours is a software problem. Good. I'm still waiting for a detailed explanation about it, because you want to solve the issue, right? :-) Greetings, -- Camaleón -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pan.2010.11.21.11.36...@gmail.com
Re: Why is troubleshooting Linux so hard?
Can I get a second on Teddy's opinion? I tend to believe that I just share the Linux experience, and if I can get something useful done whilst the computer is willing, so much the better. Is this the truth about open source software? Maybe I am in the wrong distribution and I'm wasting the list's time. What's important to emphasise here is that I'm not an idiot. Camaleón, I can tell the difference between a hardware problem and a software problem. I fix computers for profit so I know how to troubleshoot problems when I have a clear set of symptoms which I can isolate and test. Yes, I test my hard drive and RAM so, by the process of elimination, the problem *is* the software. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1290183489.3477.10.ca...@firefly.bordenrhodes.com
Re: Why is troubleshooting Linux so hard?
In 1290183489.3477.10.ca...@firefly.bordenrhodes.com, Borden Rhodes wrote: Can I get a second on Teddy's opinion? I tend to believe that I just share the Linux experience, and if I can get something useful done whilst the computer is willing, so much the better. Is this the truth about open source software? Maybe I am in the wrong distribution and I'm wasting the list's time. That's not my experience at all. Crashes are exceedingly rare for me. I have had a few single-application lockups, but that is usually bad Javascript or Flash causing issues in my browser. Killing a few processes or switching to a browser with those features turned off lets me pick up right where I started. I mainly use Debian stable (first Etch, now Lenny), but I use a mixed system to pull some packages from testing or unstable. It started with KDE 4.2 from unstable; once the freeze started I installed a number of things from testing; at this point, most of my system is testing, but testing is about to become the new stable (Squeeze). Prior to moving to Debian, I used Gentoo. Crashes weren't really more frequent, but I encountered more issues when trying to keep the system up-to- date. I felt I was spending to much time administering my system and not enough time using it. -- Boyd Stephen Smith Jr. ,= ,-_-. =. b...@iguanasuicide.net ((_/)o o(\_)) ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-' http://iguanasuicide.net/\_/ signature.asc Description: This is a digitally signed message part.
Re: Why is troubleshooting Linux so hard?
On Vi, 19 nov 10, 11:18:09, Borden Rhodes wrote: What's important to emphasise here is that I'm not an idiot. Camaleón, I can tell the difference between a hardware problem and a software problem. I fix computers for profit so I know how to troubleshoot problems when I have a clear set of symptoms which I can isolate and test. Yes, I test my hard drive and RAM so, by the process of elimination, the problem *is* the software. You can have crashes also due to a bad PSU, the wrong combination of hardware and just recently I had to pull the PS/2 mouse out of my internet radio machine because it was affecting the sound card (IRQ conflict?). But I assume Windows is rock stable on that machine, so it must be Linux, no? Regards, Andrei -- Offtopic discussions among Debian users and developers: http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic signature.asc Description: Digital signature
Re: Why is troubleshooting Linux so hard?
Dne, 17. 11. 2010 08:46:23 je Andrei Popescu napisal(a): You got it wrong, Debian does NOT work this way. Policy is not something to beat maintainers with who don't obey it, but rather to document sane packaging practices which come out of 17 years of packaging experience. Well, setting a set of guidelines is not about beating maintainers with anything. At all. It's the other way around; it's about letting maintainers intercommunicate and voice their suggestions and comments in order to avoid duplicating the efforts over and over again. I even think that such mechanism exists already, in the form of various Debian mailing lists (such as debian-legal) that make it easier for developers, maintainers and packagers to request their peers for comments. Also, I consider the lack of a body to make rules about how FLOSS software should be written to be an advantage, because it would hinder innovation. Well, sticking to the DFSG (for licensing), or to the i18n (for internationalization), or to the FHS (for file placement), or to the (for what it's worth) POSIX standard hasn't hindered innovation in any essential way so far, so why should we infere that any set of additional, well designed guidelines should hinder it? Again, such rules could help software developers and package maintainers avoid duplicating efforts. The FLOSS world has enough self-healing mechanisms in place that any guidelines, when they are nothing but a burden, get deprecated fairly soon anyway. You also forget that all Developers (in Debian or upstream) work on a voluntary basis. You cannot enforce program writing rules, because they would rather just not do it. After all, writing code based on other people's specs is something that you do at a paid job ;) I guess it boils down to what exactly the phrase enforce program writing rules means to a particular person. If you want your program to compile under GNU/Linux at all, you must stick to a whole set of requirements anyway. If you want it to run under Gnome (or KDE), the rules are even stricter. And so on. If we are to freely use the roads, we have to abide to the rule to all drive on the right (or left, depending on where you are) side of the road. Freedom is no more synonymous with anarchy as it is with dictatorship. Suggested further reading: http://www.jonmasters.org/blog/2010/11/14/rant-linux-wars/ -- Cheerio, Klistvud http://bufferoverflow.tiddlyspot.com Certifiable Loonix User #481801 Please reply to the list, not to me. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1289991083.560...@compax
Re: Why is troubleshooting Linux so hard?
Hi, Steve: On Monday 15 November 2010 21:34:03 Steve Kemp wrote: [...] Debian policy wouldn't arbitrarily try to mandate how the software we include is written because we simply have no control over that. Not to state a position but I think what you say's basically irrelevant: Debian has no control about how people distribute their software either but still Debian strongly stablishes that these kind of licenses are acceptable while those kind of licenses are not. It would be absolutly within Debian abilities to stablish, say, that only software developed in C were to be acceptable (to name just the stupidest thing that it came to my mind). Sure we can and do patch some software, but to implement your suggestion we'd have to patch many many many pieces of unrelated software and that is not a simple thing. Again, that's just in line with other things already being done: packaging 10.000 programs it's not a simple thing either but that's exactly what Debian does. Nor would maintaining those patches be easy. Only those that weren't accepted upstream should have to be maintained. (Not that I disapprove of your general idea; but consider would *you* personally download the source to 100 applications, update them to log in a consistent fashion, post the patches to the appropriate project's discussion lists (if they even exist), then keep them updated for a year or two?Even if you did who would handle the other few thousand application binaries..) Consider would *you* personally download the source to 100 applications, massage them so they are acceptable within Debian policy bounds, etc. then keept them updated for a year or two? Well, that's exactly what Debian does while, obviously, being an impossibility for you alone, so it seems you have a non-argument. Cheers. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201011171538.35837.jesus.nava...@undominio.net
Re: Why is troubleshooting Linux so hard?
Hi, Borden: On Tuesday 16 November 2010 22:43:38 Borden Rhodes wrote: [...] Why are there so many duplicate and incomplete bug reports and fora which ask the same questions over and over? I've been guilty of submitting duplicate bug reports even after I spent an hour searching Google to make sure it hadn't been reported or solved already. I'm not asking to be able to understand the error messages. I'm asking for them to be useful in a search or forum post so we can solve the problem and help the other Linux users. But how would such a utopian scheme be implemented? Well, my training is in accounting so I'll tell you how they solve these problems. A governing body, like the SEC or AICPA, recognises a problem in its standards and rules which, for example, allowed Enron to get away with what it did for as long as it did. They sit down and they say 'this shouldn't happen again if accountants do this.' They pass a regulation and they say 'anyone who wants to issue compliant financial statements needs to play by these rules.' And that's exactly why this wouldn't work for software: where's the governing body for programs? where's the authority to prosecute those not abiding to regulations? It's not only that those don't exist but that it would be extremely negative if they existed. The best you can do is promoting sane standards and hope for others to follow you. They don't chase down every practising accountant and every registered company and convince them to use the new standards. They just tell them that, to be part of the club, they have to play by the new rules. Debian, to my understanding, works that way. Yes, it could do that. A package which doesn't follow the rules has a grave bug filed against it and isn't included in the new release until it's fixed. Why does it have to be any more complicated for making error messages useful? It's an interpersonal matter as much as a technical one. While Debian Developers call themselves developers they are not *code* developers (not needed, at least). Probably it could be more descriptive if they were called Debian Packagers, but that's the way it is. As such, most of them are not so into the software quality itself but about the packaging effort. That's, again, neither good nor bad, but simply the way it is and, hey, it is not as if packaging were not already quite a significative effort. I have a working knowledge of C, Java and a few other languages. I can't even read the source code to the simplest projects let alone figure out why it crashed on me! That's probably the same quite a lot of DD could say about themselves (see my previous point). Of course I'd surely want Debian Developers giving efforts not only on packaging (and, heck, they usually do a damn good work of it) but that they were wisely/magically choosing what software to package under the highest levels of quality, engineering and maintainability and, on top to that, either developing themselves for the holes they'd found for such a highest standards or successfully tutorizing other upstream maintainers in such convincing manners that they'd either thankfully accepted their patches and/or made their own developments of better quality. But am I really in the position of realistically asking for that? Would it be even honest on my side, specially given that I'm not even doing what they already do, that is, packaging a very useful software for me? I don't think so. It's just wishful thinking as my desire to win a multimillion lotto: nice to dream of, but nothing to base planning upon. Cheers. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201011171556.09539.jesus.nava...@undominio.net
Re: Why is troubleshooting Linux so hard?
On Wed, Nov 17 2010, Klistvud wrote: Dne, 17. 11. 2010 08:46:23 je Andrei Popescu napisal(a): Well, setting a set of guidelines is not about beating maintainers with anything. At all. It's the other way around; it's about letting maintainers intercommunicate and voice their suggestions and comments in order to avoid duplicating the efforts over and over again. I even think that such mechanism exists already, in the form of various Debian mailing lists (such as debian-legal) that make it easier for developers, maintainers and packagers to request their peers for comments. Seems like what the DPE process is all about, not policy. Also, I consider the lack of a body to make rules about how FLOSS software should be written to be an advantage, because it would hinder innovation. Well, sticking to the DFSG (for licensing), or to the i18n (for internationalization), or to the FHS (for file placement), or to the (for what it's worth) POSIX standard hasn't hindered innovation in any essential way so far, so why should we infere that any set of additional, well designed guidelines should hinder it? Again, such rules could help software developers and package maintainers avoid duplicating efforts. The FLOSS world has enough self-healing mechanisms in place that any guidelines, when they are nothing but a burden, get deprecated fairly soon anyway. In most of the cases, the design, and initial implementation, and buy-in from developers was in place before these things became policy. For the most part (though not always), policy tends to ratify and encode _tested_ practices, and only in a fashion that doesnot make most packages instantly buggy. manoj -- Computers are the most fun you can have with anything that isn't breathing. Bruce Walker, CACM Forum Manoj Srivastava sriva...@acm.org http://www.golden-gryphon.com/ 4096R/C5779A1C E37E 5EC5 2A01 DA25 AD20 05B6 CF48 9438 C577 9A1C -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87zkt750jp@anzu.internal.golden-gryphon.com
Re: Why is troubleshooting Linux so hard?
Dne, 17. 11. 2010 15:56:09 je Jesús M. Navarro napisal(a): And that's exactly why this wouldn't work for software: where's the governing body for programs? where's the authority to prosecute those not abiding to regulations? There's no need for that, because, as you said: The best you can do is promoting sane standards and hope for others to follow you. Exactly. It's like promoting non-HTML mail -- if the standard you promote is sane enough, you should have no problem in finding a more or less considerable number of followers. Of course I'd surely want Debian Developers giving efforts not only on packaging (and, heck, they usually do a damn good work of it) but that they were wisely/magically choosing what software to package under the highest levels of quality, engineering and maintainability and, on top to that, either developing themselves for the holes they'd found for such a highest standards or successfully tutorizing other upstream maintainers in such convincing manners that they'd either thankfully accepted their patches and/or made their own developments of better quality. Agreed, Debian Developers are already doing a huge -- and excellent -- job and I would never dream of burdening them with additional tasks. However, Debian is generally well respected within the FLOSS world, and Debian Developers could leverage that respect in order to gently suggest or recommend certain best practices. What I have in mind is *not* a Linux police, not even tutorizing (as you say), but something simple and unpretentious, like pointing out that, to Debian, certain best practices, although not binding in any way, are, let's put it this way, more welcome than others. If these best practices were sensible, surely [some] upstream developers would embrace [some of] them simply on the ground of them being sane, sensible, practical, and for the common good. Actually, I'm sure this is already happening all the time. For example, I'm subscribed to debian-legal, and there I see many examples of Debian Packagers suggesting various license modifications to upstream developers in order to bring them into compliance with DFSG. Well, it may come as a surprise, but more often than not, the responses from upstream are positive! If you give them sane, sensible, well-founded suggestions, many developers are actually willing to modify their software licenses without much further ado. Of course, the respect enjoyed by Debian in the FLOSS community -- specifically, Debian's firm stance about software freedom and licensing -- *may* have to do something with it. Why not leverage that? -- Cheerio, Klistvud http://bufferoverflow.tiddlyspot.com Certifiable Loonix User #481801 Please reply to the list, not to me. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1290015284.3051...@compax
Re: Why is troubleshooting Linux so hard?
On Mi, 17 nov 10, 18:34:44, Klistvud wrote: Agreed, Debian Developers are already doing a huge -- and excellent -- job and I would never dream of burdening them with additional tasks. However, Debian is generally well respected within the FLOSS world, and Debian Developers could leverage that respect in order to gently suggest or recommend certain best practices. What I have in mind is *not* a Linux police, not even tutorizing (as you say), but something simple and unpretentious, like pointing out that, to Debian, certain best practices, although not binding in any way, are, let's put it this way, more welcome than others. If You mean something like this? http://wiki.debian.org/UpstreamGuide Of course, these guidelines only talk about making software easy to package. The kind of changes you are suggesting would better come from some other project, like freedesktop.org or so... Regards, Andrei -- Offtopic discussions among Debian users and developers: http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic signature.asc Description: Digital signature
Re: Why is troubleshooting Linux so hard?
Well I'm pleased for the discussion and particularly grateful to Klistvud who says many of my ideas far more eloquently than I can. I want to digress briefly and remind everyone that for as controversial as you may think software standards are, accounting standards are far worse. The SEC, AICPA, IASB, CICA (in Canada) and many other acronyms have to balance on a razor-thin line between protecting the public from deception and making business too expensive and bureaucratic to run. Standards bodies know that a new regulation which strengthens ethics could cost the market billions in restated earnings and new compliance systems but they make it work! So, no, I'm not recommending that we emulate them but certainly we can learn from their effort. Therefore, let's return to the concept of public interest. I am neither a programmer nor a packager so I consider myself part of the open source public. My expectations of software, which I hope aren't controversial, are that software should do what it's advertised to do, keep my data intact, and not interfere with other parts of the computer from doing their jobs. Obviously, if software did this, there wouldn't be any need for bug reports. So, yes, software breaks. But that's okay because I'm patient and understanding and I can usually recover from the crashes or work around them... just as long as I believe that some day it'll be fixed and not break anymore. So, the idea, and the point of the subject of this thread, is what do the public like me do when software breaks? My thesis is that FLOSS software currently breaks in a way that doesn't give me enough of the right symptoms to fix the problem myself or ask for help intelligently. If anyone still doesn't believe me, I'll subscribe you to my computer logs and a commentary of my problems! So, what I want are better symptoms from software. Ideally, I want an error message which I can plug into Google and be directed to a probable cause of the problem. I can usually handle things from there. Currently, I can't tell what error messages and log entries are related to a problem I'm having. Worse, if I plug the error message into Google, I get directed to old source repositories, bug reports totally unrelated to my problem, flame wars and a tedious variety of dead ends and wild goose chases. Surely there must be a better way to troubleshoot FLOSS! Finally, I don't care how software reaches this utopian state. It can be top-down, bottom-up, sideways, revolutionary, explosive or any which organisation or movement or argument or death threat which lets me participate in the community without having to specialise in computer science. Borden -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1290052808.2515.56.ca...@firefly.bordenrhodes.com
Re: Why is troubleshooting Linux so hard?
Jesús; Your argument is bogus, how many threads do you sit on and are you arguing about this? How many OS's are you trying to convince??? What you seem to be missing, and has been pointed out over and over is yes Debian as a Distribution is developed by volunteers, yes you can go around trying to make demands, but in reality all your gonna get is told to go do unholy things to yourself. Yeah you can choose to exclude packages or software for whatever reason, but the it wouldn't be much of an OS would it?? Sure you can do a lot of things, but if you want 100% idiot proof, go with windows, they specialize in dumbing things down. Linux as well as most open source software is written for effectiveness by the coders who wish to use it. They are taking their free time to collaborate on projects, doing what they would otherwise be paid to do. The fact is that most developers in the linux and FOSS world don't care if you can get their code working on your system or not. They provide it free of charge and without warranty. If it works for you, good we're glad, if not, we'll see if we can help. But complaining and biting isn't gonna get you anywhere because when it comes down to it, nobody's gonna loose sleep that somebody couldn't use their code. I don't mean to flame, but this conversation just keeps going back and forth, back and forth. I think your time would be better spent learning and studying, figuring out why your google results are limited, what you can change to maybe find more relevant info, or what search engines may be otherwise helpful. Maybe studying the linux system as a whole to understand what, why, and how, it's doing what it does. Bottom line: why is linux so difficult? Because it's freaking free. It is written by the very PHd Canadits you speak of, and their number one interest is that it works for them. If you can gain and work it, learn from it, excellent, a lot of people are willing to help you. If you want your hand held, well you gotta pay for that. Either microsoft or maybe a distro that offers trouble ticket licenses (like red hat enterprise or suse enterprise) TeddyB -Original Message- From: Jesús M. Navarro jesus.nava...@undominio.net Date: Wed, 17 Nov 2010 15:38:35 To: debian-user@lists.debian.org Subject: Re: Why is troubleshooting Linux so hard? Hi, Steve: On Monday 15 November 2010 21:34:03 Steve Kemp wrote: [...] Debian policy wouldn't arbitrarily try to mandate how the software we include is written because we simply have no control over that. Not to state a position but I think what you say's basically irrelevant: Debian has no control about how people distribute their software either but still Debian strongly stablishes that these kind of licenses are acceptable while those kind of licenses are not. It would be absolutly within Debian abilities to stablish, say, that only software developed in C were to be acceptable (to name just the stupidest thing that it came to my mind). Sure we can and do patch some software, but to implement your suggestion we'd have to patch many many many pieces of unrelated software and that is not a simple thing. Again, that's just in line with other things already being done: packaging 10.000 programs it's not a simple thing either but that's exactly what Debian does. Nor would maintaining those patches be easy. Only those that weren't accepted upstream should have to be maintained. (Not that I disapprove of your general idea; but consider would *you* personally download the source to 100 applications, update them to log in a consistent fashion, post the patches to the appropriate project's discussion lists (if they even exist), then keep them updated for a year or two?Even if you did who would handle the other few thousand application binaries..) Consider would *you* personally download the source to 100 applications, massage them so they are acceptable within Debian policy bounds, etc. then keept them updated for a year or two? Well, that's exactly what Debian does while, obviously, being an impossibility for you alone, so it seems you have a non-argument. Cheers. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201011171538.35837.jesus.nava...@undominio.net
Re: Why is troubleshooting Linux so hard?
On Wed, 17 Nov 2010 23:00:08 -0500, Borden Rhodes wrote: (...) So, yes, software breaks. But that's okay because I'm patient and understanding and I can usually recover from the crashes or work around them... just as long as I believe that some day it'll be fixed and not break anymore. Over the time, it will break again. So, the idea, and the point of the subject of this thread, is what do the public like me do when software breaks? Filling bugs and/or using another program while it gets fixed. My thesis is that FLOSS software currently breaks in a way that doesn't give me enough of the right symptoms to fix the problem myself or ask for help intelligently. I've never reached such situation. When something goes wrong -and Google seems not solving the problem-, I explain the symptoms in mailing lists or forums, attach the logs and get feedback. If anyone still doesn't believe me, I'll subscribe you to my computer logs and a commentary of my problems! You'll have first to explain what your problem is. So, what I want are better symptoms from software. Ideally, I want an error message which I can plug into Google and be directed to a probable cause of the problem. I can usually handle things from there. The nature of the problem may prevent this from happening. For instance, a hardware failure (bad ram or damaged micro, lack of power, loose cable...), can be very difficult to register in your logs or for getting a warning message box. Your system can give you hints but nowadays it won't tell you: hey, this happens because your sata #3 hard disk cable flaws, please correct. IBM is (was?) working on self-diagnosis system software that heal by themselves... so in a near future your wishes could be materialized :-) Currently, I can't tell what error messages and log entries are related to a problem I'm having. Worse, if I plug the error message into Google, I get directed to old source repositories, bug reports totally unrelated to my problem, flame wars and a tedious variety of dead ends and wild goose chases. Surely there must be a better way to troubleshoot FLOSS! The success in problem solving is proportionally related to the interest in getting solved. Nobody will care if you don't care... Finally, I don't care how software reaches this utopian state. It can be top-down, bottom-up, sideways, revolutionary, explosive or any which organisation or movement or argument or death threat which lets me participate in the community without having to specialise in computer science. You can participate in the communitity in many ways that require no advanced skills in computer science: mailing lists support, translation and documentation, design... Greetings, -- Camaleón -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pan.2010.11.18.07.26...@gmail.com
Re: Why is troubleshooting Linux so hard?
Thank you for the response. Indeed, you are correct in that my problem isn't specific to Linux kernel troubleshooting (although I could dedicate a website to things that don't work there) but with the software that runs on Debian in general. To clarify, the problem I have is when the computer freezes and crashes, I forcibly restart the computer, and I try to trace what caused the problem and cannot do so. I pick through the dozens of files in /var/log/ and cannot find any clues about what caused the crash. Even if I can find a suspicious log entry or two, Googling them directs me to bug reports and forum posts from 2006. Almost none of these are relevant to tracing what caused the problem. I'm not asking that every piece of software that's ever been written be fixed overnight as the proposed 'solution' implies. Rather, I want to have the information to be able to troubleshoot problems. This will also help the package maintainers and volunteers who dedicate their time to helping plebs like me. Why are there so many duplicate and incomplete bug reports and fora which ask the same questions over and over? I've been guilty of submitting duplicate bug reports even after I spent an hour searching Google to make sure it hadn't been reported or solved already. I'm not asking to be able to understand the error messages. I'm asking for them to be useful in a search or forum post so we can solve the problem and help the other Linux users. But how would such a utopian scheme be implemented? Well, my training is in accounting so I'll tell you how they solve these problems. A governing body, like the SEC or AICPA, recognises a problem in its standards and rules which, for example, allowed Enron to get away with what it did for as long as it did. They sit down and they say 'this shouldn't happen again if accountants do this.' They pass a regulation and they say 'anyone who wants to issue compliant financial statements needs to play by these rules.' They don't chase down every practising accountant and every registered company and convince them to use the new standards. They just tell them that, to be part of the club, they have to play by the new rules. Debian, to my understanding, works that way. A package which doesn't follow the rules has a grave bug filed against it and isn't included in the new release until it's fixed. Why does it have to be any more complicated for making error messages useful? The suggestion is that a PhD-level mastery of computer science is not necessary to find a problem in open source software; a thorough understanding of source code, languages, architectures, engineering and the esoteric disciplines which software is supposed to simplify should suffice. Ironically, it is on those topics which PhD candidates write their dissertations so I don't see the difference. Is the conclusion that the only people who use GNU/Linux/FOSS software should also be able to write the software themselves? I have a working knowledge of C, Java and a few other languages. I can't even read the source code to the simplest projects let alone figure out why it crashed on me! And, no, valgrind is not a solution to this problem either. Valgrind is for debugging programs in development, not as a shell in which to run every program in case it crashes. Example: Evolution just closed on me whilst I was writing this e-mail. .xsession-errors reports (evolution:5186): Gtk-WARNING **: A floating object was finalized. This means that someone called g_object_unref() on an object that had only a floating reference; the initial floating reference is not owned by anyone and must be removed with g_object_ref_sink(). The entry isn't time stamped so I don't know whether it's relevant to the crash or not. A Google search on this message (predictably) produces no results. A modified Google search reveals 9 results, one from 2007, one from 2009 and one talking about pinning the calendar on the Ubuntu Netbook Remix or something. How much easier would it be to trace this crash if the entry said 16 November 2010 16:12: Evolution: Illegal call to xyz_(); Error 0x1EE7: Debian hates you too or something to that effect? That way, I wouldn't have to burden the mailing list and bug reports with a now what do I do? This happens randomly but happened several times this week message! I'm sorry for the length of this message. I would use fewer words if I knew which ones to cut and still retain the point. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1289943818.5186.31.ca...@firefly.bordenrhodes.com
Re: Why is troubleshooting Linux so hard?
Borden Rhodes wrote: Thank you for the response. Indeed, you are correct in that my problem isn't specific to Linux kernel troubleshooting (although I could dedicate a website to things that don't work there) but with the software that runs on Debian in general. To clarify, the problem I have is when the computer freezes and crashes, I forcibly restart the computer, and I try to trace what caused the problem and cannot do so. I pick through the dozens of files in /var/log/ and cannot find any clues about what caused the crash. Even if I can find a suspicious log entry or two, Googling them directs me to bug reports and forum posts from 2006. Almost none of these are relevant to tracing what caused the problem. Now that is a problem with a more focused solution - better crash dump/analysis tools. The Linux kernel has always lagged behind Solaris and BSD variations in terms of built-in crash dump tools - it takes compiling a custom kernel to enable some of the Linux capabilities (particularly if you're running Xen). Otherwise, about the only clue to be had about kernal panics is from the console - if you have a console that captures the panic message. Sigh Miles -- In theory, there is no difference between theory and practice. Infnord practice, there is. Yogi Berra -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4ce32f1c.7000...@meetinghouse.net
Re: Re: Why is troubleshooting Linux so hard?
Much obliged for the insight. I think I understand now the point that Steve was trying to get at. If I understood correctly, Debian's role in package maintenance is the packaging; the actual coding (and related policies) are handled farther upstream. All the same, I still struggle to understand how it's possible that F/OSS has gone this long without a coherent approach to logging and diagnosing problems. You make a doctor's job awfully difficult when you only have vague symptoms which could refer to any number of ailments of varying severity. The doctor, in this case, are the poor sods who've agreed to manage the package. It may not be Debian's policy to tell a programmer how to write software, but surely a policy on generating useful troubleshooting output is consistent with a goal of producing a stable operating system and, therefore, an area where Debian can lead. I recommended to the Debian policy people (who promptly ignored the idea) that packages should produce time-stamped logs. I specifically mentioned .xsession-errors which isn't time-stamped and the response was that the output was part of an error stream and therefore impossible to time stamp. Seriously? Java and C++ are the only languages capable of catch (exception e) { logfile.writeln( time() + e ); } (yes, I know it's butchered C++/Java but you get the idea) This idea alone could cut bug reports in half and make them twice as useful! Instead of saying attach Xorg.log you could say attach the messages from Xorg.log which occurred 5 minutes before/after X went blank. Yes, I know that Xorg is in the process of time-stamping their logs, but the point is that there are still many other packages that have no intention of switching. Again, I appreciate the feedback. I'm a lot less frustrated that we're talking about it. I have hope that some good idea or initiative will come of this. Borden -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1289972440.9837.31.ca...@firefly.bordenrhodes.com
Re: Why is troubleshooting Linux so hard?
On Ma, 16 nov 10, 16:43:38, Borden Rhodes wrote: But how would such a utopian scheme be implemented? Well, my training is in accounting so I'll tell you how they solve these problems. A governing body, like the SEC or AICPA, recognises a problem in its standards and rules which, for example, allowed Enron to get away with what it did for as long as it did. They sit down and they say 'this shouldn't happen again if accountants do this.' They pass a regulation and they say 'anyone who wants to issue compliant financial statements needs to play by these rules.' They don't chase down every practising accountant and every registered company and convince them to use the new standards. They just tell them that, to be part of the club, they have to play by the new rules. Debian, to my understanding, works that way. A package which doesn't follow the rules has a grave bug filed against it and isn't included in the new release until it's fixed. Why does it have to be any more complicated for making error messages useful? You got it wrong, Debian does NOT work this way. Policy is not something to beat maintainers with who don't obey it, but rather to document sane packaging practices which come out of 17 years of packaging experience. Also, I consider the lack of a body to make rules about how FLOSS software should be written to be an advantage, because it would hinder innovation. You also forget that all Developers (in Debian or upstream) work on a voluntary basis. You cannot enforce program writing rules, because they would rather just not do it. After all, writing code based on other people's specs is something that you do at a paid job ;) Regards, Andrei -- Offtopic discussions among Debian users and developers: http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic signature.asc Description: Digital signature
Re: Why is troubleshooting Linux so hard?
(Sorry for the late reply to a thread started way back) I'm pleased for all of the feedback and that I'm not the only person who's frustrated. I tried proposing to debian-policy that it be mandatory that all logs have timestamps http://lists.debian.org/debian-policy/2010/02/msg00035.html but my suggestion was dismissed because it was considered too hard to enforce. I responded http://lists.debian.org/debian-policy/2010/08/msg00043.html saying that it shouldn't be very difficult to enforce or implement at all. Linux already has a huge troubleshooting database: Google. The trouble is that simply copying-and-pasting an error message into Google with a program or package name (assuming you know whose fault it is) doesn't generate very useful results. The most useful change, therefore, would be to improve the quality of the error messages. I just want to say that I like KDE's auto-crash popup and it would be nice to have implemented Linux-wide. Of course, the trick is when you don't have the appropriate debugging packages installed to install them and regenerate the crash report before it goes away. What would it take to get some error message standards in place so that troubleshooting Linux is possible for those of us who aren't computer science PhD candidates? -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1289847115.4309.9.ca...@firefly.bordenrhodes.com
Re: Why is troubleshooting Linux so hard?
On Mon, 15 Nov 2010 13:51:55 -0500, Borden Rhodes wrote: (Sorry for the late reply to a thread started way back) I'm pleased for all of the feedback and that I'm not the only person who's frustrated. I tried proposing to debian-policy that it be mandatory that all logs have timestamps http://lists.debian.org/debian-policy/2010/02/msg00035.html but my suggestion was dismissed because it was considered too hard to enforce. I responded http://lists.debian.org/debian-policy/2010/08/msg00043.html saying that it shouldn't be very difficult to enforce or implement at all. I guess you posted your wish into the wrong list. Debian Policy is about (sic) Discussion and editing of the Debian Policy Manual. Maybe you should have addressed your concern to debian- devel (where meal is being coocked) or by directly filling a bug report. But I'm afraid that feature will have to be addressed upstream or just in case another distribution is already adding that option, you should have pointed that it is already available. Linux already has a huge troubleshooting database: Google. The trouble is that simply copying-and-pasting an error message into Google with a program or package name (assuming you know whose fault it is) doesn't generate very useful results. The most useful change, therefore, would be to improve the quality of the error messages. I'd say Google _and_ experience. Becasue Google means nothing for people who does not know where to look or how to search. I just want to say that I like KDE's auto-crash popup and it would be nice to have implemented Linux-wide. Of course, the trick is when you don't have the appropriate debugging packages installed to install them and regenerate the crash report before it goes away. Every project provides its own bug tracking tool (GNOME -bugbuddy-, KDE, kernel -kerneloops-, mozilla...). There are tools for handling crashes but you have to install debug packages associates for each application. But, OTOH, these bug crashes reports say not much for the plain user: they are targeted to devels and won't help us to solve our mere mortal problem ;-) What would it take to get some error message standards in place so that troubleshooting Linux is possible for those of us who aren't computer science PhD candidates? Linux provides one of the best tools I've never seen for debugging erros: understable logs. They're priceless :-) Greetings, -- Camaleón -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pan.2010.11.15.19.25...@gmail.com
Re: Why is troubleshooting Linux so hard?
On Mon Nov 15, 2010 at 13:51:55 -0500, Borden Rhodes wrote: What would it take to get some error message standards in place so that troubleshooting Linux is possible for those of us who aren't computer science PhD candidates? 1. Make a list of all the programs which exist, but which do not log useful information. 2. Persuade every single one of them that your suggestion to add useful logging information is a good one. 3. Wait for them all to update. This is the problem - There is no single place this change could be made. Even if it were in KDE and all KDE applications go it you'd be missing other things such as sudo, screen, less, GNOME, etc. You suggest troubleshooting linux, but what you really mean (I guess) is troubleshooting any available program which just happens to run upon Linux systems (and possibly others). Even if you were to get a wide agreement I suspect you'd still be frustrated - as what is useful information to me is probably not useful information to you. As things stand you typically get access to source code, documentation, and tutorials online. I suspect the pragmatic thing to do if you encounter problems is to explore that particular program/system thoroughly and learn in the process the things you're missing. No PhD candidacy required. Steve -- Let me steal your soul? http://stolen-souls.com -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20101115193828.ga3...@steve.org.uk
Re: Why is troubleshooting Linux so hard?
Steve Kemp wrote: On Mon Nov 15, 2010 at 13:51:55 -0500, Borden Rhodes wrote: What would it take to get some error message standards in place so that troubleshooting Linux is possible for those of us who aren't computer science PhD candidates? 1. Make a list of all the programs which exist, but which do not log useful information. 2. Persuade every single one of them that your suggestion to add useful logging information is a good one. 3. Wait for them all to update. This is the problem - There is no single place this change could be made. Even if it were in KDE and all KDE applications go it you'd be missing other things such as sudo, screen, less, GNOME, etc. You suggest troubleshooting linux, but what you really mean (I guess) is troubleshooting any available program which just happens to run upon Linux systems (and possibly others). Actually, that does suggest a policy-level, or perhaps kernel-level approach -- creating a stronger framework for logging, error-reporting, tracebacks, etc. The further upstream that's implemented, the more likely developers are to utilize well-defined and well-supported hooks. Miles Fidelman -- In theory, there is no difference between theory and practice. Infnord practice, there is. Yogi Berra -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4ce195cf.10...@meetinghouse.net
Re: Why is troubleshooting Linux so hard?
On Mon Nov 15, 2010 at 15:19:27 -0500, Miles Fidelman wrote: Actually, that does suggest a policy-level, or perhaps kernel-level approach -- creating a stronger framework for logging, error-reporting, tracebacks, etc. The further upstream that's implemented, the more likely developers are to utilize well-defined and well-supported hooks. What I was trying to suggest is that a policy is meaningless because programs you use are written by different people, in different places. Debian policy generally refers to how Debian systems are put together - i.e. Things we control. (Such as what information is provided in package meta-data). Even in that case policy is only written after it has been implemented.There is generally not a discussion, then a policy, then an implementation. That is not the kind of bureaucratic way that Debian is organized. Debian policy wouldn't arbitrarily try to mandate how the software we include is written because we simply have no control over that. Sure we can and do patch some software, but to implement your suggestion we'd have to patch many many many pieces of unrelated software and that is not a simple thing. Nor would maintaining those patches be easy. (Not that I disapprove of your general idea; but consider would *you* personally download the source to 100 applications, update them to log in a consistent fashion, post the patches to the appropriate project's discussion lists (if they even exist), then keep them updated for a year or two?Even if you did who would handle the other few thousand application binaries..) Steve -- -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20101115203403.gb7...@steve.org.uk
Re: Why is troubleshooting Linux so hard?
Dne, 15. 11. 2010 19:51:55 je Borden Rhodes napisal(a): I'm pleased for all of the feedback and that I'm not the only person who's frustrated. I tried proposing to debian-policy that it be mandatory that all logs have timestamps http://lists.debian.org/debian-policy/2010/02/msg00035.html but my suggestion was dismissed because it was considered too hard to enforce. I responded http://lists.debian.org/debian-policy/2010/08/msg00043.html saying that it shouldn't be very difficult to enforce or implement at all. Well, even if that was difficult to enforce or implement, it shouldn't be difficult at all to at least suggest or strongly encourage, right? What would it take to get some error message standards in place so that troubleshooting Linux is possible for those of us who aren't computer science PhD candidates? I second that. Moreover, I'm confident that within, say, five years from now, such standards will be a given, making us wonder how people managed to live without them for so long ... -- Cheerio, Klistvud http://bufferoverflow.tiddlyspot.com Certifiable Loonix User #481801 Please reply to the list, not to me. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1289854564.1273...@compax
Why is troubleshooting Linux so hard?
Good morning, I'm going to list some of the frustrations I've been having with troubleshooting Linux's quirks, crashes and problems in hopes that someone may be able to help me (and the community) become better bug reporters and troubleshooters. I'll make comparisons to Windows only because I am used to fixing the same problems in Windows a certain way - maybe there are analogies in Linux or maybe I'm approaching these problems the wrong way. I'm not trying to troll or flame-bait. I'm using Debian Squeeze, by the way. 1) Is there a way to apply debugging symbols retroactively to a dump? A few times I've had Linux crash on me and spit out a debugging dump. I do my best to install debugging symbols for all 1400 packages I have on my system (when I can find them) but this requires a huge amount of hard disk space and, invariably, the odd dump is missing symbols. Recreating the crash isn't always possible. Is there (or could someone invent) a way to save a dump without the symbols, download the symbol tables and then regenerate the dump with the symbols so it's useful to developers? 2) I find that the logs contain lots of facts but not a whole lot of useful information (if any) when something goes wrong. I've had KDE go black-screen on me, for example, and force a hard reboot but there's no mention whatsoever (that I can find) in xorg.log, kdm.log, messages, syslog or dmesg. Windows seems to be fairly good at making its last breath a stop error before it dies which means when I get back into the system (or when I'm looking at a client's computer days after) I can find that stop error, look it up and figure out what went wrong. Are Linux's logs designed for troubleshooting or only for monitoring? Are proper troubleshooting logs kept somewhere else or in a special file? Is there a guide on how to read Linux's logs so I can make sense out of them like I can Windows' logs? 3) Linux needs better troubleshooting and recovery systems. The answer I usually get when I get an unexplained error is to run the program inside a dbg or with valgrind. I'm not convinced that this is a practical way to troubleshoot serious problems (like kernel panics) and it requires a certain amount of foresight that a problem will occur. According to this logic, the only way that someone can produce useful reports and feedback (or even get a clue as to what happened) on the day-to-day crashes and bugs is to start Linux and all of its sub process inside valgrind and/or gdb. This is obviously not an intended use of these programs. This is what would make it easier (at least for me) to troubleshoot Linux problems. If these features exist, please let me know so I can start using them (they should probably be documented in the man pages too). 1) Logs need to have useful information. When I look at a client's Windows box days after they report something going wrong, the logs tell me at what time the problem happened, which process failed and what error it threw just before it blew. I can look those error codes up and (usually) fix the problem within an hour. When something dies on Linux, the log entry (assuming it even makes one) only tells me how many seconds into that particular boot the problem occurred. I've never been able to go back a few days later and find the log entries related to a particular crash - maybe because they've been purged. I know that the Linux tradition is to identify processes only by ID but surely there must be a way that it can print a file or package name or anything more useful than memory addresses and registers so at least I know where to start pointing fingers. Several people have told me that it's pointless trying to debug a dump in the logs. What's the point of dumping it in the first place if nobody can read it? 2) I wish error logs had simple codes or messages (which have documentation) like Windows Stop errors so I can look them up and figure out why something died. Often times I try to Google the whole error message and either get directed to source code or totally irrelevant postings (since it seems that many messages are reused for all kinds of problems). For example, 'segfault' gets thrown so much that it only tells you that the program crashed - something I already know. 3) Logs need better organisation. I'm looking at the most recent dump and each message is printed on its own line. The problem is that interspersed in those individual lines may be other entries from other events not related to the problem in question. When I look at a Windows log, each event is entirely contained in one entry. It doesn't make one entry for Stop, another entry for the Stop number, another 4 entries for the parameters and more entries for whatever other information usually is in them - whilst having other entries amid the list with what other things were doing at the time. I find Linux logs very frustrating to read for that
Re: Why is troubleshooting Linux so hard?
My suggestion, can't we create troubleshooting database?? On Sun, Aug 15, 2010 at 11:30 AM, Borden Rhodes j...@bordenrhodes.comwrote: Good morning, I'm going to list some of the frustrations I've been having with troubleshooting Linux's quirks, crashes and problems in hopes that someone may be able to help me (and the community) become better bug reporters and troubleshooters. I'll make comparisons to Windows only because I am used to fixing the same problems in Windows a certain way - maybe there are analogies in Linux or maybe I'm approaching these problems the wrong way. I'm not trying to troll or flame-bait. I'm using Debian Squeeze, by the way. 1) Is there a way to apply debugging symbols retroactively to a dump? A few times I've had Linux crash on me and spit out a debugging dump. I do my best to install debugging symbols for all 1400 packages I have on my system (when I can find them) but this requires a huge amount of hard disk space and, invariably, the odd dump is missing symbols. Recreating the crash isn't always possible. Is there (or could someone invent) a way to save a dump without the symbols, download the symbol tables and then regenerate the dump with the symbols so it's useful to developers? 2) I find that the logs contain lots of facts but not a whole lot of useful information (if any) when something goes wrong. I've had KDE go black-screen on me, for example, and force a hard reboot but there's no mention whatsoever (that I can find) in xorg.log, kdm.log, messages, syslog or dmesg. Windows seems to be fairly good at making its last breath a stop error before it dies which means when I get back into the system (or when I'm looking at a client's computer days after) I can find that stop error, look it up and figure out what went wrong. Are Linux's logs designed for troubleshooting or only for monitoring? Are proper troubleshooting logs kept somewhere else or in a special file? Is there a guide on how to read Linux's logs so I can make sense out of them like I can Windows' logs? 3) Linux needs better troubleshooting and recovery systems. The answer I usually get when I get an unexplained error is to run the program inside a dbg or with valgrind. I'm not convinced that this is a practical way to troubleshoot serious problems (like kernel panics) and it requires a certain amount of foresight that a problem will occur. According to this logic, the only way that someone can produce useful reports and feedback (or even get a clue as to what happened) on the day-to-day crashes and bugs is to start Linux and all of its sub process inside valgrind and/or gdb. This is obviously not an intended use of these programs. This is what would make it easier (at least for me) to troubleshoot Linux problems. If these features exist, please let me know so I can start using them (they should probably be documented in the man pages too). 1) Logs need to have useful information. When I look at a client's Windows box days after they report something going wrong, the logs tell me at what time the problem happened, which process failed and what error it threw just before it blew. I can look those error codes up and (usually) fix the problem within an hour. When something dies on Linux, the log entry (assuming it even makes one) only tells me how many seconds into that particular boot the problem occurred. I've never been able to go back a few days later and find the log entries related to a particular crash - maybe because they've been purged. I know that the Linux tradition is to identify processes only by ID but surely there must be a way that it can print a file or package name or anything more useful than memory addresses and registers so at least I know where to start pointing fingers. Several people have told me that it's pointless trying to debug a dump in the logs. What's the point of dumping it in the first place if nobody can read it? 2) I wish error logs had simple codes or messages (which have documentation) like Windows Stop errors so I can look them up and figure out why something died. Often times I try to Google the whole error message and either get directed to source code or totally irrelevant postings (since it seems that many messages are reused for all kinds of problems). For example, 'segfault' gets thrown so much that it only tells you that the program crashed - something I already know. 3) Logs need better organisation. I'm looking at the most recent dump and each message is printed on its own line. The problem is that interspersed in those individual lines may be other entries from other events not related to the problem in question. When I look at a Windows log, each event is entirely contained in one entry. It doesn't make one entry for Stop, another entry for the Stop number, another 4 entries for the parameters and more entries for whatever other
Re: Why is troubleshooting Linux so hard?
On Sun, 2010-08-15 at 02:00 -0400, Borden Rhodes wrote: Good morning, I'm going to list some of the frustrations I've been having with troubleshooting Linux's quirks, crashes and problems in hopes that someone may be able to help me (and the community) become better bug reporters and troubleshooters.snip Very interesting and helpful post. Thank you. I've snipped most of it out for the sake of those for whom long emails are a problem or expensive. As someone without deep troubleshooting experience, I'll be curious to see the responses. The only thing I can help with is the old log information. Linux usually rotates the logs. As I'd imagine you've seen, you can find compressed versions of the logs, e.g., messages.1, message.2.gz - John -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1281853909.24518.43.ca...@denise.theartistscloset.com
Re: Why is troubleshooting Linux so hard?
On Sun, Aug 15, 2010 at 02:00:52AM -0400, Borden Rhodes wrote: Good morning, I'm going to list some of the frustrations I've been having with troubleshooting Linux's quirks, crashes and problems in hopes that someone may be able to help me (and the community) become better bug reporters and troubleshooters. I'll make comparisons to Windows only because I am used to fixing the same problems in Windows a certain way - maybe there are analogies in Linux or maybe I'm approaching these problems the wrong way. I'm not trying to troll or flame-bait. I'm using Debian Squeeze, by the way. ..snip.. That's been my main complaint about Linux ever since I started using it 10 year or more ago (Redhat 6.0). -- Bob Holtzman Key ID: 8D549279 If you think you're getting free lunch, check the price of the beer signature.asc Description: Digital signature
Re: Why is troubleshooting Linux so hard?
In 201008150200.52677.j...@bordenrhodes.com, Borden Rhodes wrote: 1) Is there a way to apply debugging symbols retroactively to a dump? A few times I've had Linux crash on me and spit out a debugging dump. I do my best to install debugging symbols for all 1400 packages I have on my system (when I can find them) but this requires a huge amount of hard disk space and, invariably, the odd dump is missing symbols. Recreating the crash isn't always possible. Is there (or could someone invent) a way to save a dump without the symbols, download the symbol tables and then regenerate the dump with the symbols so it's useful to developers? Yes, it is, sometimes. Ubuntu has a process to do it automatically, that mostly gets it right. Modern versions of strip et. al. allow you to save the debugging information to a separate .so that just contains debugging information. gdb (et. al.) can then use the debugging-info only .so to decorate an existing backtrace. This is actually how a lot of distributions produce separate -dbg or -DEBUG packages. However, this debugging-info only files only match with the *same exact build* of the real .so. Taking a random backtrace, determining which build it came from and finding the appropriate -dbg packages is a bit difficult. Also, things like prelink, that modify existing .so files result in the debugging-info only .so not matching. This might also happen this some types of hardening that reduces the impact of heap/stack overflow/underflow attacks. Compounding this problem is the large number of programs that are being written with parts in scripting languages, or otherwise non-C/C++ languages where the path from a symbol in a ELF file to the problematic code is not as simple. In short, it can be done in some cases and there are programmers working on making backtraces from Joe Sixpack or Jane Boxwine more useful. It does seem to be like there may need to be more people working on this, but it is not very sexy work. Most programmers would rather spend their time improving the user experience when things are working; IME, that is where the user spends most of their time. 2) I find that the logs contain lots of facts but not a whole lot of useful information (if any) when something goes wrong. I've had KDE go black-screen on me, for example, and force a hard reboot but there's no mention whatsoever (that I can find) in xorg.log, kdm.log, messages, syslog or dmesg. Windows seems to be fairly good at making its last breath a stop error before it dies which means when I get back into the system (or when I'm looking at a client's computer days after) I can find that stop error, look it up and figure out what went wrong. Are Linux's logs designed for troubleshooting or only for monitoring? Are proper troubleshooting logs kept somewhere else or in a special file? Is there a guide on how to read Linux's logs so I can make sense out of them like I can Windows' logs? In the case of a kernel crash, the last breath of the system is unfortunately not writing to dmesg/syslog and sync()ing disks. Depending on the nature of the crash, there are some good reasons not to do this, though. (E.g. is the case of a PANIC(), the kernel developer is basically indicating that the kernel image has been compromised -- doing FS operations with a compromised kernel might cause [more] data loss.) I think that logs in general are... dropping in quality. They seem to be less focused around failed sanity checks, mis-configuration warnings, and I-was- here before I called exit() message. They seem to more filled with I-didn't- comment-this-out-before-our-release build debugging messages for random developers. This is not true of kernel logs for the most part; I find them informative, but it is rarely my kernel that causes me problems. I speak as someone that has been working as a developer in some capacity for 8 years. Take that for what you will. 3) Linux needs better troubleshooting and recovery systems. The answer I usually get when I get an unexplained error is to run the program inside a dbg or with valgrind. I'm not convinced that this is a practical way to troubleshoot serious problems (like kernel panics) and it requires a certain amount of foresight that a problem will occur. According to this logic, the only way that someone can produce useful reports and feedback (or even get a clue as to what happened) on the day-to-day crashes and bugs is to start Linux and all of its sub process inside valgrind and/or gdb. This is obviously not an intended use of these programs. If we don't know how to reproduce the problem, we can't fix it. If we do know how to reproduce the problem, the foresight needed to use gdb/valgrind is not too much more. They shouldn't be your first tools, but they are necessary. I've also had gdb/valgrind mask errors, which is truly unfortunate. Still, if you know a way to make it crash every time EXCEPT when in gdb/valgrind,
Re: Why is troubleshooting Linux so hard?
On Sun, 15 Aug 2010 02:00:52 -0400, Borden Rhodes wrote: I'm going to list some of the frustrations I've been having with troubleshooting Linux's quirks, crashes and problems in hopes that someone may be able to help me (and the community) become better bug reporters and troubleshooters. I'll make comparisons to Windows only because I am used to fixing the same problems in Windows a certain way - maybe there are analogies in Linux or maybe I'm approaching these problems the wrong way. I'm not trying to troll or flame-bait. I'm using Debian Squeeze, by the way. 1) Is there a way to apply debugging symbols retroactively to a dump? (...) Dunno, but in all the time I was using linux, never had to install a - dbg package to solve 99% of the problems :-) Anyway, true is that every piece of code (kernel, gnome apps, kde apps, mozilla apps...) have their own system for debugging problems. 2) I find that the logs contain lots of facts but not a whole lot of useful information (if any) when something goes wrong. (...) Ugh, in windows is even worst :-( I find linux logging to be very useful (for finding real problems) and highly configurable. Is not uncommon to have a verbose switch available for every program that tends to provide very detailed information. The problem here is knowing what (what program is involved in the crash) and so where to look into. Expertise helps here. 3) Linux needs better troubleshooting and recovery systems. Yes, I agree this point can be improved and automated. But -generally speaking- I find linux debugging system to be far useful than Windows one. Greetings, -- Camaleón -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pan.2010.08.15.17.03...@gmail.com
Re: Why is troubleshooting Linux so hard?
On Sun, Aug 15, 2010 at 02:31:49AM -0400, John A. Sullivan III wrote: Very interesting and helpful post. Thank you. I've snipped most of it out for the sake of those for whom long emails are a problem or expensive. You should ALWAYS trim your messages, cutting out the irrelevant cruft, leaving only enough of the original message to which you're replying, so others can make sense of your reply. Thank you for trimming. Now if everyone else would learn that lesson. -- . O . O . O . . O O . . . O . . . O . O O O . O . O O . . O O O O . O . . O O O O . O O O signature.asc Description: Digital signature