RE: large messages from Jenkins failures
The Groovy-scripted email length reduction method appears to be working now (after quite a few false starts). I've added the following script, which limits email body text to 200K chars, to all Lucene and Solr jobs on all three Jenkins instances that regularly send failure emails to this list (ASF, SDDS, and flonkings): maxLength = 20; trailingLength = 1; bodyPart = msg.getContent().getBodyPart(0); body = bodyPart.getContent(); bodyLength = body.length(); logger.println([Email-ext] Notification email body length: + bodyLength); if (bodyLength maxLength) { text = new StringBuilder(); text.append(body.substring(0, maxLength - trailingLength)); text.append(\n\n[...truncated too long message...]\n\n); text.append(body.substring(bodyLength - trailingLength)); bodyPart.setText(text.toString(), UTF-8); logger.println([Email-ext] Reduced notification email body length to: + text.length()); } You can see the first successfully length-reduced email to this list, sent from Policeman Jenkins Server a little less than an hour before this post, with subject [JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_06) - Build # 533 - Failure!. I chose the 200K chars limit somewhat arbitrarily - please let me know if you think it should be different. Steve -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Tuesday, August 28, 2012 3:56 PM To: dev@lucene.apache.org Subject: RE: large messages from Jenkins failures Actually, after discussing with Uwe on #lucene-dev IRC, I'm looking into another mechanism to reduce the size of email messages: the Jenkins Email-Ext plugin has a per-build-job configuration item named Pre-send script that allows you to modify the MimeMessage object representing an email via a Groovy script. Here's what I've got so far - I'm going to enable this now on all the jobs on Uwe's Jenkins (the msg variable, of type MimeMessage, is made available by the plugin to the script): maxLength = 20; trailingLength = 1; content = msg.getContent(); // assumption: mime type is text/plain contentLength = content.length(); if (content.length() maxLength) { text = content.substring(0, maxLength - trailingLength) + \n\n[... truncated too long message ...]\n\n + content.substring(contentLength - trailingLength); msg.setText(text, UTF-8); } Steve -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, August 28, 2012 3:11 PM To: dev@lucene.apache.org Subject: Re: large messages from Jenkins failures On Mon, Aug 20, 2012 at 2:22 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Oh, one more thing -- if we suppress the console output we would absolutely have to keep (at jenkins) multiple tests-report.txt files because these always contain full output dumps (regardless of console settings). Otherwise we'd suppress potentially important info. +1 to not forward truckloads of info to the mailing lists, as long as we can easily get at it via jenkins or some other mechanism. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
Created SOLR-3766 for this. Dawid On Wed, Aug 29, 2012 at 2:32 AM, Michael McCandless luc...@mikemccandless.com wrote: On Tue, Aug 28, 2012 at 5:42 PM, Chris Hostetter hossman_luc...@fucit.org wrote: If folks are concerned that certian tests fail to frequently to be considered stable and included in the main build, then let's: 1) slap a special @UnstableTest annotation on them 2) set up a new jenkins job that *only* runs these @UnstableTest jobs 3) configure this new jenkins job to not send any email +1 Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: large messages from Jenkins failures
Hi Dawid, Unfortunately, -Dtests.showOutput=never penalizes all tests that don't have megabytes of failure output because some do. What do you think of adding an option to limit output size (e.g. -Dtests.outputLimitKB=10), and truncating to that size if it's exceeded? If you think this would be reasonable, I'm willing to (try to) do the work. Steve -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Monday, August 20, 2012 4:10 PM To: dev@lucene.apache.org Subject: RE: large messages from Jenkins failures +1 to using -Dtests.showOutput=never for Jenkins jobs. - Steve -Original Message- From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Monday, August 20, 2012 2:20 PM To: dev@lucene.apache.org Subject: Re: large messages from Jenkins failures This is partially aggregated by solr failure logs (the output for successful suites is not emitted to the console). As for myself I don't look at those e-mails directly, I typicall click on the jenkins link to see the full output. Alternatively we could suppress the console output for failures too (it would still show the stack trace and everything, just not the stdout/sysouts) -- this is relatively easy to override even from jenkins level: -Dtests.showOutput=never Dawid On Fri, Aug 17, 2012 at 5:04 PM, Dyer, James james.d...@ingramcontent.com wrote: Is there any way we can limit the size of the messages Jenkins emails this list? Responsing to a your mailbox is full warning, I found I had 32 recent Jenkins messages all over 1mb (a few were 10mb). A few weeks ago I returned from vacation to find my mail account partially disabled because Jenkins had used up most of my storage. Maybe, if the log is more than so many lines to just supplies a link to it than have the whole thing in the email? I realize a lot of you have unlimited storage on your email accounts, but unfortunately I do not. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
Unfortunately, -Dtests.showOutput=never penalizes all tests that don't have megabytes of failure output because some do. It doesn't penalize them, it says exactly what it does. A full report is always written to disk, including all sysouts -- look at tests-report.txt if I recall right. What do you think of adding an option to limit output size (e.g. -Dtests.outputLimitKB=10), and truncating to that size if it's exceeded? If you think this would be reasonable, I'm willing to (try to) do the work. I don't know... this seems like monkey patching for something that is wrong in the first place. Here are my thoughts on this: 1) the problem is not really in big e-mails but that they're frequent failures resulting from pretty much a fixed set of classes that we don't know how to stabilize. 2) I think Solr emits a LOT of logging information to the console. I don't know if all of it is really useful -- I doubt it, really. The solutions I see are simple -- disable the tests that fail 3-5 times and we still don't know what causes the problem. Disable them and file a JIRA issue. An alternative is to redirect these logs on Solr tests to a file or a circular memory buffer and only emit like a tail of N most recent messages if we know a test failed (which is easy to do with a simple rule). Patching the test runner to truncate log output is doable of course but I think it's powdering the corpse or whatever the English idiom for that is, you get me. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
On Tue, Aug 28, 2012 at 2:43 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: 2) I think Solr emits a LOT of logging information to the console. I don't know if all of it is really useful -- I doubt it, really. The solutions I see are simple -- disable the tests that fail 3-5 times and we still don't know what causes the problem. Disable them and file a JIRA issue. Another option is to redirect solr fails to a different mailing list that only those that care about solr development can follow. Tests that fail a small percent of the time are still hugely valuable (i.e. when they fail for a different reason than usual, or they start failing much more often). Simply disabling them is far worse for the project. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
On Mon, Aug 20, 2012 at 2:22 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Oh, one more thing -- if we suppress the console output we would absolutely have to keep (at jenkins) multiple tests-report.txt files because these always contain full output dumps (regardless of console settings). Otherwise we'd suppress potentially important info. +1 to not forward truckloads of info to the mailing lists, as long as we can easily get at it via jenkins or some other mechanism. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: large messages from Jenkins failures
Actually, after discussing with Uwe on #lucene-dev IRC, I'm looking into another mechanism to reduce the size of email messages: the Jenkins Email-Ext plugin has a per-build-job configuration item named Pre-send script that allows you to modify the MimeMessage object representing an email via a Groovy script. Here's what I've got so far - I'm going to enable this now on all the jobs on Uwe's Jenkins (the msg variable, of type MimeMessage, is made available by the plugin to the script): maxLength = 20; trailingLength = 1; content = msg.getContent(); // assumption: mime type is text/plain contentLength = content.length(); if (content.length() maxLength) { text = content.substring(0, maxLength - trailingLength) + \n\n[... truncated too long message ...]\n\n + content.substring(contentLength - trailingLength); msg.setText(text, UTF-8); } Steve -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, August 28, 2012 3:11 PM To: dev@lucene.apache.org Subject: Re: large messages from Jenkins failures On Mon, Aug 20, 2012 at 2:22 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Oh, one more thing -- if we suppress the console output we would absolutely have to keep (at jenkins) multiple tests-report.txt files because these always contain full output dumps (regardless of console settings). Otherwise we'd suppress potentially important info. +1 to not forward truckloads of info to the mailing lists, as long as we can easily get at it via jenkins or some other mechanism. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
Another option is to redirect solr fails to a different mailing list that only those that care about solr development can follow. I don't make a distinction between solr and lucene development, call me odd. I did try to help with those few tests (and I fixed some others) but no luck. Tests that fail a small percent of the time are still hugely valuable (i.e. when they fail for a different reason than usual, or they start failing much more often). Simply disabling them is far worse for the project. I don't agree with you here. I think having two or three failures daily from the same test (and typically with the same message) is far worse than not having it at all. You get used to having failing tests and this is bad. A test failure should be a red flag, something you eagerly look into because you're curious about what happened. I stopped having that feeling after a while, this seems bad to me. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
On Tue, Aug 28, 2012 at 3:04 PM, Yonik Seeley yo...@lucidworks.com wrote: On Tue, Aug 28, 2012 at 2:43 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: 2) I think Solr emits a LOT of logging information to the console. I don't know if all of it is really useful -- I doubt it, really. The solutions I see are simple -- disable the tests that fail 3-5 times and we still don't know what causes the problem. Disable them and file a JIRA issue. Another option is to redirect solr fails to a different mailing list that only those that care about solr development can follow. I don't think splintering the dev community is healthy. What I really want is for the tests (or the bugs in Solr/Lucene causing the test failures) to be fixed, for a Solr dev who understands the test to dig into it. Tests that fail a small percent of the time are still hugely valuable (i.e. when they fail for a different reason than usual, or they start failing much more often). Simply disabling them is far worse for the project. I agree, for tests that don't fail frequently. This is the power/purpose of having a test. The problem is certain Solr tests fail very frequently and nobody jumps on those failures / we become complacent: such failures quickly stop being helpful. I know Mark has jumped on some of the test failures (thank you!), but he's only one person and we still have certain Solr tests failing frequently. This really reflects a deeper problem: Solr doesn't have enough dev coverage, or devs that have time/itch/energy to dig into hard test failures. When a test fails devs should be eager to fix it. That's the polar opposite of Solr's failures today. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
On Tue, Aug 28, 2012 at 3:57 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't agree with you here. I think having two or three failures daily from the same test (and typically with the same message) is far worse than not having it at all. Imperfect test coverage is better than no test coverage? Seems like we could simply disable all of our tests and then be happy because they will never fail ;-) Some of these tests fail because of threads left over that are hard to control - we have a lot more moving parts like jetty and zookeeper. Some tests started failing more often because of more stringent checks (like threads left over after a test). If these can't be fixed in a timely manner, it seems like the most logical thing to do is relax the checks - that maximises test coverage. You get used to having failing tests and this is bad. A test failure should be a red flag, something you eagerly look into because you're curious about what happened. I stopped having that feeling after a while, this seems bad to me. It is bad, but disabling seems even worse, unless we're just not worried about test code coverage at all. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
On Tue, Aug 28, 2012 at 4:03 PM, Michael McCandless luc...@mikemccandless.com wrote: Another option is to redirect solr fails to a different mailing list that only those that care about solr development can follow. I don't think splintering the dev community is healthy. Well, it seems like some people would prefer tests that fail sometimes to be disabled so they don't see the failure messages. Others (like me) find those tests to be extremely valuable since they represent coverage for key features. How else to resolve that? Just fix the test isn't an answer... unless one is personally committing the time to do it themselves. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
Imperfect test coverage is better than no test coverage? Seems like we could simply disable all of our tests and then be happy because they will never fail ;-) I didn't say that. I said the opposite - that having imperfect tests (or rather tests that cannot be fixed for whatever reason) discourages from looking at test failures and makes one just unsubscribe from the jenkins mails. If this is the case then yes, I think not having a test like that at all is better than having it. Some of these tests fail because of threads left over that are hard to control - we have a lot more moving parts like jetty and zookeeper. I understand that but these tests have been failing long before those checks were added. I also understand the complexity involved -- like I said, I also tried to fix those tests and failed. timely manner, it seems like the most logical thing to do is relax the checks - that maximises test coverage. These thread leak checks are meant to isolate test suites from each other and I think they do a good job at it. It is bad, but disabling seems even worse, unless we're just not worried about test code coverage at all. We have different viewpoints on this, sorry. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
: I didn't say that. I said the opposite - that having imperfect tests : (or rather tests that cannot be fixed for whatever reason) discourages : from looking at test failures and makes one just unsubscribe from the : jenkins mails. If this is the case then yes, I think not having a test : like that at all is better than having it. As i've said before... Running these problematic tests in jenkins on machines like builds.apache.org is still very helpful because in many cases folks are unable to reproduce the failures anywhere else (or in some cases: some people can reproduce them, but not the people who have the knowledge/energy to fix them) If folks are concerned that certian tests fail to frequently to be considered stable and included in the main build, then let's: 1) slap a special @UnstableTest annotation on them 2) set up a new jenkins job that *only* runs these @UnstableTest jobs 3) configure this new jenkins job to not send any email ...seems like that would satisfy everyone right? -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
1) slap a special @UnstableTest annotation on them 2) set up a new jenkins job that *only* runs these @UnstableTest jobs 3) configure this new jenkins job to not send any email ...seems like that would satisfy everyone right? I'm all for it. We can rename @BadApple to @Unstable and make it disabled by default. As for (2) this will be tricky because there's no way to run just a specific group. I like this idea as a feature though so if there's no vetos I'll add it to the runner. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: large messages from Jenkins failures
On Tue, Aug 28, 2012 at 5:42 PM, Chris Hostetter hossman_luc...@fucit.org wrote: If folks are concerned that certian tests fail to frequently to be considered stable and included in the main build, then let's: 1) slap a special @UnstableTest annotation on them 2) set up a new jenkins job that *only* runs these @UnstableTest jobs 3) configure this new jenkins job to not send any email +1 Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org