Hey Guys, Updated JIRA with current theory on what's causing the failure:
https://issues.apache.org/jira/browse/SAMZA-166 Still investigating. Cheers, Chris On 3/4/14 2:27 PM, "TJ Giuli" <[email protected]> wrote: >Great, thanks Chris. > >Also, I should mention that when I build on my Mac, this is sprinkled >throughout the build output: > >objc[52666]: Class JavaLaunchHelper is implemented in both >/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/bin/java >and >/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre/lib/li >binstrument.dylib. One of the two will be used. Which one is undefined. > >—T >On Mar 4, 2014, at 2:11 PM, Chris Riccomini <[email protected]> >wrote: > >> Hey Guys, >> >> Able to reproduce the change log issue. Opening a JIRA and >>investigating. >> >> https://issues.apache.org/jira/browse/SAMZA-166 >> >> >> TJ, I'll dig into the cache issue afterwards. >> >> Cheers, >> Chris >> >> On 3/4/14 2:04 PM, "TJ Giuli" <[email protected]> wrote: >> >>> Sure, Chris, >>> >>> 1.) d38277ff83956f5885dd6596db9c0e15761964c7 >>> 2.) ./gradlew clean test >>> 3.) It doesn’t happen every time. I just ran three consecutive >>>tests, 2 >>> failed with different failures and one succeeded. >>> Failure 1: http://pastebin.com/YG7KBjJz >>> Failure 2: http://pastebin.com/7NqES1rS >>> >>> Thanks for getting on this! >>> —T >>> >>> On Mar 4, 2014, at 1:37 PM, Chris Riccomini <[email protected]> >>> wrote: >>> >>>> Hey Guys, >>>> >>>> Having a look, but nothing yet. >>>> >>>> Regarding the TestStatefulTask bugs, Martin did find a bug this >>>>morning >>>> in >>>> the SAMZA-142 commit. The issue is that KafkaSystemAdmin can >>>> occasionally >>>> return empty metadata information for a change-log stream. This >>>>results >>>> in >>>> an NPE later in the TaskStorageManager. The issue is triggered when >>>> there >>>> is no lead Kafka broker for a given change-log's topic/partition. >>>> >>>> That said, I don't *think* this should cause a failure in >>>> TestStatefulTask, since TestStatefulTask.validateTopics is run before >>>> the >>>> tests are run, and validateTopics checks to make sure that the >>>>metadata >>>> is >>>> available and there is no error code. >>>> >>>> As for the testBasicMetadataCacheFunctionality, I haven't seen that >>>> issue, >>>> and can't reproduce it. TJ, can you send: >>>> >>>> 1. The git checksum you're working off of. >>>> 2. The command you're using to run the test. >>>> 3. Does the failure happen every time, or just randomly? >>>> >>>> Cheers, >>>> Chris >>>> >>>> On 3/3/14 11:57 PM, "TJ Giuli" <[email protected]> wrote: >>>> >>>>> Hey, guys, >>>>> >>>>> I¹m also having build and test problems on both my Mac OS X (10.9.2) >>>>> box >>>>> and a relatively fresh Ubuntu 12.04 install. On Ubuntu, I¹m getting >>>>> the >>>>> error that Garry describes (http://pastebin.com/4w3qr11K). I was >>>>> getting >>>>> the same error on my Mac, but now I seem to have moved onto a failure >>>>> in >>>>> the testBasicMetadataCacheFunctionality test >>>>> (http://pastebin.com/YNxrNC7q). >>>>> ‹T >>>>> >>>>> On Mar 3, 2014, at 4:25 PM, Garry Turkington >>>>> <[email protected]> wrote: >>>>> >>>>>> Jakob, >>>>>> >>>>>> Yep, here's the output: >>>>>> >>>>>> devel@vm17:~/samza$ git bisect bad >>>>>> f50f022c7d0fbe648412c26c9d6dc677e7758006 is the first bad commit >>>>>> commit f50f022c7d0fbe648412c26c9d6dc677e7758006 >>>>>> Author: Chris Riccomini <[email protected]> >>>>>> Date: Fri Feb 28 09:26:54 2014 -0800 >>>>>> >>>>>> SAMZA-142; changelog stores should restore from beginning of >>>>>>stream, >>>>>> not the end >>>>>> >>>>>> Garry >>>>>> >>>>>> -----Original Message----- >>>>>> From: Jakob Homan [mailto:[email protected]] >>>>>> Sent: 03 March 2014 23:25 >>>>>> To: [email protected] >>>>>> Subject: Re: TestStatefulTask failures >>>>>> >>>>>> Garry, can you run git bisect against the commits for the past few >>>>>> days >>>>>> on the wheezy box? >>>>>> >>>>>> >>>>>> On Monday, March 3, 2014 at 3:11 PM, Garry Turkington wrote: >>>>>> >>>>>>> Hi Chris, >>>>>>> >>>>>>> Posted the test log at : >>>>>>> >>>>>>> http://pastebin.com/LFEdfQqX >>>>>>> >>>>>>> Highlight is that it is timing out, and indeed line 325 of the test >>>>>>> is >>>>>>> task.awaitMessage. Which seems slightly odd as if there was >>>>>>>something >>>>>>> badly broken with the instantiation of Kafka and sending messages >>>>>>> to/from it wouldn't we expect failures in the samza-kafka tests? >>>>>>> >>>>>>> On the Wheezy box this is failing every time. >>>>>>> >>>>>>> >>>>>>> Regards >>>>>>> Garry >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Chris Riccomini [mailto:[email protected]] >>>>>>> Sent: 03 March 2014 22:55 >>>>>>> To: [email protected] >>>>>>> Subject: Re: TestStatefulTask failures >>>>>>> >>>>>>> Hey Garry, >>>>>>> >>>>>>> Master successfully tested on my Mac OSX box with: >>>>>>> >>>>>>> $ ./gradlew clean test >>>>>>> >>>>>>> Cheers, >>>>>>> Chris >>>>>>> >>>>>>> On 3/3/14 2:49 PM, "Chris Riccomini" <[email protected]> >>>>>>>wrote: >>>>>>> >>>>>>>> Hey Garry, >>>>>>>> >>>>>>>> Hmm. This is alarming. >>>>>>>> >>>>>>>> This test is really more of an integration test than a unit test, >>>>>>>> which makes it a bit trickier to tell why it's failed. It is, >>>>>>>> however, extraordinarily useful in catching a ton of obscure bugs >>>>>>>> that sneak through most of the other tests. >>>>>>>> >>>>>>>> Questions: >>>>>>>> >>>>>>>> 1. What is the error you see in the resulting test logs? >>>>>>>> 2. Does it ALWAYS fail on your Wheezy box, or just sometimes? >>>>>>>> >>>>>>>> I will try and re-run on my end. It's working fine on a branch of >>>>>>>> mine that was rebased mid-last week, but perhaps something has >>>>>>>> broken. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/3/14 2:44 PM, "Garry Turkington" >>>>>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi guys, >>>>>>>>> >>>>>>>>> Anyone else having issues doing a clean build of master? I was >>>>>>>>> happily doing rebuilds on a repo that I hadn't pulled from origin >>>>>>>>> since mid-last week. Then I did a git pull today and I get the >>>>>>>>> following on each build >>>>>>>>> attempt: >>>>>>>>> >>>>>>>>> org.apache.samza.test.integration.TestStatefulTask > >>>>>>>>> testShouldStartAndRestore FAILED java.lang.AssertionError at >>>>>>>>> TestStatefulTask.scala:325 >>>>>>>>> >>>>>>>>> The slightly curious thing is that if I go do a clone of master >>>>>>>>>on >>>>>>>>> a different host (Centos 5.2 64-bit) it builds fine but on my >>>>>>>>> usual development VM (Debian Wheezy 64-bit) the above happens. >>>>>>>>> >>>>>>>>> This could be specific to my environment (not the first time!) >>>>>>>>>but >>>>>>>>> I also know there have been changes around state and that >>>>>>>>>specific >>>>>>>>> test recently so anyone else seeing odd behaviour? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Garry >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ----- >>>>>>> No virus found in this message. >>>>>>> Checked by AVG - www.avg.com >>>>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >>>>>>> 03/03/14 >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ----- >>>>>> No virus found in this message. >>>>>> Checked by AVG - www.avg.com >>>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >>>>>> 03/03/14 >>>>> >>>> >>> >> >
