Hey Guys, Able to reproduce the change log issue. Opening a JIRA and investigating.
https://issues.apache.org/jira/browse/SAMZA-166 TJ, I'll dig into the cache issue afterwards. Cheers, Chris On 3/4/14 2:04 PM, "TJ Giuli" <[email protected]> wrote: >Sure, Chris, > >1.) d38277ff83956f5885dd6596db9c0e15761964c7 >2.) ./gradlew clean test >3.) It doesn’t happen every time. I just ran three consecutive tests, 2 >failed with different failures and one succeeded. >Failure 1: http://pastebin.com/YG7KBjJz >Failure 2: http://pastebin.com/7NqES1rS > >Thanks for getting on this! >—T > >On Mar 4, 2014, at 1:37 PM, Chris Riccomini <[email protected]> >wrote: > >> Hey Guys, >> >> Having a look, but nothing yet. >> >> Regarding the TestStatefulTask bugs, Martin did find a bug this morning >>in >> the SAMZA-142 commit. The issue is that KafkaSystemAdmin can >>occasionally >> return empty metadata information for a change-log stream. This results >>in >> an NPE later in the TaskStorageManager. The issue is triggered when >>there >> is no lead Kafka broker for a given change-log's topic/partition. >> >> That said, I don't *think* this should cause a failure in >> TestStatefulTask, since TestStatefulTask.validateTopics is run before >>the >> tests are run, and validateTopics checks to make sure that the metadata >>is >> available and there is no error code. >> >> As for the testBasicMetadataCacheFunctionality, I haven't seen that >>issue, >> and can't reproduce it. TJ, can you send: >> >> 1. The git checksum you're working off of. >> 2. The command you're using to run the test. >> 3. Does the failure happen every time, or just randomly? >> >> Cheers, >> Chris >> >> On 3/3/14 11:57 PM, "TJ Giuli" <[email protected]> wrote: >> >>> Hey, guys, >>> >>> I¹m also having build and test problems on both my Mac OS X (10.9.2) >>>box >>> and a relatively fresh Ubuntu 12.04 install. On Ubuntu, I¹m getting >>>the >>> error that Garry describes (http://pastebin.com/4w3qr11K). I was >>>getting >>> the same error on my Mac, but now I seem to have moved onto a failure >>>in >>> the testBasicMetadataCacheFunctionality test >>> (http://pastebin.com/YNxrNC7q). >>> ‹T >>> >>> On Mar 3, 2014, at 4:25 PM, Garry Turkington >>> <[email protected]> wrote: >>> >>>> Jakob, >>>> >>>> Yep, here's the output: >>>> >>>> devel@vm17:~/samza$ git bisect bad >>>> f50f022c7d0fbe648412c26c9d6dc677e7758006 is the first bad commit >>>> commit f50f022c7d0fbe648412c26c9d6dc677e7758006 >>>> Author: Chris Riccomini <[email protected]> >>>> Date: Fri Feb 28 09:26:54 2014 -0800 >>>> >>>> SAMZA-142; changelog stores should restore from beginning of stream, >>>> not the end >>>> >>>> Garry >>>> >>>> -----Original Message----- >>>> From: Jakob Homan [mailto:[email protected]] >>>> Sent: 03 March 2014 23:25 >>>> To: [email protected] >>>> Subject: Re: TestStatefulTask failures >>>> >>>> Garry, can you run git bisect against the commits for the past few >>>>days >>>> on the wheezy box? >>>> >>>> >>>> On Monday, March 3, 2014 at 3:11 PM, Garry Turkington wrote: >>>> >>>>> Hi Chris, >>>>> >>>>> Posted the test log at : >>>>> >>>>> http://pastebin.com/LFEdfQqX >>>>> >>>>> Highlight is that it is timing out, and indeed line 325 of the test >>>>>is >>>>> task.awaitMessage. Which seems slightly odd as if there was something >>>>> badly broken with the instantiation of Kafka and sending messages >>>>> to/from it wouldn't we expect failures in the samza-kafka tests? >>>>> >>>>> On the Wheezy box this is failing every time. >>>>> >>>>> >>>>> Regards >>>>> Garry >>>>> >>>>> -----Original Message----- >>>>> From: Chris Riccomini [mailto:[email protected]] >>>>> Sent: 03 March 2014 22:55 >>>>> To: [email protected] >>>>> Subject: Re: TestStatefulTask failures >>>>> >>>>> Hey Garry, >>>>> >>>>> Master successfully tested on my Mac OSX box with: >>>>> >>>>> $ ./gradlew clean test >>>>> >>>>> Cheers, >>>>> Chris >>>>> >>>>> On 3/3/14 2:49 PM, "Chris Riccomini" <[email protected]> wrote: >>>>> >>>>>> Hey Garry, >>>>>> >>>>>> Hmm. This is alarming. >>>>>> >>>>>> This test is really more of an integration test than a unit test, >>>>>> which makes it a bit trickier to tell why it's failed. It is, >>>>>> however, extraordinarily useful in catching a ton of obscure bugs >>>>>> that sneak through most of the other tests. >>>>>> >>>>>> Questions: >>>>>> >>>>>> 1. What is the error you see in the resulting test logs? >>>>>> 2. Does it ALWAYS fail on your Wheezy box, or just sometimes? >>>>>> >>>>>> I will try and re-run on my end. It's working fine on a branch of >>>>>> mine that was rebased mid-last week, but perhaps something has >>>>>>broken. >>>>>> >>>>>> Cheers, >>>>>> Chris >>>>>> >>>>>> On 3/3/14 2:44 PM, "Garry Turkington" >>>>>> <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi guys, >>>>>>> >>>>>>> Anyone else having issues doing a clean build of master? I was >>>>>>> happily doing rebuilds on a repo that I hadn't pulled from origin >>>>>>> since mid-last week. Then I did a git pull today and I get the >>>>>>> following on each build >>>>>>> attempt: >>>>>>> >>>>>>> org.apache.samza.test.integration.TestStatefulTask > >>>>>>> testShouldStartAndRestore FAILED java.lang.AssertionError at >>>>>>> TestStatefulTask.scala:325 >>>>>>> >>>>>>> The slightly curious thing is that if I go do a clone of master on >>>>>>> a different host (Centos 5.2 64-bit) it builds fine but on my >>>>>>> usual development VM (Debian Wheezy 64-bit) the above happens. >>>>>>> >>>>>>> This could be specific to my environment (not the first time!) but >>>>>>> I also know there have been changes around state and that specific >>>>>>> test recently so anyone else seeing odd behaviour? >>>>>>> >>>>>>> Thanks >>>>>>> Garry >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> ----- >>>>> No virus found in this message. >>>>> Checked by AVG - www.avg.com >>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >>>>> 03/03/14 >>>>> >>>>> >>>> >>>> >>>> >>>> ----- >>>> No virus found in this message. >>>> Checked by AVG - www.avg.com >>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >>>> 03/03/14 >>> >> >
