Hey Guys, I have a patch up at:
https://issues.apache.org/jira/browse/SAMZA-166 Could you please apply and see if this fixes your problem? I ran the TestStatefulTask test for an hour, and it passed every time. TJ, regarding your cache issue, can you try running with Java 1.6 instead of 1.7, and see if that fixes the issue? Samza has had known issues with Java 1.7. Cheers, Chris On 3/4/14 4:12 PM, "Jakob Homan" <[email protected]> wrote: >Hey TJ- > Java 1.7 is known to be flaky right now. Garry had planned on taking a >look at the issue. Not sure where he is on this. We definitely want to >get better 1.7 support. >-jg > > > >On Tue, Mar 4, 2014 at 2:27 PM, TJ Giuli <[email protected]> >wrote: > >> Great, thanks Chris. >> >> Also, I should mention that when I build on my Mac, this is sprinkled >> throughout the build output: >> >> objc[52666]: Class JavaLaunchHelper is implemented in both >> /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/bin/java >> and >> >>/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre/lib/l >>ibinstrument.dylib. >> One of the two will be used. Which one is undefined. >> >> —T >> On Mar 4, 2014, at 2:11 PM, Chris Riccomini <[email protected]> >> wrote: >> >> > Hey Guys, >> > >> > Able to reproduce the change log issue. Opening a JIRA and >>investigating. >> > >> > https://issues.apache.org/jira/browse/SAMZA-166 >> > >> > >> > TJ, I'll dig into the cache issue afterwards. >> > >> > Cheers, >> > Chris >> > >> > On 3/4/14 2:04 PM, "TJ Giuli" <[email protected]> wrote: >> > >> >> Sure, Chris, >> >> >> >> 1.) d38277ff83956f5885dd6596db9c0e15761964c7 >> >> 2.) ./gradlew clean test >> >> 3.) It doesn’t happen every time. I just ran three consecutive >>tests, >> 2 >> >> failed with different failures and one succeeded. >> >> Failure 1: http://pastebin.com/YG7KBjJz >> >> Failure 2: http://pastebin.com/7NqES1rS >> >> >> >> Thanks for getting on this! >> >> —T >> >> >> >> On Mar 4, 2014, at 1:37 PM, Chris Riccomini <[email protected]> >> >> wrote: >> >> >> >>> Hey Guys, >> >>> >> >>> Having a look, but nothing yet. >> >>> >> >>> Regarding the TestStatefulTask bugs, Martin did find a bug this >>morning >> >>> in >> >>> the SAMZA-142 commit. The issue is that KafkaSystemAdmin can >> >>> occasionally >> >>> return empty metadata information for a change-log stream. This >>results >> >>> in >> >>> an NPE later in the TaskStorageManager. The issue is triggered when >> >>> there >> >>> is no lead Kafka broker for a given change-log's topic/partition. >> >>> >> >>> That said, I don't *think* this should cause a failure in >> >>> TestStatefulTask, since TestStatefulTask.validateTopics is run >>before >> >>> the >> >>> tests are run, and validateTopics checks to make sure that the >>metadata >> >>> is >> >>> available and there is no error code. >> >>> >> >>> As for the testBasicMetadataCacheFunctionality, I haven't seen that >> >>> issue, >> >>> and can't reproduce it. TJ, can you send: >> >>> >> >>> 1. The git checksum you're working off of. >> >>> 2. The command you're using to run the test. >> >>> 3. Does the failure happen every time, or just randomly? >> >>> >> >>> Cheers, >> >>> Chris >> >>> >> >>> On 3/3/14 11:57 PM, "TJ Giuli" <[email protected]> wrote: >> >>> >> >>>> Hey, guys, >> >>>> >> >>>> I¹m also having build and test problems on both my Mac OS X >>(10.9.2) >> >>>> box >> >>>> and a relatively fresh Ubuntu 12.04 install. On Ubuntu, I¹m >>getting >> >>>> the >> >>>> error that Garry describes (http://pastebin.com/4w3qr11K). I was >> >>>> getting >> >>>> the same error on my Mac, but now I seem to have moved onto a >>failure >> >>>> in >> >>>> the testBasicMetadataCacheFunctionality test >> >>>> (http://pastebin.com/YNxrNC7q). >> >>>> ‹T >> >>>> >> >>>> On Mar 3, 2014, at 4:25 PM, Garry Turkington >> >>>> <[email protected]> wrote: >> >>>> >> >>>>> Jakob, >> >>>>> >> >>>>> Yep, here's the output: >> >>>>> >> >>>>> devel@vm17:~/samza$ git bisect bad >> >>>>> f50f022c7d0fbe648412c26c9d6dc677e7758006 is the first bad commit >> >>>>> commit f50f022c7d0fbe648412c26c9d6dc677e7758006 >> >>>>> Author: Chris Riccomini <[email protected]> >> >>>>> Date: Fri Feb 28 09:26:54 2014 -0800 >> >>>>> >> >>>>> SAMZA-142; changelog stores should restore from beginning of >>stream, >> >>>>> not the end >> >>>>> >> >>>>> Garry >> >>>>> >> >>>>> -----Original Message----- >> >>>>> From: Jakob Homan [mailto:[email protected]] >> >>>>> Sent: 03 March 2014 23:25 >> >>>>> To: [email protected] >> >>>>> Subject: Re: TestStatefulTask failures >> >>>>> >> >>>>> Garry, can you run git bisect against the commits for the past few >> >>>>> days >> >>>>> on the wheezy box? >> >>>>> >> >>>>> >> >>>>> On Monday, March 3, 2014 at 3:11 PM, Garry Turkington wrote: >> >>>>> >> >>>>>> Hi Chris, >> >>>>>> >> >>>>>> Posted the test log at : >> >>>>>> >> >>>>>> http://pastebin.com/LFEdfQqX >> >>>>>> >> >>>>>> Highlight is that it is timing out, and indeed line 325 of the >>test >> >>>>>> is >> >>>>>> task.awaitMessage. Which seems slightly odd as if there was >> something >> >>>>>> badly broken with the instantiation of Kafka and sending messages >> >>>>>> to/from it wouldn't we expect failures in the samza-kafka tests? >> >>>>>> >> >>>>>> On the Wheezy box this is failing every time. >> >>>>>> >> >>>>>> >> >>>>>> Regards >> >>>>>> Garry >> >>>>>> >> >>>>>> -----Original Message----- >> >>>>>> From: Chris Riccomini [mailto:[email protected]] >> >>>>>> Sent: 03 March 2014 22:55 >> >>>>>> To: [email protected] >> >>>>>> Subject: Re: TestStatefulTask failures >> >>>>>> >> >>>>>> Hey Garry, >> >>>>>> >> >>>>>> Master successfully tested on my Mac OSX box with: >> >>>>>> >> >>>>>> $ ./gradlew clean test >> >>>>>> >> >>>>>> Cheers, >> >>>>>> Chris >> >>>>>> >> >>>>>> On 3/3/14 2:49 PM, "Chris Riccomini" <[email protected]> >> wrote: >> >>>>>> >> >>>>>>> Hey Garry, >> >>>>>>> >> >>>>>>> Hmm. This is alarming. >> >>>>>>> >> >>>>>>> This test is really more of an integration test than a unit >>test, >> >>>>>>> which makes it a bit trickier to tell why it's failed. It is, >> >>>>>>> however, extraordinarily useful in catching a ton of obscure >>bugs >> >>>>>>> that sneak through most of the other tests. >> >>>>>>> >> >>>>>>> Questions: >> >>>>>>> >> >>>>>>> 1. What is the error you see in the resulting test logs? >> >>>>>>> 2. Does it ALWAYS fail on your Wheezy box, or just sometimes? >> >>>>>>> >> >>>>>>> I will try and re-run on my end. It's working fine on a branch >>of >> >>>>>>> mine that was rebased mid-last week, but perhaps something has >> >>>>>>> broken. >> >>>>>>> >> >>>>>>> Cheers, >> >>>>>>> Chris >> >>>>>>> >> >>>>>>> On 3/3/14 2:44 PM, "Garry Turkington" >> >>>>>>> <[email protected]> >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi guys, >> >>>>>>>> >> >>>>>>>> Anyone else having issues doing a clean build of master? I was >> >>>>>>>> happily doing rebuilds on a repo that I hadn't pulled from >>origin >> >>>>>>>> since mid-last week. Then I did a git pull today and I get the >> >>>>>>>> following on each build >> >>>>>>>> attempt: >> >>>>>>>> >> >>>>>>>> org.apache.samza.test.integration.TestStatefulTask > >> >>>>>>>> testShouldStartAndRestore FAILED java.lang.AssertionError at >> >>>>>>>> TestStatefulTask.scala:325 >> >>>>>>>> >> >>>>>>>> The slightly curious thing is that if I go do a clone of >>master on >> >>>>>>>> a different host (Centos 5.2 64-bit) it builds fine but on my >> >>>>>>>> usual development VM (Debian Wheezy 64-bit) the above happens. >> >>>>>>>> >> >>>>>>>> This could be specific to my environment (not the first time!) >>but >> >>>>>>>> I also know there have been changes around state and that >>specific >> >>>>>>>> test recently so anyone else seeing odd behaviour? >> >>>>>>>> >> >>>>>>>> Thanks >> >>>>>>>> Garry >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> ----- >> >>>>>> No virus found in this message. >> >>>>>> Checked by AVG - www.avg.com >> >>>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >> >>>>>> 03/03/14 >> >>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> ----- >> >>>>> No virus found in this message. >> >>>>> Checked by AVG - www.avg.com >> >>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >> >>>>> 03/03/14 >> >>>> >> >>> >> >> >> > >> >>
