Ok, yes, things look good from my end when I compile 12594fb710 with Java 6 on Mac OS X. Do you guys have more of a sense of whether the issues with Java 7 are confined to the build, or is runtime stability of Samza on Java 7 in question? Should I exclusively run my Samza tasks with 6? Thanks for looking into this! —T
On Mar 5, 2014, at 1:10 PM, Garry Turkington <[email protected]> wrote: > Woot, yes, I can confirm this patch fixes things on my host. Thanks Chris! > > Regarding the failings in TestTopicMetadataCache this is Java 7 related, > something I need update SAMZA-16 about, this is an intermittent build failure > that we don't see on JDK6. > > Garry > > -----Original Message----- > From: Chris Riccomini [mailto:[email protected]] > Sent: 05 March 2014 20:10 > To: [email protected] > Subject: Re: TestStatefulTask failures > > Hey Guys, > > I have a patch up at: > > https://issues.apache.org/jira/browse/SAMZA-166 > > > Could you please apply and see if this fixes your problem? I ran the > TestStatefulTask test for an hour, and it passed every time. > > TJ, regarding your cache issue, can you try running with Java 1.6 instead of > 1.7, and see if that fixes the issue? Samza has had known issues with Java > 1.7. > > Cheers, > Chris > > On 3/4/14 4:12 PM, "Jakob Homan" <[email protected]> wrote: > >> Hey TJ- >> Java 1.7 is known to be flaky right now. Garry had planned on >> taking a look at the issue. Not sure where he is on this. We >> definitely want to get better 1.7 support. >> -jg >> >> >> >> On Tue, Mar 4, 2014 at 2:27 PM, TJ Giuli <[email protected]> >> wrote: >> >>> Great, thanks Chris. >>> >>> Also, I should mention that when I build on my Mac, this is sprinkled >>> throughout the build output: >>> >>> objc[52666]: Class JavaLaunchHelper is implemented in both >>> /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/bin/j >>> ava >>> and >>> >>> /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre/li >>> b/l >>> ibinstrument.dylib. >>> One of the two will be used. Which one is undefined. >>> >>> —T >>> On Mar 4, 2014, at 2:11 PM, Chris Riccomini <[email protected]> >>> wrote: >>> >>>> Hey Guys, >>>> >>>> Able to reproduce the change log issue. Opening a JIRA and >>> investigating. >>>> >>>> https://issues.apache.org/jira/browse/SAMZA-166 >>>> >>>> >>>> TJ, I'll dig into the cache issue afterwards. >>>> >>>> Cheers, >>>> Chris >>>> >>>> On 3/4/14 2:04 PM, "TJ Giuli" <[email protected]> wrote: >>>> >>>>> Sure, Chris, >>>>> >>>>> 1.) d38277ff83956f5885dd6596db9c0e15761964c7 >>>>> 2.) ./gradlew clean test >>>>> 3.) It doesn’t happen every time. I just ran three consecutive >>> tests, >>> 2 >>>>> failed with different failures and one succeeded. >>>>> Failure 1: http://pastebin.com/YG7KBjJz Failure 2: >>>>> http://pastebin.com/7NqES1rS >>>>> >>>>> Thanks for getting on this! >>>>> —T >>>>> >>>>> On Mar 4, 2014, at 1:37 PM, Chris Riccomini >>>>> <[email protected]> >>>>> wrote: >>>>> >>>>>> Hey Guys, >>>>>> >>>>>> Having a look, but nothing yet. >>>>>> >>>>>> Regarding the TestStatefulTask bugs, Martin did find a bug this >>> morning >>>>>> in >>>>>> the SAMZA-142 commit. The issue is that KafkaSystemAdmin can >>>>>> occasionally return empty metadata information for a change-log >>>>>> stream. This >>> results >>>>>> in >>>>>> an NPE later in the TaskStorageManager. The issue is triggered >>>>>> when there is no lead Kafka broker for a given change-log's >>>>>> topic/partition. >>>>>> >>>>>> That said, I don't *think* this should cause a failure in >>>>>> TestStatefulTask, since TestStatefulTask.validateTopics is run >>> before >>>>>> the >>>>>> tests are run, and validateTopics checks to make sure that the >>> metadata >>>>>> is >>>>>> available and there is no error code. >>>>>> >>>>>> As for the testBasicMetadataCacheFunctionality, I haven't seen >>>>>> that issue, and can't reproduce it. TJ, can you send: >>>>>> >>>>>> 1. The git checksum you're working off of. >>>>>> 2. The command you're using to run the test. >>>>>> 3. Does the failure happen every time, or just randomly? >>>>>> >>>>>> Cheers, >>>>>> Chris >>>>>> >>>>>> On 3/3/14 11:57 PM, "TJ Giuli" <[email protected]> wrote: >>>>>> >>>>>>> Hey, guys, >>>>>>> >>>>>>> I¹m also having build and test problems on both my Mac OS X >>> (10.9.2) >>>>>>> box >>>>>>> and a relatively fresh Ubuntu 12.04 install. On Ubuntu, I¹m >>> getting >>>>>>> the >>>>>>> error that Garry describes (http://pastebin.com/4w3qr11K). I >>>>>>> was getting the same error on my Mac, but now I seem to have >>>>>>> moved onto a >>> failure >>>>>>> in >>>>>>> the testBasicMetadataCacheFunctionality test >>>>>>> (http://pastebin.com/YNxrNC7q). >>>>>>> ‹T >>>>>>> >>>>>>> On Mar 3, 2014, at 4:25 PM, Garry Turkington >>>>>>> <[email protected]> wrote: >>>>>>> >>>>>>>> Jakob, >>>>>>>> >>>>>>>> Yep, here's the output: >>>>>>>> >>>>>>>> devel@vm17:~/samza$ git bisect bad >>>>>>>> f50f022c7d0fbe648412c26c9d6dc677e7758006 is the first bad >>>>>>>> commit commit f50f022c7d0fbe648412c26c9d6dc677e7758006 >>>>>>>> Author: Chris Riccomini <[email protected]> >>>>>>>> Date: Fri Feb 28 09:26:54 2014 -0800 >>>>>>>> >>>>>>>> SAMZA-142; changelog stores should restore from beginning of >>> stream, >>>>>>>> not the end >>>>>>>> >>>>>>>> Garry >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Jakob Homan [mailto:[email protected]] >>>>>>>> Sent: 03 March 2014 23:25 >>>>>>>> To: [email protected] >>>>>>>> Subject: Re: TestStatefulTask failures >>>>>>>> >>>>>>>> Garry, can you run git bisect against the commits for the past >>>>>>>> few days on the wheezy box? >>>>>>>> >>>>>>>> >>>>>>>> On Monday, March 3, 2014 at 3:11 PM, Garry Turkington wrote: >>>>>>>> >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Posted the test log at : >>>>>>>>> >>>>>>>>> http://pastebin.com/LFEdfQqX >>>>>>>>> >>>>>>>>> Highlight is that it is timing out, and indeed line 325 of the >>> test >>>>>>>>> is >>>>>>>>> task.awaitMessage. Which seems slightly odd as if there was >>> something >>>>>>>>> badly broken with the instantiation of Kafka and sending >>>>>>>>> messages to/from it wouldn't we expect failures in the samza-kafka >>>>>>>>> tests? >>>>>>>>> >>>>>>>>> On the Wheezy box this is failing every time. >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Garry >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Chris Riccomini [mailto:[email protected]] >>>>>>>>> Sent: 03 March 2014 22:55 >>>>>>>>> To: [email protected] >>>>>>>>> Subject: Re: TestStatefulTask failures >>>>>>>>> >>>>>>>>> Hey Garry, >>>>>>>>> >>>>>>>>> Master successfully tested on my Mac OSX box with: >>>>>>>>> >>>>>>>>> $ ./gradlew clean test >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 3/3/14 2:49 PM, "Chris Riccomini" <[email protected]> >>> wrote: >>>>>>>>> >>>>>>>>>> Hey Garry, >>>>>>>>>> >>>>>>>>>> Hmm. This is alarming. >>>>>>>>>> >>>>>>>>>> This test is really more of an integration test than a unit >>> test, >>>>>>>>>> which makes it a bit trickier to tell why it's failed. It is, >>>>>>>>>> however, extraordinarily useful in catching a ton of obscure >>> bugs >>>>>>>>>> that sneak through most of the other tests. >>>>>>>>>> >>>>>>>>>> Questions: >>>>>>>>>> >>>>>>>>>> 1. What is the error you see in the resulting test logs? >>>>>>>>>> 2. Does it ALWAYS fail on your Wheezy box, or just sometimes? >>>>>>>>>> >>>>>>>>>> I will try and re-run on my end. It's working fine on a >>>>>>>>>> branch >>> of >>>>>>>>>> mine that was rebased mid-last week, but perhaps something >>>>>>>>>> has broken. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 3/3/14 2:44 PM, "Garry Turkington" >>>>>>>>>> <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi guys, >>>>>>>>>>> >>>>>>>>>>> Anyone else having issues doing a clean build of master? I >>>>>>>>>>> was happily doing rebuilds on a repo that I hadn't pulled >>>>>>>>>>> from >>> origin >>>>>>>>>>> since mid-last week. Then I did a git pull today and I get >>>>>>>>>>> the following on each build >>>>>>>>>>> attempt: >>>>>>>>>>> >>>>>>>>>>> org.apache.samza.test.integration.TestStatefulTask > >>>>>>>>>>> testShouldStartAndRestore FAILED java.lang.AssertionError at >>>>>>>>>>> TestStatefulTask.scala:325 >>>>>>>>>>> >>>>>>>>>>> The slightly curious thing is that if I go do a clone of >>> master on >>>>>>>>>>> a different host (Centos 5.2 64-bit) it builds fine but on >>>>>>>>>>> my usual development VM (Debian Wheezy 64-bit) the above happens. >>>>>>>>>>> >>>>>>>>>>> This could be specific to my environment (not the first >>>>>>>>>>> time!) >>> but >>>>>>>>>>> I also know there have been changes around state and that >>> specific >>>>>>>>>>> test recently so anyone else seeing odd behaviour? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Garry >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ----- >>>>>>>>> No virus found in this message. >>>>>>>>> Checked by AVG - www.avg.com >>>>>>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >>>>>>>>> 03/03/14 >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ----- >>>>>>>> No virus found in this message. >>>>>>>> Checked by AVG - www.avg.com >>>>>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date: >>>>>>>> 03/03/14 >>>>>>> >>>>>> >>>>> >>>> >>> >>> >
