Hey Guys,

Able to reproduce the change log issue. Opening a JIRA and investigating.

  https://issues.apache.org/jira/browse/SAMZA-166


TJ, I'll dig into the cache issue afterwards.

Cheers,
Chris

On 3/4/14 2:04 PM, "TJ Giuli" <[email protected]> wrote:

>Sure, Chris, 
>
>1.)  d38277ff83956f5885dd6596db9c0e15761964c7
>2.)  ./gradlew clean test
>3.)  It doesn’t happen every time.  I just ran three consecutive tests, 2
>failed with different failures and one succeeded.
>Failure 1: http://pastebin.com/YG7KBjJz
>Failure 2: http://pastebin.com/7NqES1rS
>
>Thanks for getting on this!
>—T
>
>On Mar 4, 2014, at 1:37 PM, Chris Riccomini <[email protected]>
>wrote:
>
>> Hey Guys,
>> 
>> Having a look, but nothing yet.
>> 
>> Regarding the TestStatefulTask bugs, Martin did find a bug this morning
>>in
>> the SAMZA-142 commit. The issue is that KafkaSystemAdmin can
>>occasionally
>> return empty metadata information for a change-log stream. This results
>>in
>> an NPE later in the TaskStorageManager. The issue is triggered when
>>there
>> is no lead Kafka broker for a given change-log's topic/partition.
>> 
>> That said, I don't *think* this should cause a failure in
>> TestStatefulTask, since TestStatefulTask.validateTopics is run before
>>the
>> tests are run, and validateTopics checks to make sure that the metadata
>>is
>> available and there is no error code.
>> 
>> As for the testBasicMetadataCacheFunctionality, I haven't seen that
>>issue,
>> and can't reproduce it. TJ, can you send:
>> 
>> 1. The git checksum you're working off of.
>> 2. The command you're using to run the test.
>> 3. Does the failure happen every time, or just randomly?
>> 
>> Cheers,
>> Chris
>> 
>> On 3/3/14 11:57 PM, "TJ Giuli" <[email protected]> wrote:
>> 
>>> Hey, guys,
>>> 
>>> I¹m also having build and test problems on both my Mac OS X (10.9.2)
>>>box
>>> and a relatively fresh Ubuntu 12.04  install.  On Ubuntu, I¹m getting
>>>the
>>> error that Garry describes (http://pastebin.com/4w3qr11K).  I was
>>>getting
>>> the same error on my Mac, but now I seem to have moved onto a failure
>>>in
>>> the testBasicMetadataCacheFunctionality test
>>> (http://pastebin.com/YNxrNC7q).
>>> ‹T
>>> 
>>> On Mar 3, 2014, at 4:25 PM, Garry Turkington
>>> <[email protected]> wrote:
>>> 
>>>> Jakob,
>>>> 
>>>> Yep, here's the output:
>>>> 
>>>> devel@vm17:~/samza$ git bisect bad
>>>> f50f022c7d0fbe648412c26c9d6dc677e7758006 is the first bad commit
>>>> commit f50f022c7d0fbe648412c26c9d6dc677e7758006
>>>> Author: Chris Riccomini <[email protected]>
>>>> Date:   Fri Feb 28 09:26:54 2014 -0800
>>>> 
>>>>   SAMZA-142; changelog stores should restore from beginning of stream,
>>>> not the end
>>>> 
>>>> Garry
>>>> 
>>>> -----Original Message-----
>>>> From: Jakob Homan [mailto:[email protected]]
>>>> Sent: 03 March 2014 23:25
>>>> To: [email protected]
>>>> Subject: Re: TestStatefulTask failures
>>>> 
>>>> Garry, can you run git bisect against the commits for the past few
>>>>days
>>>> on the wheezy box?
>>>> 
>>>> 
>>>> On Monday, March 3, 2014 at 3:11 PM, Garry Turkington wrote:
>>>> 
>>>>> Hi Chris,
>>>>> 
>>>>> Posted the test log at :
>>>>> 
>>>>> http://pastebin.com/LFEdfQqX
>>>>> 
>>>>> Highlight is that it is timing out, and indeed line 325 of the test
>>>>>is
>>>>> task.awaitMessage. Which seems slightly odd as if there was something
>>>>> badly broken with the instantiation of Kafka and sending messages
>>>>> to/from it wouldn't we expect failures in the samza-kafka tests?
>>>>> 
>>>>> On the Wheezy box this is failing every time.
>>>>> 
>>>>> 
>>>>> Regards
>>>>> Garry
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Chris Riccomini [mailto:[email protected]]
>>>>> Sent: 03 March 2014 22:55
>>>>> To: [email protected]
>>>>> Subject: Re: TestStatefulTask failures
>>>>> 
>>>>> Hey Garry,
>>>>> 
>>>>> Master successfully tested on my Mac OSX box with:
>>>>> 
>>>>> $ ./gradlew clean test
>>>>> 
>>>>> Cheers,
>>>>> Chris
>>>>> 
>>>>> On 3/3/14 2:49 PM, "Chris Riccomini" <[email protected]> wrote:
>>>>> 
>>>>>> Hey Garry,
>>>>>> 
>>>>>> Hmm. This is alarming.
>>>>>> 
>>>>>> This test is really more of an integration test than a unit test,
>>>>>> which makes it a bit trickier to tell why it's failed. It is,
>>>>>> however, extraordinarily useful in catching a ton of obscure bugs
>>>>>> that sneak through most of the other tests.
>>>>>> 
>>>>>> Questions:
>>>>>> 
>>>>>> 1. What is the error you see in the resulting test logs?
>>>>>> 2. Does it ALWAYS fail on your Wheezy box, or just sometimes?
>>>>>> 
>>>>>> I will try and re-run on my end. It's working fine on a branch of
>>>>>> mine that was rebased mid-last week, but perhaps something has
>>>>>>broken.
>>>>>> 
>>>>>> Cheers,
>>>>>> Chris
>>>>>> 
>>>>>> On 3/3/14 2:44 PM, "Garry Turkington"
>>>>>> <[email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi guys,
>>>>>>> 
>>>>>>> Anyone else having issues doing a clean build of master? I was
>>>>>>> happily doing rebuilds on a repo that I hadn't pulled from origin
>>>>>>> since mid-last week. Then I did a git pull today and I get the
>>>>>>> following on each build
>>>>>>> attempt:
>>>>>>> 
>>>>>>> org.apache.samza.test.integration.TestStatefulTask >
>>>>>>> testShouldStartAndRestore FAILED java.lang.AssertionError at
>>>>>>> TestStatefulTask.scala:325
>>>>>>> 
>>>>>>> The slightly curious thing is that if I go do a clone of master on
>>>>>>> a different host (Centos 5.2 64-bit) it builds fine but on my
>>>>>>> usual development VM (Debian Wheezy 64-bit) the above happens.
>>>>>>> 
>>>>>>> This could be specific to my environment (not the first time!) but
>>>>>>> I also know there have been changes around state and that specific
>>>>>>> test recently so anyone else seeing odd behaviour?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Garry
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -----
>>>>> No virus found in this message.
>>>>> Checked by AVG - www.avg.com
>>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date:
>>>>> 03/03/14
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> -----
>>>> No virus found in this message.
>>>> Checked by AVG - www.avg.com
>>>> Version: 2014.0.4259 / Virus Database: 3705/7144 - Release Date:
>>>> 03/03/14
>>> 
>> 
>

Reply via email to