Re: Travis CI

2014-03-29 Thread Nan Zhu
Hi,   

Is the migration from Jenkins to Travis finished?

I think Travis is actually not stable based on the observations in these days 
(and Jenkins becomes unstable too……  :-(  ), I’m actively working on two PRs 
related to DAGScheduler, I saw

Problem on Travis:  

1. test “large number of iterations”  in BagelSuite sometimes failed, because 
it doesn’t output anything within 10 seconds

2. hive/test usually aborted because it doesn’t output anything within 10 
minutes

3. a test case in Streaming.CheckpointSuite failed  

4. hive/test didn’t finish in 50 minutes, and was aborted

Problem on Jenkins:

1. didn’t finish in 90mins, and the process is aborted

2. the same as 3 in Travis problem

Some of these problems once appeared in Jenkins months, but not so often

I’m not complaining, I know that the admins are working hard to make the 
community run in a good condition on every aspect,  

I’m just reporting what I saw and hope that can help you to identify the problem

Thank you  

--  
Nan Zhu


On Tuesday, March 25, 2014 at 10:11 PM, Patrick Wendell wrote:

 Ya It's been a little bit slow lately because of a high error rate in
 interactions with the git-hub API. Unfortunately we are pretty slammed
 for the release and haven't had a ton of time to do further debugging.
  
 - Patrick
  
 On Tue, Mar 25, 2014 at 7:13 PM, Nan Zhu zhunanmcg...@gmail.com 
 (mailto:zhunanmcg...@gmail.com) wrote:
  I just found that the Jenkins is not working from this afternoon
   
  for one PR, the first time build failed after 90 minutes, the second time it
  has run for more than 2 hours, no result is returned
   
  Best,
   
  --
  Nan Zhu
   
   
  On Tuesday, March 25, 2014 at 10:06 PM, Patrick Wendell wrote:
   
  That's not correct - like Michael said the Jenkins build remains the
  reference build for now.
   
  On Tue, Mar 25, 2014 at 7:03 PM, Nan Zhu zhunanmcg...@gmail.com 
  (mailto:zhunanmcg...@gmail.com) wrote:
   
  I assume the Jenkins is not working now?
   
  Best,
   
  --
  Nan Zhu
   
   
  On Tuesday, March 25, 2014 at 6:42 PM, Michael Armbrust wrote:
   
  Just a quick note to everyone that Patrick and I are playing around with
  Travis CI on the Spark github repository. For now, travis does not run all
  of the test cases, so will only be turned on experimentally. Long term it
  looks like Travis might give better integration with github, so we are
  going to see if it is feasible to get all of our tests running on it.
   
  *Jenkins remains the reference CI and should be consulted before merging
  pull requests, independent of what Travis says.*
   
  If you have any questions or want to help out with the investigation, let
  me know!
   
  Michael  



Re: Travis CI

2014-03-29 Thread Michael Armbrust

 Is the migration from Jenkins to Travis finished?


It is not finished and really at this point it is only something we are
considering, not something that will happen for sure.  We turned it on in
addition to Jenkins so that we could start finding issues exactly like the
ones you described below to determine if Travis is going to be a viable
option.

Basically it seems to me that the Travis environment is a little less
predictable (probably because of virtualization) and this is pointing out
some existing flakey-ness in the tests

If there are tests that are regularly flakey we should probably file JIRAs
so they can be fixed or switched off.  If you have seen a test fail 2-3
times and then pass with no changes, I'd say go ahead and file an issue for
it (others should feel free to chime in if we want some other process here)

A few more specific comments inline below.


 2. hive/test usually aborted because it doesn't output anything within 10
 minutes


Hmm, this is a little confusing.  Do you have a pointer to this one?  Was
there any other error?


 4. hive/test didn't finish in 50 minutes, and was aborted


Here I think the right thing to do is probably break the hive tests in two
and run them in parallel.  There is already machinery for doing this, we
just need to flip the options on in the travis.yml to make it happen.  This
is only going to get more critical as we whitelist more hive tests.  We
also talked about checking the PR and skipping the hive tests when there
have been no changes in catalyst/sql/hive.  I'm okay with this plan, just
need to find someone with time to implement it


Travis CI

2014-03-25 Thread Michael Armbrust
Just a quick note to everyone that Patrick and I are playing around with
Travis CI on the Spark github repository.  For now, travis does not run all
of the test cases, so will only be turned on experimentally.  Long term it
looks like Travis might give better integration with github, so we are
going to see if it is feasible to get all of our tests running on it.

*Jenkins remains the reference CI and should be consulted before merging
pull requests, independent of what Travis says.*

If you have any questions or want to help out with the investigation, let
me know!

Michael


Re: Travis CI

2014-03-25 Thread Nan Zhu
I assume the Jenkins is not working now? 

Best, 

-- 
Nan Zhu



On Tuesday, March 25, 2014 at 6:42 PM, Michael Armbrust wrote:

 Just a quick note to everyone that Patrick and I are playing around with
 Travis CI on the Spark github repository. For now, travis does not run all
 of the test cases, so will only be turned on experimentally. Long term it
 looks like Travis might give better integration with github, so we are
 going to see if it is feasible to get all of our tests running on it.
 
 *Jenkins remains the reference CI and should be consulted before merging
 pull requests, independent of what Travis says.*
 
 If you have any questions or want to help out with the investigation, let
 me know!
 
 Michael 



Re: Travis CI

2014-03-25 Thread Patrick Wendell
That's not correct - like Michael said the Jenkins build remains the
reference build for now.

On Tue, Mar 25, 2014 at 7:03 PM, Nan Zhu zhunanmcg...@gmail.com wrote:
 I assume the Jenkins is not working now?

 Best,

 --
 Nan Zhu


 On Tuesday, March 25, 2014 at 6:42 PM, Michael Armbrust wrote:

 Just a quick note to everyone that Patrick and I are playing around with
 Travis CI on the Spark github repository. For now, travis does not run all
 of the test cases, so will only be turned on experimentally. Long term it
 looks like Travis might give better integration with github, so we are
 going to see if it is feasible to get all of our tests running on it.

 *Jenkins remains the reference CI and should be consulted before merging
 pull requests, independent of what Travis says.*

 If you have any questions or want to help out with the investigation, let
 me know!

 Michael




Re: Travis CI

2014-03-25 Thread Nan Zhu
I just found that the Jenkins is not working from this afternoon

for one PR, the first time build failed after 90 minutes, the second time it 
has run for more than 2 hours, no result is returned

Best, 

-- 
Nan Zhu



On Tuesday, March 25, 2014 at 10:06 PM, Patrick Wendell wrote:

 That's not correct - like Michael said the Jenkins build remains the
 reference build for now.
 
 On Tue, Mar 25, 2014 at 7:03 PM, Nan Zhu zhunanmcg...@gmail.com 
 (mailto:zhunanmcg...@gmail.com) wrote:
  I assume the Jenkins is not working now?
  
  Best,
  
  --
  Nan Zhu
  
  
  On Tuesday, March 25, 2014 at 6:42 PM, Michael Armbrust wrote:
  
  Just a quick note to everyone that Patrick and I are playing around with
  Travis CI on the Spark github repository. For now, travis does not run all
  of the test cases, so will only be turned on experimentally. Long term it
  looks like Travis might give better integration with github, so we are
  going to see if it is feasible to get all of our tests running on it.
  
  *Jenkins remains the reference CI and should be consulted before merging
  pull requests, independent of what Travis says.*
  
  If you have any questions or want to help out with the investigation, let
  me know!
  
  Michael 



Re: Travis CI

2014-03-25 Thread Patrick Wendell
Ya It's been a little bit slow lately because of a high error rate in
interactions with the git-hub API. Unfortunately we are pretty slammed
for the release and haven't had a ton of time to do further debugging.

- Patrick

On Tue, Mar 25, 2014 at 7:13 PM, Nan Zhu zhunanmcg...@gmail.com wrote:
 I just found that the Jenkins is not working from this afternoon

 for one PR, the first time build failed after 90 minutes, the second time it
 has run for more than 2 hours, no result is returned

 Best,

 --
 Nan Zhu


 On Tuesday, March 25, 2014 at 10:06 PM, Patrick Wendell wrote:

 That's not correct - like Michael said the Jenkins build remains the
 reference build for now.

 On Tue, Mar 25, 2014 at 7:03 PM, Nan Zhu zhunanmcg...@gmail.com wrote:

 I assume the Jenkins is not working now?

 Best,

 --
 Nan Zhu


 On Tuesday, March 25, 2014 at 6:42 PM, Michael Armbrust wrote:

 Just a quick note to everyone that Patrick and I are playing around with
 Travis CI on the Spark github repository. For now, travis does not run all
 of the test cases, so will only be turned on experimentally. Long term it
 looks like Travis might give better integration with github, so we are
 going to see if it is feasible to get all of our tests running on it.

 *Jenkins remains the reference CI and should be consulted before merging
 pull requests, independent of what Travis says.*

 If you have any questions or want to help out with the investigation, let
 me know!

 Michael