Re: Tests failed after assembling the latest code from github

2014-04-15 Thread Aaron Davidson
By all means, it would be greatly appreciated!


On Mon, Apr 14, 2014 at 10:34 PM, Ye Xianjin advance...@gmail.com wrote:

 Hi, I think I have found the cause of the tests failing.

 I have two disks on my laptop. The spark project dir is on an HDD disk
 while the tempdir created by google.io.Files.createTempDir is the
 /var/folders/5q/ ,which is on the system disk, an SSD.
 The ExecutorLoaderSuite test uses
 org.apache.spark.TestUtils.createdCompiledClass methods.
 The createCompiledClass method first generates the compiled class in the
 pwd(spark/repl), thens use renameTo to move
 the file. The renameTo method fails because the dest file is in a
 different filesystem than the source file.

 I modify the TestUtils.scala to first copy the file to dest then delete
 the original file. The tests go smoothly.
 Should I issue an jira about this problem? Then I can send a pr on Github.

 --
 Ye Xianjin
 Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


 On Tuesday, April 15, 2014 at 3:43 AM, Ye Xianjin wrote:

  well. This is very strange.
  I looked into ExecutorClassLoaderSuite.scala and ReplSuite.scala and
 made small changes to ExecutorClassLoaderSuite.scala (mostly output some
 internal variables). After that, when running repl test, I noticed the
 ReplSuite
  was tested first and the test result is ok. But the
 ExecutorClassLoaderSuite test was weird.
  Here is the output:
  [info] ExecutorClassLoaderSuite:
  [error] Uncaught exception when running
 org.apache.spark.repl.ExecutorClassLoaderSuite: java.lang.OutOfMemoryError:
 PermGen space
  [error] Uncaught exception when running
 org.apache.spark.repl.ExecutorClassLoaderSuite: java.lang.OutOfMemoryError:
 PermGen space
  Internal error when running tests: java.lang.OutOfMemoryError: PermGen
 space
  Exception in thread Thread-3 java.io.EOFException
  at
 java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2577)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
  at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1685)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1323)
  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
  at sbt.React.react(ForkTests.scala:116)
  at
 sbt.ForkTests$$anonfun$mainTestTask$1$Acceptor$2$.run(ForkTests.scala:75)
  at java.lang.Thread.run(Thread.java:695)
 
 
  I revert my changes. The test result is same.
 
   I touched the ReplSuite.scala file (use touch command), the test order
 is reversed, same as the very beginning. And the output is also the
 same.(The result in my first post).
 
 
  --
  Ye Xianjin
  Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
 
 
  On Tuesday, April 15, 2014 at 3:14 AM, Aaron Davidson wrote:
 
   This may have something to do with running the tests on a Mac, as
 there is
   a lot of File/URI/URL stuff going on in that test which may just have
   happened to work if run on a Linux system (like Jenkins). Note that
 this
   suite was added relatively recently:
   https://github.com/apache/spark/pull/217
  
  
   On Mon, Apr 14, 2014 at 12:04 PM, Ye Xianjin advance...@gmail.com(mailto:
 advance...@gmail.com) wrote:
  
Thank you for your reply.
   
After building the assembly jar, the repl test still failed. The
 error
output is same as I post before.
   
--
Ye Xianjin
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
   
   
On Tuesday, April 15, 2014 at 1:39 AM, Michael Armbrust wrote:
   
 I believe you may need an assembly jar to run the ReplSuite.
 sbt/sbt
 assembly/assembly.

 Michael


 On Mon, Apr 14, 2014 at 3:14 AM, Ye Xianjin 
 advance...@gmail.com(mailto:
 advance...@gmail.com)(mailto:
advance...@gmail.com (mailto:advance...@gmail.com)) wrote:

  Hi, everyone:
  I am new to Spark development. I download spark's latest code
 from
 


   
github.
  After running sbt/sbt assembly,
  I began running sbt/sbt test in the spark source code dir. But it
 

   
failed
  running the repl module test.
 
  Here are some output details.
 
  command:
  sbt/sbt test-only org.apache.spark.repl.*
  output:
 
  [info] Loading project definition from
  /Volumes/MacintoshHD/github/spark/project/project
  [info] Loading project definition from
  /Volumes/MacintoshHD/github/spark/project
  [info] Set current project to root (in build
  file:/Volumes/MacintoshHD/github/spark/)
  [info] Passed: Total 0, Failed 0, Errors 0, Passed 0
  [info] No tests to run for graphx/test:testOnly
  [info] Passed: Total 0, Failed 0, Errors 0, Passed 0
  [info] No tests to run for bagel/test:testOnly
  [info] Passed: Total 0, Failed 0, Errors 0, Passed 0
  [info] No tests to run for streaming/test:testOnly
  [info] Passed: Total 0, Failed 0, Errors 0, Passed 0
  [info] No tests to run for mllib/test:testOnly
  

Re: Tests failed after assembling the latest code from github

2014-04-15 Thread Sean Owen
Good call -- indeed that same Files class has a move() method that
will try to use renameTo() and then fall back to copy() and delete()
if needed for this very reason.


On Tue, Apr 15, 2014 at 6:34 AM, Ye Xianjin advance...@gmail.com wrote:
 Hi, I think I have found the cause of the tests failing.

 I have two disks on my laptop. The spark project dir is on an HDD disk while 
 the tempdir created by google.io.Files.createTempDir is the 
 /var/folders/5q/ ,which is on the system disk, an SSD.
 The ExecutorLoaderSuite test uses 
 org.apache.spark.TestUtils.createdCompiledClass methods.
 The createCompiledClass method first generates the compiled class in the 
 pwd(spark/repl), thens use renameTo to move
 the file. The renameTo method fails because the dest file is in a different 
 filesystem than the source file.

 I modify the TestUtils.scala to first copy the file to dest then delete the 
 original file. The tests go smoothly.
 Should I issue an jira about this problem? Then I can send a pr on Github.


Re: It seems that jenkins for PR is not working

2014-04-15 Thread Patrick Wendell
There are a few things going on here wrt tests.

1. I fixed up the RAT issues with a hotfix.

2. The Hive tests were actually disabled for a while accidentally. A recent
fix correctly re-enabled them. Without Hive Spark tests run in about 40
minutes and with Hive it runs in 1 hour and 15 minutes, so it's a big
difference.

To ease things I committed a patch today that only runs the Hive tests if
the change touches Spark SQL. So this should make it simpler for normal
tests.

We can actually generalize this to do much finer grained testing, e.g. if
something in MLLib changes we don't need to re-run the streaming tests.
I've added this JIRA to track it:
https://issues.apache.org/jira/browse/SPARK-1455

3. Overall we've experienced more race conditions with tests recently. I
noticed a few zombie test processes on Jenkins hogging up 100% of CPU so I
think this has triggered several previously unseen races due to CPU
contention on the test cluster. I killed them and we'll see if they crop up
again.

4. Please try to keep an eye on the length of new tests that get committed.
It's common to see people commit tests that e.g. sleep for several seconds
or do things that take a long time. Almost always this can be avoided and
usually avoiding it makes the test cleaner anyways (e.g. use proper
synchronization instead of sleeping).

- Patrick


On Tue, Apr 15, 2014 at 9:34 AM, Mark Hamstra m...@clearstorydata.comwrote:

 The RAT path issue is now fixed, but it appears to me that some recent
 change has dramatically altered the behavior of the testing framework, so
 that I am now seeing many individual tests taking more than a minute to run
 and the complete test run taking a very, very long time.  I expect that
 this is what is causing Jenkins to now timeout repeatedly.


 On Mon, Apr 14, 2014 at 1:32 PM, Nan Zhu zhunanmcg...@gmail.com wrote:

  +1
 
  --
  Nan Zhu
 
 
  On Friday, April 11, 2014 at 5:35 PM, DB Tsai wrote:
 
   I always got
  
 =
  
   Could not find Apache license headers in the following files:
   !? /root/workspace/SparkPullRequestBuilder/python/metastore/db.lck
   !?
 
 /root/workspace/SparkPullRequestBuilder/python/metastore/service.properties
  
  
   Sincerely,
  
   DB Tsai
   ---
   My Blog: https://www.dbtsai.com
   LinkedIn: https://www.linkedin.com/in/dbtsai