Re: Failing Spark Unit Tests
Got it, I opened a PR. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Failing Spark Unit Tests
I tried doing a change for it, but I was unable to reproduce. Anyway, I am seeing some unrelated errors in other PRs too, so there might be (or might have been) something wrong at some point. But I'd expect the test to pass locally anyway. 2018-01-23 15:23 GMT+01:00 Sean Owen <so...@cloudera.com>: > That's odd. The current master build is failing for unrelated reasons > (Jenkins jobs keep getting killed) so it's possible a very recent change > did break something, though they would have had to pass tests in the PR > builder first. You can go ahead and open a PR for your change and see what > the PR builder tests say. > > On Tue, Jan 23, 2018 at 4:42 AM Yacine Mazari <y.maz...@gmail.com> wrote: > >> Hi All, >> >> I am currently working on SPARK-23166 >> <https://issues.apache.org/jira/browse/SPARK-23166> , but after running >> "./dev/run-tests", the Python unit tests (supposedly unrelated to my >> change) >> are failing for the following reason: >> >> >> === >> File "/home/yacine/spark/python/pyspark/ml/linalg/__init__.py", line >> 895, in >> __main__.DenseMatrix.__str__ >> Failed example: >> print(dm) >> Expected: >> DenseMatrix([[ 0., 2.], >> [ 1., 3.]]) >> Got: >> DenseMatrix([[0., 2.], >> [1., 3.]]) >> >> === >> >> Notice that the missing space in the output is causing the failure. >> >> Any hints what is causing this? Are there any specific version of Python >> and/or other libraries I should be using? >> >> Thanks. >> >> >> >> -- >> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
Failing Spark Unit Tests
Hi All, I am currently working on SPARK-23166 <https://issues.apache.org/jira/browse/SPARK-23166> , but after running "./dev/run-tests", the Python unit tests (supposedly unrelated to my change) are failing for the following reason: === File "/home/yacine/spark/python/pyspark/ml/linalg/__init__.py", line 895, in __main__.DenseMatrix.__str__ Failed example: print(dm) Expected: DenseMatrix([[ 0., 2.], [ 1., 3.]]) Got: DenseMatrix([[0., 2.], [1., 3.]]) === Notice that the missing space in the output is causing the failure. Any hints what is causing this? Are there any specific version of Python and/or other libraries I should be using? Thanks. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: internal unit tests failing against the latest spark master
i confirmed that an Encoder[Array[Int]] is no longer serializable, and with my spark build from march 7 it was. i believe the issue is commit 295747e59739ee8a697ac3eba485d3439e4a04c3 and i send wenchen an email about it. On Wed, Apr 12, 2017 at 4:31 PM, Koert Kuipers <ko...@tresata.com> wrote: > i believe the error is related to an > org.apache.spark.sql.expressions.Aggregator > where the buffer type (BUF) is Array[Int] > > On Wed, Apr 12, 2017 at 4:19 PM, Koert Kuipers <ko...@tresata.com> wrote: > >> hey all, >> today i tried upgrading the spark version we use internally by creating a >> new internal release from the spark master branch. last time i did this was >> march 7. >> >> with this updated spark i am seeing some serialization errors in the unit >> tests for our own libraries. looks like a scala reflection type that is not >> serializable is getting sucked into serialization for the encoder? >> see below. >> best, >> koert >> >> [info] org.apache.spark.SparkException: Task not serializable >> [info] at org.apache.spark.util.ClosureCleaner$.ensureSerializable(Clo >> sureCleaner.scala:298) >> [info] at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ >> ClosureCleaner$$clean(ClosureCleaner.scala:288) >> [info] at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner. >> scala:108) >> [info] at org.apache.spark.SparkContext.clean(SparkContext.scala:2284) >> [info] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058) >> ... >> [info] Serialization stack: >> [info] - object not serializable (class: >> scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, value: >> BTS(Int,AnyVal,Any)) >> [info] - field (class: scala.reflect.internal.Types$TypeRef, name: >> baseTypeSeqCache, type: class scala.reflect.internal.BaseTyp >> eSeqs$BaseTypeSeq) >> [info] - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef, >> Int) >> [info] - field (class: >> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, >> name: elementType$2, type: class scala.reflect.api.Types$TypeApi) >> [info] - object (class >> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, >> ) >> [info] - field (class: org.apache.spark.sql.catalyst. >> expressions.objects.UnresolvedMapObjects, name: function, type: >> interface scala.Function1) >> [info] - object (class org.apache.spark.sql.catalyst. >> expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(, >> getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface >> scala.collection.Seq))) >> [info] - field (class: org.apache.spark.sql.catalyst. >> expressions.objects.WrapOption, name: child, type: class >> org.apache.spark.sql.catalyst.expressions.Expression) >> [info] - object (class org.apache.spark.sql.catalyst. >> expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(, >> getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface >> scala.collection.Seq)), ObjectType(interface scala.collection.Seq))) >> [info] - writeObject data (class: scala.collection.immutable.Lis >> t$SerializationProxy) >> [info] - object (class >> scala.collection.immutable.List$SerializationProxy, >> scala.collection.immutable.List$SerializationProxy@69040c85) >> [info] - writeReplace data (class: scala.collection.immutable.Lis >> t$SerializationProxy) >> [info] - object (class scala.collection.immutable.$colon$colon, >> List(wrapoption(unresolvedmapobjects(, getcolumnbyordinal(0, >> ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), >> ObjectType(interface scala.collection.Seq >> [info] - field (class: org.apache.spark.sql.catalyst. >> expressions.objects.NewInstance, name: arguments, type: interface >> scala.collection.Seq) >> [info] - object (class org.apache.spark.sql.catalyst. >> expressions.objects.NewInstance, newInstance(class scala.Tuple1)) >> [info] - field (class: >> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, >> name: deserializer, type: class org.apache.spark.sql.catalyst. >> expressions.Expression) >> [info] - object (class >> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, >> class[_1[0]: array]) >> ... >> >> >
Re: internal unit tests failing against the latest spark master
i believe the error is related to an org.apache.spark.sql.expressions.Aggregator where the buffer type (BUF) is Array[Int] On Wed, Apr 12, 2017 at 4:19 PM, Koert Kuipers <ko...@tresata.com> wrote: > hey all, > today i tried upgrading the spark version we use internally by creating a > new internal release from the spark master branch. last time i did this was > march 7. > > with this updated spark i am seeing some serialization errors in the unit > tests for our own libraries. looks like a scala reflection type that is not > serializable is getting sucked into serialization for the encoder? > see below. > best, > koert > > [info] org.apache.spark.SparkException: Task not serializable > [info] at org.apache.spark.util.ClosureCleaner$.ensureSerializable( > ClosureCleaner.scala:298) > [info] at org.apache.spark.util.ClosureCleaner$.org$apache$ > spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > [info] at org.apache.spark.util.ClosureCleaner$.clean( > ClosureCleaner.scala:108) > [info] at org.apache.spark.SparkContext.clean(SparkContext.scala:2284) > [info] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058) > ... > [info] Serialization stack: > [info] - object not serializable (class: > scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, > value: BTS(Int,AnyVal,Any)) > [info] - field (class: scala.reflect.internal.Types$TypeRef, name: > baseTypeSeqCache, type: class scala.reflect.internal. > BaseTypeSeqs$BaseTypeSeq) > [info] - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef, > Int) > [info] - field (class: > org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, > name: elementType$2, type: class scala.reflect.api.Types$TypeApi) > [info] - object (class > org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, > ) > [info] - field (class: org.apache.spark.sql.catalyst. > expressions.objects.UnresolvedMapObjects, name: function, type: interface > scala.Function1) > [info] - object (class org.apache.spark.sql.catalyst. > expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(, > getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface > scala.collection.Seq))) > [info] - field (class: org.apache.spark.sql.catalyst. > expressions.objects.WrapOption, name: child, type: class > org.apache.spark.sql.catalyst.expressions.Expression) > [info] - object (class org.apache.spark.sql.catalyst. > expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(, > getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface > scala.collection.Seq)), ObjectType(interface scala.collection.Seq))) > [info] - writeObject data (class: scala.collection.immutable. > List$SerializationProxy) > [info] - object (class scala.collection.immutable.List$SerializationProxy, > scala.collection.immutable.List$SerializationProxy@69040c85) > [info] - writeReplace data (class: scala.collection.immutable. > List$SerializationProxy) > [info] - object (class scala.collection.immutable.$colon$colon, > List(wrapoption(unresolvedmapobjects(, getcolumnbyordinal(0, > ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), > ObjectType(interface scala.collection.Seq > [info] - field (class: org.apache.spark.sql.catalyst. > expressions.objects.NewInstance, name: arguments, type: interface > scala.collection.Seq) > [info] - object (class org.apache.spark.sql.catalyst. > expressions.objects.NewInstance, newInstance(class scala.Tuple1)) > [info] - field (class: > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, > name: deserializer, type: class org.apache.spark.sql.catalyst. > expressions.Expression) > [info] - object (class > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, > class[_1[0]: array]) > ... > >
internal unit tests failing against the latest spark master
hey all, today i tried upgrading the spark version we use internally by creating a new internal release from the spark master branch. last time i did this was march 7. with this updated spark i am seeing some serialization errors in the unit tests for our own libraries. looks like a scala reflection type that is not serializable is getting sucked into serialization for the encoder? see below. best, koert [info] org.apache.spark.SparkException: Task not serializable [info] at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298) [info] at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) [info] at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) [info] at org.apache.spark.SparkContext.clean(SparkContext.scala:2284) [info] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058) ... [info] Serialization stack: [info] - object not serializable (class: scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, value: BTS(Int,AnyVal,Any)) [info] - field (class: scala.reflect.internal.Types$TypeRef, name: baseTypeSeqCache, type: class scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq) [info] - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef, Int) [info] - field (class: org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, name: elementType$2, type: class scala.reflect.api.Types$TypeApi) [info] - object (class org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, ) [info] - field (class: org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, name: function, type: interface scala.Function1) [info] - object (class org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq))) [info] - field (class: org.apache.spark.sql.catalyst.expressions.objects.WrapOption, name: child, type: class org.apache.spark.sql.catalyst.expressions.Expression) [info] - object (class org.apache.spark.sql.catalyst.expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq))) [info] - writeObject data (class: scala.collection.immutable.List$SerializationProxy) [info] - object (class scala.collection.immutable.List$SerializationProxy, scala.collection.immutable.List$SerializationProxy@69040c85) [info] - writeReplace data (class: scala.collection.immutable.List$SerializationProxy) [info] - object (class scala.collection.immutable.$colon$colon, List(wrapoption(unresolvedmapobjects(, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq [info] - field (class: org.apache.spark.sql.catalyst.expressions.objects.NewInstance, name: arguments, type: interface scala.collection.Seq) [info] - object (class org.apache.spark.sql.catalyst.expressions.objects.NewInstance, newInstance(class scala.Tuple1)) [info] - field (class: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, name: deserializer, type: class org.apache.spark.sql.catalyst.expressions.Expression) [info] - object (class org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, class[_1[0]: array]) ...
Re: Running Unit Tests in pyspark failure
I could resolve this by passing the argument below ./python/run-tests --python-executables=python2.7 Thanks, Krishna On Thu, Nov 3, 2016 at 4:16 PM, Krishna Kalyan <krishnakaly...@gmail.com> wrote: > Hello, > I am trying to run unit tests on pyspark. > > When I try to run unit test I am faced with errors. > krishna@Krishna:~/Experiment/spark$ ./python/run-tests > Running PySpark tests. Output is in /Users/krishna/Experiment/spar > k/python/unit-tests.log > Will test against the following Python executables: ['python2.6'] > Will test the following Python modules: ['pyspark-core', 'pyspark-ml', > 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming'] > Please install unittest2 to test with Python 2.6 or earlier > Had test failures in pyspark.sql.tests with python2.6; see logs. > > and when I try to Install unittest2, It says requirement already satisfied. > > krishna@Krishna:~/Experiment/spark$ sudo pip install --upgrade unittest2 > Password: > Requirement already up-to-date: unittest2 in /usr/local/lib/python2.7/site- > packages > Requirement already up-to-date: argparse in > /usr/local/lib/python2.7/site-packages > (from unittest2) > Requirement already up-to-date: six>=1.4 in > /usr/local/lib/python2.7/site-packages > (from unittest2) > Requirement already up-to-date: traceback2 in > /usr/local/lib/python2.7/site-packages (from unittest2) > Requirement already up-to-date: linecache2 in > /usr/local/lib/python2.7/site-packages (from traceback2->unittest2) > > Help! > > Thanks, > Krishna > > > > >
Running Unit Tests in pyspark failure
Hello, I am trying to run unit tests on pyspark. When I try to run unit test I am faced with errors. krishna@Krishna:~/Experiment/spark$ ./python/run-tests Running PySpark tests. Output is in /Users/krishna/Experiment/ spark/python/unit-tests.log Will test against the following Python executables: ['python2.6'] Will test the following Python modules: ['pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming'] Please install unittest2 to test with Python 2.6 or earlier Had test failures in pyspark.sql.tests with python2.6; see logs. and when I try to Install unittest2, It says requirement already satisfied. krishna@Krishna:~/Experiment/spark$ sudo pip install --upgrade unittest2 Password: Requirement already up-to-date: unittest2 in /usr/local/lib/python2.7/site- packages Requirement already up-to-date: argparse in /usr/local/lib/python2.7/site-packages (from unittest2) Requirement already up-to-date: six>=1.4 in /usr/local/lib/python2.7/site-packages (from unittest2) Requirement already up-to-date: traceback2 in /usr/local/lib/python2.7/site-packages (from unittest2) Requirement already up-to-date: linecache2 in /usr/local/lib/python2.7/site-packages (from traceback2->unittest2) Help! Thanks, Krishna
Machine learning unit tests guidelines
Dear Spark developers, Are there any best practices or guidelines for machine learning unit tests in Spark? After taking a brief look at the unit tests in ML and MLlib, I have found that each algorithm is tested in a different way. There are few kinds of tests: 1)Partial check of internal algorithm correctness. This can be anything. 2)Generate test data with distribution specific to the algorithm, do machine learning and check the outcomes. This is also very specific. 3)Compare the parameters (weights) of machine learning model with parameters from existing implementations, such as R or SciPy. This looks more like a useful test, so that you are sure you will get the same result from the algorithm as other people get using other software. After googling a bit, I've found the following guidelines rather relevant: http://blog.mpacula.com/2011/02/17/unit-testing-statistical-software/ I am wondering, should we come up with specific guidelines for machine learning, such as that the user is guaranteed to get the expected result? This also might be considered as additional benefit for Spark - to be standardized ML. Best regards, Alexander
Re: Unit tests can generate spurious shutdown messages
Can you submit a pull request for it? Thanks. On Tue, Jun 2, 2015 at 4:25 AM, Mick Davies michael.belldav...@gmail.com wrote: If I write unit tests that indirectly initialize org.apache.spark.util.Utils, for example use sql types, but produce no logging, I get the following unpleasant stack trace in my test output. This caused by the the Utils class adding a shutdown hook which logs the message logDebug(Shutdown hook called). We are using log4j 2 for logging and if there has been no logging before this point then the static initialization of log4j 2 tries to add a shutdown hook itself but can't because JVM is already in shutdown. Its only slightly annoying but could be easily 'fixed' by adding a line like: logDebug(Adding shutdown hook) to Utils before adding the shutdown hook, so ensuring logging always initialized. I am happy to make this change, unless there is a better approach or considered too trivial. ERROR StatusLogger catching java.lang.IllegalStateException: Shutdown in progress at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:66) at java.lang.Runtime.addShutdownHook(Runtime.java:211) at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownHook(DefaultShutdownCallbackRegistry.java:136) at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.start(DefaultShutdownCallbackRegistry.java:125) at org.apache.logging.log4j.core.impl.Log4jContextFactory.initializeShutdownCallbackRegistry(Log4jContextFactory.java:123) at org.apache.logging.log4j.core.impl.Log4jContextFactory.init(Log4jContextFactory.java:89) at org.apache.logging.log4j.core.impl.Log4jContextFactory.init(Log4jContextFactory.java:54) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:438) at org.apache.logging.log4j.LogManager.clinit(LogManager.java:96) at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:102) at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43) at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42) at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) at org.apache.spark.Logging$class.log(Logging.scala:52) at org.apache.spark.util.Utils$.log(Utils.scala:62) at org.apache.spark.Logging$class.initializeLogging(Logging.scala:138) at org.apache.spark.Logging$class.initializeIfNecessary(Logging.scala:107) at org.apache.spark.Logging$class.log(Logging.scala:51) at org.apache.spark.util.Utils$.log(Utils.scala:62) at org.apache.spark.Logging$class.logDebug(Logging.scala:63) at org.apache.spark.util.Utils$.logDebug(Utils.scala:62) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply$mcV$sp(Utils.scala:178) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618) at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:177) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Unit-tests-can-generate-spurious-shutdown-messages-tp12557.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Unit tests can generate spurious shutdown messages
If I write unit tests that indirectly initialize org.apache.spark.util.Utils, for example use sql types, but produce no logging, I get the following unpleasant stack trace in my test output. This caused by the the Utils class adding a shutdown hook which logs the message logDebug(Shutdown hook called). We are using log4j 2 for logging and if there has been no logging before this point then the static initialization of log4j 2 tries to add a shutdown hook itself but can't because JVM is already in shutdown. Its only slightly annoying but could be easily 'fixed' by adding a line like: logDebug(Adding shutdown hook) to Utils before adding the shutdown hook, so ensuring logging always initialized. I am happy to make this change, unless there is a better approach or considered too trivial. ERROR StatusLogger catching java.lang.IllegalStateException: Shutdown in progress at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:66) at java.lang.Runtime.addShutdownHook(Runtime.java:211) at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownHook(DefaultShutdownCallbackRegistry.java:136) at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.start(DefaultShutdownCallbackRegistry.java:125) at org.apache.logging.log4j.core.impl.Log4jContextFactory.initializeShutdownCallbackRegistry(Log4jContextFactory.java:123) at org.apache.logging.log4j.core.impl.Log4jContextFactory.init(Log4jContextFactory.java:89) at org.apache.logging.log4j.core.impl.Log4jContextFactory.init(Log4jContextFactory.java:54) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:438) at org.apache.logging.log4j.LogManager.clinit(LogManager.java:96) at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:102) at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43) at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42) at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) at org.apache.spark.Logging$class.log(Logging.scala:52) at org.apache.spark.util.Utils$.log(Utils.scala:62) at org.apache.spark.Logging$class.initializeLogging(Logging.scala:138) at org.apache.spark.Logging$class.initializeIfNecessary(Logging.scala:107) at org.apache.spark.Logging$class.log(Logging.scala:51) at org.apache.spark.util.Utils$.log(Utils.scala:62) at org.apache.spark.Logging$class.logDebug(Logging.scala:63) at org.apache.spark.util.Utils$.logDebug(Utils.scala:62) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply$mcV$sp(Utils.scala:178) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618) at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:177) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Unit-tests-can-generate-spurious-shutdown-messages-tp12557.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests
Thank, Josh, I missed that PR. On Mon, Feb 9, 2015 at 7:45 PM, Josh Rosen rosenvi...@gmail.com wrote: Hi Iulian, I think the AkakUtilsSuite failure that you observed has been fixed in https://issues.apache.org/jira/browse/SPARK-5548 / https://github.com/apache/spark/pull/4343 On February 9, 2015 at 5:47:59 AM, Iulian Dragoș ( iulian.dra...@typesafe.com) wrote: Hi Patrick, Thanks for the heads up. I was trying to set up our own infrastructure for testing Spark (essentially, running `run-tests` every night) on EC2. I stumbled upon a number of flaky tests, but none of them look similar to anything in Jira with the flaky-test tag. I wonder if there's something wrong with our infrastructure, or I should simply open Jira tickets with the failures I find. For example, one that appears fairly often on our setup is in AkkaUtilsSuite remote fetch ssl on - untrusted server (exception `ActorNotFound`, instead of `TimeoutException`). thanks, iulian On Fri, Feb 6, 2015 at 9:55 PM, Patrick Wendell pwend...@gmail.com wrote: Hey All, The tests are in a not-amazing state right now due to a few compounding factors: 1. We've merged a large volume of patches recently. 2. The load on jenkins has been relatively high, exposing races and other behavior not seen at lower load. For those not familiar, the main issue is flaky (non deterministic) test failures. Right now I'm trying to prioritize keeping the PullReqeustBuilder in good shape since it will block development if it is down. For other tests, let's try to keep filing JIRA's when we see issues and use the flaky-test label (see http://bit.ly/1yRif9S): I may contact people regarding specific tests. This is a very high priority to get in good shape. This kind of thing is no one's fault but just the result of a lot of concurrent development, and everyone needs to pitch in to get back in a good place. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com
Re: Unit tests
Hi Iulian, I think the AkakUtilsSuite failure that you observed has been fixed in https://issues.apache.org/jira/browse/SPARK-5548 / https://github.com/apache/spark/pull/4343 On February 9, 2015 at 5:47:59 AM, Iulian Dragoș (iulian.dra...@typesafe.com) wrote: Hi Patrick, Thanks for the heads up. I was trying to set up our own infrastructure for testing Spark (essentially, running `run-tests` every night) on EC2. I stumbled upon a number of flaky tests, but none of them look similar to anything in Jira with the flaky-test tag. I wonder if there's something wrong with our infrastructure, or I should simply open Jira tickets with the failures I find. For example, one that appears fairly often on our setup is in AkkaUtilsSuite remote fetch ssl on - untrusted server (exception `ActorNotFound`, instead of `TimeoutException`). thanks, iulian On Fri, Feb 6, 2015 at 9:55 PM, Patrick Wendell pwend...@gmail.com wrote: Hey All, The tests are in a not-amazing state right now due to a few compounding factors: 1. We've merged a large volume of patches recently. 2. The load on jenkins has been relatively high, exposing races and other behavior not seen at lower load. For those not familiar, the main issue is flaky (non deterministic) test failures. Right now I'm trying to prioritize keeping the PullReqeustBuilder in good shape since it will block development if it is down. For other tests, let's try to keep filing JIRA's when we see issues and use the flaky-test label (see http://bit.ly/1yRif9S): I may contact people regarding specific tests. This is a very high priority to get in good shape. This kind of thing is no one's fault but just the result of a lot of concurrent development, and everyone needs to pitch in to get back in a good place. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com
Re: Unit tests
Hi Patrick, Thanks for the heads up. I was trying to set up our own infrastructure for testing Spark (essentially, running `run-tests` every night) on EC2. I stumbled upon a number of flaky tests, but none of them look similar to anything in Jira with the flaky-test tag. I wonder if there's something wrong with our infrastructure, or I should simply open Jira tickets with the failures I find. For example, one that appears fairly often on our setup is in AkkaUtilsSuite remote fetch ssl on - untrusted server (exception `ActorNotFound`, instead of `TimeoutException`). thanks, iulian On Fri, Feb 6, 2015 at 9:55 PM, Patrick Wendell pwend...@gmail.com wrote: Hey All, The tests are in a not-amazing state right now due to a few compounding factors: 1. We've merged a large volume of patches recently. 2. The load on jenkins has been relatively high, exposing races and other behavior not seen at lower load. For those not familiar, the main issue is flaky (non deterministic) test failures. Right now I'm trying to prioritize keeping the PullReqeustBuilder in good shape since it will block development if it is down. For other tests, let's try to keep filing JIRA's when we see issues and use the flaky-test label (see http://bit.ly/1yRif9S): I may contact people regarding specific tests. This is a very high priority to get in good shape. This kind of thing is no one's fault but just the result of a lot of concurrent development, and everyone needs to pitch in to get back in a good place. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com
Unit tests
Hey All, The tests are in a not-amazing state right now due to a few compounding factors: 1. We've merged a large volume of patches recently. 2. The load on jenkins has been relatively high, exposing races and other behavior not seen at lower load. For those not familiar, the main issue is flaky (non deterministic) test failures. Right now I'm trying to prioritize keeping the PullReqeustBuilder in good shape since it will block development if it is down. For other tests, let's try to keep filing JIRA's when we see issues and use the flaky-test label (see http://bit.ly/1yRif9S): I may contact people regarding specific tests. This is a very high priority to get in good shape. This kind of thing is no one's fault but just the result of a lot of concurrent development, and everyone needs to pitch in to get back in a good place. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
Ted, I posted some updates https://issues.apache.org/jira/browse/SPARK-3431?focusedCommentId=14236540page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14236540 on JIRA on my progress (or lack thereof) getting SBT to parallelize test suites properly. I'm currently stuck with SBT / ScalaTest, so I may move on to trying Maven. Andrew, Once we have a basic grasp of how to parallelize some of the tests, the next step will probably be to use containers (i.e. Docker) to allow more parallelization, especially for those tests that, for example, contend for ports. Nick On Fri Dec 05 2014 at 2:05:29 PM Andrew Or and...@databricks.com wrote: @Patrick and Josh actually we went even further than that. We simply disable the UI for most tests and these used to be the single largest source of port conflict.
Re: Unit tests in 5 minutes
bq. I may move on to trying Maven. Maven is my favorite :-) On Sat, Dec 6, 2014 at 10:54 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Ted, I posted some updates https://issues.apache.org/jira/browse/SPARK-3431?focusedCommentId=14236540page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14236540 on JIRA on my progress (or lack thereof) getting SBT to parallelize test suites properly. I'm currently stuck with SBT / ScalaTest, so I may move on to trying Maven. Andrew, Once we have a basic grasp of how to parallelize some of the tests, the next step will probably be to use containers (i.e. Docker) to allow more parallelization, especially for those tests that, for example, contend for ports. Nick On Fri Dec 05 2014 at 2:05:29 PM Andrew Or and...@databricks.com wrote: @Patrick and Josh actually we went even further than that. We simply disable the UI for most tests and these used to be the single largest source of port conflict.
Re: Unit tests in 5 minutes
@Patrick and Josh actually we went even further than that. We simply disable the UI for most tests and these used to be the single largest source of port conflict.
Re: Unit tests in 5 minutes
fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. I think we need to do this as well. Perhaps the test naming hierarchy can be used to group non-parallelizable tests in the same JVM. For example, here are some Hive tests from our project: org.apache.spark.sql.hive.StatisticsSuite org.apache.spark.sql.hive.execution.HiveQuerySuite org.apache.spark.sql.QueryTest org.apache.spark.sql.parquet.HiveParquetSuite If we group tests by the first 5 parts of their name (e.g. org.apache.spark.sql.hive), then we’d have the first 2 tests run in the same JVM, and the next 2 tests each run in their own JVM. I’m new to this stuff so I’m not sure if I’m going about this in the right way, but you can see my attempt with this approach on GitHub https://github.com/nchammas/spark/blob/ab127b798dbfa9399833d546e627f9651b060918/project/SparkBuild.scala#L388-L397, as well as the related discussion on JIRA https://issues.apache.org/jira/browse/SPARK-3431. If anyone has more feedback on this, I’d love to hear it (either on this thread or in the JIRA issue). Nick On Sun Sep 07 2014 at 8:28:51 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: On Fri, Aug 8, 2014 at 1:12 PM, Reynold Xin r...@databricks.com wrote: Nick, Would you like to file a ticket to track this? SPARK-3431 https://issues.apache.org/jira/browse/SPARK-3431: Parallelize execution of tests Sub-task: SPARK-3432 https://issues.apache.org/jira/browse/SPARK-3432: Fix logging of unit test execution time Nick
Re: Unit tests in 5 minutes
Have you seen this thread http://search-hadoop.com/m/JW1q5xxSAa2 ? Test categorization in HBase is done through maven-surefire-plugin Cheers On Thu, Dec 4, 2014 at 4:05 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. I think we need to do this as well. Perhaps the test naming hierarchy can be used to group non-parallelizable tests in the same JVM. For example, here are some Hive tests from our project: org.apache.spark.sql.hive.StatisticsSuite org.apache.spark.sql.hive.execution.HiveQuerySuite org.apache.spark.sql.QueryTest org.apache.spark.sql.parquet.HiveParquetSuite If we group tests by the first 5 parts of their name (e.g. org.apache.spark.sql.hive), then we’d have the first 2 tests run in the same JVM, and the next 2 tests each run in their own JVM. I’m new to this stuff so I’m not sure if I’m going about this in the right way, but you can see my attempt with this approach on GitHub https://github.com/nchammas/spark/blob/ab127b798dbfa9399833d546e627f9651b060918/project/SparkBuild.scala#L388-L397, as well as the related discussion on JIRA https://issues.apache.org/jira/browse/SPARK-3431. If anyone has more feedback on this, I’d love to hear it (either on this thread or in the JIRA issue). Nick On Sun Sep 07 2014 at 8:28:51 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: On Fri, Aug 8, 2014 at 1:12 PM, Reynold Xin r...@databricks.com wrote: Nick, Would you like to file a ticket to track this? SPARK-3431 https://issues.apache.org/jira/browse/SPARK-3431: Parallelize execution of tests Sub-task: SPARK-3432 https://issues.apache.org/jira/browse/SPARK-3432: Fix logging of unit test execution time Nick
Re: Troubleshooting JVM OOM during Spark Unit Tests
What does /tmp/jvm-21940/hs_error.log tell you? It might give hints to what threads are allocating the extra off-heap memory. On Fri, Nov 21, 2014 at 1:50 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy folks, I’m trying to understand why I’m getting “insufficient memory” errors when trying to run Spark Units tests within a CentOS Docker container. I’m building Spark and running the tests as follows: # build sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver package assembly/assembly # Scala unit tests sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver catalyst/test sql/test hive/test mllib/test The build completes successfully. After humming along for many minutes, the unit tests fail with this: OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00074a58, 30932992, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 30932992 bytes for committing reserved memory. # An error report file with more information is saved as: # /tmp/jvm-21940/hs_error.log Exception in thread Thread-20 Exception in thread Thread-16 java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2598) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1$React.react(Framework.scala:945) at org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1.run(Framework.scala:934) at java.lang.Thread.run(Thread.java:745) java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2293) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2586) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at sbt.React.react(ForkTests.scala:114) at sbt.ForkTests$$anonfun$mainTestTask$1$Acceptor$2$.run(ForkTests.scala:74) at java.lang.Thread.run(Thread.java:745) Here are some (I think) relevant environment variables I have set: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71-2.5.3.1.el7_0.x86_64 export JAVA_OPTS=-Xms128m -Xmx1g -XX:MaxPermSize=128m export MAVEN_OPTS=-Xmx512m -XX:MaxPermSize=128m How do I narrow down why this is happening? I know that running this thing within a Docker container may be playing a role here, but before poking around with Docker configs I want to make an effort at getting the Java setup right within the container. I’ve already tried giving the container 2GB of memory, so I don’t think at this point it’s a restriction on the container. Any pointers on how to narrow the problem down? Nick P.S. If you’re wondering why I’m trying to run unit tests within a Docker container, I’m exploring a different angle on SPARK-3431 https://issues.apache.org/jira/browse/SPARK-3431.
Re: Troubleshooting JVM OOM during Spark Unit Tests
Here’s that log file https://gist.github.com/nchammas/08d3a3a02486cf602ceb from a different run of the unit tests that also failed. I’m not sure what to look for. If it matters any, I also changed JAVA_OPTS as follows for this run: export JAVA_OPTS=-Xms512m -Xmx1024m -XX:PermSize=64m -XX:MaxPermSize=128m -Xss512k Nick On Sat Nov 22 2014 at 3:09:55 AM Reynold Xin r...@databricks.com http://mailto:r...@databricks.com wrote: What does /tmp/jvm-21940/hs_error.log tell you? It might give hints to what threads are allocating the extra off-heap memory. On Fri, Nov 21, 2014 at 1:50 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy folks, I’m trying to understand why I’m getting “insufficient memory” errors when trying to run Spark Units tests within a CentOS Docker container. I’m building Spark and running the tests as follows: # build sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver package assembly/assembly # Scala unit tests sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver catalyst/test sql/test hive/test mllib/test The build completes successfully. After humming along for many minutes, the unit tests fail with this: OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00074a58, 30932992, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 30932992 bytes for committing reserved memory. # An error report file with more information is saved as: # /tmp/jvm-21940/hs_error.log Exception in thread Thread-20 Exception in thread Thread-16 java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2598) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1$React.react(Framework.scala:945) at org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1.run(Framework.scala:934) at java.lang.Thread.run(Thread.java:745) java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2293) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2586) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at sbt.React.react(ForkTests.scala:114) at sbt.ForkTests$anonfun$mainTestTask$1$Acceptor$2$.run(ForkTests.scala:74) at java.lang.Thread.run(Thread.java:745) Here are some (I think) relevant environment variables I have set: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71-2.5.3.1.el7_0.x86_64 export JAVA_OPTS=-Xms128m -Xmx1g -XX:MaxPermSize=128m export MAVEN_OPTS=-Xmx512m -XX:MaxPermSize=128m How do I narrow down why this is happening? I know that running this thing within a Docker container may be playing a role here, but before poking around with Docker configs I want to make an effort at getting the Java setup right within the container. I’ve already tried giving the container 2GB of memory, so I don’t think at this point it’s a restriction on the container. Any pointers on how to narrow the problem down? Nick P.S. If you’re wondering why I’m trying to run unit tests within a Docker container, I’m exploring a different angle on SPARK-3431 https://issues.apache.org/jira/browse/SPARK-3431.
Troubleshooting JVM OOM during Spark Unit Tests
Howdy folks, I’m trying to understand why I’m getting “insufficient memory” errors when trying to run Spark Units tests within a CentOS Docker container. I’m building Spark and running the tests as follows: # build sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver package assembly/assembly # Scala unit tests sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver catalyst/test sql/test hive/test mllib/test The build completes successfully. After humming along for many minutes, the unit tests fail with this: OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00074a58, 30932992, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 30932992 bytes for committing reserved memory. # An error report file with more information is saved as: # /tmp/jvm-21940/hs_error.log Exception in thread Thread-20 Exception in thread Thread-16 java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2598) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1$React.react(Framework.scala:945) at org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1.run(Framework.scala:934) at java.lang.Thread.run(Thread.java:745) java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2293) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2586) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at sbt.React.react(ForkTests.scala:114) at sbt.ForkTests$$anonfun$mainTestTask$1$Acceptor$2$.run(ForkTests.scala:74) at java.lang.Thread.run(Thread.java:745) Here are some (I think) relevant environment variables I have set: export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71-2.5.3.1.el7_0.x86_64 export JAVA_OPTS=-Xms128m -Xmx1g -XX:MaxPermSize=128m export MAVEN_OPTS=-Xmx512m -XX:MaxPermSize=128m How do I narrow down why this is happening? I know that running this thing within a Docker container may be playing a role here, but before poking around with Docker configs I want to make an effort at getting the Java setup right within the container. I’ve already tried giving the container 2GB of memory, so I don’t think at this point it’s a restriction on the container. Any pointers on how to narrow the problem down? Nick P.S. If you’re wondering why I’m trying to run unit tests within a Docker container, I’m exploring a different angle on SPARK-3431 https://issues.apache.org/jira/browse/SPARK-3431.
Exception while running unit tests that makes use of local-cluster mode
Hi All, When i try to run unit tests that makes use of local-cluster mode (Ex: Accessing HttpBroadcast variables in a local cluster in BroadcastSuite.scala), its failing with the below exception. I'm using java version 1.8.0_05 and scala version 2.10. I tried to look into the jenkins build report and its passing over there. Please let me know how i can resolve this issue. Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, 192.168.43.112): java.lang.ClassNotFoundException: org.apache.spark.broadcast.BroadcastSuite$$anonfun$3$$anonfun$19 java.net.URLClassLoader$1.run(URLClassLoader.java:372) java.net.URLClassLoader$1.run(URLClassLoader.java:361) java.security.AccessController.doPrivileged(Native Method) java.net.URLClassLoader.findClass(URLClassLoader.java:360) java.lang.ClassLoader.loadClass(ClassLoader.java:424) java.lang.ClassLoader.loadClass(ClassLoader.java:357) java.lang.Class.forName0(Native Method) java.lang.Class.forName(Class.java:340) org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57) org.apache.spark.scheduler.Task.run(Task.scala:56) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) Driver stacktrace: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, 192.168.43.112): java.lang.ClassNotFoundException: org.apache.spark.broadcast.BroadcastSuite$$anonfun$3$$anonfun$19 java.net.URLClassLoader$1.run(URLClassLoader.java:372) java.net.URLClassLoader$1.run(URLClassLoader.java:361) java.security.AccessController.doPrivileged(Native Method) java.net.URLClassLoader.findClass(URLClassLoader.java:360) java.lang.ClassLoader.loadClass(ClassLoader.java:424) java.lang.ClassLoader.loadClass(ClassLoader.java:357) java.lang.Class.forName0(Native Method) java.lang.Class.forName(Class.java:340) org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87
Re: Unit tests in 5 minutes
Issue with supporting this imo is the fact that scala-test uses the same vm for all the tests (surefire plugin supports fork, but scala-test ignores it iirc). So different tests would initialize different spark context, and can potentially step on each others toes. Regards, Mridul On Fri, Aug 8, 2014 at 9:31 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Unit tests in 5 minutes
Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick
Re: Unit tests in 5 minutes
A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
Nick, Would you like to file a ticket to track this? I think the first baby step is to log the amount of time each test cases take. This is supposed to happen already (see the flag), but somehow the time are not showing. If you have some time to figure that out, that'd be great. https://github.com/apache/spark/blob/master/project/SparkBuild.scala#L350 On Fri, Aug 8, 2014 at 10:10 AM, Reynold Xin r...@databricks.com wrote: ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. Nicolas On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin r...@databricks.com wrote: ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
Just as a note, when you're developing stuff, you can use test-only in sbt, or the equivalent feature in Maven, to run just some of the tests. This is what I do, I don't wait for Jenkins to run things. 90% of the time if it passes the tests that I know could break stuff, it will pass all of Jenkins. Jenkins should always be doing all the integration tests, so I don't think it will become *that* much shorter in the long run, though it can certainly be improved. Matei On August 8, 2014 at 10:20:35 AM, Nicolas Liochon (nkey...@gmail.com) wrote: fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. Nicolas On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin r...@databricks.com wrote: ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
I dug around this a bit a while ago, I think if someone sat down and profiled the tests it's likely we could find some things to optimize. In particular, there may be overheads in starting up a local spark context that could be minimized and speed up all the tests. Also, there are some tests (especially in Streaming) that take really long, like 60 seconds for a single test (see some of the new flume tests). These could almost certainly be optimized. I think 5 minutes might be out of reach, but something like a 2X improvement might be possible and would be very valuable if accomplished. - Patrick On Fri, Aug 8, 2014 at 11:24 AM, Matei Zaharia matei.zaha...@gmail.com wrote: Just as a note, when you're developing stuff, you can use test-only in sbt, or the equivalent feature in Maven, to run just some of the tests. This is what I do, I don't wait for Jenkins to run things. 90% of the time if it passes the tests that I know could break stuff, it will pass all of Jenkins. Jenkins should always be doing all the integration tests, so I don't think it will become *that* much shorter in the long run, though it can certainly be improved. Matei On August 8, 2014 at 10:20:35 AM, Nicolas Liochon (nkey...@gmail.com) wrote: fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. Nicolas On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin r...@databricks.com wrote: ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Unit tests in 5 minutes
One simple optimization might be to disable the application web UI in tests that don’t need it. When running tests on my local machine while also running another Spark shell, I’ve noticed that the test logs fill up with errors when the web UI attempts to bind to the default port, fails, and tries a higher one. - Josh On August 8, 2014 at 11:54:24 AM, Patrick Wendell (pwend...@gmail.com) wrote: I dug around this a bit a while ago, I think if someone sat down and profiled the tests it's likely we could find some things to optimize. In particular, there may be overheads in starting up a local spark context that could be minimized and speed up all the tests. Also, there are some tests (especially in Streaming) that take really long, like 60 seconds for a single test (see some of the new flume tests). These could almost certainly be optimized. I think 5 minutes might be out of reach, but something like a 2X improvement might be possible and would be very valuable if accomplished. - Patrick On Fri, Aug 8, 2014 at 11:24 AM, Matei Zaharia matei.zaha...@gmail.com wrote: Just as a note, when you're developing stuff, you can use test-only in sbt, or the equivalent feature in Maven, to run just some of the tests. This is what I do, I don't wait for Jenkins to run things. 90% of the time if it passes the tests that I know could break stuff, it will pass all of Jenkins. Jenkins should always be doing all the integration tests, so I don't think it will become *that* much shorter in the long run, though it can certainly be improved. Matei On August 8, 2014 at 10:20:35 AM, Nicolas Liochon (nkey...@gmail.com) wrote: fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. Nicolas On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin r...@databricks.com wrote: ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick
Re: Unit tests in 5 minutes
Josh - that was actually fixed recently (we just bind to a random port when running tests). On Fri, Aug 8, 2014 at 12:00 PM, Josh Rosen rosenvi...@gmail.com wrote: One simple optimization might be to disable the application web UI in tests that don't need it. When running tests on my local machine while also running another Spark shell, I've noticed that the test logs fill up with errors when the web UI attempts to bind to the default port, fails, and tries a higher one. - Josh On August 8, 2014 at 11:54:24 AM, Patrick Wendell (pwend...@gmail.com) wrote: I dug around this a bit a while ago, I think if someone sat down and profiled the tests it's likely we could find some things to optimize. In particular, there may be overheads in starting up a local spark context that could be minimized and speed up all the tests. Also, there are some tests (especially in Streaming) that take really long, like 60 seconds for a single test (see some of the new flume tests). These could almost certainly be optimized. I think 5 minutes might be out of reach, but something like a 2X improvement might be possible and would be very valuable if accomplished. - Patrick On Fri, Aug 8, 2014 at 11:24 AM, Matei Zaharia matei.zaha...@gmail.com wrote: Just as a note, when you're developing stuff, you can use test-only in sbt, or the equivalent feature in Maven, to run just some of the tests. This is what I do, I don't wait for Jenkins to run things. 90% of the time if it passes the tests that I know could break stuff, it will pass all of Jenkins. Jenkins should always be doing all the integration tests, so I don't think it will become *that* much shorter in the long run, though it can certainly be improved. Matei On August 8, 2014 at 10:20:35 AM, Nicolas Liochon (nkey...@gmail.com) wrote: fwiw, when we did this work in HBase, we categorized the tests. Then some tests can share a single jvm, while some others need to be isolated in their own jvm. Nevertheless surefire can still run them in parallel by starting/stopping several jvm. Nicolas On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin r...@databricks.com wrote: ScalaTest actually has support for parallelization built-in. We can use that. The main challenge is to make sure all the test suites can work in parallel when running along side each other. On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen so...@cloudera.com wrote: A common approach is to separate unit tests from integration tests. Maven has support for this distinction. I'm not sure it helps a lot though, since it only helps you to not run integration tests all the time. But lots of Spark tests are integration-test-like and are important to run to know a change works. I haven't heard of a plugin to run different test suites remotely on many machines, but I would not be surprised if it exists. The Jenkins servers aren't CPU-bound as far as I can tell. It's that the tests spend a lot of time waiting for bits to start up or complete. That implies the existing tests could be sped up by just running in parallel locally. I recall someone recently proposed this? And I think the problem with that is simply that some of the tests collide with each other, by opening up the same port at the same time for example. I know that kind of problem is being attacked even right now. But if all the tests were made parallel friendly, I imagine parallelism could be enabled and speed up builds greatly without any remote machines. On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Howdy, Do we think it's both feasible and worthwhile to invest in getting our unit tests to finish in under 5 minutes (or something similarly brief) when run by Jenkins? Unit tests currently seem to take anywhere from 30 min to 2 hours. As people add more tests, I imagine this time will only grow. I think it would be better for both contributors and reviewers if they didn't have to wait so long for test results; PR reviews would be shorter, if nothing else. I don't know how how this is normally done, but maybe it wouldn't be too much work to get a test cycle to feel lighter. Most unit tests are independent and can be run concurrently, right? Would it make sense to build a given patch on many servers at once and send disjoint sets of unit tests to each? I'd be interested in working on something like that if possible (and sensible). Nick