Some of our tests actually require spinning up a small multi-process
spark cluster. These use the normal deployment codepath for Spark
which is that we rely on the spark "assembly jar" to be present. That
jar is generated when you run "mvn package" via a special sub project
called assembly in our build. This is a bit non-standard. The reason
is that some of our tests are really mini integration tests.

- Patrick

On Thu, Oct 30, 2014 at 4:36 AM, Sean Owen <so...@cloudera.com> wrote:
> You are right that this is a bit weird compared to the Maven lifecycle
> semantics. Maven wants assembly to come after tests but here tests want to
> launch the final assembly as part of some tests. Yes you would not normally
> have to do this in 2 stages.
>
> On Oct 30, 2014 12:28 PM, "Niklas Wilcke"
> <1wil...@informatik.uni-hamburg.de> wrote:
>>
>> Can you please briefly explain why packaging is necessary. I thought
>> packaging would only build the jar and place it in the target folder.
>> How does that affect the tests? If tests depend on the assembly a "mvn
>> install" would be more sensible to me.
>> Probably I misunderstand the maven build life-cycle.
>>
>> Thanks,
>> Niklas
>>
>> On 29.10.2014 19:01, Patrick Wendell wrote:
>> > One thing is you need to do a "maven package" before you run tests.
>> > The "local-cluster" tests depend on Spark already being packaged.
>> >
>> > - Patrick
>> >
>> > On Wed, Oct 29, 2014 at 10:02 AM, Niklas Wilcke
>> > <1wil...@informatik.uni-hamburg.de> wrote:
>> >> Hi Sean,
>> >>
>> >> thanks for your reply. The tests still don't work. I focused on the
>> >> mllib and core tests and made some observations.
>> >>
>> >> The core tests seems to fail because of my german locale. Some tests
>> >> are
>> >> locale dependend like the
>> >> UtilsSuite.scala
>> >>  - "string formatting of time durations" - checks for locale dependend
>> >> seperators like "." and ","
>> >>  - "isBindCollision" - checks for the locale dependend exception
>> >> message
>> >>
>> >> In the MLlib it seems to be just one source of failure. The same
>> >> Exception I described in my first mail appears several times in
>> >> different tests.
>> >> The reason for all the similar failures is the line 29 in
>> >> LocalClusterSparkContext.scala.
>> >> When I change the line
>> >> .setMaster("local-cluster[2, 1, 512]")
>> >> to
>> >> .setMaster("local")
>> >> all tests run without a failure. The local-cluster mode seems to be the
>> >> reason for the failure. I tried some different configurations like
>> >> [1,1,512], [2,1,1024] etc. but couldn't get the tests run without a
>> >> failure.
>> >>
>> >> Could this be a configuration issue?
>> >>
>> >> On 28.10.2014 19:03, Sean Owen wrote:
>> >>> On Tue, Oct 28, 2014 at 6:18 PM, Niklas Wilcke
>> >>> <1wil...@informatik.uni-hamburg.de> wrote:
>> >>>> 1. via dev/run-tests script
>> >>>>     This script executes all tests and take several hours to finish.
>> >>>> Some tests failed but I can't say which of them. Should this really
>> >>>> take
>> >>>> that long? Can I specify to run only MLlib tests?
>> >>> Yes, running all tests takes a long long time. It does print which
>> >>> tests failed, and you can see the errors in the test output.
>> >>>
>> >>> Did you read
>> >>> http://spark.apache.org/docs/latest/building-with-maven.html#spark-tests-in-maven
>> >>> ? This shows how to run just one test suite.
>> >>>
>> >>> In any Maven project you can try things like "mvn test -pl [module]"
>> >>> to run just one module's tests.
>> >> Yes I tried that as described below at point 2.
>> >>>> 2. directly via maven
>> >>>> I did the following described in the docs [0].
>> >>>>
>> >>>> export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M
>> >>>> -XX:ReservedCodeCacheSize=512m"
>> >>>> mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package
>> >>>> mvn -Pyarn -Phadoop-2.3 -Phive test
>> >>>>
>> >>>> This also doesn't work.
>> >>>> Why do I have to package spark bevore running the tests?
>> >>> What doesn't work?
>> >>> Some tests use the built assembly, which requires packaging.
>> >> I get the same Exceptions as in every other way.
>> >>>> 3. via sbt
>> >>>> I tried the following. I freshly cloned spark and checked out the tag
>> >>>> v1.1.0-rc4.
>> >>>>
>> >>>> sbt/sbt "project mllib" test
>> >>>>
>> >>>> and get the following exception in several cluster tests.
>> >>>>
>> >>>> [info] - task size should be small in both training and prediction
>> >>>> ***
>> >>>> FAILED ***
>> >>> This just looks like a flaky test failure; I'd try again.
>> >>>
>> >> I don't think so. I tried for several times now in several different
>> >> ways.
>> >>
>> >> Thanks,
>> >> Niklas
>> >>
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> >> For additional commands, e-mail: dev-h...@spark.apache.org
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to