Re: unit testing for spark code

2021-03-22 Thread Attila Zsolt Piros
Hi! Let me draw your attention to Holden's* spark-testing-base* project. The documentation is at https://github.com/holdenk/spark-testing-base/wiki. As I usually write test for spark internal features I haven't needed to test so high level. But I am interested about your experiences. Best

Re: unit testing for spark code

2021-03-22 Thread Nicholas Gustafson
I've found pytest works well if you're using PySpark. Though if you have a lot of tests, running them all can be pretty slow. On Mon, Mar 22, 2021 at 6:32 AM Amit Sharma wrote: > Hi, can we write unit tests for spark code. Is there any specific > framework? > > > Thanks > Amit >

Re: unit testing for spark code

2021-03-22 Thread Mich Talebzadeh
coding in Scala or Python? Are you using any IDE (IntelliJ, PyCharm) view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property

unit testing for spark code

2021-03-22 Thread Amit Sharma
Hi, can we write unit tests for spark code. Is there any specific framework? Thanks Amit

Re: Unit testing Spark/Scala code with Mockito

2020-05-20 Thread ZHANG Wei
AFAICT, depends on testing goals, Unit Test, Integration Test or E2E Test. For Unit Test, mostly, it tests individual class or class methods. Mockito can help mock and verify dependent instances or methods. For Integration Test, some Spark testing helper methods can setup the environment, such

Re: Unit testing Spark/Scala code with Mockito

2020-05-20 Thread Mich Talebzadeh
On a second note with regard Spark and read writes as I understand unit tests are not meant to test database connections. This should be done in integration tests to check that all the parts work together. Unit tests are just meant to test the functional logic, and not spark's ability to read from

Unit testing Spark/Scala code with Mockito

2020-05-20 Thread Mich Talebzadeh
Hi, I have a spark job that reads an XML file from HDFS, process it and port data to Hive tables, one good and one exception table The Code itself works fine. I need to create Unit Test with Mockito for it.. A unit test should test

Unit testing PySpark Code and doing assertion

2019-09-03 Thread Rahul Nandi
Hi, I'm trying to do unit testing of my pyspark DataFrame code. My goal is to do an assertion on the schema and data of the DataFrames. I'm looking for options if there are any known libraries that I can use for doing the same. Any library which can work on 10-15 records in the DataFrame is good

Re: Configuration for unit testing and sql.shuffle.partitions

2017-09-18 Thread Vadim Semenov
you can create a Super class "FunSuiteWithSparkContext" that's going to create a Spark sessions, Spark context, and SQLContext with all the desired properties. Then you add the class to all the relevant test suites, and that's pretty much it. The other option can be is to pass it as a VM

Re: Configuration for unit testing and sql.shuffle.partitions

2017-09-16 Thread Femi Anthony
How are you specifying it, as an option to spark-submit ? On Sat, Sep 16, 2017 at 12:26 PM, Akhil Das wrote: > spark.sql.shuffle.partitions is still used I believe. I can see it in the > code >

Re: Configuration for unit testing and sql.shuffle.partitions

2017-09-16 Thread Akhil Das
spark.sql.shuffle.partitions is still used I believe. I can see it in the code and in the documentation page

Configuration for unit testing and sql.shuffle.partitions

2017-09-12 Thread peay
Hello, I am running unit tests with Spark DataFrames, and I am looking for configuration tweaks that would make tests faster. Usually, I use a local[2] or local[4] master. Something that has been bothering me is that most of my stages end up using 200 partitions, independently of whether I

Re: unit testing in spark

2017-04-11 Thread Elliot West
then it might be worth to test the function with many different data inputs > as unit tests and have integrated job/pipeline tests in addition. > > On 10. Apr 2017, at 15:46, Gokula Krishnan D <email2...@gmail.com> wrote: > > Hello Shiv, > > Unit Testing is really helping wh

Re: unit testing in spark

2017-04-11 Thread Steve Loughran
(sorry sent an empty reply by accident) Unit testing is one of the easiest ways to isolate problems in an an internal class, things you can get wrong. But: time spent writing unit tests is time *not* spent writing integration tests. Which biases me towards the integration. What I do find

Re: unit testing in spark

2017-04-10 Thread Jörn Franke
:46, Gokula Krishnan D <email2...@gmail.com> wrote: > > Hello Shiv, > > Unit Testing is really helping when you follow TDD approach. And it's a safe > way to code a program locally and also you can make use those test cases > during the build process by using any of the c

Re: unit testing in spark

2017-04-10 Thread Gokula Krishnan D
Hello Shiv, Unit Testing is really helping when you follow TDD approach. And it's a safe way to code a program locally and also you can make use those test cases during the build process by using any of the continuous integration tools ( Bamboo, Jenkins). If so you can ensure that artifacts

Re: unit testing in spark

2017-04-05 Thread Shiva Ramagopal
in the >>>> various platforms, spark-testing-base >>>> <http://github.com/holdenk/spark-testing-base> for Scala/Java/Python >>>> (& video https://www.youtube.com/watch?v=f69gSGSLGrY), sscheck >>>> <https://github.com/juanrh/sscheck>

Re: unit testing in spark

2016-12-11 Thread Juan Rodríguez Hortalá
;http://github.com/holdenk/spark-testing-base> for Scala/Java/Python (& >>> video https://www.youtube.com/watch?v=f69gSGSLGrY), sscheck >>> <https://github.com/juanrh/sscheck> (scala focused property based), >>> pyspark.test (python focused with py.te

Re: unit testing in spark

2016-12-09 Thread Michael Stratton
latforms, spark-testing-base >> <http://github.com/holdenk/spark-testing-base> for Scala/Java/Python (& >> video https://www.youtube.com/watch?v=f69gSGSLGrY), sscheck >> <https://github.com/juanrh/sscheck> (scala focused property based), >> pyspark.test (python focused wi

Re: unit testing in spark

2016-12-09 Thread Marco Mistroni
(scala focused property based), > pyspark.test (python focused with py.test instead of unittest2) (& blog > post from nextdoor https://engblog.nextdoor.com/unit-testing- > apache-spark-with-py-test-3b8970dc013b#.jw3bdcej9 ) > > Good luck on your Spark Adventures :) > > P.S.

Re: unit testing in spark

2016-12-08 Thread Miguel Morales
atch?v=f69gSGSLGrY), sscheck (scala focused >>> property based), pyspark.test (python focused with py.test instead of >>> unittest2) (& blog post from nextdoor >>> https://engblog.nextdoor.com/unit-testing-apache-spark-with-py-test-3b8970dc013b#.jw3bdcej9 >>>

Re: unit testing in spark

2016-12-08 Thread Holden Karau
e.com/watch?v=f69gSGSLGrY), sscheck > <https://github.com/juanrh/sscheck> (scala focused property based), > pyspark.test (python focused with py.test instead of unittest2) (& blog > post from nextdoor https://engblog.nextdoor.com/unit-testing- > apache-spark-with-py-test-3b89

Re: unit testing in spark

2016-12-08 Thread Miguel Morales
pyspark.test (python focused with py.test instead of unittest2) (& > blog post from nextdoor > https://engblog.nextdoor.com/unit-testing-apache-spark-with-py-test-3b8970dc013b#.jw3bdcej9 > ) > > Good luck on your Spark Adventures :) > > P.S. > > If anyone is inter

Re: unit testing in spark

2016-12-08 Thread Holden Karau
focused property based), pyspark.test (python focused with py.test instead of unittest2) (& blog post from nextdoor https://engblog.nextdoor.com/unit-testing-apache-spark-with-py-test-3b8970dc013b#.jw3bdcej9 ) Good luck on your Spark Adventures :) P.S. If anyone is interested in helping i

Re: unit testing in spark

2016-12-08 Thread Lars Albertsson
I wrote some advice in a previous post on the list: http://markmail.org/message/bbs5acrnksjxsrrs It does not mention python, but the strategy advice is the same. Just replace JUnit/Scalatest with pytest, unittest, or your favourite python test framework. I recently held a presentation on the

Re: unit testing in spark

2016-12-08 Thread ndjido
Hi Pseudo, Just use unittest https://docs.python.org/2/library/unittest.html . > On 8 Dec 2016, at 19:14, pseudo oduesp wrote: > > somone can tell me how i can make unit test on pyspark ? > (book, tutorial ...)

unit testing in spark

2016-12-08 Thread pseudo oduesp
somone can tell me how i can make unit test on pyspark ? (book, tutorial ...)

Driver/Executor Memory values during Unit Testing

2016-12-07 Thread Aleksander Eskilson
Hi there, I've been trying to increase the spark.driver.memory and spark.executor.memory during some unit tests. Most of the information I can find about increasing memory for Spark is based on either flags to spark-submit, or settings in the spark-defaults.conf file. Running unit tests with

Re: Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-22 Thread Bedrytski Aliaksandr
-base/issues/123 ). >>> I'll try and include this in the next release :) >>> >>> On Mon, Aug 1, 2016 at 9:22 AM, Koert Kuipers >>> <ko...@tresata.com> wrote: >>> we share a single single sparksession across tests, and they can run >>> in par

Re: Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-21 Thread Everett Anderson
n Mon, Aug 1, 2016 at 9:22 AM, Koert Kuipers > <ko...@tresata.com> wrote: > we share a single single sparksession across tests, and they can run > in parallel. is pretty fast > > On Mon, Aug 1, 2016 at 12:02 PM, Everett Anderson > <ever...@nuna.com.invalid> wrote: >

Re: Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-21 Thread Bedrytski Aliaksandr
l. is pretty fast > > On Mon, Aug 1, 2016 at 12:02 PM, Everett Anderson > <ever...@nuna.com.invalid> wrote: > Hi, > > Right now, if any code uses DataFrame/Dataset, I need a test setup > that brings up a local master as in this article[1]. > > That's a lot of overhead

Re: Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-19 Thread Everett Anderson
>>> >>> Do people have any tricks to get around this? Maybe using spy mocks on >>> fake DataFrame/Datasets? >>> >>> Anyone know if there are plans to make more traditional unit testing >>> possible with Spark SQL, perhaps with a stripped down in-me

Re: Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-01 Thread Holden Karau
f any code uses DataFrame/Dataset, I need a test setup that >> brings up a local master as in this article >> <http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/> >> . >> >> That's a lot of overhead for unit testing and

Re: Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-01 Thread Koert Kuipers
up a local master as in this article > <http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/> > . > > That's a lot of overhead for unit testing and the tests can't run in > parallel, so testing is slow -- this is more like what I'd call an

Plans for improved Spark DataFrame/Dataset unit testing?

2016-08-01 Thread Everett Anderson
Hi, Right now, if any code uses DataFrame/Dataset, I need a test setup that brings up a local master as in this article <http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/> . That's a lot of overhead for unit testing and the tests can

Re: Unit testing framework for Spark Jobs?

2016-05-21 Thread Lars Albertsson
;> > 3. It doesn't support some fail-fast exception which your code can >> raise to indicate that the desired state is never going to be reached, and >> so the test should fail fast. Here a new exception and another entry in >> anExceptionThatShouldCauseAnAbort() may be the a

Re: Unit testing framework for Spark Jobs?

2016-05-18 Thread Todd Nist
Perhaps these may be of some use: https://github.com/mkuthan/example-spark http://mkuthan.github.io/blog/2015/03/01/spark-unit-testing/ https://github.com/holdenk/spark-testing-base On Wed, May 18, 2016 at 2:14 PM, swetha kasireddy <swethakasire...@gmail.com > wrote: > Hi Lars, > &

Re: Unit testing framework for Spark Jobs?

2016-05-18 Thread swetha kasireddy
at some more. > > > > > >> > >> This poll and sleep strategy both makes tests quick in successful > >> cases, but still robust to occasional delays. The strategy does not > >> work if you want to test for absence, e.g. ensure that a particular > &

Re: Scala: Perform Unit Testing in spark

2016-04-06 Thread Shishir Anshuman
an > <shishiranshu...@gmail.com> wrote: > > Hello, > > > > I have a code written in scala using Mllib. I want to perform unit > testing > > it. I cant decide between Junit 4 and ScalaTest. > > I am new to Spark. Please guide me how to proceed with the testing. > > > > Thank you. >

Re: Scala: Perform Unit Testing in spark

2016-04-06 Thread Lars Albertsson
www.mapflat.com +46 70 7687109 On Fri, Apr 1, 2016 at 10:31 PM, Shishir Anshuman <shishiranshu...@gmail.com> wrote: > Hello, > > I have a code written in scala using Mllib. I want to perform unit testing > it. I cant decide between Junit 4 and ScalaTest. > I am new to Spark. Please gu

Re: Scala: Perform Unit Testing in spark

2016-04-02 Thread Ted Yu
gt; >>>> *libraryDependencies ++= Seq( "org.apache.spark" % "spark-core_2.10" % >>>> "1.6.0", "org.apache.spark" % "spark-mllib_2.10" % "1.6.0" )* >>> >>> >>> >>> >>&

Re: Scala: Perform Unit Testing in spark

2016-04-01 Thread Shishir Anshuman
.10" % "1.6.0" )* >> >> >> >> >> On Sat, Apr 2, 2016 at 2:21 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Assuming your code is written in Scala, I would suggest using ScalaTest. >>> >>> Please take a look at the XXSu

Re: Scala: Perform Unit Testing in spark

2016-04-01 Thread Ted Yu
sing ScalaTest. >> >> Please take a look at the XXSuite.scala files under mllib/ >> >> On Fri, Apr 1, 2016 at 1:31 PM, Shishir Anshuman < >> shishiranshu...@gmail.com> wrote: >> >>> Hello, >>> >>> I have a code written in scala using Mllib. I want to perform unit >>> testing it. I cant decide between Junit 4 and ScalaTest. >>> I am new to Spark. Please guide me how to proceed with the testing. >>> >>> Thank you. >>> >> >> >

Re: Scala: Perform Unit Testing in spark

2016-04-01 Thread Holden Karau
ase take a look at the XXSuite.scala files under mllib/ > > On Fri, Apr 1, 2016 at 1:31 PM, Shishir Anshuman < > shishiranshu...@gmail.com > <javascript:_e(%7B%7D,'cvml','shishiranshu...@gmail.com');>> wrote: > >> Hello, >> >> I have a code written in scala using Mllib.

Re: Scala: Perform Unit Testing in spark

2016-04-01 Thread Ted Yu
Assuming your code is written in Scala, I would suggest using ScalaTest. Please take a look at the XXSuite.scala files under mllib/ On Fri, Apr 1, 2016 at 1:31 PM, Shishir Anshuman <shishiranshu...@gmail.com> wrote: > Hello, > > I have a code written in scala using Mllib. I want

Scala: Perform Unit Testing in spark

2016-04-01 Thread Shishir Anshuman
Hello, I have a code written in scala using Mllib. I want to perform unit testing it. I cant decide between Junit 4 and ScalaTest. I am new to Spark. Please guide me how to proceed with the testing. Thank you.

Re: Unit testing framework for Spark Jobs?

2016-03-30 Thread Lars Albertsson
l delays. The strategy does not >> work if you want to test for absence, e.g. ensure that a particular >> message if filtered. You can work around it by adding another message >> afterwards and polling for its effect before testing for absence of >> the first. Be aware that

Re: Unit testing framework for Spark Jobs?

2016-03-28 Thread Steve Loughran
ribed above, > and it is straightforward to set up. Let me know if you want > clarifications or assistance. > > Regards, > > > > Lars Albertsson > Data engineering consultant > www.mapflat.com > +46 70 7687109 > > > On Wed, Mar

Re: Unit testing framework for Spark Jobs?

2016-03-24 Thread Shiva Ramagopal
hat messages can be processed out of order in > Spark Streaming depending on partitioning, however. > > > I have tested Spark applications with both strategies described above, > and it is straightforward to set up. Let me know if you want > clarifications or assistance. > > R

Re: Unit testing framework for Spark Jobs?

2016-03-19 Thread Vikas Kawadia
I just wrote a blog post on Unit testing Apache Spark with py.test https://engblog.nextdoor.com/unit-testing-apache-spark-with-py-test-3b8970dc013b If you prefer using the py.test framework, then it might be useful. -vikas On Wed, Mar 2, 2016 at 10:59 AM, radoburansky <radoburan...@gmail.

Re: Unit testing framework for Spark Jobs?

2016-03-19 Thread Lars Albertsson
clarifications or assistance. Regards, Lars Albertsson Data engineering consultant www.mapflat.com +46 70 7687109 On Wed, Mar 2, 2016 at 6:54 PM, SRK <swethakasire...@gmail.com> wrote: > Hi, > > What is a good unit testing framework for Spark batch/streaming jobs? I have > cor

Re: Unit testing framework for Spark Jobs?

2016-03-02 Thread radoburansky
I am sure you have googled this: https://github.com/holdenk/spark-testing-base On Wed, Mar 2, 2016 at 6:54 PM, SRK [via Apache Spark User List] < ml-node+s1001560n2638...@n3.nabble.com> wrote: > Hi, > > What is a good unit testing framework for Spark batch/streaming jobs? I &g

Re: Unit testing framework for Spark Jobs?

2016-03-02 Thread Ricardo Paiva
ias("num")) assertEquals(5, numDf.count) } } On Wed, Mar 2, 2016 at 2:54 PM, SRK [via Apache Spark User List] < ml-node+s1001560n26380...@n3.nabble.com> wrote: > Hi, > > What is a good unit testing framework for Spark batch/streaming jobs? I > have core spark

Re: Unit testing framework for Spark Jobs?

2016-03-02 Thread Silvio Fiorito
Please check out the following for some good resources: https://github.com/holdenk/spark-testing-base https://spark-summit.org/east-2016/events/beyond-collect-and-parallelize-for-tests/ On 3/2/16, 12:54 PM, "SRK" <swethakasire...@gmail.com> wrote: >Hi, > >Wh

Re: Unit testing framework for Spark Jobs?

2016-03-02 Thread Yin Yang
Cycling prior bits: http://search-hadoop.com/m/q3RTto4sby1Cd2rt=Re+Unit+test+with+sqlContext On Wed, Mar 2, 2016 at 9:54 AM, SRK <swethakasire...@gmail.com> wrote: > Hi, > > What is a good unit testing framework for Spark batch/streaming jobs? I > have > core spark, spar

Unit testing framework for Spark Jobs?

2016-03-02 Thread SRK
Hi, What is a good unit testing framework for Spark batch/streaming jobs? I have core spark, spark sql with dataframes and streaming api getting used. Any good framework to cover unit tests for these APIs? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3

Re: Allow multiple SparkContexts in Unit Testing

2015-11-04 Thread Priya Ch
Already tried setting spark.driver.allowMultipleContexts to true. But it not successful. I the problem is we have different test suites which of course run in parallel. How do we stop sparkContext after each test suite and start it in the next test suite or is there any way to share sparkContext

Re: Allow multiple SparkContexts in Unit Testing

2015-11-04 Thread Bryan Jeffrey
Priya, If you're trying to get unit tests running local spark contexts, you can just set up your spark context with 'spark.driver.allowMultipleContexts' set to true. Example: def create(seconds : Int, appName : String): StreamingContext = { val master = "local[*]" val conf = new

Re: Allow multiple SparkContexts in Unit Testing

2015-11-04 Thread Ted Yu
Are you trying to speed up tests where each test suite uses single SparkContext ? You may want to read: https://issues.apache.org/jira/browse/SPARK-2243 Cheers On Wed, Nov 4, 2015 at 4:59 AM, Priya Ch wrote: > Hello All, > > How to use multiple Spark Context in

Allow multiple SparkContexts in Unit Testing

2015-11-04 Thread Priya Ch
Hello All, How to use multiple Spark Context in executing multiple test suite of spark code ??? Can some one throw light on this ?

Re: Mock Cassandra DB Connection in Unit Testing

2015-10-29 Thread Priya Ch
One more question, if i have a function which takes RDD as a parameter, how do we mock an RDD ?? On Thu, Oct 29, 2015 at 5:20 PM, Priya Ch wrote: > How do we do it for Cassandra..can we use the same Mocking ? > EmbeddedCassandra Server is available with

Re: Mock Cassandra DB Connection in Unit Testing

2015-10-29 Thread Priya Ch
How do we do it for Cassandra..can we use the same Mocking ? EmbeddedCassandra Server is available with CassandraUnit. Can this be used in Spark Code as well ? I mean with Scala code ? On Thu, Oct 29, 2015 at 5:03 PM, Василец Дмитрий wrote: > there is example how i

Mock Cassandra DB Connection in Unit Testing

2015-10-29 Thread Priya Ch
Hi All, For my Spark Streaming code, which writes the results to Cassandra DB, I need to write Unit test cases. what are the available test frameworks to mock the connection to Cassandra DB ?

Re: Mock Cassandra DB Connection in Unit Testing

2015-10-29 Thread Василец Дмитрий
there is example how i mock mysql import org.scalamock.scalatest.MockFactory val connectionMock = mock[java.sql.Connection] val statementMock = mock[PreparedStatement] (conMock.prepareStatement(_: String)).expects(sql.toString).returning(statementMock) (statementMock.executeUpdate

Re: Mock Cassandra DB Connection in Unit Testing

2015-10-29 Thread Adrian Tanase
s.datastax.com>" Subject: Re: Mock Cassandra DB Connection in Unit Testing One more question, if i have a function which takes RDD as a parameter, how do we mock an RDD ?? On Thu, Oct 29, 2015 at 5:20 PM, Priya Ch <learnings.chitt...@gmail.com<mailto:learnings.chitt...@gmail.com&

Re: What are best practices from Unit Testing Spark Code?

2015-09-26 Thread ehrlichja
-from-Unit-Testing-Spark-Code-tp24821p24833.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

Re: Unit Testing

2015-08-13 Thread jay vyas
to use intellij with the set plugins for scala development. It allows you to run everything from inside the IDE. I've written up setup instructions here: http://jayunit100.blogspot.com/2014/07/set-up-spark-application-devleopment.html Now, regarding local unit testing: As an example, here

Re: Unit Testing

2015-08-13 Thread Burak Yavuz
I would recommend this spark package for your unit testing needs ( http://spark-packages.org/package/holdenk/spark-testing-base). Best, Burak On Thu, Aug 13, 2015 at 5:51 AM, jay vyas jayunit100.apa...@gmail.com wrote: yes there certainly is, so long as eclipse has the right plugins and so

Unit Testing

2015-08-12 Thread Mohit Anchlia
Is there a way to run spark streaming methods in standalone eclipse environment to test out the functionality?

Unit Testing Spark Transformations/Actions

2015-06-16 Thread Mark Tse
Hi there, I am looking to use Mockito to mock out some functionality while unit testing a Spark application. I currently have code that happily runs on a cluster, but fails when I try to run unit tests against it, throwing a SparkException: org.apache.spark.SparkException: Job aborted due

Re: Spark Unit Testing

2015-04-21 Thread James King
a JavaPairDStreamString, String to my spark class. Is there a way to create a JavaPairDStream using Java API? Also is there a good resource that covers an approach (or approaches) for unit testing using Java. Regards jk -- Emre Sevinc

Re: Spark Unit Testing

2015-04-21 Thread Emre Sevinc
resource that covers an approach (or approaches) for unit testing using Java. Regards jk -- Emre Sevinc

Spark Unit Testing

2015-04-21 Thread James King
I'm trying to write some unit tests for my spark code. I need to pass a JavaPairDStreamString, String to my spark class. Is there a way to create a JavaPairDStream using Java API? Also is there a good resource that covers an approach (or approaches) for unit testing using Java. Regards jk

Re: Unit testing with HiveContext

2015-04-09 Thread Daniel Siegmann
Thanks Ted, using HiveTest as my context worked. It still left a metastore directory and Derby log in my current working directory though; I manually added a shutdown hook to delete them and all was well. On Wed, Apr 8, 2015 at 4:33 PM, Ted Yu yuzhih...@gmail.com wrote: Please take a look at

Unit testing with HiveContext

2015-04-08 Thread Daniel Siegmann
I am trying to unit test some code which takes an existing HiveContext and uses it to execute a CREATE TABLE query (among other things). Unfortunately I've run into some hurdles trying to unit test this, and I'm wondering if anyone has a good approach. The metastore DB is automatically created in

Re: Unit testing with HiveContext

2015-04-08 Thread Ted Yu
Please take a look at sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala : protected def configure(): Unit = { warehousePath.delete() metastorePath.delete() setConf(javax.jdo.option.ConnectionURL, sjdbc:derby:;databaseName=$metastorePath;create=true)

Unit testing and Spark Streaming

2014-12-12 Thread Eric Loots
Hi, I’ve started my first experiments with Spark Streaming and started with setting up an environment using ScalaTest to do unit testing. Poked around on this mailing list and googled the topic. One of the things I wanted to be able to do is to use Scala Sequences as data source in the tests

Re: Unit testing and Spark Streaming

2014-12-12 Thread Emre Sevinc
On Fri, Dec 12, 2014 at 2:17 PM, Eric Loots eric.lo...@gmail.com wrote: How can the log level in test mode be reduced (or extended when needed) ? Hello Eric, The following might be helpful for reducing the log messages during unit testing: http://stackoverflow.com/a/2736/236007 -- Emre

Re: Unit testing and Spark Streaming

2014-12-12 Thread Jay Vyas
PM, Eric Loots eric.lo...@gmail.com wrote: How can the log level in test mode be reduced (or extended when needed) ? Hello Eric, The following might be helpful for reducing the log messages during unit testing: http://stackoverflow.com/a/2736/236007 -- Emre Sevinç https

embedded spark for unit testing..

2014-11-09 Thread Kevin Burton
What’s the best way to embed spark to run local mode in unit tests? Some or our jobs are mildly complex and I want to keep verifying that they work including during schema changes / migration. I think for some of this I would just run local mode, read from a few text files via resources, and

Re: embedded spark for unit testing..

2014-11-09 Thread DB Tsai
You can write unittest with a local spark context by mixing LocalSparkContext trait. See https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/mllib/classification/LogisticRegressionSuite.scala

Re: Unit Testing (JUnit) with Spark

2014-10-29 Thread touchdown
: http://apache-spark-user-list.1001560.n3.nabble.com/Unit-Testing-JUnit-with-Spark-tp10861p17652.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Unit testing: Mocking out Spark classes

2014-10-16 Thread Saket Kumar
Hello all, I am trying to unit test my classes involved my Spark job. I am trying to mock out the Spark classes (like SparkContext and Broadcast) so that I can unit test my classes in isolation. However I have realised that these are classes instead of traits. My first question is why? It is

Re: Unit testing: Mocking out Spark classes

2014-10-16 Thread Daniel Siegmann
Mocking these things is difficult; executing your unit tests in a local Spark context is preferred, as recommended in the programming guide http://spark.apache.org/docs/latest/programming-guide.html#unit-testing. I know this may not technically be a unit test, but it is hopefully close enough

Unit testing jar request

2014-10-15 Thread Jean Charles Jabouille
Hi, we are Spark users and we use some Spark's test classes for our own application unit tests. We use LocalSparkContext and SharedSparkContext. But these classes are not included in the spark-core library. This is a good option as it's not a good idea to include test classes in the runtime

Unit Testing (JUnit) with Spark

2014-07-29 Thread soumick86
Is there any example out there for unit testing a Spark application in Java? Even a trivial application like word count will be very helpful. I am very new to this and I am struggling to understand how I can use JavaSpark Context for JUnit -- View this message in context: http://apache-spark

Re: Unit Testing (JUnit) with Spark

2014-07-29 Thread jay vyas
:29 AM, soumick86 sdasgu...@dstsystems.com wrote: Is there any example out there for unit testing a Spark application in Java? Even a trivial application like word count will be very helpful. I am very new to this and I am struggling to understand how I can use JavaSpark Context for JUnit

Re: Unit Testing (JUnit) with Spark

2014-07-29 Thread Kostiantyn Kudriavtsev
: Is there any example out there for unit testing a Spark application in Java? Even a trivial application like word count will be very helpful. I am very new to this and I am struggling to understand how I can use JavaSpark Context for JUnit -- View this message in context: http

Re: Unit Testing (JUnit) with Spark

2014-07-29 Thread Sonal Goyal
29, 2014, at 6:29 PM, soumick86 sdasgu...@dstsystems.com wrote: Is there any example out there for unit testing a Spark application in Java? Even a trivial application like word count will be very helpful. I am very new to this and I am struggling to understand how I can use JavaSpark

Re: guidance on simple unit testing with Spark

2014-06-16 Thread Daniel Siegmann
testing at http://spark.apache.org/docs/latest/programming-guide.html#unit-testing, but still dont have a good understanding of writing unit tests using the Spark framework. Previously, I have written unit tests using specs2 framework and have got them to work in Scalding. I tried to use

Re: guidance on simple unit testing with Sprk

2014-06-14 Thread Gerard Maas
looked through some of the test examples and also the brief documentation on unit testing at http://spark.apache.org/docs/latest/programming-guide.html#unit-testing, but still dont have a good understanding of writing unit tests using the Spark framework. Previously, I have written unit

Re: guidance on simple unit testing with Spark

2014-06-13 Thread Matei Zaharia
(“line 1”, “line 2”)) val result = GetInfo.processData(myLines).collect() assert(result.toSet === Set(“res 1”, “res 2”)) Matei On Jun 13, 2014, at 2:42 PM, SK skrishna...@gmail.com wrote: Hi, I have looked through some of the test examples and also the brief documentation on unit testing

Re: Spark unit testing best practices

2014-05-16 Thread Andras Nemeth
Thanks for the answers! On a concrete example, here is what I did to test my (wrong :) ) hypothesis before writing my email: class SomethingNotSerializable { def process(a: Int): Int = 2 *a } object NonSerializableClosure extends App { val sc = new spark.SparkContext( local,

Re: Spark unit testing best practices

2014-05-16 Thread Nan Zhu
+1, at least with current code just watch the log printed by DAGScheduler… -- Nan Zhu On Wednesday, May 14, 2014 at 1:58 PM, Mark Hamstra wrote: serDe

Spark unit testing best practices

2014-05-15 Thread Andras Nemeth
Hi, Spark's local mode is great to create simple unit tests for our spark logic. The disadvantage however is that certain types of problems are never exposed in local mode because things never need to be put on the wire. E.g. if I accidentally use a closure which has something non-serializable

Re: Spark unit testing best practices

2014-05-14 Thread Andrew Ash
There's an undocumented mode that looks like it simulates a cluster: SparkContext.scala: // Regular expression for simulating a Spark cluster of [N, cores, memory] locally val LOCAL_CLUSTER_REGEX = local-cluster\[\s*([0-9]+)\s*,\s*([0-9]+)\s*,\s*([0-9]+)\s*].r can you running your tests

Re: Spark unit testing best practices

2014-05-14 Thread Philip Ogren
Have you actually found this to be true? I have found Spark local mode to be quite good about blowing up if there is something non-serializable and so my unit tests have been great for detecting this. I have never seen something that worked in local mode that didn't work on the cluster