Hi,

I've been following this thread for a while.

I'm trying to bring in a test strategy in my team to test a number of data
pipelines before production. I have watched Lars' presentation and find it
great. However I'm debating whether unit tests are worth the effort if
there are good job-level and pipeline-level tests. Does anybody have any
experiences benefitting from unit-tests in such a case?

Cheers,
Shiv

On Mon, Dec 12, 2016 at 6:00 AM, Juan Rodríguez Hortalá <
juan.rodriguez.hort...@gmail.com> wrote:

> Hi all,
>
> I would also would like to participate on that.
>
> Greetings,
>
> Juan
>
> On Fri, Dec 9, 2016 at 6:03 AM, Michael Stratton <michael.stratton@
> komodohealth.com> wrote:
>
>> That sounds great, please include me so I can get involved.
>>
>> On Fri, Dec 9, 2016 at 7:39 AM, Marco Mistroni <mmistr...@gmail.com>
>> wrote:
>>
>>> Me too as I spent most of my time writing unit/integ tests....  pls
>>> advise on where I  can start
>>> Kr
>>>
>>> On 9 Dec 2016 12:15 am, "Miguel Morales" <therevolti...@gmail.com>
>>> wrote:
>>>
>>>> I would be interested in contributing.  Ive created my own library for
>>>> this as well.  In my blog post I talk about testing with Spark in RSpec
>>>> style:
>>>> https://medium.com/@therevoltingx/test-driven-development-w-
>>>> apache-spark-746082b44941
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Dec 8, 2016, at 4:09 PM, Holden Karau <hol...@pigscanfly.ca> wrote:
>>>>
>>>> There are also libraries designed to simplify testing Spark in the
>>>> various platforms, spark-testing-base
>>>> <http://github.com/holdenk/spark-testing-base> for Scala/Java/Python
>>>> (& video https://www.youtube.com/watch?v=f69gSGSLGrY), sscheck
>>>> <https://github.com/juanrh/sscheck> (scala focused property based),
>>>> pyspark.test (python focused with py.test instead of unittest2) (&
>>>> blog post from nextdoor https://engblog.nextd
>>>> oor.com/unit-testing-apache-spark-with-py-test-3b8970dc013b#.jw3bdcej9
>>>>  )
>>>>
>>>> Good luck on your Spark Adventures :)
>>>>
>>>> P.S.
>>>>
>>>> If anyone is interested in helping improve spark testing libraries I'm
>>>> always looking for more people to be involved with spark-testing-base
>>>> because I'm lazy :p
>>>>
>>>> On Thu, Dec 8, 2016 at 2:05 PM, Lars Albertsson <la...@mapflat.com>
>>>> wrote:
>>>>
>>>>> I wrote some advice in a previous post on the list:
>>>>> http://markmail.org/message/bbs5acrnksjxsrrs
>>>>>
>>>>> It does not mention python, but the strategy advice is the same. Just
>>>>> replace JUnit/Scalatest with pytest, unittest, or your favourite
>>>>> python test framework.
>>>>>
>>>>>
>>>>> I recently held a presentation on the subject. There is a video
>>>>> recording at https://vimeo.com/192429554 and slides at
>>>>> http://www.slideshare.net/lallea/test-strategies-for-data-pr
>>>>> ocessing-pipelines-67244458
>>>>>
>>>>> You can find more material on test strategies at
>>>>> http://www.mapflat.com/lands/resources/reading-list/index.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Lars Albertsson
>>>>> Data engineering consultant
>>>>> www.mapflat.com
>>>>> https://twitter.com/lalleal
>>>>> +46 70 7687109
>>>>> Calendar: https://goo.gl/6FBtlS, https://freebusy.io/la...@mapflat.com
>>>>>
>>>>>
>>>>> On Thu, Dec 8, 2016 at 4:14 PM, pseudo oduesp <pseudo20...@gmail.com>
>>>>> wrote:
>>>>> > somone can tell me how i can make unit test on pyspark ?
>>>>> > (book, tutorial ...)
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>>
>>
>

Reply via email to