Re: Run a specific PySpark test or group of tests

Hyukjin Kwon Wed, 05 Dec 2018 00:30:54 -0800

Hey all, I kind of met the goal with a minimised fix with keeping available
framework and options. See


https://github.com/apache/spark/pull/23203
https://github.com/apache/spark-website/pull/161

I know it's not perfect and other Python testing framework provide many
good other features but should be good enough for now.
Thanks!


2017년 8월 17일 (목) 오전 2:38, Nicholas Chammas <nicholas.cham...@gmail.com>님이
작성:

> Looks like it doesn’t take too much work to get pytest working on our code
> base, since it knows how to run unittest tests.
>
> https://github.com/apache/spark/compare/master…nchammas:pytest
> <https://github.com/apache/spark/compare/master...nchammas:pytest>
>
> For example I was able to do this from that branch and it did the right
> thing, running only the tests with string in their name:
>
> python [pytest *]$ ../bin/spark-submit ./pytest-run-tests.py 
> ./pyspark/sql/tests.py -v -k string
>
> However, looking more closely at the whole test setup, I’m hesitant to
> work any further on this.
>
> My intention was to see if we could leverage pytest, tox, and other test
> tools that are standard in the Python ecosystem to replace some of the
> homegrown stuff we have. We have our own test dependency tracking code, our
> own breakdown of tests into module-scoped chunks, and our own machinery to
> parallelize test execution. It seems like it would be a lot of work to reap
> the benefits of using the standard tools while ensuring that we don’t lose
> any of the benefits our current test setup provides.
>
> Nick
>
> On Tue, Aug 15, 2017 at 3:26 PM Bryan Cutler cutl...@gmail.com
> <http://mailto:cutl...@gmail.com> wrote:
>
> This generally works for me to just run tests within a class or even a
>> single test.  Not as flexible as pytest -k, which would be nice..
>>
>> $ SPARK_TESTING=1 bin/pyspark pyspark.sql.tests ArrowTests
>> On Tue, Aug 15, 2017 at 5:49 AM, Nicholas Chammas <
>> nicholas.cham...@gmail.com> wrote:
>>
>>> Pytest does support unittest-based tests
>>> <https://docs.pytest.org/en/latest/unittest.html>, allowing for
>>> incremental adoption. I'll see how convenient it is to use with our current
>>> test layout.
>>>
>>> On Tue, Aug 15, 2017 at 1:03 AM Hyukjin Kwon <gurwls...@gmail.com>
>>> wrote:
>>>
>>>> For me, I would like this if this can be done with relatively small
>>>> changes.
>>>> How about adding more granular options, for example, specifying or
>>>> filtering smaller set of test goals in the run-tests.py script?
>>>> I think it'd be quite small change and we could roughly reach this goal
>>>> if I understood correctly.
>>>>
>>>>
>>>> 2017-08-15 3:06 GMT+09:00 Nicholas Chammas <nicholas.cham...@gmail.com>
>>>> :
>>>>
>>>>> Say you’re working on something and you want to rerun the PySpark
>>>>> tests, focusing on a specific test or group of tests. Is there a way to do
>>>>> that?
>>>>>
>>>>> I know that you can test entire modules with this:
>>>>>
>>>>> ./python/run-tests --modules pyspark-sql
>>>>>
>>>>> But I’m looking for something more granular, like pytest’s -k option.
>>>>>
>>>>> On that note, does anyone else think it would be valuable to use a
>>>>> test runner like pytest to run our Python tests? The biggest benefits 
>>>>> would
>>>>> be the use of fixtures
>>>>> <https://docs.pytest.org/en/latest/fixture.html>, and more
>>>>> flexibility on test running and reporting. Just wondering if we’ve already
>>>>> considered this.
>>>>>
>>>>> Nick
>>>>> 
>>>>>
>>>>
>>>> 
>

Re: Run a specific PySpark test or group of tests

Reply via email to