Looks like it doesn’t take too much work to get pytest working on our code
base, since it knows how to run unittest tests.

https://github.com/apache/spark/compare/master…nchammas:pytest
<https://github.com/apache/spark/compare/master...nchammas:pytest>

For example I was able to do this from that branch and it did the right
thing, running only the tests with string in their name:

python [pytest *]$ ../bin/spark-submit ./pytest-run-tests.py
./pyspark/sql/tests.py -v -k string

However, looking more closely at the whole test setup, I’m hesitant to work
any further on this.

My intention was to see if we could leverage pytest, tox, and other test
tools that are standard in the Python ecosystem to replace some of the
homegrown stuff we have. We have our own test dependency tracking code, our
own breakdown of tests into module-scoped chunks, and our own machinery to
parallelize test execution. It seems like it would be a lot of work to reap
the benefits of using the standard tools while ensuring that we don’t lose
any of the benefits our current test setup provides.

Nick

On Tue, Aug 15, 2017 at 3:26 PM Bryan Cutler cutl...@gmail.com
<http://mailto:cutl...@gmail.com> wrote:

This generally works for me to just run tests within a class or even a
> single test.  Not as flexible as pytest -k, which would be nice..
>
> $ SPARK_TESTING=1 bin/pyspark pyspark.sql.tests ArrowTests
> On Tue, Aug 15, 2017 at 5:49 AM, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> Pytest does support unittest-based tests
>> <https://docs.pytest.org/en/latest/unittest.html>, allowing for
>> incremental adoption. I'll see how convenient it is to use with our current
>> test layout.
>>
>> On Tue, Aug 15, 2017 at 1:03 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:
>>
>>> For me, I would like this if this can be done with relatively small
>>> changes.
>>> How about adding more granular options, for example, specifying or
>>> filtering smaller set of test goals in the run-tests.py script?
>>> I think it'd be quite small change and we could roughly reach this goal
>>> if I understood correctly.
>>>
>>>
>>> 2017-08-15 3:06 GMT+09:00 Nicholas Chammas <nicholas.cham...@gmail.com>:
>>>
>>>> Say you’re working on something and you want to rerun the PySpark
>>>> tests, focusing on a specific test or group of tests. Is there a way to do
>>>> that?
>>>>
>>>> I know that you can test entire modules with this:
>>>>
>>>> ./python/run-tests --modules pyspark-sql
>>>>
>>>> But I’m looking for something more granular, like pytest’s -k option.
>>>>
>>>> On that note, does anyone else think it would be valuable to use a test
>>>> runner like pytest to run our Python tests? The biggest benefits would be
>>>> the use of fixtures <https://docs.pytest.org/en/latest/fixture.html>,
>>>> and more flexibility on test running and reporting. Just wondering if we’ve
>>>> already considered this.
>>>>
>>>> Nick
>>>> ​
>>>>
>>>
>>> ​

Reply via email to