Looks like it doesn’t take too much work to get pytest working on our code base, since it knows how to run unittest tests.
https://github.com/apache/spark/compare/master…nchammas:pytest <https://github.com/apache/spark/compare/master...nchammas:pytest> For example I was able to do this from that branch and it did the right thing, running only the tests with string in their name: python [pytest *]$ ../bin/spark-submit ./pytest-run-tests.py ./pyspark/sql/tests.py -v -k string However, looking more closely at the whole test setup, I’m hesitant to work any further on this. My intention was to see if we could leverage pytest, tox, and other test tools that are standard in the Python ecosystem to replace some of the homegrown stuff we have. We have our own test dependency tracking code, our own breakdown of tests into module-scoped chunks, and our own machinery to parallelize test execution. It seems like it would be a lot of work to reap the benefits of using the standard tools while ensuring that we don’t lose any of the benefits our current test setup provides. Nick On Tue, Aug 15, 2017 at 3:26 PM Bryan Cutler cutl...@gmail.com <http://mailto:cutl...@gmail.com> wrote: This generally works for me to just run tests within a class or even a > single test. Not as flexible as pytest -k, which would be nice.. > > $ SPARK_TESTING=1 bin/pyspark pyspark.sql.tests ArrowTests > On Tue, Aug 15, 2017 at 5:49 AM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> Pytest does support unittest-based tests >> <https://docs.pytest.org/en/latest/unittest.html>, allowing for >> incremental adoption. I'll see how convenient it is to use with our current >> test layout. >> >> On Tue, Aug 15, 2017 at 1:03 AM Hyukjin Kwon <gurwls...@gmail.com> wrote: >> >>> For me, I would like this if this can be done with relatively small >>> changes. >>> How about adding more granular options, for example, specifying or >>> filtering smaller set of test goals in the run-tests.py script? >>> I think it'd be quite small change and we could roughly reach this goal >>> if I understood correctly. >>> >>> >>> 2017-08-15 3:06 GMT+09:00 Nicholas Chammas <nicholas.cham...@gmail.com>: >>> >>>> Say you’re working on something and you want to rerun the PySpark >>>> tests, focusing on a specific test or group of tests. Is there a way to do >>>> that? >>>> >>>> I know that you can test entire modules with this: >>>> >>>> ./python/run-tests --modules pyspark-sql >>>> >>>> But I’m looking for something more granular, like pytest’s -k option. >>>> >>>> On that note, does anyone else think it would be valuable to use a test >>>> runner like pytest to run our Python tests? The biggest benefits would be >>>> the use of fixtures <https://docs.pytest.org/en/latest/fixture.html>, >>>> and more flexibility on test running and reporting. Just wondering if we’ve >>>> already considered this. >>>> >>>> Nick >>>> >>>> >>> >>>