Re: Assertion of return value of dataframe in pytest

2021-02-03 Thread Mich Talebzadeh
Thanks Marco. This is an approach # Start as we defined the dataframe to write to Oracle df2 = house_df. \ select( \ F.date_format('datetaken', '').cast("Integer").alias('YEAR') \ , 'REGIONNAME' \ ,

Re: Assertion of return value of dataframe in pytest

2021-02-03 Thread Sofia’s World
Hello my 2cents/./ well that will be an integ test to write to a 'dev' database. (which you might pre-populate and clean up after your runs, so you can have repeatable data). then either you 1 - use normal sql and assert that the values you store in your dataframe are the same as what you get

Re: Assertion of return value of dataframe in pytest

2021-02-03 Thread Mich Talebzadeh
It appears that the following assertion works assuming that result set can be = 0 (no data) or > 0 there is data assert df2.count() >= 0 However, if I wanted to write to a JDBC database from PySpark through a function (already defined in another module) as below def

Assertion of return value of dataframe in pytest

2021-02-03 Thread Mich Talebzadeh
Hi, In Pytest you want to ensure that the composed DF has the correct return. Example df2 = house_df. \ select( \ F.date_format('datetaken', '').cast("Integer").alias('YEAR') \ , 'REGIONNAME' \ ,