Thanks Marco.
This is an approach
# Start as we defined the dataframe to write to Oracle
df2 = house_df. \
select( \
F.date_format('datetaken', '').cast("Integer").alias('YEAR') \
, 'REGIONNAME' \
,
Hello
my 2cents/./
well that will be an integ test to write to a 'dev' database. (which you
might pre-populate and clean up after your runs, so you can have repeatable
data).
then either you
1 - use normal sql and assert that the values you store in your dataframe
are the same as what you get
It appears that the following assertion works assuming that result set can
be = 0 (no data) or > 0 there is data
assert df2.count() >= 0
However, if I wanted to write to a JDBC database from PySpark through a
function (already defined in another module) as below
def
Hi,
In Pytest you want to ensure that the composed DF has the correct return.
Example
df2 = house_df. \
select( \
F.date_format('datetaken', '').cast("Integer").alias('YEAR') \
, 'REGIONNAME' \
,