Hi all,

My df looks like follows:

Situation:
MainKey, SubKey, Val1, Val2, Val3, ...
1, 2, a, null, c
1, 2, null, null, c
1, 3, null, b, null
1, 3, a, null, c


Desired outcome:
1, 2, a, b, c
1, 2, a, b, c
1, 3, a, b, c
1, 3, a, b, c


How could I populate/synchronize empty cells of all records with the same 
combination of MainKey and SubKey with the respective value of other rows with 
the same key combination?
A certain value, if not null, of a col is guaranteed to be unique within the 
df. If a col exists then there is at least one row with a not-null value.

I am using pyspark.

Thanks for any hint,
Best
Meikel

Reply via email to