Hello,

Let's say I have a very simple DataFrame, as below.

+---+----+
| id|datA|
+---+----+
|  1|  a1|
|  2|  a2|
|  3|  a3|
+---+----+

Let's say I have a requirement to write this to a bizarre JSON structure.
For example:

{
  "id": 1,
  "stuff": {
    "datA": "a1"
  }
}

How can I achieve this with PySpark? I have only seen the following:
- writing the DataFrame as-is (doesn't meet requirement)
- using a UDF (seems frowned upon)

What I have tried is to do this within a `foreach`. I have had some
success, but also some problems with other requirements (serializing other
things).

Any advice? Please and thank you,
Marco.

Reply via email to