Re: PySpark .collect() output to Scala Array[Row]

2020-05-25 Thread Wim Van Leuven
Looking at the stack trace, your data from Spark gets serialized to an ArrayList (of something) whereas in your scala code you are using an Array of Rows. So, the types don't lign up. That's the exception you are seeing: the JVM searches for a signature that simply does not exist. Try to turn the

PySpark .collect() output to Scala Array[Row]

2020-05-25 Thread Nick Ruest
Hi, I've hit a wall with trying to implement a couple of Scala methods in a Python version of our project. I've implemented a number of these already, but I'm getting hung up with this one. My Python function looks like this: def Write_Graphml(data, graphml_path, sc): return