[ https://issues.apache.org/jira/browse/DATAFU-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778002#comment-16778002 ]
Eyal Allweil commented on DATAFU-148: ------------------------------------- Ohad and I had some time to work on this, so we added the "scala-python bridge" to the [spark-tmp|https://github.com/apache/datafu/tree/spark-tmp/datafu-spark] branch - [~russell.jurney], you can take it and try testing it out in pyspark, it should work. Obviously we still need to add documentation, but I've put a rudimentary version in our README which explains how to call the DataFu Scala API's from Pyspark. I'll add instructions for how to call arbitrary Python code from Scala later - you can look at the [test which does this|https://github.com/apache/datafu/blob/spark-tmp/datafu-spark/src/test/scala/datafu/spark/TestScalaPythonBridge.scala#L73] for now. > Setup Spark sub-project > ----------------------- > > Key: DATAFU-148 > URL: https://issues.apache.org/jira/browse/DATAFU-148 > Project: DataFu > Issue Type: New Feature > Reporter: Eyal Allweil > Assignee: Eyal Allweil > Priority: Major > Attachments: patch.diff, patch.diff > > Time Spent: 20m > Remaining Estimate: 0h > > Create a skeleton Spark sub project for Spark code to be contributed to DataFu -- This message was sent by Atlassian JIRA (v7.6.3#76005)