[ https://issues.apache.org/jira/browse/SYSTEMML-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256597#comment-15256597 ]
Mike Dusenberry edited comment on SYSTEMML-650 at 4/25/16 5:06 PM: ------------------------------------------------------------------- Hi [~kartikkanna...@gmail.com], thanks for bringing this up. Are you using the 0.9 release? If so, the issue here is that {{ml.executeScript(...)}} did not exist for the Python API in the 0.9 release. Can you please try a building the latest master of the project, **or** grabbing a nightly JAR build [1] and loading the latest Python file in programmatically (rather than with {{--py-files}} [2]? [1]: {{SystemML.jar}} from [our nightly repo | https://sparktc.ibmcloud.com/repo/latest/]. [2]: Load the Python API file {{SystemML.jar}} programmatically via {{sc.addPyFile("https://raw.githubusercontent.com/apache/incubator-systemml/master/src/main/java/org/apache/sysml/api/python/SystemML.py")}}. was (Author: mwdus...@us.ibm.com): Hi [~kartikkanna...@gmail.com], thanks for bringing this up. Are you using the 0.9 release? If so, the issue here is that {{ml.executeScript(...)}} did not exist for the Python API in the 0.9 release. Can you please try a building the latest master of the project, **or** grabbing a nightly JAR build [1] and loading the latest Python file in programmatically [2]? [1]: {{SystemML.jar}} from [our nightly repo | https://sparktc.ibmcloud.com/repo/latest/]. [2]: Load the Python API file {{SystemML.jar}} programmatically via {{sc.addPyFile("https://raw.githubusercontent.com/apache/incubator-systemml/master/src/main/java/org/apache/sysml/api/python/SystemML.py")}}. > Error while trying to load data as a DataFrame in PySpark > --------------------------------------------------------- > > Key: SYSTEMML-650 > URL: https://issues.apache.org/jira/browse/SYSTEMML-650 > Project: SystemML > Issue Type: Bug > Affects Versions: SystemML 0.9 > Environment: Cloudera Distribution CDH 5.5.0 > Hadoop 2.6.0 > Spark 1.5.0 > SystemML 0.9.0 > Python 2.7.6 > Reporter: Kartik Kannapur > Labels: documentation, newbie > > I tried to run the sample code for "Jupyter (PySpark) Notebook Example - > Poisson Nonnegative Matrix Factorization" as provided in the documentation. > The code fails at the line where we try to run the PNMF script on SystemML > with Spark: > {code:xml} > outputs = ml.executeScript(pnmf, {"X": X_train, "maxiter": 100, "rank": 10}, > ["W", "H", "losses"]) > {code} > The script seems to fail at the first line itself, where *X_train* is passed > as a DataFrame into the variable *X*. > The error message is as below: > {code:xml} > /tmp/spark-e7974be5-4438-44b2-ae83-574b2c2bad21/userFiles-5a3c99c5-9bb7-46fe-af83-5119f9358e0f/SystemML.py > in executeScript(self, dmlScript, nargs, outputs, configFilePath) > 126 > 127 # Execute script > --> 128 jml_out = self.ml.executeScript(dmlScript, nargs, > configFilePath) > 129 ml_out = MLOutput(jml_out, self.sc) > 130 return ml_out > /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py > in __call__(self, *args) > 536 answer = self.gateway_client.send_command(command) > 537 return_value = get_return_value(answer, self.gateway_client, > --> 538 self.target_id, self.name) > 539 > 540 for temp_arg in temp_args: > /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/spark/python/pyspark/sql/utils.pyc > in deco(*a, **kw) > 34 def deco(*a, **kw): > 35 try: > ---> 36 return f(*a, **kw) > 37 except py4j.protocol.Py4JJavaError as e: > 38 s = e.java_exception.toString() > /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py > in get_return_value(answer, gateway_client, target_id, name) > 302 raise Py4JError( > 303 'An error occurred while calling {0}{1}{2}. > Trace:\n{3}\n'. > --> 304 format(target_id, '.', name, value)) > 305 else: > 306 raise Py4JError( > Py4JError: An error occurred while calling o79.executeScript. Trace: > py4j.Py4JException: Method executeScript([class java.lang.String, class > java.util.HashMap, null]) does not exist > at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333) > at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) > at py4j.Gateway.invoke(Gateway.java:252) > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) > {code} > Is there any workaround for this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)