RE: PySpark 1.2 Hadoop version mismatch

2015-02-12 Thread Michael Nazario
...@cloudera.com] Sent: Thursday, February 12, 2015 12:13 AM To: Akhil Das Cc: Michael Nazario; user@spark.apache.org Subject: Re: PySpark 1.2 Hadoop version mismatch No, mr1 should not be the issue here, and I think that would break other things. The OP is not using mr1. client 4 / server 7 means roughly

PySpark 1.2 Hadoop version mismatch

2015-02-11 Thread Michael Nazario
Hi Spark users, I seem to be having this consistent error which I have been trying to reproduce and narrow down the problem. I've been running a PySpark application on Spark 1.2 reading avro files from Hadoop. I was consistently seeing the following error: py4j.protocol.Py4JJavaError: An

RE: PySpark 1.2 Hadoop version mismatch

2015-02-11 Thread Michael Nazario
From: Michael Nazario Sent: Wednesday, February 11, 2015 10:13 PM To: user@spark.apache.org Subject: PySpark 1.2 Hadoop version mismatch Hi Spark users, I seem to be having this consistent error which I have been trying to reproduce and narrow down the problem. I've been

Job status from Python

2014-12-11 Thread Michael Nazario
In PySpark, is there a way to get the status of a job which is currently running? My use case is that I have a long running job that users may not know whether or not the job is still running. It would be nice to have an idea of whether or not the job is progressing even if it isn't very