I want to add that there a regression when using pyspark to read data
from HDFS. its performance during map tasks has gone down approx 1 ->
0.5x. I have tested the 1.0.2 and the performance was fine, but the 1.1
release candidate has this issue. I tested by setting the following
properties to make sure it was not due to these.

set("spark.io.compression.codec","lzf").set("spark.shuffle.spill","false")

in conf object. Let me know if you need further information.

Regards,
Gurvinder
On 09/04/2014 07:47 AM, Denny Lee wrote:
> When I start the thrift server (on Spark 1.1 RC4) via:
> ./sbin/start-thriftserver.sh --master spark://hostname:7077
> --driver-class-path $CLASSPATH
> 
> It appears that the thrift server is starting off of localhost as
> opposed to hostname.  I have set the spark-env.sh to use the hostname,
> modified the /etc/hosts for the hostname, and it appears to work properly.
> 
> But when I start the thrift server, connectivity can only be via
> localhost:10000 as opposed to hostname:10000.
> 
> Any ideas on what configurations I may be setting incorrectly here?
> 
> Thanks!
> Denny
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to