parkAppPort//tcpor, better yet, use a
port-deterministic strategy mentioned earlier.(Hopefully the verbosity here
will help someone in their furute search. Fedora aside, the original problem
here can be network related, as I discovered).sincerely,didata
--
View this message in context:
http://apache-s
Hello friends:
I have a theory question about call blocking in a Spark driver.
Consider this (admittedly contrived =:)) snippet to illustrate this question...
x = rdd01.reduceByKey() # or maybe some other 'shuffle-requiring action'.
b = sc.broadcast(x. take(20)) # Or any statement that r
Thanks for asking this.
I've have this issue with pyspark too on YARN 100 of the time: I quit out
of pyspark and, while my Unix shell prompt returns, a 'yarn application
-list' always shows (as does the UI) that application is still running (or
at least not totally dead). When I then log onto
); export MASTER=local[NN];
pyspark --master local[NN]*
Without temporarily moving the Hadoop/YARN configuration directory, how do I
dynamcally instruct
pyspark on the CLI to not use HDFS? (i.e. without hard-codes anywhere, such
as in
*/etc/spark/spark-env.sh*)
Thank you in advance!
didata staff
ries on and not compiling
from source... Is there a reason why you aren't just using the
binaries?
On Thu, Apr 10, 2014 at 1:30 PM, DiData wrote:
Hello friends:
I recently compiled and installed Spark v0.9 from the Apache distribution.
Note: I have the Cloudera/CDH5 Spark RPMs co-installe
namenode:8020 failed on connection exception:
java.net.ConnectException: Connection refused; For more details
see: http://wiki.apache.org/hadoop/ConnectionRefused*
[ ... snip ... ]
>>>
>>>
#
===
--
Sincerely,
DiData