Hi, I'm using Spark 1.1.0. There is no error on the executors -- it appears
as if the job never gets properly dispatched -- the only message is the
Broken Pipe message in the driver.
--
View this message in context:
I have a dataset comprised of ~200k labeled points whose features are
SparseVectors with ~20M features. I take 5% of the data for a training set.
model = LogisticRegressionWithSGD.train(training_set)
fails with
ERROR:py4j.java_gateway:Error while sending or receiving.
Traceback (most recent
Hi Rok,
you could try to debug it by first collecting your training_set, see if it
gets you something back, before passing it to the train method. Then go
through each line in the train method, also the serializer and check where
it fails exactly.
thanks,
--
View this message in context:
yes, the training set is fine, I've verified it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/using-LogisticRegressionWithSGD-train-in-Python-crashes-with-Broken-pipe-tp18182p18195.html
Sent from the Apache Spark User List mailing list archive at
Which Spark version did you use? Could you check the WebUI and attach
the error message on executors? -Xiangrui
On Wed, Nov 5, 2014 at 8:23 AM, rok rokros...@gmail.com wrote:
yes, the training set is fine, I've verified it.
--
View this message in context: