java.io.IOException when using KryoSerializer

2015-11-24 Thread Piero Cinquegrana
Hello, I am using spark 1.4.1 with Zeppelin. When using the kryo serializer, spark.serializer = org.apache.spark.serializer.KryoSerializer instead of the default Java serializer I am getting the following error. Is this a known issue? Thanks, Piero java.io.IOException: Failed to connect to

com.esotericsoftware.kryo.KryoException: java.io.IOException: failed to read chunk

2015-06-24 Thread Piero Cinquegrana
Hello Spark Experts, I am facing the following issue. 1) I am converting a org.apache.spark.sql.Row into org.apache.spark.mllib.linalg.Vectors using sparse notation 2) After the parsing proceeds successfully I try to look at the result and I get the following error:

SparkSQL: leftOuterJoin is VERY slow!

2015-06-19 Thread Piero Cinquegrana
rows)? Much appreciated, Piero Cinquegrana Marketing Scientist | MarketShare 11150 Santa Monica Blvd, 5th Floor, Los Angeles, CA 90025 P: 310.914.5677 x242 M: 323.377.9197 www.marketshare.comhttp://www.marketsharepartners.com/ twitter.com/marketsharephttp://twitter.com/marketsharep val tv_key

RE: SparkSQL: leftOuterJoin is VERY slow!

2015-06-19 Thread Piero Cinquegrana
Any tips on how to implement and broadcast left outer join using Scala? From: Michael Armbrust [mailto:mich...@databricks.com] Sent: Friday, June 19, 2015 12:40 PM To: Piero Cinquegrana Cc: user@spark.apache.org Subject: Re: SparkSQL: leftOuterJoin is VERY slow! Broadcast outer joins are on my

RE: Standard Scaler taking 1.5hrs

2015-06-04 Thread Piero Cinquegrana
each step. Thanks, Piero From: DB Tsai [mailto:dbt...@dbtsai.com] Sent: Wednesday, June 03, 2015 10:33 PM To: Piero Cinquegrana Cc: user@spark.apache.org Subject: Re: Standard Scaler taking 1.5hrs Can you do count() before fit to force materialize the RDD? I think something before fit is slow

Standard Scaler taking 1.5hrs

2015-06-03 Thread Piero Cinquegrana
algorithm.optimizer.setNumIterations(numIterations) scala algorithm.optimizer.setStepSize(stepSize) scala algorithm.optimizer.setMiniBatchFraction(miniBatchFraction) scala val model = algorithm.run(scaledData) Best, Piero Cinquegrana Marketing Scientist | MarketShare 11150 Santa Monica Blvd, 5th Floor

Re: Standard Scaler taking 1.5hrs

2015-06-03 Thread Piero Cinquegrana
and give us feedback. Thanks. On Wednesday, June 3, 2015, Piero Cinquegrana pcinquegr...@marketshare.commailto:pcinquegr...@marketshare.com wrote: Hello User group, I have a RDD of LabeledPoint composed of sparse vectors like showing below. In the next step, I am standardizing the columns