using Kryo with pyspark?

2014-04-14 Thread Diana Carroll
I'm looking at the Tuning Guide suggestion to use Kryo instead of default serialization. My questions: Does pyspark use Java serialization by default, as Scala spark does? If so, then... can I use Kryo with pyspark instead? The instructions say I should register my classes with the Kryo

Re: using Kryo with pyspark?

2014-04-14 Thread Matei Zaharia
Kryo won’t make a major impact on PySpark because it just stores data as byte[] objects, which are fast to serialize even with Java. But it may be worth a try — you would just set spark.serializer and not try to register any classes. What might make more impact is storing data

Re: Kryo serialization does not compress

2014-03-07 Thread pradeeps8
Hi Patrick, Thanks for your reply. I am guessing even an array type will be registered automatically. Is this correct? Thanks, Pradeep -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kryo-serialization-does-not-compress-tp2042p2400.html Sent from

Re: Kryo serialization does not compress

2014-03-06 Thread pradeeps8
We are trying to use kryo serialization, but with kryo serialization ON the memory consumption does not change. We have tried this on multiple sets of data. We have also checked the logs of Kryo serialization and have confirmed that Kryo is being used. Can somebody please help us

<    1   2   3