Re: How to set KryoRegistrator class in spark-shell
Or launch the spark-shell with --conf spark.kryo.registrator=foo.bar.MyClass 2015-06-11 14:30 GMT+02:00 Igor Berman : > Another option would be to close sc and open new context with your custom > configuration > On Jun 11, 2015 01:17, "bhomass" wrote: > >> you need to register using spark-default.xml as explained here >> >> >> https://books.google.com/books?id=WE_GBwAAQBAJ&pg=PA239&lpg=PA239&dq=spark+shell+register+kryo+serialization&source=bl&ots=vCxgEfz1-2&sig=dHU8FY81zVoBqYIJbCFuRwyFjAw&hl=en&sa=X&ved=0CEwQ6AEwB2oVChMIn_iujpCGxgIVDZmICh3kYADW#v=onepage&q=spark%20shell%20register%20kryo%20serialization&f=false >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-KryoRegistrator-class-in-spark-shell-tp12498p23265.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >>
Re: How to set KryoRegistrator class in spark-shell
Another option would be to close sc and open new context with your custom configuration On Jun 11, 2015 01:17, "bhomass" wrote: > you need to register using spark-default.xml as explained here > > > https://books.google.com/books?id=WE_GBwAAQBAJ&pg=PA239&lpg=PA239&dq=spark+shell+register+kryo+serialization&source=bl&ots=vCxgEfz1-2&sig=dHU8FY81zVoBqYIJbCFuRwyFjAw&hl=en&sa=X&ved=0CEwQ6AEwB2oVChMIn_iujpCGxgIVDZmICh3kYADW#v=onepage&q=spark%20shell%20register%20kryo%20serialization&f=false > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-KryoRegistrator-class-in-spark-shell-tp12498p23265.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: How to set KryoRegistrator class in spark-shell
you need to register using spark-default.xml as explained here https://books.google.com/books?id=WE_GBwAAQBAJ&pg=PA239&lpg=PA239&dq=spark+shell+register+kryo+serialization&source=bl&ots=vCxgEfz1-2&sig=dHU8FY81zVoBqYIJbCFuRwyFjAw&hl=en&sa=X&ved=0CEwQ6AEwB2oVChMIn_iujpCGxgIVDZmICh3kYADW#v=onepage&q=spark%20shell%20register%20kryo%20serialization&f=false -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-KryoRegistrator-class-in-spark-shell-tp12498p23265.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to set KryoRegistrator class in spark-shell
I can do that in my application, but I really want to know how I can do it in spark-shell because I usually prototype in spark-shell before I put the code into an application. On Wed, Aug 20, 2014 at 12:47 PM, Sameer Tilak wrote: > Hi Wang, > Have you tried doing this in your application? > >conf.set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") >conf.set("spark.kryo.registrator", "yourpackage.MyKryoRegistrator") > > You then don't need to specify it via commandline. > > > -------------- > Date: Wed, 20 Aug 2014 12:25:14 -0700 > Subject: How to set KryoRegistrator class in spark-shell > From: bewang.t...@gmail.com > To: user@spark.apache.org > > > I want to use opencsv's CSVParser to parse csv lines using a script like > below in spark-shell: > > import au.com.bytecode.opencsv.CSVParser; > import com.esotericsoftware.kryo.Kryo > import org.apache.spark.serializer.KryoRegistrator > import org.apache.hadoop.fs.{Path, FileSystem} > > class MyKryoRegistrator extends KryoRegistrator { > override def registerClasses(kryo:Kryo) { > kryo.register(classOf[CSVParser]) > } > } > > val outDir="/tmp/dmc-out" > > val fs = FileSystem.get(sc.hadoopConfiguration) > fs.delete(new Path(outDir), true); > > val largeLines = sc.textFile("/tmp/dmc-03-08/*.gz") > val parser = new CSVParser('|', '"') > largeLines.map(parser.parseLine(_).toList).saveAsTextFile(outDir, > classOf[org.apache.hadoop.io.compress.GzipCodec]) > > If I start spark-shell with spark.kryo.registrator like this > > SPARK_JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer > -Dspark.kryo.registrator=MyKryoRegistrator" spark-shell > > it complains that MyKroRegistrator not found when I run ":load my_script" > in spark-shell. > > 14/08/20 12:14:01 ERROR KryoSerializer: Failed to run > spark.kryo.registrator > java.lang.ClassNotFoundException: MyKryoRegistrator > > What's wrong? >
RE: How to set KryoRegistrator class in spark-shell
Hi Wang,Have you tried doing this in your application? conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") conf.set("spark.kryo.registrator", "yourpackage.MyKryoRegistrator") You then don't need to specify it via commandline. Date: Wed, 20 Aug 2014 12:25:14 -0700 Subject: How to set KryoRegistrator class in spark-shell From: bewang.t...@gmail.com To: user@spark.apache.org I want to use opencsv's CSVParser to parse csv lines using a script like below in spark-shell: import au.com.bytecode.opencsv.CSVParser;import com.esotericsoftware.kryo.Kryo import org.apache.spark.serializer.KryoRegistratorimport org.apache.hadoop.fs.{Path, FileSystem} class MyKryoRegistrator extends KryoRegistrator { override def registerClasses(kryo:Kryo) { kryo.register(classOf[CSVParser]) }} val outDir="/tmp/dmc-out" val fs = FileSystem.get(sc.hadoopConfiguration)fs.delete(new Path(outDir), true); val largeLines = sc.textFile("/tmp/dmc-03-08/*.gz")val parser = new CSVParser('|', '"')largeLines.map(parser.parseLine(_).toList).saveAsTextFile(outDir, classOf[org.apache.hadoop.io.compress.GzipCodec]) If I start spark-shell with spark.kryo.registrator like this SPARK_JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.kryo.registrator=MyKryoRegistrator" spark-shell it complains that MyKroRegistrator not found when I run ":load my_script" in spark-shell. 14/08/20 12:14:01 ERROR KryoSerializer: Failed to run spark.kryo.registrator java.lang.ClassNotFoundException: MyKryoRegistrator What's wrong?
How to set KryoRegistrator class in spark-shell
I want to use opencsv's CSVParser to parse csv lines using a script like below in spark-shell: import au.com.bytecode.opencsv.CSVParser; import com.esotericsoftware.kryo.Kryo import org.apache.spark.serializer.KryoRegistrator import org.apache.hadoop.fs.{Path, FileSystem} class MyKryoRegistrator extends KryoRegistrator { override def registerClasses(kryo:Kryo) { kryo.register(classOf[CSVParser]) } } val outDir="/tmp/dmc-out" val fs = FileSystem.get(sc.hadoopConfiguration) fs.delete(new Path(outDir), true); val largeLines = sc.textFile("/tmp/dmc-03-08/*.gz") val parser = new CSVParser('|', '"') largeLines.map(parser.parseLine(_).toList).saveAsTextFile(outDir, classOf[org.apache.hadoop.io.compress.GzipCodec]) If I start spark-shell with spark.kryo.registrator like this SPARK_JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.kryo.registrator=MyKryoRegistrator" spark-shell it complains that MyKroRegistrator not found when I run ":load my_script" in spark-shell. 14/08/20 12:14:01 ERROR KryoSerializer: Failed to run spark.kryo.registrator java.lang.ClassNotFoundException: MyKryoRegistrator What's wrong?