Some Spark MLLIB tests failing due to some classes not being registered with Kryo
Hi Dev, I'm running the MLLIB tests in the current Master branch and the following Suites are failing due to some classes not being registered with Kryo: org.apache.spark.mllib.MatricesSuite org.apache.spark.mllib.VectorsSuite org.apache.spark.ml.InstanceSuite I can solve it by registering the failing classes with Kryo, but I'm wondering if I'm missing something as these tests shouldn't be failing from Master. Any suggestions on what I may be doing wrong? Thank you.
Re: Some Spark MLLIB tests failing due to some classes not being registered with Kryo
Hi Jorge, then try running the tests not from the mllib folder, but on Spark base directory. If you want to run only the tests in mllib, you can specify the project using the -pl argument of mvn. Thanks, Marco 2017-11-11 13:37 GMT+01:00 Jorge Sánchez : > Hi Marco, > > Just mvn test from the mllib folder. > > Thank you. > > 2017-11-11 12:36 GMT+00:00 Marco Gaido : > >> Hi Jorge, >> >> how are you running those tests? >> >> Thanks, >> Marco >> >> 2017-11-11 13:21 GMT+01:00 Jorge Sánchez : >> >>> Hi Dev, >>> >>> I'm running the MLLIB tests in the current Master branch and the >>> following Suites are failing due to some classes not being registered with >>> Kryo: >>> >>> org.apache.spark.mllib.MatricesSuite >>> org.apache.spark.mllib.VectorsSuite >>> org.apache.spark.ml.InstanceSuite >>> >>> I can solve it by registering the failing classes with Kryo, but I'm >>> wondering if I'm missing something as these tests shouldn't be failing from >>> Master. >>> >>> Any suggestions on what I may be doing wrong? >>> >>> Thank you. >>> >> >> >
Re: Some Spark MLLIB tests failing due to some classes not being registered with Kryo
No luck running the full test suites with mvn test from the main folder or just mvn -pl mllib. Any other suggestion would be much appreciated. Thank you. 2017-11-11 12:46 GMT+00:00 Marco Gaido : > Hi Jorge, > > then try running the tests not from the mllib folder, but on Spark base > directory. > If you want to run only the tests in mllib, you can specify the project > using the -pl argument of mvn. > > Thanks, > Marco > > > > 2017-11-11 13:37 GMT+01:00 Jorge Sánchez : > >> Hi Marco, >> >> Just mvn test from the mllib folder. >> >> Thank you. >> >> 2017-11-11 12:36 GMT+00:00 Marco Gaido : >> >>> Hi Jorge, >>> >>> how are you running those tests? >>> >>> Thanks, >>> Marco >>> >>> 2017-11-11 13:21 GMT+01:00 Jorge Sánchez : >>> Hi Dev, I'm running the MLLIB tests in the current Master branch and the following Suites are failing due to some classes not being registered with Kryo: org.apache.spark.mllib.MatricesSuite org.apache.spark.mllib.VectorsSuite org.apache.spark.ml.InstanceSuite I can solve it by registering the failing classes with Kryo, but I'm wondering if I'm missing something as these tests shouldn't be failing from Master. Any suggestions on what I may be doing wrong? Thank you. >>> >>> >> >
how to replace hdfs with a custom distributed fs ?
hi i have my distributed java fs and i would like to implement my class for storing data in spark. How to do? it there a example how to do?
Re: how to replace hdfs with a custom distributed fs ?
You can implement the Hadoop FileSystem API for your distributed java fs and just plug into Spark using the Hadoop API. On Sat, Nov 11, 2017 at 9:37 AM, Cristian Lorenzetto < cristian.lorenze...@gmail.com> wrote: > hi i have my distributed java fs and i would like to implement my class > for storing data in spark. > How to do? it there a example how to do? >
is there a way for removing hadoop from spark
Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark? Is YARN the unique dependency in spark? is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers? YARN lib is difficult to customize? I made different questions for understanding what is the better way for me
Re: is there a way for removing hadoop from spark
Hey Cristian, You don’t need to remove anything. Spark has a standalone mode. Actually that’s the default. https://spark.apache.org/docs/latest/spark-standalone.html When building Spark (and you should build it yourself), just use the option that suits you: https://spark.apache.org/docs/latest/building-spark.html Regards, Yohann Jardin Le 11-Nov-17 à 6:42 PM, Cristian Lorenzetto a écrit : Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark? Is YARN the unique dependency in spark? is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers? YARN lib is difficult to customize? I made different questions for understanding what is the better way for me
Spark json data - avro schema validation
Hi I have a spark streaming application which receives logs that has encoded json in it. The json complies to a avro schema and part of the process I m converting the json to a data class which of course is a row in dataset. It’s a nested object indeed. In this scenario I m looking to validate the inbound json to see if it complies to the definition of avro schema. I m not finding any approach that already exists to perform this validation or not aware of . I am hoping to get some direction from this group to get going on the validation front. Thanks Barath.