Glad to hear. Could you please share your solution on the user mailing list? -Xiangrui
On Mon, Jul 18, 2016 at 2:26 AM Alger Remirata <abremirat...@gmail.com> wrote: > Hi Xiangrui, > > We have now solved the problem. Thanks for all the tips you've given. > > Best Regards, > > Alger > > On Thu, Jul 14, 2016 at 2:43 AM, Alger Remirata <abremirat...@gmail.com> > wrote: > >> By the using cloudera manager for standalone cluster manager >> >> On Thu, Jul 14, 2016 at 2:20 AM, Alger Remirata <abremirat...@gmail.com> >> wrote: >> >>> It looks like there are a lot of people already having posted on >>> classNotFoundError on the cluster mode fro version 1.5.1. >>> >>> https://www.mail-archive.com/user@spark.apache.org/msg43089.html >>> >>> >>> >>> On Thu, Jul 14, 2016 at 12:45 AM, Alger Remirata <abremirat...@gmail.com >>> > wrote: >>> >>>> Hi Xiangrui, >>>> >>>> I check all the nodes of the cluster. It is working locally on each >>>> node but there's an error upon deploying it on the cluster itself. I don't >>>> know why it is and still don't understand why on individual node, it is >>>> working locally but when deployed to hadoop cluster, it gets the error >>>> mentioned. >>>> >>>> Thanks, >>>> >>>> Alger >>>> >>>> On Wed, Jul 13, 2016 at 4:38 AM, Alger Remirata <abremirat...@gmail.com >>>> > wrote: >>>> >>>>> Since we're using mvn to build, it looks like mvn didn't add the >>>>> class. Is there something on pom.xml to be added so that the new class can >>>>> be recognized? >>>>> >>>>> On Wed, Jul 13, 2016 at 4:21 AM, Alger Remirata < >>>>> abremirat...@gmail.com> wrote: >>>>> >>>>>> Thanks for the reply however I couldn't locate the MLlib jar. What I >>>>>> have is a fat 'spark-assembly-1.5.1-hadoop2.6.0.jar'. >>>>>> >>>>>> There's an error on me copying user@spark.apache.org. The message >>>>>> suddently is not sent when I do that. >>>>>> >>>>>> On Wed, Jul 13, 2016 at 4:13 AM, Alger Remirata < >>>>>> abremirat...@gmail.com> wrote: >>>>>> >>>>>>> Thanks for the reply however I couldn't locate the MLlib jar. What I >>>>>>> have is a fat 'spark-assembly-1.5.1-hadoop2.6.0.jar'. >>>>>>> >>>>>>> On Tue, Jul 12, 2016 at 3:23 AM, Xiangrui Meng <m...@databricks.com> >>>>>>> wrote: >>>>>>> >>>>>>>> (+user@spark. Please copy user@ so other people could see and >>>>>>>> help.) >>>>>>>> >>>>>>>> The error message means you have an MLlib jar on the classpath but >>>>>>>> it didn't contain ALS$StandardNNLSSolver. So it is either the >>>>>>>> modified jar not deployed to the workers or there existing an >>>>>>>> unmodified >>>>>>>> MLlib jar sitting in front of the modified one on the classpath. You >>>>>>>> can >>>>>>>> check the worker logs and see the classpath used in launching the >>>>>>>> worker, >>>>>>>> and then check the MLlib jars on that classpath. -Xiangrui >>>>>>>> >>>>>>>> On Sun, Jul 10, 2016 at 10:18 PM Alger Remirata < >>>>>>>> abremirat...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Xiangrui, >>>>>>>>> >>>>>>>>> We have the modified jars deployed both on master and slave nodes. >>>>>>>>> >>>>>>>>> What do you mean by this line?: 1. The unmodified Spark jars were >>>>>>>>> not on the classpath (already existed on the cluster or pulled in by >>>>>>>>> other >>>>>>>>> packages). >>>>>>>>> >>>>>>>>> How would I check that the unmodified Spark jars are not on the >>>>>>>>> classpath? We change entirely the contents of the directory for >>>>>>>>> SPARK_HOME. >>>>>>>>> The newly built customized spark is the new contents of the current >>>>>>>>> SPARK_HOME we have right now. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Alger >>>>>>>>> >>>>>>>>> On Fri, Jul 8, 2016 at 1:32 PM, Xiangrui Meng <m...@databricks.com >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> This seems like a deployment or dependency issue. Please check >>>>>>>>>> the following: >>>>>>>>>> 1. The unmodified Spark jars were not on the classpath (already >>>>>>>>>> existed on the cluster or pulled in by other packages). >>>>>>>>>> 2. The modified jars were indeed deployed to both master and >>>>>>>>>> slave nodes. >>>>>>>>>> >>>>>>>>>> On Tue, Jul 5, 2016 at 12:29 PM Alger Remirata < >>>>>>>>>> abremirat...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> First of all, we like to thank you for developing spark. This >>>>>>>>>>> helps us a lot on our data science task. >>>>>>>>>>> >>>>>>>>>>> I have a question. We have build a customized spark using the >>>>>>>>>>> following command: >>>>>>>>>>> mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive >>>>>>>>>>> -Phive-thriftserver -DskipTests clean package. >>>>>>>>>>> >>>>>>>>>>> On the custom spark we built, we've added a new scala file or >>>>>>>>>>> package called StandardNNLS file however it got an error saying: >>>>>>>>>>> >>>>>>>>>>> Name: org.apache.spark.SparkException >>>>>>>>>>> Message: Job aborted due to stage failure: Task 21 in stage 34.0 >>>>>>>>>>> failed 4 times, most recent failure: Lost task 21.3 in stage 34.0 >>>>>>>>>>> (TID >>>>>>>>>>> 2547, 192.168.60.115): java.lang.ClassNotFoundException: >>>>>>>>>>> org.apache.spark.ml.recommendation.ALS$StandardNNLSSolver >>>>>>>>>>> >>>>>>>>>>> StandardNNLSolver is found on another scala file called >>>>>>>>>>> StandardNNLS.scala >>>>>>>>>>> as we replace the original NNLS solver from scala with >>>>>>>>>>> StandardNNLS >>>>>>>>>>> Do you guys have some idea about the error. Is there a config >>>>>>>>>>> file we need to edit to add the classpath? Even if we insert the >>>>>>>>>>> added >>>>>>>>>>> codes in ALS.scala and not create another file like >>>>>>>>>>> StandardNNLS.scala, the >>>>>>>>>>> inserted code is not recognized. It still gets an error regarding >>>>>>>>>>> ClassNotFoundException >>>>>>>>>>> >>>>>>>>>>> However, when we run this on our local machine and not on the >>>>>>>>>>> hadoop cluster, it is working. We don't know if the error is >>>>>>>>>>> because we are >>>>>>>>>>> using mvn to build custom spark or it has something to do with >>>>>>>>>>> communicating to hadoop cluster. >>>>>>>>>>> >>>>>>>>>>> We would like to ask some ideas from you how to solve this >>>>>>>>>>> problem. We can actually create another package not dependent to >>>>>>>>>>> Apache >>>>>>>>>>> Spark but this is so slow. As of now, we are still learning scala >>>>>>>>>>> and >>>>>>>>>>> spark. Using Apache spark utilities make the code faster. However, >>>>>>>>>>> if we'll >>>>>>>>>>> make another package not dependent to apache spark, we have to >>>>>>>>>>> recode the >>>>>>>>>>> utilities that are set private in Apache Spark. So, it is better to >>>>>>>>>>> use >>>>>>>>>>> Apache Spark and insert some code that we can use. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Alger >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >