Re: Custom Spark Error on Hadoop Cluster

Xiangrui Meng Mon, 18 Jul 2016 06:42:13 -0700

Glad to hear. Could you please share your solution on the user mailing
list? -Xiangrui


On Mon, Jul 18, 2016 at 2:26 AM Alger Remirata <abremirat...@gmail.com>
wrote:

> Hi Xiangrui,
>
> We have now solved the problem. Thanks for all the tips you've given.
>
> Best Regards,
>
> Alger
>
> On Thu, Jul 14, 2016 at 2:43 AM, Alger Remirata <abremirat...@gmail.com>
> wrote:
>
>> By the using cloudera manager for standalone cluster manager
>>
>> On Thu, Jul 14, 2016 at 2:20 AM, Alger Remirata <abremirat...@gmail.com>
>> wrote:
>>
>>> It looks like there are a lot of people already having posted on
>>> classNotFoundError on the cluster mode fro version 1.5.1.
>>>
>>> https://www.mail-archive.com/user@spark.apache.org/msg43089.html
>>>
>>>
>>>
>>> On Thu, Jul 14, 2016 at 12:45 AM, Alger Remirata <abremirat...@gmail.com
>>> > wrote:
>>>
>>>> Hi Xiangrui,
>>>>
>>>> I check all the nodes of the cluster. It is working locally on each
>>>> node but there's an error upon deploying it on the cluster itself. I don't
>>>> know why it is and still don't understand why on individual node, it is
>>>> working locally but when deployed to hadoop cluster, it gets the error
>>>> mentioned.
>>>>
>>>> Thanks,
>>>>
>>>> Alger
>>>>
>>>> On Wed, Jul 13, 2016 at 4:38 AM, Alger Remirata <abremirat...@gmail.com
>>>> > wrote:
>>>>
>>>>> Since we're using mvn to build, it looks like mvn didn't add the
>>>>> class. Is there something on pom.xml to be added so that the new class can
>>>>> be recognized?
>>>>>
>>>>> On Wed, Jul 13, 2016 at 4:21 AM, Alger Remirata <
>>>>> abremirat...@gmail.com> wrote:
>>>>>
>>>>>> Thanks for the reply however I couldn't locate the MLlib jar. What I
>>>>>> have is a fat 'spark-assembly-1.5.1-hadoop2.6.0.jar'.
>>>>>>
>>>>>> There's an error on me copying user@spark.apache.org. The message
>>>>>> suddently is not sent when I do that.
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 4:13 AM, Alger Remirata <
>>>>>> abremirat...@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks for the reply however I couldn't locate the MLlib jar. What I
>>>>>>> have is a fat 'spark-assembly-1.5.1-hadoop2.6.0.jar'.
>>>>>>>
>>>>>>> On Tue, Jul 12, 2016 at 3:23 AM, Xiangrui Meng <m...@databricks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> (+user@spark. Please copy user@ so other people could see and
>>>>>>>> help.)
>>>>>>>>
>>>>>>>> The error message means you have an MLlib jar on the classpath but
>>>>>>>> it didn't contain ALS$StandardNNLSSolver. So it is either the
>>>>>>>> modified jar not deployed to the workers or there existing an 
>>>>>>>> unmodified
>>>>>>>> MLlib jar sitting in front of the modified one on the classpath. You 
>>>>>>>> can
>>>>>>>> check the worker logs and see the classpath used in launching the 
>>>>>>>> worker,
>>>>>>>> and then check the MLlib jars on that classpath. -Xiangrui
>>>>>>>>
>>>>>>>> On Sun, Jul 10, 2016 at 10:18 PM Alger Remirata <
>>>>>>>> abremirat...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Xiangrui,
>>>>>>>>>
>>>>>>>>> We have the modified jars deployed both on master and slave nodes.
>>>>>>>>>
>>>>>>>>> What do you mean by this line?: 1. The unmodified Spark jars were
>>>>>>>>> not on the classpath (already existed on the cluster or pulled in by 
>>>>>>>>> other
>>>>>>>>> packages).
>>>>>>>>>
>>>>>>>>> How would I check that the unmodified Spark jars are not on the
>>>>>>>>> classpath? We change entirely the contents of the directory for 
>>>>>>>>> SPARK_HOME.
>>>>>>>>> The newly built customized spark is the new contents of the current
>>>>>>>>> SPARK_HOME we have right now.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Alger
>>>>>>>>>
>>>>>>>>> On Fri, Jul 8, 2016 at 1:32 PM, Xiangrui Meng <m...@databricks.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> This seems like a deployment or dependency issue. Please check
>>>>>>>>>> the following:
>>>>>>>>>> 1. The unmodified Spark jars were not on the classpath (already
>>>>>>>>>> existed on the cluster or pulled in by other packages).
>>>>>>>>>> 2. The modified jars were indeed deployed to both master and
>>>>>>>>>> slave nodes.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 5, 2016 at 12:29 PM Alger Remirata <
>>>>>>>>>> abremirat...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> First of all, we like to thank you for developing spark. This
>>>>>>>>>>> helps us a lot on our data science task.
>>>>>>>>>>>
>>>>>>>>>>> I have a question. We have build a customized spark using the
>>>>>>>>>>> following command:
>>>>>>>>>>> mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive
>>>>>>>>>>> -Phive-thriftserver -DskipTests clean package.
>>>>>>>>>>>
>>>>>>>>>>> On the custom spark we built, we've added a new scala file or
>>>>>>>>>>> package called StandardNNLS file however it got an error saying:
>>>>>>>>>>>
>>>>>>>>>>> Name: org.apache.spark.SparkException
>>>>>>>>>>> Message: Job aborted due to stage failure: Task 21 in stage 34.0
>>>>>>>>>>> failed 4 times, most recent failure: Lost task 21.3 in stage 34.0 
>>>>>>>>>>> (TID
>>>>>>>>>>> 2547, 192.168.60.115): java.lang.ClassNotFoundException:
>>>>>>>>>>> org.apache.spark.ml.recommendation.ALS$StandardNNLSSolver
>>>>>>>>>>>
>>>>>>>>>>> StandardNNLSolver is found on another scala file called
>>>>>>>>>>> StandardNNLS.scala
>>>>>>>>>>> as we replace the original NNLS solver from scala with
>>>>>>>>>>> StandardNNLS
>>>>>>>>>>> Do you guys have some idea about the error. Is there a config
>>>>>>>>>>> file we need to edit to add the classpath? Even if we insert the 
>>>>>>>>>>> added
>>>>>>>>>>> codes in ALS.scala and not create another file like 
>>>>>>>>>>> StandardNNLS.scala, the
>>>>>>>>>>> inserted code is not recognized. It still gets an error regarding
>>>>>>>>>>> ClassNotFoundException
>>>>>>>>>>>
>>>>>>>>>>> However, when we run this on our local machine and not on the
>>>>>>>>>>> hadoop cluster, it is working. We don't know if the error is 
>>>>>>>>>>> because we are
>>>>>>>>>>> using mvn to build custom spark or it has something to do with
>>>>>>>>>>> communicating to hadoop cluster.
>>>>>>>>>>>
>>>>>>>>>>> We would like to ask some ideas from you how to solve this
>>>>>>>>>>> problem. We can actually create another package not dependent to 
>>>>>>>>>>> Apache
>>>>>>>>>>> Spark but this is so slow. As of now, we are still learning scala 
>>>>>>>>>>> and
>>>>>>>>>>> spark. Using Apache spark utilities make the code faster. However, 
>>>>>>>>>>> if we'll
>>>>>>>>>>> make another package not dependent to apache spark, we have to 
>>>>>>>>>>> recode the
>>>>>>>>>>> utilities that are set private in Apache Spark. So, it is better to 
>>>>>>>>>>> use
>>>>>>>>>>> Apache Spark and insert some code that we can use.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Alger
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Custom Spark Error on Hadoop Cluster

Reply via email to