Hi Shahid, To be honest I think this question is better suited for Stack Overflow than for a PhD thesis.
On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf <sha...@trialx.com> wrote: > hi > > I have a 10 node cluster i loaded the data onto hdfs, so the no. of > partitions i get is 9. I am running a spark application , it gets stuck on > one of tasks, looking at the UI it seems application is not using all nodes > to do calculations. attached is the screen shot of tasks, it seems tasks > are put on each node more then once. looking at tasks 8 tasks get completed > under 7-8 minutes and one task takes around 30 minutes so causing the delay > in results. > > > On Tue, Jul 14, 2015 at 10:48 AM, Shashidhar Rao < > raoshashidhar...@gmail.com> wrote: > >> Hi, >> >> I am doing my PHD thesis on large scale machine learning e.g Online >> learning, batch and mini batch learning. >> >> Could somebody help me with ideas especially in the context of Spark and >> to the above learning methods. >> >> Some ideas like improvement to existing algorithms, implementing new >> features especially the above learning methods and algorithms that have not >> been implemented etc. >> >> If somebody could help me with some ideas it would really accelerate my >> work. >> >> Plus few ideas on research papers regarding Spark or Mahout. >> >> Thanks in advance. >> >> Regards >> > > > > -- > with Regards > Shahid Ashraf > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >