There seems to be a bit of confusion here - the OP (doing the PhD) had the
thread hijacked by someone with a similar name asking a mundane question.

It would be a shame to send someone away so rudely, who may do valuable
work on Spark.

Sashidar (not Sashid!) I'm personally interested in running graph
algorithms for image segmentation using MLib and Spark.  I've got many
questions though - like is it even going to give me a speed-up?  (
http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html)

It's not obvious to me which classes of graph algorithms can be implemented
correctly and efficiently in a highly parallel manner.  There's tons of
work to be done here, I'm sure. Also, look at parallel geospatial
algorithms - there's a lot of work being done on this.

Best, Will



On 15 July 2015 at 09:01, Vineel Yalamarthy <vineelyalamar...@gmail.com>
wrote:

> Hi Daniel
>
> Well said
>
> Regards
> Vineel
>
> On Tue, Jul 14, 2015, 6:11 AM Daniel Darabos <
> daniel.dara...@lynxanalytics.com> wrote:
>
>> Hi Shahid,
>> To be honest I think this question is better suited for Stack Overflow
>> than for a PhD thesis.
>>
>> On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf <sha...@trialx.com> wrote:
>>
>>> hi
>>>
>>> I have a 10 node cluster  i loaded the data onto hdfs, so the no. of
>>> partitions i get is 9. I am running a spark application , it gets stuck on
>>> one of tasks, looking at the UI it seems application is not using all nodes
>>> to do calculations. attached is the screen shot of tasks, it seems tasks
>>> are put on each node more then once. looking at tasks 8 tasks get completed
>>> under 7-8 minutes and one task takes around 30 minutes so causing the delay
>>> in results.
>>>
>>>
>>> On Tue, Jul 14, 2015 at 10:48 AM, Shashidhar Rao <
>>> raoshashidhar...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am doing my PHD thesis on large scale machine learning e.g  Online
>>>> learning, batch and mini batch learning.
>>>>
>>>> Could somebody help me with ideas especially in the context of Spark
>>>> and to the above learning methods.
>>>>
>>>> Some ideas like improvement to existing algorithms, implementing new
>>>> features especially the above learning methods and algorithms that have not
>>>> been implemented etc.
>>>>
>>>> If somebody could help me with some ideas it would really accelerate my
>>>> work.
>>>>
>>>> Plus few ideas on research papers regarding Spark or Mahout.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards
>>>>
>>>
>>>
>>>
>>> --
>>> with Regards
>>> Shahid Ashraf
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>
>>

Reply via email to