Re: Research ideas using spark

shahid ashraf Wed, 15 Jul 2015 09:49:36 -0700

Sorry Guys!

I mistakenly added my question to this thread( Research ideas using spark).
Moreover people can ask any question , this spark user group is for that.


Cheers!
😊

On Wed, Jul 15, 2015 at 9:43 PM, Robin East <robin.e...@xense.co.uk> wrote:

> Well said Will. I would add that you might want to investigate GraphChi
> which claims to be able to run a number of large-scale graph processing
> tasks on a workstation much quicker than a very large Hadoop cluster. It
> would be interesting to know how widely applicable the approach GraphChi
> takes and what implications it has for parallel/distributed computing
> approaches. A rich seam to mine indeed.
>
> Robin
>
> On 15 Jul 2015, at 14:48, William Temperley <willtemper...@gmail.com>
> wrote:
>
> There seems to be a bit of confusion here - the OP (doing the PhD) had the
> thread hijacked by someone with a similar name asking a mundane question.
>
> It would be a shame to send someone away so rudely, who may do valuable
> work on Spark.
>
> Sashidar (not Sashid!) I'm personally interested in running graph
> algorithms for image segmentation using MLib and Spark.  I've got many
> questions though - like is it even going to give me a speed-up?  (
> http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html)
>
> It's not obvious to me which classes of graph algorithms can be
> implemented correctly and efficiently in a highly parallel manner.  There's
> tons of work to be done here, I'm sure. Also, look at parallel geospatial
> algorithms - there's a lot of work being done on this.
>
> Best, Will
>
>
>
> On 15 July 2015 at 09:01, Vineel Yalamarthy <vineelyalamar...@gmail.com>
> wrote:
>
>> Hi Daniel
>>
>> Well said
>>
>> Regards
>> Vineel
>>
>> On Tue, Jul 14, 2015, 6:11 AM Daniel Darabos <
>> daniel.dara...@lynxanalytics.com> wrote:
>>
>>> Hi Shahid,
>>> To be honest I think this question is better suited for Stack Overflow
>>> than for a PhD thesis.
>>>
>>> On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf <sha...@trialx.com>
>>> wrote:
>>>
>>>> hi
>>>>
>>>> I have a 10 node cluster  i loaded the data onto hdfs, so the no. of
>>>> partitions i get is 9. I am running a spark application , it gets stuck on
>>>> one of tasks, looking at the UI it seems application is not using all nodes
>>>> to do calculations. attached is the screen shot of tasks, it seems tasks
>>>> are put on each node more then once. looking at tasks 8 tasks get completed
>>>> under 7-8 minutes and one task takes around 30 minutes so causing the delay
>>>> in results.
>>>>
>>>>
>>>> On Tue, Jul 14, 2015 at 10:48 AM, Shashidhar Rao <
>>>> raoshashidhar...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am doing my PHD thesis on large scale machine learning e.g  Online
>>>>> learning, batch and mini batch learning.
>>>>>
>>>>> Could somebody help me with ideas especially in the context of Spark
>>>>> and to the above learning methods.
>>>>>
>>>>> Some ideas like improvement to existing algorithms, implementing new
>>>>> features especially the above learning methods and algorithms that have 
>>>>> not
>>>>> been implemented etc.
>>>>>
>>>>> If somebody could help me with some ideas it would really accelerate
>>>>> my work.
>>>>>
>>>>> Plus few ideas on research papers regarding Spark or Mahout.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Regards
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> with Regards
>>>> Shahid Ashraf
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>
>>>
>
>


-- 
with Regards
Shahid Ashraf

Re: Research ideas using spark

Reply via email to