Re: Research ideas using spark

Ravindra Wed, 15 Jul 2015 11:20:48 -0700

Look at this :
http://www.forbes.com/sites/lisabrownlee/2015/07/10/the-11-trillion-internet-of-things-big-data-and-pattern-of-life-pol-analytics/


On Wed, Jul 15, 2015 at 10:19 PM shahid ashraf <sha...@trialx.com> wrote:

> Sorry Guys!
>
> I mistakenly added my question to this thread( Research ideas using
> spark). Moreover people can ask any question , this spark user group is for
> that.
>
> Cheers!
> 😊
>
> On Wed, Jul 15, 2015 at 9:43 PM, Robin East <robin.e...@xense.co.uk>
> wrote:
>
>> Well said Will. I would add that you might want to investigate GraphChi
>> which claims to be able to run a number of large-scale graph processing
>> tasks on a workstation much quicker than a very large Hadoop cluster. It
>> would be interesting to know how widely applicable the approach GraphChi
>> takes and what implications it has for parallel/distributed computing
>> approaches. A rich seam to mine indeed.
>>
>> Robin
>>
>> On 15 Jul 2015, at 14:48, William Temperley <willtemper...@gmail.com>
>> wrote:
>>
>> There seems to be a bit of confusion here - the OP (doing the PhD) had
>> the thread hijacked by someone with a similar name asking a mundane
>> question.
>>
>> It would be a shame to send someone away so rudely, who may do valuable
>> work on Spark.
>>
>> Sashidar (not Sashid!) I'm personally interested in running graph
>> algorithms for image segmentation using MLib and Spark.  I've got many
>> questions though - like is it even going to give me a speed-up?  (
>> http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html)
>>
>> It's not obvious to me which classes of graph algorithms can be
>> implemented correctly and efficiently in a highly parallel manner.  There's
>> tons of work to be done here, I'm sure. Also, look at parallel geospatial
>> algorithms - there's a lot of work being done on this.
>>
>> Best, Will
>>
>>
>>
>> On 15 July 2015 at 09:01, Vineel Yalamarthy <vineelyalamar...@gmail.com>
>> wrote:
>>
>>> Hi Daniel
>>>
>>> Well said
>>>
>>> Regards
>>> Vineel
>>>
>>> On Tue, Jul 14, 2015, 6:11 AM Daniel Darabos <
>>> daniel.dara...@lynxanalytics.com> wrote:
>>>
>>>> Hi Shahid,
>>>> To be honest I think this question is better suited for Stack Overflow
>>>> than for a PhD thesis.
>>>>
>>>> On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf <sha...@trialx.com>
>>>> wrote:
>>>>
>>>>> hi
>>>>>
>>>>> I have a 10 node cluster  i loaded the data onto hdfs, so the no. of
>>>>> partitions i get is 9. I am running a spark application , it gets stuck on
>>>>> one of tasks, looking at the UI it seems application is not using all 
>>>>> nodes
>>>>> to do calculations. attached is the screen shot of tasks, it seems tasks
>>>>> are put on each node more then once. looking at tasks 8 tasks get 
>>>>> completed
>>>>> under 7-8 minutes and one task takes around 30 minutes so causing the 
>>>>> delay
>>>>> in results.
>>>>>
>>>>>
>>>>> On Tue, Jul 14, 2015 at 10:48 AM, Shashidhar Rao <
>>>>> raoshashidhar...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am doing my PHD thesis on large scale machine learning e.g  Online
>>>>>> learning, batch and mini batch learning.
>>>>>>
>>>>>> Could somebody help me with ideas especially in the context of Spark
>>>>>> and to the above learning methods.
>>>>>>
>>>>>> Some ideas like improvement to existing algorithms, implementing new
>>>>>> features especially the above learning methods and algorithms that have 
>>>>>> not
>>>>>> been implemented etc.
>>>>>>
>>>>>> If somebody could help me with some ideas it would really accelerate
>>>>>> my work.
>>>>>>
>>>>>> Plus few ideas on research papers regarding Spark or Mahout.
>>>>>>
>>>>>> Thanks in advance.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> with Regards
>>>>> Shahid Ashraf
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>
>>>>
>>>>
>>
>>
>
>
> --
> with Regards
> Shahid Ashraf
>

Re: Research ideas using spark

Reply via email to