Ok…
After having some off-line exchanges with Shashidhar Rao came up with an idea…
Apply machine learning to either implement or improve autoscaling up or down
within a Storm/Akka cluster.
While I don’t know what constitutes an acceptable PhD thesis, or senior project
for undergrads… this
Hi Daniel
Well said
Regards
Vineel
On Tue, Jul 14, 2015, 6:11 AM Daniel Darabos
daniel.dara...@lynxanalytics.com wrote:
Hi Shahid,
To be honest I think this question is better suited for Stack Overflow
than for a PhD thesis.
On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf
Try to repartition it to a higher number (at least 3-4 times the total # of
cpu cores). What operation are you doing? It may happen that if you are
doing a join/groupBy sort of operation that task which is taking time is
having all the values, in that case you need to use a Partitioner which
will
Sorry Guys!
I mistakenly added my question to this thread( Research ideas using spark).
Moreover people can ask any question , this spark user group is for that.
Cheers!
On Wed, Jul 15, 2015 at 9:43 PM, Robin East robin.e...@xense.co.uk wrote:
Well said Will. I would add that you might want
Look at this :
http://www.forbes.com/sites/lisabrownlee/2015/07/10/the-11-trillion-internet-of-things-big-data-and-pattern-of-life-pol-analytics/
On Wed, Jul 15, 2015 at 10:19 PM shahid ashraf sha...@trialx.com wrote:
Sorry Guys!
I mistakenly added my question to this thread( Research ideas
Silly question…
When thinking about a PhD thesis… do you want to tie it to a specific
technology or do you want to investigate an idea but then use a specific
technology.
Or is this an outdated way of thinking?
I am doing my PHD thesis on large scale machine learning e.g Online learning,
I would suggest study spark ,flink,strom and based on your understanding
and finding prepare your research paper.
May be you will invented new spark ☺
Regards,
Vaquar khan
On 16 Jul 2015 00:47, Michael Segel msegel_had...@hotmail.com wrote:
Silly question…
When thinking about a PhD thesis…
Well one of the strength of spark is standardized general distributed
processing allowing many different types of processing, such as graph
processing, stream processing etc. The limitation is that it is less
performant than one system focusing only on one type of processing (eg
graph processing).
Well said Will. I would add that you might want to investigate GraphChi which
claims to be able to run a number of large-scale graph processing tasks on a
workstation much quicker than a very large Hadoop cluster. It would be
interesting to know how widely applicable the approach GraphChi takes
There seems to be a bit of confusion here - the OP (doing the PhD) had the
thread hijacked by someone with a similar name asking a mundane question.
It would be a shame to send someone away so rudely, who may do valuable
work on Spark.
Sashidar (not Sashid!) I'm personally interested in running
Hi Shahid,
To be honest I think this question is better suited for Stack Overflow than
for a PhD thesis.
On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf sha...@trialx.com wrote:
hi
I have a 10 node cluster i loaded the data onto hdfs, so the no. of
partitions i get is 9. I am running a spark
Hi,
I am doing my PHD thesis on large scale machine learning e.g Online
learning, batch and mini batch learning.
Could somebody help me with ideas especially in the context of Spark and to
the above learning methods.
Some ideas like improvement to existing algorithms, implementing new
features
12 matches
Mail list logo