As far as I can understand, your requirements are pretty straight forward
and doable with just simple SQL queries. Take a look at Spark SQL on spark
documentation.

Prashant Sharma



On Tue, Apr 12, 2016 at 8:13 PM, Joe San <codeintheo...@gmail.com> wrote:

> up vote
> down votefavorite
> <http://datascience.stackexchange.com/questions/11167/algorithm-suggestion-for-a-specific-problem/11174?noredirect=1#>
>
> I'm working on a problem where in I have some data sets about some power
> generating units. Each of these units have been activated to run in the
> past and while activation, some units went into some issues. I now have all
> these data and I would like to come up with some sort of Ranking for these
> generating units. The criteria for ranking would be pretty simple to start
> with. They are:
>
>    1. Maximum number of times a particular generating unit was activated
>    2. How many times did the generating unit ran into problems during
>    activation
>
> Later on I would expand on this ranking algorithm by adding more criteria.
> I will be using Apache Spark MLIB library and I can already see that there
> are quite a few algorithms already in place.
>
> http://spark.apache.org/docs/latest/mllib-guide.html
>
> I'm just not sure which algorithm would fit my purpose. Any suggestions?
>

Reply via email to