SparkR and MLlib are becoming more integrated (we recently added R formula support) but the integration is still quite small. If you learn R and SparkR, you will not be able to leverage most of the distributed algorithms in MLlib (e.g. all the algorithms you cited). However, you could use the equivalent R implementations (e.g. glm for Logistic) but be aware that these will not scale to the large scale datasets Spark is designed to handle.
On Thu, Aug 6, 2015 at 8:06 PM, praveen S <mylogi...@gmail.com> wrote: > I am starting off with classification models, Logistic,RandomForest. > Basically wanted to learn Machine learning. > Since I have a java background I started off with MLib, but later heard R > works as well ( with scaling issues - only). > > So, with SparkR was wondering the scaling issue would be resolved - hence > my question why not go with R and Spark R alone.( keeping aside my > inclination towards java) > > On Thu, Aug 6, 2015 at 12:28 AM, Charles Earl <charles.ce...@gmail.com> > wrote: > >> What machine learning algorithms are you interested in exploring or >> using? Start from there or better yet the problem you are trying to solve, >> and then the selection may be evident. >> >> >> On Wednesday, August 5, 2015, praveen S <mylogi...@gmail.com> wrote: >> >>> I was wondering when one should go for MLib or SparkR. What is the >>> criteria or what should be considered before choosing either of the >>> solutions for data analysis? >>> or What is the advantages of Spark MLib over Spark R or advantages of >>> SparkR over MLib? >>> >> >> >> -- >> - Charles >> > >