RStudio is a separate product with its own support. Post there, not here. -- Bert
Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Tue, Jul 8, 2014 at 7:34 PM, Phan, Truong Q <troung.p...@team.telstra.com> wrote: > Hi R'er, > > I have a dataset which has a matrix of 7502 x 1426 (rows x columns). > The data is in a CSV format which has a size around 68Mb. This dataset is > less than 10% of our dataset. > I have been adopting the Anomaly detection method as described by > http://www.mattpeeples.net/kmeans.html . > It has been running more than 24hrs and still haven't completed the > calculation. > I did manage to run it with a smaller dataset (ie, 2100 rows x 1426 columns). > It took around 12hrs to run. > > I have a few questions and need your expertise guidance. > > 1) Is there any better Open source tools to use to do in one tool (eg, R > Studio): prepare data, build models, validate models, test models and present > data. I am looking a tool which will allow me to do the same as per the above > link (Matt Peeples' blog). > > 2) Is there an Open source tools to perform the above which will allow > me to run on top of Hadoop eco-system? > > 3) Can we use R Studio for windows as a client to run on top of Hadoop > eco-system? If yes, please point me to the site where they have a use cases > or samples. > > Thanks and Regards, > Truong Phan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.