There must be an algorithmic way to figure out which of these factors contribute the least and remove them in the analysis. I am hoping same one can throw some insight on this.
On Mon, Aug 8, 2016 at 7:41 PM, Sivakumaran S <siva.kuma...@me.com> wrote: > Not an expert here, but the first step would be devote some time and > identify which of these 112 factors are actually causative. Some domain > knowledge of the data may be required. Then, you can start of with PCA. > > HTH, > > Regards, > > Sivakumaran S > > On 08-Aug-2016, at 3:01 PM, Tony Lane <tonylane....@gmail.com> wrote: > > Great question Rohit. I am in my early days of ML as well and it would be > great if we get some idea on this from other experts on this group. > > I know we can reduce dimensions by using PCA, but i think that does not > allow us to understand which factors from the original are we using in the > end. > > - Tony L. > > On Mon, Aug 8, 2016 at 5:12 PM, Rohit Chaddha <rohitchaddha1...@gmail.com> > wrote: > >> >> I have a data-set where each data-point has 112 factors. >> >> I want to remove the factors which are not relevant, and say reduce to 20 >> factors out of these 112 and then do clustering of data-points using these >> 20 factors. >> >> How do I do these and how do I figure out which of the 20 factors are >> useful for analysis. >> >> I see SVD and PCA implementations, but I am not sure if these give which >> elements are removed and which are remaining. >> >> Can someone please help me understand what to do here >> >> thanks, >> -Rohit >> >> > >