I would rather have less features to make better inferences on the data based on the smaller number of factors, Any suggestions Sean ?
On Mon, Aug 8, 2016 at 11:37 PM, Sean Owen <so...@cloudera.com> wrote: > Yes, that's exactly what PCA is for as Sivakumaran noted. Do you > really want to select features or just obtain a lower-dimensional > representation of them, with less redundancy? > > On Mon, Aug 8, 2016 at 4:10 PM, Tony Lane <tonylane....@gmail.com> wrote: > > There must be an algorithmic way to figure out which of these factors > > contribute the least and remove them in the analysis. > > I am hoping same one can throw some insight on this. > > > > On Mon, Aug 8, 2016 at 7:41 PM, Sivakumaran S <siva.kuma...@me.com> > wrote: > >> > >> Not an expert here, but the first step would be devote some time and > >> identify which of these 112 factors are actually causative. Some domain > >> knowledge of the data may be required. Then, you can start of with PCA. > >> > >> HTH, > >> > >> Regards, > >> > >> Sivakumaran S > >> > >> On 08-Aug-2016, at 3:01 PM, Tony Lane <tonylane....@gmail.com> wrote: > >> > >> Great question Rohit. I am in my early days of ML as well and it would > be > >> great if we get some idea on this from other experts on this group. > >> > >> I know we can reduce dimensions by using PCA, but i think that does not > >> allow us to understand which factors from the original are we using in > the > >> end. > >> > >> - Tony L. > >> > >> On Mon, Aug 8, 2016 at 5:12 PM, Rohit Chaddha < > rohitchaddha1...@gmail.com> > >> wrote: > >>> > >>> > >>> I have a data-set where each data-point has 112 factors. > >>> > >>> I want to remove the factors which are not relevant, and say reduce to > 20 > >>> factors out of these 112 and then do clustering of data-points using > these > >>> 20 factors. > >>> > >>> How do I do these and how do I figure out which of the 20 factors are > >>> useful for analysis. > >>> > >>> I see SVD and PCA implementations, but I am not sure if these give > which > >>> elements are removed and which are remaining. > >>> > >>> Can someone please help me understand what to do here > >>> > >>> thanks, > >>> -Rohit > >>> > >> > >> > > >