Re: Does MLLib has attribute importance?

2015-06-18 Thread Debasish Das
Running l1 and picking non zero coefficient s gives a good estimate of interesting features as well... On Jun 17, 2015 4:51 PM, Xiangrui Meng men...@gmail.com wrote: We don't have it in MLlib. The closest would be the ChiSqSelector, which works for categorical data. -Xiangrui On Thu, Jun 11,

Re: Does MLLib has attribute importance?

2015-06-18 Thread Xiangrui Meng
ChiSqSelector calls an RDD of labeled points, where the label is the target. See https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala#L120 On Wed, Jun 17, 2015 at 10:22 PM, Ruslan Dautkhanov dautkha...@gmail.com wrote: Thank you

Re: Does MLLib has attribute importance?

2015-06-18 Thread Ruslan Dautkhanov
Got it. Thanks! -- Ruslan Dautkhanov On Thu, Jun 18, 2015 at 1:02 PM, Xiangrui Meng men...@gmail.com wrote: ChiSqSelector calls an RDD of labeled points, where the label is the target. See

Re: Does MLLib has attribute importance?

2015-06-17 Thread Xiangrui Meng
We don't have it in MLlib. The closest would be the ChiSqSelector, which works for categorical data. -Xiangrui On Thu, Jun 11, 2015 at 4:33 PM, Ruslan Dautkhanov dautkha...@gmail.com wrote: What would be closest equivalent in MLLib to Oracle Data Miner's Attribute Importance mining function?

Re: Does MLLib has attribute importance?

2015-06-17 Thread Ruslan Dautkhanov
Thank you Xiangrui. Oracle's attribute importance mining function have a target variable. Attribute importance is a supervised function that ranks attributes according to their significance in predicting a target. MLlib's ChiSqSelector does not have a target variable. -- Ruslan Dautkhanov

Does MLLib has attribute importance?

2015-06-11 Thread Ruslan Dautkhanov
What would be closest equivalent in MLLib to Oracle Data Miner's Attribute Importance mining function? http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/feature_extr.htm#i1005920 Attribute importance is a supervised function that ranks attributes according to their significance in