[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Summary: IDFModel.transform() add support for single vector (was: IDFModel.transform() add support for single vectors) > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation in mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using single vectors on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Description: For now when using the tfidf implementation of mllib you have no other possibility to map your data back onto i.e. labels or ids than use a hackish way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just vectors and apply IDFModel 3. zip with original RDD 4. transform label and new vector to LabeledPoint{quote} Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] I think as in production alot of users want to map their data back to some identifier, it would be a good imporvement to allow using single vectors on IDFModel.transform() was: For now when using the tfidf implementation in mllib you have no other possibility to map your data back onto i.e. labels or ids than use a hackish way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just vectors and apply IDFModel 3. zip with original RDD 4. transform label and new vector to LabeledPoint{quote} Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] I think as in production alot of users want to map their data back to some identifier, it would be a good imporvement to allow using single vectors on IDFModel.transform() > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using single vectors on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Description: For now when using the tfidf implementation of mllib you have no other possibility to map your data back onto i.e. labels or ids than use a hackish way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just vectors and apply IDFModel 3. zip with original RDD 4. transform label and new vector to LabeledPoint{quote} Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] I think as in production alot of users want to map their data back to some identifier, it would be a good imporvement to allow using single vector on IDFModel.transform() was: For now when using the tfidf implementation of mllib you have no other possibility to map your data back onto i.e. labels or ids than use a hackish way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just vectors and apply IDFModel 3. zip with original RDD 4. transform label and new vector to LabeledPoint{quote} Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] I think as in production alot of users want to map their data back to some identifier, it would be a good imporvement to allow using single vectors on IDFModel.transform() > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Description: For now when using the tfidf implementation of mllib you have no other possibility to map your data back onto i.e. labels or ids than use a hackish way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just vectors and apply IDFModel 3. zip with original RDD 4. transform label and new vector to LabeledPoint{quote} Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] I think as in production alot of users want to map their data back to some identifier, it would be a good imporvement to allow using a single vector on IDFModel.transform() was: For now when using the tfidf implementation of mllib you have no other possibility to map your data back onto i.e. labels or ids than use a hackish way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just vectors and apply IDFModel 3. zip with original RDD 4. transform label and new vector to LabeledPoint{quote} Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] I think as in production alot of users want to map their data back to some identifier, it would be a good imporvement to allow using single vector on IDFModel.transform() > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Shepherd: Xiangrui Meng > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Affects Version/s: 1.1.0 > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 1.1.0 >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4494: - Affects Version/s: (was: 1.1.0) 1.1.1 > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 1.1.1, 1.2.0 >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4494: - Affects Version/s: 1.2.0 > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 1.1.1, 1.2.0 >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4494: - Target Version/s: 1.3.0 (was: 1.1.1) > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 1.1.1, 1.2.0 >Reporter: Jean-Philippe Quemener > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4494: - Priority: Minor (was: Major) > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 1.1.1, 1.2.0 >Reporter: Jean-Philippe Quemener >Priority: Minor > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4494: - Assignee: Yu Ishikawa > IDFModel.transform() add support for single vector > -- > > Key: SPARK-4494 > URL: https://issues.apache.org/jira/browse/SPARK-4494 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 1.1.1, 1.2.0 >Reporter: Jean-Philippe Quemener >Assignee: Yu Ishikawa >Priority: Minor > Fix For: 1.3.0 > > > For now when using the tfidf implementation of mllib you have no other > possibility to map your data back onto i.e. labels or ids than use a hackish > way with ziping: {quote} 1. Persist input RDD. 2. Transform it to just > vectors and apply IDFModel 3. zip with original RDD 4. transform label and > new vector to LabeledPoint{quote} > Source:[http://stackoverflow.com/questions/26897908/spark-mllib-tfidf-implementation-for-logisticregression] > I think as in production alot of users want to map their data back to some > identifier, it would be a good imporvement to allow using a single vector on > IDFModel.transform() -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org