RE: Feedback: Feature request
This is great and much appreciated. Thank you. - Jim From: Manish Amde [mailto:manish...@gmail.com] Sent: Friday, August 28, 2015 9:20 AM To: Cody Koeninger Cc: Murphy, James; user@spark.apache.org; d...@spark.apache.org Subject: Re: Feedback: Feature request Sounds good. It's a request I have seen a few times in the past and have needed it personally. May be Joseph Bradley has something to add. I think a JIRA to capture this will be great. We can move this discussion to the JIRA then. On Friday, August 28, 2015, Cody Koeninger c...@koeninger.orgmailto:c...@koeninger.org wrote: I wrote some code for this a while back, pretty sure it didn't need access to anything private in the decision tree / random forest model. If people want it added to the api I can put together a PR. I think it's important to have separately parseable operators / operands though. E.g lhs:0,op:=,rhs:-35.0 On Aug 28, 2015 12:03 AM, Manish Amde manish...@gmail.comjavascript:_e(%7B%7D,'cvml','manish...@gmail.com'); wrote: Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though. You should probably create a JIRA for this. CC: dev list -Manish On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.comjavascript:_e(%7B%7D,'cvml','james.mur...@disney.com'); wrote: Hey all, In working with the DecisionTree classifier, I found it difficult to extract rules that could easily facilitate visualization with libraries like D3. So for example, using : print(model.toDebugString()), I get the following result = If (feature 0 = -35.0) If (feature 24 = 176.0) Predict: 2.1 If (feature 24 = 176.0) Predict: 4.2 Else (feature 24 176.0) Predict: 6.3 Else (feature 0 -35.0) If (feature 24 = 11.0) Predict: 4.5 Else (feature 24 11.0) Predict: 10.2 But ideally, I could see results in a more parseable format like JSON: { node: [ { name:node1, rule:feature 0 = -35.0, children:[ { name:node2, rule:feature 24 = 176.0, children:[ { name:node4, rule:feature 20 116.0, predict: 2.1 }, { name:node5, rule:feature 20 = 116.0, predict: 4.2 }, { name:node5, rule:feature 20 116.0, predict: 6.3 } ] }, { name:node3, rule:feature 0 -35.0, children:[ { name:node7, rule:feature 3 = 11.0, predict: 4.5 }, { name:node8, rule:feature 3 11.0, predict: 10.2 } ] } ] } ] } Food for thought! Thanks, Jim
Re: Feedback: Feature request
I wrote some code for this a while back, pretty sure it didn't need access to anything private in the decision tree / random forest model. If people want it added to the api I can put together a PR. I think it's important to have separately parseable operators / operands though. E.g lhs:0,op:=,rhs:-35.0 On Aug 28, 2015 12:03 AM, Manish Amde manish...@gmail.com wrote: Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though. You should probably create a JIRA for this. CC: dev list -Manish On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.com wrote: Hey all, In working with the DecisionTree classifier, I found it difficult to extract rules that could easily facilitate visualization with libraries like D3. So for example, using : print(model.toDebugString()), I get the following result = If (feature 0 = -35.0) If (feature 24 = 176.0) Predict: 2.1 If (feature 24 = 176.0) Predict: 4.2 Else (feature 24 176.0) Predict: 6.3 Else (feature 0 -35.0) If (feature 24 = 11.0) Predict: 4.5 Else (feature 24 11.0) Predict: 10.2 But ideally, I could see results in a more parseable format like JSON: { node: [ { name:node1, rule:feature 0 = -35.0, children:[ { name:node2, rule:feature 24 = 176.0, children:[ { name:node4, rule:feature 20 116.0, predict: 2.1 }, { name:node5, rule:feature 20 = 116.0, predict: 4.2 }, { name:node5, rule:feature 20 116.0, predict: 6.3 } ] }, { name:node3, rule:feature 0 -35.0, children:[ { name:node7, rule:feature 3 = 11.0, predict: 4.5 }, { name:node8, rule:feature 3 11.0, predict: 10.2 } ] } ] } ] } Food for thought! Thanks, Jim
Re: Feedback: Feature request
Sounds good. It's a request I have seen a few times in the past and have needed it personally. May be Joseph Bradley has something to add. I think a JIRA to capture this will be great. We can move this discussion to the JIRA then. On Friday, August 28, 2015, Cody Koeninger c...@koeninger.org wrote: I wrote some code for this a while back, pretty sure it didn't need access to anything private in the decision tree / random forest model. If people want it added to the api I can put together a PR. I think it's important to have separately parseable operators / operands though. E.g lhs:0,op:=,rhs:-35.0 On Aug 28, 2015 12:03 AM, Manish Amde manish...@gmail.com javascript:_e(%7B%7D,'cvml','manish...@gmail.com'); wrote: Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though. You should probably create a JIRA for this. CC: dev list -Manish On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.com javascript:_e(%7B%7D,'cvml','james.mur...@disney.com'); wrote: Hey all, In working with the DecisionTree classifier, I found it difficult to extract rules that could easily facilitate visualization with libraries like D3. So for example, using : print(model.toDebugString()), I get the following result = If (feature 0 = -35.0) If (feature 24 = 176.0) Predict: 2.1 If (feature 24 = 176.0) Predict: 4.2 Else (feature 24 176.0) Predict: 6.3 Else (feature 0 -35.0) If (feature 24 = 11.0) Predict: 4.5 Else (feature 24 11.0) Predict: 10.2 But ideally, I could see results in a more parseable format like JSON: { node: [ { name:node1, rule:feature 0 = -35.0, children:[ { name:node2, rule:feature 24 = 176.0, children:[ { name:node4, rule:feature 20 116.0, predict: 2.1 }, { name:node5, rule:feature 20 = 116.0, predict: 4.2 }, { name:node5, rule:feature 20 116.0, predict: 6.3 } ] }, { name:node3, rule:feature 0 -35.0, children:[ { name:node7, rule:feature 3 = 11.0, predict: 4.5 }, { name:node8, rule:feature 3 11.0, predict: 10.2 } ] } ] } ] } Food for thought! Thanks, Jim
Re: Feedback: Feature request
Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though. You should probably create a JIRA for this. CC: dev list -Manish On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.com wrote: Hey all, In working with the DecisionTree classifier, I found it difficult to extract rules that could easily facilitate visualization with libraries like D3. So for example, using : print(model.toDebugString()), I get the following result = If (feature 0 = -35.0) If (feature 24 = 176.0) Predict: 2.1 If (feature 24 = 176.0) Predict: 4.2 Else (feature 24 176.0) Predict: 6.3 Else (feature 0 -35.0) If (feature 24 = 11.0) Predict: 4.5 Else (feature 24 11.0) Predict: 10.2 But ideally, I could see results in a more parseable format like JSON: { node: [ { name:node1, rule:feature 0 = -35.0, children:[ { name:node2, rule:feature 24 = 176.0, children:[ { name:node4, rule:feature 20 116.0, predict: 2.1 }, { name:node5, rule:feature 20 = 116.0, predict: 4.2 }, { name:node5, rule:feature 20 116.0, predict: 6.3 } ] }, { name:node3, rule:feature 0 -35.0, children:[ { name:node7, rule:feature 3 = 11.0, predict: 4.5 }, { name:node8, rule:feature 3 11.0, predict: 10.2 } ] } ] } ] } Food for thought! Thanks, Jim
Feedback: Feature request
Hey all, In working with the DecisionTree classifier, I found it difficult to extract rules that could easily facilitate visualization with libraries like D3. So for example, using : print(model.toDebugString()), I get the following result = If (feature 0 = -35.0) If (feature 24 = 176.0) Predict: 2.1 If (feature 24 = 176.0) Predict: 4.2 Else (feature 24 176.0) Predict: 6.3 Else (feature 0 -35.0) If (feature 24 = 11.0) Predict: 4.5 Else (feature 24 11.0) Predict: 10.2 But ideally, I could see results in a more parseable format like JSON: { node: [ { name:node1, rule:feature 0 = -35.0, children:[ { name:node2, rule:feature 24 = 176.0, children:[ { name:node4, rule:feature 20 116.0, predict: 2.1 }, { name:node5, rule:feature 20 = 116.0, predict: 4.2 }, { name:node5, rule:feature 20 116.0, predict: 6.3 } ] }, { name:node3, rule:feature 0 -35.0, children:[ { name:node7, rule:feature 3 = 11.0, predict: 4.5 }, { name:node8, rule:feature 3 11.0, predict: 10.2 } ] } ] } ] } Food for thought! Thanks, Jim