RE: Feedback: Feature request

2015-08-28 Thread Murphy, James
This is great and much appreciated. Thank you.
- Jim

From: Manish Amde [mailto:manish...@gmail.com]
Sent: Friday, August 28, 2015 9:20 AM
To: Cody Koeninger
Cc: Murphy, James; user@spark.apache.org; d...@spark.apache.org
Subject: Re: Feedback: Feature request

Sounds good. It's a request I have seen a few times in the past and have needed 
it personally. May be Joseph Bradley has something to add.

I think a JIRA to capture this will be great. We can move this discussion to 
the JIRA then.

On Friday, August 28, 2015, Cody Koeninger 
c...@koeninger.orgmailto:c...@koeninger.org wrote:

I wrote some code for this a while back, pretty sure it didn't need access to 
anything private in the decision tree / random forest model.  If people want it 
added to the api I can put together a PR.

I think it's important to have separately parseable operators / operands 
though.  E.g

lhs:0,op:=,rhs:-35.0
On Aug 28, 2015 12:03 AM, Manish Amde 
manish...@gmail.comjavascript:_e(%7B%7D,'cvml','manish...@gmail.com'); 
wrote:
Hi James,

It's a good idea. A JSON format is more convenient for visualization though a 
little inconvenient to read. How about toJson() method? It might make the mllib 
api inconsistent across models though.

You should probably create a JIRA for this.

CC: dev list

-Manish

On Aug 26, 2015, at 11:29 AM, Murphy, James 
james.mur...@disney.comjavascript:_e(%7B%7D,'cvml','james.mur...@disney.com');
 wrote:
Hey all,

In working with the DecisionTree classifier, I found it difficult to extract 
rules that could easily facilitate visualization with libraries like D3.

So for example, using : print(model.toDebugString()), I get the following 
result =

   If (feature 0 = -35.0)
  If (feature 24 = 176.0)
Predict: 2.1
  If (feature 24 = 176.0)
Predict: 4.2
  Else (feature 24  176.0)
Predict: 6.3
Else (feature 0  -35.0)
  If (feature 24 = 11.0)
Predict: 4.5
  Else (feature 24  11.0)
Predict: 10.2

But ideally, I could see results in a more parseable format like JSON:


{

node: [

{

name:node1,

rule:feature 0 = -35.0,

children:[

{

  name:node2,

  rule:feature 24 = 176.0,

  children:[

  {

  name:node4,

  rule:feature 20  116.0,

  predict:  2.1

  },

  {

  name:node5,

  rule:feature 20 = 116.0,

  predict: 4.2

  },

  {

  name:node5,

  rule:feature 20  116.0,

  predict: 6.3

  }

  ]

},

{

name:node3,

rule:feature 0  -35.0,

  children:[

  {

  name:node7,

  rule:feature 3 = 11.0,

  predict: 4.5

  },

  {

  name:node8,

  rule:feature 3  11.0,

  predict: 10.2

  }

  ]

}



]

}

]

}

Food for thought!

Thanks,

Jim



Re: Feedback: Feature request

2015-08-28 Thread Cody Koeninger
I wrote some code for this a while back, pretty sure it didn't need access
to anything private in the decision tree / random forest model.  If people
want it added to the api I can put together a PR.

I think it's important to have separately parseable operators / operands
though.  E.g

lhs:0,op:=,rhs:-35.0
On Aug 28, 2015 12:03 AM, Manish Amde manish...@gmail.com wrote:

 Hi James,

 It's a good idea. A JSON format is more convenient for visualization
 though a little inconvenient to read. How about toJson() method? It might
 make the mllib api inconsistent across models though.

 You should probably create a JIRA for this.

 CC: dev list

 -Manish

 On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.com
 wrote:

 Hey all,



 In working with the DecisionTree classifier, I found it difficult to
 extract rules that could easily facilitate visualization with libraries
 like D3.



 So for example, using : print(model.toDebugString()), I get the following
 result =



If (feature 0 = -35.0)

   If (feature 24 = 176.0)

 Predict: 2.1

   If (feature 24 = 176.0)

 Predict: 4.2

   Else (feature 24  176.0)

 Predict: 6.3

 Else (feature 0  -35.0)

   If (feature 24 = 11.0)

 Predict: 4.5

   Else (feature 24  11.0)

 Predict: 10.2



 But ideally, I could see results in a more parseable format like JSON:



 {

 node: [

 {

 name:node1,

 rule:feature 0 = -35.0,

 children:[

 {

   name:node2,

   rule:feature 24 = 176.0,

   children:[

   {

   name:node4,

   rule:feature 20  116.0,

   predict:  2.1

   },

   {

   name:node5,

   rule:feature 20 = 116.0,

   predict: 4.2

   },

   {

   name:node5,

   rule:feature 20  116.0,

   predict: 6.3

   }

   ]

 },

 {

 name:node3,

 rule:feature 0  -35.0,

   children:[

   {

   name:node7,

   rule:feature 3 = 11.0,

   predict: 4.5

   },

   {

   name:node8,

   rule:feature 3  11.0,

   predict: 10.2

   }

   ]

 }



 ]

 }

 ]

 }



 Food for thought!



 Thanks,



 Jim






Re: Feedback: Feature request

2015-08-28 Thread Manish Amde
Sounds good. It's a request I have seen a few times in the past and have
needed it personally. May be Joseph Bradley has something to add.

I think a JIRA to capture this will be great. We can move this discussion
to the JIRA then.

On Friday, August 28, 2015, Cody Koeninger c...@koeninger.org wrote:

 I wrote some code for this a while back, pretty sure it didn't need access
 to anything private in the decision tree / random forest model.  If people
 want it added to the api I can put together a PR.

 I think it's important to have separately parseable operators / operands
 though.  E.g

 lhs:0,op:=,rhs:-35.0
 On Aug 28, 2015 12:03 AM, Manish Amde manish...@gmail.com
 javascript:_e(%7B%7D,'cvml','manish...@gmail.com'); wrote:

 Hi James,

 It's a good idea. A JSON format is more convenient for visualization
 though a little inconvenient to read. How about toJson() method? It might
 make the mllib api inconsistent across models though.

 You should probably create a JIRA for this.

 CC: dev list

 -Manish

 On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.com
 javascript:_e(%7B%7D,'cvml','james.mur...@disney.com'); wrote:

 Hey all,



 In working with the DecisionTree classifier, I found it difficult to
 extract rules that could easily facilitate visualization with libraries
 like D3.



 So for example, using : print(model.toDebugString()), I get the following
 result =



If (feature 0 = -35.0)

   If (feature 24 = 176.0)

 Predict: 2.1

   If (feature 24 = 176.0)

 Predict: 4.2

   Else (feature 24  176.0)

 Predict: 6.3

 Else (feature 0  -35.0)

   If (feature 24 = 11.0)

 Predict: 4.5

   Else (feature 24  11.0)

 Predict: 10.2



 But ideally, I could see results in a more parseable format like JSON:



 {

 node: [

 {

 name:node1,

 rule:feature 0 = -35.0,

 children:[

 {

   name:node2,

   rule:feature 24 = 176.0,

   children:[

   {

   name:node4,

   rule:feature 20  116.0,

   predict:  2.1

   },

   {

   name:node5,

   rule:feature 20 = 116.0,

   predict: 4.2

   },

   {

   name:node5,

   rule:feature 20  116.0,

   predict: 6.3

   }

   ]

 },

 {

 name:node3,

 rule:feature 0  -35.0,

   children:[

   {

   name:node7,

   rule:feature 3 = 11.0,

   predict: 4.5

   },

   {

   name:node8,

   rule:feature 3  11.0,

   predict: 10.2

   }

   ]

 }



 ]

 }

 ]

 }



 Food for thought!



 Thanks,



 Jim






Re: Feedback: Feature request

2015-08-27 Thread Manish Amde
Hi James,

It's a good idea. A JSON format is more convenient for visualization though a 
little inconvenient to read. How about toJson() method? It might make the mllib 
api inconsistent across models though. 

You should probably create a JIRA for this.

CC: dev list

-Manish

 On Aug 26, 2015, at 11:29 AM, Murphy, James james.mur...@disney.com wrote:
 
 Hey all,
  
 In working with the DecisionTree classifier, I found it difficult to extract 
 rules that could easily facilitate visualization with libraries like D3.
  
 So for example, using : print(model.toDebugString()), I get the following 
 result =
  
If (feature 0 = -35.0)
   If (feature 24 = 176.0)
 Predict: 2.1
   If (feature 24 = 176.0)
 Predict: 4.2
   Else (feature 24  176.0)
 Predict: 6.3
 Else (feature 0  -35.0)
   If (feature 24 = 11.0)
 Predict: 4.5
   Else (feature 24  11.0)
 Predict: 10.2
  
 But ideally, I could see results in a more parseable format like JSON:
  
 {
 node: [
 {
 name:node1,
 rule:feature 0 = -35.0,
 children:[
 {
   name:node2,
   rule:feature 24 = 176.0,
   children:[
   {
   name:node4,
   rule:feature 20  116.0,
   predict:  2.1
   },
   {
   name:node5,
   rule:feature 20 = 116.0,
   predict: 4.2
   },
   {
   name:node5,
   rule:feature 20  116.0,
   predict: 6.3
   }
   ]
 },
 {
 name:node3,
 rule:feature 0  -35.0,
   children:[
   {
   name:node7,
   rule:feature 3 = 11.0,
   predict: 4.5
   },
   {
   name:node8,
   rule:feature 3  11.0,
   predict: 10.2
   }
   ]
 }
  
 ]
 }
 ]
 }
  
 Food for thought!
  
 Thanks,
  
 Jim
  


Feedback: Feature request

2015-08-26 Thread Murphy, James
Hey all,

In working with the DecisionTree classifier, I found it difficult to extract 
rules that could easily facilitate visualization with libraries like D3.

So for example, using : print(model.toDebugString()), I get the following 
result =

   If (feature 0 = -35.0)
  If (feature 24 = 176.0)
Predict: 2.1
  If (feature 24 = 176.0)
Predict: 4.2
  Else (feature 24  176.0)
Predict: 6.3
Else (feature 0  -35.0)
  If (feature 24 = 11.0)
Predict: 4.5
  Else (feature 24  11.0)
Predict: 10.2

But ideally, I could see results in a more parseable format like JSON:


{

node: [

{

name:node1,

rule:feature 0 = -35.0,

children:[

{

  name:node2,

  rule:feature 24 = 176.0,

  children:[

  {

  name:node4,

  rule:feature 20  116.0,

  predict:  2.1

  },

  {

  name:node5,

  rule:feature 20 = 116.0,

  predict: 4.2

  },

  {

  name:node5,

  rule:feature 20  116.0,

  predict: 6.3

  }

  ]

},

{

name:node3,

rule:feature 0  -35.0,

  children:[

  {

  name:node7,

  rule:feature 3 = 11.0,

  predict: 4.5

  },

  {

  name:node8,

  rule:feature 3  11.0,

  predict: 10.2

  }

  ]

}



]

}

]

}

Food for thought!

Thanks,

Jim