I took a look at JPMML... At the bottom of it they have ran a JAXB compiler on the PMML V4 schema to generate Java bindings. I didn't see a lot of value add in JPMML beyond that.

I'd say just add the schema and bindings generation to Mahout. The value add here is model mapping from the JAXB generated model into the Mahout models.

On 12/20/2012 06:13 AM, Grant Ingersoll wrote:
 From looking at PMML (http://www.dmg.org/v4-1/GeneralStructure.html), it seems 
that JPMML is not going to really get us there if it only supports the 4 models 
listed below.  I would think we could go through the structures supported in 
the link above and then map it to the Algorithms that are supported.  To start, 
perhaps it would make sense to focus on a few like: clustering, naive bayes and 
perhaps SGD will fit into the regression models.  Perhaps try to get K-Means 
and Naive Bayes to work first.

FTR, I can only imagine how bloated these files are going to get since they use 
XML.  Thankfully, they won't be used to power the internals, just to support 
interoperability.

-Grant

On Dec 19, 2012, at 8:12 AM, Simon Vocella wrote:

Hi All,

as Grant suggested, I forward the email about mahout-pmml.
I already tried jpmml standalone and works fine for me, the next important 
point is to understand or maybe create some example for each model described 
before:
NeuralNetwork
RandomForest (implemented via Segmentation, which is a PMML version 4.0 feature)
RegressionModel
TreeModel
with only Mahout and next step create a convertor to create object from jpmml 
to Mahout. This is related only to import the object and for me the export 
object is more similar to these.

Do you agree? Are you interested in this models? Or Mahout focus on another one?

regards,
Simon

---------- Forwarded message ----------
From: Simon Vocella <vox...@gmail.com>
Date: Mon, Dec 17, 2012 at 1:50 AM
Subject: mahout-pmml
To: Grant Ingersoll <gsing...@apache.org>
Cc: Marty Kube <martyk...@beavercreekconsulting.com>


Hi Grant,

I start with this is the project https://github.com/voxsim/mahout-pmml (I 
pushed only the skeleton for now) with mahout and jpmml integration 
(http://code.google.com/p/jpmml/)

I read the wiki about weka convertor 
https://cwiki.apache.org/MAHOUT/creating-vectors-from-wekas-arff-format.html
And I read the integration with Lucene 
http://searchhub.org/2010/03/16/integrating-apache-mahout-with-apache-lucene-and-solr-part-i-of-3/

In theory we need to do more similar to these parts, but different, we don't 
transfrom vector but model, Do i understand correctly?

I'll request directly to you because you have in mind this idea and for now 
jpmml support this models
NeuralNetwork
RandomForest (implemented via Segmentation, which is a PMML version 4.0 feature)
RegressionModel
TreeModel
Are you interested in this models? Or Mahout focus on another one?

Simon

PS Marty before to start I need some answers sorry XD

--------------------------------------------
Grant Ingersoll
http://www.lucidworks.com






Reply via email to