I took a look at JPMML... At the bottom of it they have ran a JAXB
compiler on the PMML V4 schema to generate Java bindings. I didn't see
a lot of value add in JPMML beyond that.
I'd say just add the schema and bindings generation to Mahout. The
value add here is model mapping from the JAXB generated model into the
Mahout models.
On 12/20/2012 06:13 AM, Grant Ingersoll wrote:
From looking at PMML (http://www.dmg.org/v4-1/GeneralStructure.html), it seems
that JPMML is not going to really get us there if it only supports the 4 models
listed below. I would think we could go through the structures supported in
the link above and then map it to the Algorithms that are supported. To start,
perhaps it would make sense to focus on a few like: clustering, naive bayes and
perhaps SGD will fit into the regression models. Perhaps try to get K-Means
and Naive Bayes to work first.
FTR, I can only imagine how bloated these files are going to get since they use
XML. Thankfully, they won't be used to power the internals, just to support
interoperability.
-Grant
On Dec 19, 2012, at 8:12 AM, Simon Vocella wrote:
Hi All,
as Grant suggested, I forward the email about mahout-pmml.
I already tried jpmml standalone and works fine for me, the next important
point is to understand or maybe create some example for each model described
before:
NeuralNetwork
RandomForest (implemented via Segmentation, which is a PMML version 4.0 feature)
RegressionModel
TreeModel
with only Mahout and next step create a convertor to create object from jpmml
to Mahout. This is related only to import the object and for me the export
object is more similar to these.
Do you agree? Are you interested in this models? Or Mahout focus on another one?
regards,
Simon
---------- Forwarded message ----------
From: Simon Vocella <vox...@gmail.com>
Date: Mon, Dec 17, 2012 at 1:50 AM
Subject: mahout-pmml
To: Grant Ingersoll <gsing...@apache.org>
Cc: Marty Kube <martyk...@beavercreekconsulting.com>
Hi Grant,
I start with this is the project https://github.com/voxsim/mahout-pmml (I
pushed only the skeleton for now) with mahout and jpmml integration
(http://code.google.com/p/jpmml/)
I read the wiki about weka convertor
https://cwiki.apache.org/MAHOUT/creating-vectors-from-wekas-arff-format.html
And I read the integration with Lucene
http://searchhub.org/2010/03/16/integrating-apache-mahout-with-apache-lucene-and-solr-part-i-of-3/
In theory we need to do more similar to these parts, but different, we don't
transfrom vector but model, Do i understand correctly?
I'll request directly to you because you have in mind this idea and for now
jpmml support this models
NeuralNetwork
RandomForest (implemented via Segmentation, which is a PMML version 4.0 feature)
RegressionModel
TreeModel
Are you interested in this models? Or Mahout focus on another one?
Simon
PS Marty before to start I need some answers sorry XD
--------------------------------------------
Grant Ingersoll
http://www.lucidworks.com