Hi,
so you know, I added PMML export for linear models (linear, ridge and lasso)
as suggested by Xiangrui.
I will be looking at SVMs and Logistic regression next.
Vincenzo
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Status-of-MLLib-exporting-models
Yes,
The case is convincing for PMML with Oryx. I will also investigate
parameter server.
Cheers,
Charles
On Tuesday, November 18, 2014, Sean Owen so...@cloudera.com wrote:
I'm just using PMML. I haven't hit any limitation of its
expressiveness, for the model types is supports. I don't think
Hi Charles,
I am not aware of other storage formats. Perhaps Sean or Sandy can
elaborate more given their experience with Oryx.
There is work by Smola et al at Google that talks about large scale model
update and deployment.
I'm just using PMML. I haven't hit any limitation of its
expressiveness, for the model types is supports. I don't think there
is a point in defining a new format for models, excepting that PMML
can get very big. Still, just compressing the XML gets it down to a
manageable size for just about any
Manish and others,
A follow up question on my mind is whether there are protobuf (or other
binary format) frameworks in the vein of PMML. Perhaps scientific data
storage frameworks like netcdf, root are possible also.
I like the comprehensiveness of PMML but as you mention the complexity of
@Aris, we are closely following the PMML work that is going on and as
Xiangrui mentioned, it might be easier to migrate models such as logistic
regression and then migrate trees. Some of the models get fairly large (as
pointed out by Sung Chung) with deep trees as building blocks and we might
have
to receive a
single thank you message for my previous work in this field, and many
other fields.
VR
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Status-of-MLLib-exporting-models-to-PMML-tp18514p18729.html
Sent from the Apache Spark User List mailing list
Vincenzo sent a PR and included k-means as an example. Sean is helping
review it. PMML standard is quite large. So we may start with simple
model export, like linear methods, then move forward to tree-based.
-Xiangrui
On Mon, Nov 10, 2014 at 11:27 AM, Aris arisofala...@gmail.com wrote:
Hello
JPMML evaluator just changed their license to AGPL or commercial
license, and I think AGPL is not compatible with apache project. Any
advice?
https://github.com/jpmml/jpmml-evaluator
Sincerely,
DB Tsai
---
My Blog: https://www.dbtsai.com
Yes, jpmml-evaluator is AGPL, but things like jpmml-model are not; they're
3-clause BSD:
https://github.com/jpmml/jpmml-model
So some of the scoring components are off-limits for an AL2 project but the
core model components are OK.
On Tue, Nov 11, 2014 at 7:40 PM, DB Tsai dbt...@dbtsai.com
I also worry about that the author of JPMML changed the license of
jpmml-evaluator due to his interest of his commercial business, and he
might change the license of jpmml-model in the future.
Sincerely,
DB Tsai
---
My Blog:
Yes although I think this difference is on purpose as part of that
commercial strategy. If future versions change license it would still be
possible to not upgrade. Or fork / recreate the bean classes. Not worried
so much but it is a good point.
On Nov 11, 2014 10:06 PM, DB Tsai dbt...@dbtsai.com
Hello Spark and MLLib folks,
So a common problem in the real world of using machine learning is that
some data analysis use tools like R, but the more data engineers out
there will use more advanced systems like Spark MLLib or even Python Scikit
Learn.
In the real world, I want to have a system
13 matches
Mail list logo