Dear Spark Users and Developers, 

(we apologize if you receive multiple copies of the email, we are resending 
because we found that our email was not delivered to user mail list correctly)
We are happy to announce the release of XGBoost4J 
(http://dmlc.ml/2016/03/14/xgboost4j-portable-distributed-xgboost-in-spark-flink-and-dataflow.html),
 a Portable Distributed XGBoost in Spark, Flink and Dataflow XGBoost is an 
optimized distributed gradient boosting library designed to be highly 
efficient, flexible and portable.XGBoost provides a parallel tree boosting 
(also known as GBDT, GBM) that solve many data science problems in a fast and 
accurate way. It has been the winning solution for many machine learning 
scenarios, ranging from Machine Learning Challenges to Industrial User Cases 
XGBoost4J is a new package in XGBoost aiming to provide the clean Scala/Java 
APIs and the seamless integration with the mainstream data processing platform, 
like Apache Spark. With XGBoost4J, users can run XGBoost as a stage of Spark 
job and build a unified pipeline from ETL to Model training to data product 
service within Spark, instead of jumping across two different systems, i.e. 
XGBoost and Sp
ark. Today, we release the first version of XGBoost4J to bring more choices to 
the Spark users who are seeking the solutions to build highly efficient data 
analytic platform and enrich the Spark ecosystem. We will keep moving forward 
to integrate with more features of Spark. Of course, you are more than welcome 
to join us and contribute to the project! For more details of distributed 
XGBoost, you can refer to the recently published paper: 
http://arxiv.org/abs/1603.02754 Best, -- Nan Zhu http://codingcat.me



Reply via email to