Xiangrui Meng created SPARK-9941:
------------------------------------

             Summary: Try ML pipeline API on Kaggle competitions
                 Key: SPARK-9941
                 URL: https://issues.apache.org/jira/browse/SPARK-9941
             Project: Spark
          Issue Type: Umbrella
          Components: ML
            Reporter: Xiangrui Meng
            Assignee: Xiangrui Meng


This is an umbrella JIRA to track some fun tasks:)

We have built many features under the ML pipeline API, and we want to see how 
it works on real-world datasets, e.g., Kaggle competition datasets. We want to 
invite community members to help test. The goal is NOT to win the competitions 
but to provide code examples and to find out missing features and other issues 
to help shape the roadmap.

For people who are interested, please do the following:

1. Create a subtask (or leave a comment if you cannot create a subtask) to 
claim a Kaggle dataset.
2. Use the ML pipeline API to build and tune an ML pipeline that works for the 
Kaggle dataset.
3. Paste the code to gist (https://gist.github.com/) and provide the link.
4. Report missing features, issues, running times, and accuracy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to