Re: [Scikit-learn-general] sample data for anomaly detection (OT)

2015-05-02 Thread Nicolas Goix
Hey You have the classical AD datasets from the KDD cup 99: SA, SF, http, smtp. The original dataset has a high proportion of anomalies like 80% (originally it was a classificaton task between different types of intrusion). SA is obtained by selecting all the normal data, and asmall proportion

[Scikit-learn-general] [GSoC] Project Metric Learning

2015-05-02 Thread Artem
Hello Andreas Hello Michael First, I'm happy to be selected as this year's scikit-learn student, and hope to make a great work. According to my timeline , I'm going to use community