Re: [scikit-learn] [ANN] scikit-learn 1.2.0rc1 is online!

2022-11-29 Thread Olivier Grisel
Thanks Jeremie for pushing this release out! Now is the time to test downstream projects against this to make sure it will not break too many things when we publish the 1.2.0 final release in a week or two ! -- Olivier ___ scikit-learn mailing list

Re: [scikit-learn] [ANN] scikit-learn 1.1.3 is online!

2022-10-30 Thread Olivier Grisel
Thank you so much Guillaume for getting this release out and to Chiara for pushing forward with the Python 3.11 wheel building infrastructure update and related fixes! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] [ANN] scikit-learn 1.1.1 is online!

2022-05-19 Thread Olivier Grisel
BTW, this is now stable to the URL https://scikit-learn.org/stable/whats_new/v1.1.html#version-1-1-1 also works :) ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] [ANN] scikit-learn 1.1.1 is online!

2022-05-19 Thread Olivier Grisel
Thank you to all the contributors who reported bugs, minimal reproducers and fixes, and thank you Guillaume for getting this bugfix release out so timely \o/ -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] Experience with black formatting in scikit-learn for astropy

2022-05-19 Thread Olivier Grisel
I agree with Guillaume's answers. I think it was a net benefit, even though it might be a bit annoying to get the tooling right for first time contributors. We can probably improve this by making the error messages on the CI more directive on how to fix formatting issues by given copy-pastable

Re: [scikit-learn] [ANN] scikit-learn 1.1 release

2022-05-12 Thread Olivier Grisel
Congrats Jeremie and everybody who contributed to this release! This is a great achievement. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] [ANN] scikit-learn 1.1.0rc1 is online!

2022-04-28 Thread Olivier Grisel
Thanks Jeremie for leading the efforts to get this release out! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] scikit-learn 1 - pytest - multiprocessing Pool - hangs?

2021-12-09 Thread Olivier Grisel
Maybe you can try to use faulthandler.dump_traceback_later https://docs.python.org/3/library/faulthandler.html#faulthandler.dump_traceback_later to get a traceback of all the threads of the main process. But the fact that you are using the default `p = multiprocessing.Pool()` makes me think that

Re: [scikit-learn] scikit-learn office hours on Friday Oct. 8 2021

2021-10-08 Thread Olivier Grisel
To summarize, the office hours for today are: - 15:00-16:00 UTC / 17:00-18:00 CEST (this one starts in less than 10min) - 18:00-19:00 UTC / 20:00-21:00 CEST (with Guillaume) Sorry for the confusion and see you soon. -- Olivier ___ scikit-learn

[scikit-learn] scikit-learn office hours on Friday Oct. 8 2021

2021-10-06 Thread Olivier Grisel
Hi all, Some of us will be online on the scikit-learn discord this Friday at 15:00 UTC and 20:00 UTC. First time and occasional contributors are welcome to join us to discord using this invitation link: https://discord.gg/YBdN45kD The focus of these office hour sessions is to answer questions

Re: [scikit-learn] [ANNOUNCEMENT] scikit-learn 1.0 release

2021-09-24 Thread Olivier Grisel
Yeah! Thank you so much Adrin for all your efforts in getting this release out! Congratulations everyone, time to celebrate! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Dataframe protocol RFC

2021-08-25 Thread Olivier Grisel
Hi all, This is an email to notify everybody interested that the discussion on interoperability of Python dataframe libraries has moved to an official repo under the data-apis.org initiative: https://data-apis.org/blog/dataframe_protocol_rfc/ https://github.com/data-apis/dataframe-api and they

Re: [scikit-learn] Pandas copy-on-write proposal

2021-08-25 Thread Olivier Grisel
Thanks for the heads up! This is interesting. We rarely update dataframe values in-place in scikit-learn but this is interesting to know that we could leverage this for more efficient pandas-in pandas-out support, for instance for missing value imputation.

Re: [scikit-learn] [TC Vote] Technical Committee vote: line length

2021-07-28 Thread Olivier Grisel
Many very active core devs not represented in the TC voted for 88 and my previous vote for 79 was not that strong. So I feel that I should now vote for 88: Keep current 88 characters: Olivier Revert to 79 characters: -- Olivier ___ scikit-learn

[scikit-learn] scikit-learn monthly developer meeting: Monday June 28 2021

2021-06-25 Thread Olivier Grisel
Dear all, The scikit-learn developer monthly meeting will take place on Monday June 28th at 3PM UTC. - Video call link: https://meet.google.com/qbg-ucpe-ngz - Meeting notes / agenda: https://hackmd.io/0yokz72CTZSny8y3Re648Q - Local times:

Re: [scikit-learn] New member of the triage team: Norbert

2021-06-21 Thread Olivier Grisel
> I have only one question related to scikit-learn. > how to compute topic coherence of lda models in scikit-lean. I don't find > any function that calculate a coherence value. > please, reply me. We don't have such a metric in scikit-learn. I assume you are referring to:

Re: [scikit-learn] New member of the triage team: Norbert

2021-06-21 Thread Olivier Grisel
I am a bit late but I am very happy to see Norbert joining the triage team! Welcome! ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] running examples

2021-03-24 Thread Olivier Grisel
Alternatively, you can edit the code to use fetch_openml(..., as_frame=False) to use a numpy array instead of a pandas dataframe for this example. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

[scikit-learn] [ANN] scikit-learn 0.24.0rc1 is online!

2020-12-03 Thread Olivier Grisel
Please help us test the first release candidate for scikit-learn 0.24.0: pip install scikit-learn==0.24.0rc1 Changelog: https://scikit-learn.org/0.24/whats_new/v0.24.html In particular, if you maintain a project with a dependency on scikit-learn, please let us know about any regression.

Re: [scikit-learn] Changes in Travis billing

2020-11-05 Thread Olivier Grisel
> Shall I contact them? Any other volunteers? +1. I think we are still dependent on travis for ARM-based release builds and cron-jobs. The rest we can move it to Azure Pipelines or github actions I believe. -- Olivier ___ scikit-learn mailing list

Re: [scikit-learn] About the Boston housing prices dataset

2020-10-14 Thread Olivier Grisel
Le mar. 13 oct. 2020 à 16:19, Adrin a écrit : > > Isn't the Boston dataset available through openml? Maybe here: > https://www.openml.org/d/531 > > I'm happy to have the dataset out there on opemml, and for any material that > addresses some of the issues with it. > But for educational

Re: [scikit-learn] About the Boston housing prices dataset

2020-10-13 Thread Olivier Grisel
Thanks for your input, this is also an extension I was thinking of. ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] About the Boston housing prices dataset

2020-10-13 Thread Olivier Grisel
Hi all, Thanks to the sustained effort of several contributors (thanks Maria and Lucy in particular), the Boston housing price dataset is no longer used in the examples of scikit-learn (nor in the test suite) in the master branch. To give some context on why this dataset is problematic, please

Re: [scikit-learn] scikit-learn monthly meeting September 28th 2020

2020-09-28 Thread Olivier Grisel
Shall we start rolling meetings with a switch between 2 or 3 time slots? -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] climate friendly software licence

2020-06-29 Thread Olivier Grisel
Hi Sole, I personally support climate change actions very much and I am convinced climate change is the number 1 challenge of our time. In an attempt to act in a consistent way with that belief, I declined several times to keynote at conferences either organized by the fossil fuel industry or to

Re: [scikit-learn] ANN scikit-learn 0.23.0 release

2020-05-13 Thread Olivier Grisel
Congrats on the release! And thank you very much to all those who were involved in making it happen (and Adrin in particular)! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Monthly meetings

2020-03-30 Thread Olivier Grisel
I get a message for an invalid meeting id. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] scikit-learn 0.22.1 is out!

2020-01-02 Thread Olivier Grisel
This is a minor release that includes many bug fixes and solves a number of packaging issues with Windows wheels in particular. Here is the full changelog: https://scikit-learn.org/stable/whats_new/v0.22.html#version-0-22-1 The conda package will follow soon (hopefully). Thank you very much to

Re: [scikit-learn] Paris Sprint in January and wiki update

2019-12-17 Thread Olivier Grisel
Indeed I do not see the "circle add" button in the tweetdeck UI anymore. But it's ok not to prepare the threads before tweeting the first tweet. We can build the thread progressively by publishing the first tweet and then replying one tweet after the other by hitting the reply button of the last

Re: [scikit-learn] scikit-learn twitter account

2019-12-03 Thread Olivier Grisel
Ok the twitter accounts are now switched: https://twitter.com/scikit_learn/status/1201794032650932224 The notifications for commits pushed to master are live: https://twitter.com/sklearn_commits Ready for the release :) -- Olivier ___ scikit-learn

Re: [scikit-learn] scikit-learn twitter account

2019-12-02 Thread Olivier Grisel
Alright, I have configured the new github action for the tweets on @sklearn_commits: https://github.com/scikit-learn/scikit-learn/pull/15758 I tested it from my repo and it worked fine (I deleted the test tweet though). We can do the switch as soon as this PR is merged. -- Olivier

Re: [scikit-learn] scikit-learn twitter account

2019-12-02 Thread Olivier Grisel
It might actually be possible to use github actions with https://github.com/xorilog/twitter-action for instance. I will try to give it a try with a test repo. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] scikit-learn twitter account

2019-12-02 Thread Olivier Grisel
Alright, it seems that I can create twitter apps (and generates API tokens) for the @sklearn_commits account however https://github.com/filearts/tweethook does not work as it relies on a third party webtask,io service that does not accept any new subscription... I am looking for an alternative

Re: [scikit-learn] scikit-learn twitter account

2019-11-25 Thread Olivier Grisel
I have created the https://twitter.com/sklearn_commits twitter account. I have applied to make this account a "Twitter Developer" account to be able to use https://github.com/filearts/tweethook to register it as a webhook for the main scikit-learn github repo. Once ready, I will remove the old

Re: [scikit-learn] scikit-learn twitter account

2019-11-22 Thread Olivier Grisel
Le ven. 22 nov. 2019 à 17:24, Gael Varoquaux a écrit : > > > I would like to create @sklearn_commits instead of > > @scikit_learn_commits that is too long to my taste. Any opinion? > > Some people do not make the link between "sklearn" and "scikit-learn" :) People who are likely to follow a

Re: [scikit-learn] scikit-learn twitter account

2019-11-22 Thread Olivier Grisel
Ok, I have sent some invites. I would like to create @sklearn_commits instead of @scikit_learn_commits that is too long to my taste. Any opinion? -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] scikit-learn twitter account

2019-11-22 Thread Olivier Grisel
Thanks Tom, let me try to configure this. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] scikit-learn twitter account

2019-11-15 Thread Olivier Grisel
I am not sure who has the rights to manage the twitter account. I just sent a password reset request to "sc**@a..***" I suspect that this is Andreas but I am not so sure. ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] scikit-learn twitter account

2019-11-15 Thread Olivier Grisel
Le ven. 15 nov. 2019 à 17:31, Nicolas Hug a écrit : > > What's the status of this? Would be great to have it for the 0.22 release :) ! > +1 and we could also announce / thank / RT new sources of funding (CZI and Fujitsu). ___ scikit-learn mailing list

Re: [scikit-learn] scikit-learn twitter account

2019-11-15 Thread Olivier Grisel
Le mar. 5 nov. 2019 à 12:46, Gael Varoquaux a écrit : > > On Mon, Nov 04, 2019 at 10:14:26PM -0700, Andreas Mueller wrote: > > Should we re-purpose the existing twitter account or make a new one? > > https://twitter.com/scikit_learn > > I think that we should repurpose it: > > - Make a

Re: [scikit-learn] Monthly meetings between core developers

2019-07-18 Thread Olivier Grisel
I just found this planner to give it a try: https://www.timeanddate.com/worldclock/meetingtime.html?day=29=7=2019=240=33=37=179=0 (Berlin and Paris are on the same timezone so I did not put only Berlin). It's going to be challenging to find a timeslot for every body. The least extreme timeslot

Re: [scikit-learn] Monthly meetings between core developers

2019-07-18 Thread Olivier Grisel
Le jeu. 18 juil. 2019 à 08:29, Adrin a écrit : > > BTW, where was the meeting for last Monday organized? I don't think I knew it > was happening. I do not understand what you are referring to. My email was about the organization of future meetings as suggested by Andreas.

[scikit-learn] New core developer: jeremiedbb

2019-07-03 Thread Olivier Grisel
The core developers of Scikit-learn have recently voted to welcome Jérémie Du Boisberranger to the team, in recognition of his efforts and trustworthiness as contributor. Jérémie's works at Inria Saclay and is supported by the scikit-learn initiative at Fondation Inria and its partners.

Re: [scikit-learn] Scikit Learn in a Cray computer

2019-06-29 Thread Olivier Grisel
You have to use a dedicated framework to distribute the computation on a cluster like you cray system. You can use mpi, or dask with dask-jobqueue but the also need to run parallel algorithms that are efficient when running in a distributed with a high cost for communication between distributed

Re: [scikit-learn] Scikit Learn in a Cray computer

2019-06-19 Thread Olivier Grisel
How many cores du you have on this machine? joblib.cpu_count() ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] [Copyright] Skicit-learn graphic

2019-05-24 Thread Olivier Grisel
I think it's ok to do as you said. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Release Candidate for Scikit-learn 0.21

2019-05-01 Thread Olivier Grisel
\o/ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] VOTE: scikit-learn governance document

2019-02-20 Thread Olivier Grisel
+1 ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Sprint discussion points?

2019-02-15 Thread Olivier Grisel
I would also add generalizing early stopping options to most estimators. This is a bit related to Joel's point on max_iter consistency in LogisticRegression. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] Next Sprint

2018-12-21 Thread Olivier Grisel
say that they > >> > might be available at this time. It is good for many people, or > should we > >> > organize a doodle? > >> > > >> > G > >> > > >> > On Wed, Dec 19, 2018 at 05:27:21PM -0500, Andreas Mueller wrote: > &

Re: [scikit-learn] MLPClassifier on WIndows 10 is 4 times slower than that on macOS?

2018-12-18 Thread Olivier Grisel
You should probably just "conda update scikit-learn": scikit-learn 0.20.1 is available on the official anaconda channel for all supported operating systems: https://anaconda.org/anaconda/scikit-learn -- Olivier ___ scikit-learn mailing list

Re: [scikit-learn] Difference between linear model and tree-based regressor?

2018-12-13 Thread Olivier Grisel
They are very different statistical models from a mathematical point of view. See the online scikit-learn documentation or reference text books such as "Elements of Statistical Learning" for more details. In practice, linear model tends to be faster to fit on large data, especially when the

Re: [scikit-learn] New core dev: Adrin Jalali

2018-12-06 Thread Olivier Grisel
Congrats and welcome Adrin! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] benchmarking TargetEncoder Was: ANN Dirty_cat: learning on dirty categories

2018-11-23 Thread Olivier Grisel
Maybe a subset of the criteo TB dataset? ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Next Sprint

2018-11-20 Thread Olivier Grisel
We can also do Paris in April / May or June if that's ok with Joel and better for Andreas. I am teaching on Fridays from end of January to March. But I can miss half a day of sprint to teach my class. -- Olivier ___ scikit-learn mailing list

Re: [scikit-learn] Random Forest Regressor -- Implementation in C++

2018-11-07 Thread Olivier Grisel
You might also want to have a look at https://github.com/onnx/onnxmltools although I am not sure if there are RF optimized ONNX runtimes at this point. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Olivier Grisel
> > > > I think model serialization should be a priority. > There is also the ONNX specification that is gaining industrial adoption and that already includes open source exporters for several families of scikit-learn models: https://github.com/onnx/onnxmltools -- Olivier

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Olivier Grisel
Le mer. 26 sept. 2018 à 23:02, Joel Nothman a écrit : > And for those interested in what's in the pipeline, we are trying to draft > a roadmap... > https://github.com/scikit-learn/scikit-learn/wiki/Draft-Roadmap-2018 > > But there are no doubt many features that are absent there too. > Indeed,

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Olivier Grisel
Joy ! ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Bootstrapping in sklearn

2018-09-20 Thread Olivier Grisel
I believe it would fit in sklearn-contrib even if it's more for statistical inference rather than machine learning style prediction. Others might disagree. Anyways, joining efforts to improve documentation, CI, testing and so on is always a good thing for your future users. -- Olivier

Re: [scikit-learn] Bootstrapping in sklearn

2018-09-18 Thread Olivier Grisel
This looks like a very useful project. There is also scikits-bootstraps [1]. Personally I prefer the flat package namespace of resample (I am not a fan of the 'scikits' namespace package) but I still think it would be great to contact the author to know if he would be interested in joining

[scikit-learn] New core dev: Joris Van den Bossche

2018-06-23 Thread Olivier Grisel
Hi everyone! Let's welcome Joris Van den Bossche (@jorisvdbossche) officially as a scikit-learn core developer! Joris is one of the maintainers of the pandas project and recently contributed many new great PRs to scikit-learn (notably the ColumnTransformer and a refactoring of the categorical

Re: [scikit-learn] Announcing modAL: a modular active learning framework

2018-02-19 Thread Olivier Grisel
It looks nice, thanks for sharing. Do you plan to couple the active learner with a UX-optimized labeling interface (for instance with a react.js or similar frontend and a flask or similar backend)? -- Olivier ​ ___ scikit-learn mailing list

Re: [scikit-learn] clustering on big dataset

2018-01-02 Thread Olivier Grisel
Have you had a look at BIRCH? http://scikit-learn.org/stable/modules/clustering.html#birch -- Olivier ​ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Announcing sklearn-xarray

2017-12-04 Thread Olivier Grisel
Interesting project! BTW, do you know about dask-ml [1]? It might be interesting to think about generalizing the input validation of fit and predict / transform as a private method of the BaseEstimator class instead of directly calling into sklearn.utils.validation functions so has to make it

Re: [scikit-learn] Error while running 'python setup.py build_ext --inplace'

2017-12-04 Thread Olivier Grisel
Maybe update your version of Cython? -- Olivier ​ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Rapid Outlier Detection via Sampling

2017-11-27 Thread Olivier Grisel
> Do I need to write object oriented or are functions also ok? I you want to contribute an implementation as a new project on scikit-learn contrib, you should be careful to follow the scikit-learn estimators API:

Re: [scikit-learn] New core devs: Hanmin Qin, Guillaume Lemaître, and Roman Yurchak

2017-11-09 Thread Olivier Grisel
Congrats to all three of you! Thank you very much for your contributions and in particular in reviewing contributions by others. -- Olivier ​ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] scikit-learn-commits mailing list defunct?

2017-08-28 Thread Olivier Grisel
+1 for python.org if they accept this kind of mailing lists. ​ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] scikit-learn-commits mailing list defunct?

2017-08-28 Thread Olivier Grisel
+1 ​ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] scikit-learn 0.19.0 is out!

2017-08-11 Thread Olivier Grisel
Grab it with pip or conda ! Quoting the release highlights from the website: We are excited to release a number of great new features including neighbors.LocalOutlierFactor for anomaly detection, preprocessing.QuantileTransformer for robust feature transformation, and the

Re: [scikit-learn] Truncated svd not working for complex matrices

2017-08-10 Thread Olivier Grisel
I have no idea whether the randomized SVD method is supposed to work for complex data or not (from a mathematical point of view). I think that all scikit-learn estimators assume real data (or integer data for class labels) and our input validation utilities will cast numeric values to float64 by

Re: [scikit-learn] Extra trees tuning parameters

2017-08-04 Thread Olivier Grisel
I believe so even though it's always better to check in the code to see how this parameter is actually used. -- Olivier ​ ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] scikit-learn 0.19b2 is available for testing

2017-07-17 Thread Olivier Grisel
The new release is coming and we are seeking feedback from beta testers! pip install scikit-learn==0.19b2 conda-forge packages should follow in the coming hours / days. Note that many models have changed behaviors and some things have been deprecated, see the full changelog at:

Re: [scikit-learn] Which algorithm is used in sklearn SGDClassifier when modified huber loss is used?

2017-07-07 Thread Olivier Grisel
The name of the algorithm / model would be "L2-penalized linear model with modified Huber loss trained with Stochastic Gradient Descent". SVM is traditionally used to describe models that use the hinge loss only (or sometimes the squared hinge loss too). Only the log loss can be lead to a

Re: [scikit-learn] Typo in online documentation on Matrix Factorization

2017-07-06 Thread Olivier Grisel
I think the documentation is correct. U, a.k.a. "the code" or "the activations" has shape (n_samples, n_components) and V a.k.a. "the dictionary" or "the components" has shape (n_components, n_features) in both case. We could use n_components uniformly instead of n_atoms for consistency's sake

Re: [scikit-learn] Fwd: [SciPy-User] EuroSciPy 2017 call for contributions - extension of deadline

2017-06-30 Thread Olivier Grisel
I am pretty sure this is exactly the kind of presentation that the EuroScipy audience would enjoy. Please submit! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Scikit-learn at Data Intelligence this past weekend

2017-06-30 Thread Olivier Grisel
Thanks for this report! -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Agglomerative clustering

2017-06-30 Thread Olivier Grisel
You can have a look at the test named "test_agglomerative_clustering" in: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/tests/test_hierarchical.py -- Olivier ___ scikit-learn mailing list scikit-learn@python.org

[scikit-learn] Scikit-learn workshop and sprint at EuroScipy 2017 in Erlangen

2017-06-23 Thread Olivier Grisel
Hi all, FYI I have just submitted a 90 min tutorial on scikit-learn to the EuroScipy CFP. If anybody is interested in co-teaching / TA-ing this workshop please let me know. I also plan to stay for the one-day sprint to help people make their first contribution to the project. Last year we had

Re: [scikit-learn] XGboost Classifier error

2017-04-19 Thread Olivier Grisel
Please provide the full traceback. Without it it's impossible to tell whether the problem is in scikit-learn or xgboost. Also, please provide a minimal reproduction script as explained in: http://scikit-learn.org/stable/faq.html#what-s-the-best-way-to-get-help-on-scikit-learn-usage -- Olivier

Re: [scikit-learn] Logistic regression with elastic net regularization

2017-03-14 Thread Olivier Grisel
>From a generalization point of view (test accuracy), the optimal sparsity support should not matter much though, but it can be helpful to find a the optimally sparsest solution for either computational constraints (smaller models with a lower prediction latency) and interpretation of the weights

Re: [scikit-learn] Logistic regression with elastic net regularization

2017-03-14 Thread Olivier Grisel
Note that SGD is not very good at optimizing finely with a non-smooth penalty (e.g. l1 or elasticnet). The future SAGA solver is going to be much better at finding the optimal sparsity support (although this support is not guaranteed to be stable across re-sampling of the training set if the

Re: [scikit-learn] GSOC call for mentors

2017-02-18 Thread Olivier Grisel
Personally I don't feel like mentoring this year. I would really like to focus my scikit-learn time on finishing the joblib process refactoring with Thomas Moreau and the binning / thread-based parallelization of boosted trees with Guillaume and Raghav. -- Olivier

Re: [scikit-learn] Modelling event rates

2017-02-17 Thread Olivier Grisel
I don't think we have any model dedicated to this, but it's possible that expressive non-parametricmodels such as RF and GBRT or richly parameterized models such as MLP with a regression loss can do a good enough job at giving you a point estimate. -- Olivier

Re: [scikit-learn] Preparing a scikit-learn 0.18.2 bugfix release

2017-01-09 Thread Olivier Grisel
I would rather like to get it out before April ideally and instead of setting up a roadmap I would rather just identify bugs that are blockers and fix only those and don't wait for any feature before cutting 0.19.X. -- Olivier ___ scikit-learn mailing

Re: [scikit-learn] Preparing a scikit-learn 0.18.2 bugfix release

2017-01-09 Thread Olivier Grisel
In retrospect, making a small 0.19 release is probably a good idea. I would like to get https://github.com/scikit-learn/scikit-learn/pull/8002 in before cutting the 0.19.X branch. -- Olivier Grisel ___ scikit-learn mailing list scikit-learn@python.org

[scikit-learn] Preparing a scikit-learn 0.18.2 bugfix release

2017-01-09 Thread Olivier Grisel
Hi all, I think we should release 0.18.2 to get some important fixes and make it easy to release Python 3.6 wheel package for all the operating systems using the automated procedure. I identified a couple of PR to backport to 0.18.X to prepare the 0.18.2 release. Are there any other important

Re: [scikit-learn] modifying CV score

2017-01-04 Thread Olivier Grisel
You can indeed derive from BaseEstimator and implement fit, predict and optionally score. Here is the documentation for the expected estimator API: http://scikit-learn.org/stable/developers/contributing.html#apis-of-scikit-learn-objects As this is a linear regression model, you can also want to

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Olivier Grisel
I cannot reproduce such a degradation on my machine: (sklearn-0.17)ogrisel@is146148:~/code/scikit-learn$ python ~/tmp/bench_vectorizer.py scikit-learn 0.17.1. Numpy 1.11.2. Python 3.5.0 x86_64 Vectorizing 20newsgroup 11314 documents Vectorization completed in 4.033604383468628 seconds,

Re: [scikit-learn] Latent Semantic Analysis (LSA) and TrucatedSVD

2016-08-27 Thread Olivier Grisel
BTW Roman, the examples in your gist would make a great non-regression test for this new feature. Please feel free to submit a PR. -- Olivier ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] 0.18?

2016-07-25 Thread Olivier Grisel
Sorry for the late reply, Before working on this release I would like to automate the wheel generation process (for the release wheels) in a single repo that will generate wheels for linux, osx and windows based on https://github.com/matthew-brett/multibuild I plan to put that repo under

Re: [scikit-learn] How to test on PYTHON_ARCH=32 with mac?

2016-07-20 Thread Olivier Grisel
> I believe this `arch -i386` only works as a prefix for Python.org Python, > but I'm happy to be corrected. Then the following should work: arch -i386 python -c "import nose; nose.main()" sklearn ___ scikit-learn mailing list scikit-learn@python.org

Re: [scikit-learn] NB-SVM Implementation

2016-06-07 Thread Olivier Grisel
I think it could be implemented as a preprocessing step: this is the approach followed by: https://github.com/ryankiros/skip-thoughts/blob/master/eval_classification.py Note that in that case LogisticRegression is used as the final classifier instead of a squared hinge loss SVM but that should