[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-25 Thread Raymond Hettinger
Raymond Hettinger added the comment: New changeset a6825197e9f2bd730d8da38f223608411e508695 by Miss Islington (bot) in branch '3.10': bpo-44151: Various grammar, word order, and markup fixes (GH-26344) (GH-26345)

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-25 Thread miss-islington
Change by miss-islington : -- pull_requests: +24937 pull_request: https://github.com/python/cpython/pull/26345 ___ Python tracker ___

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-24 Thread Raymond Hettinger
Change by Raymond Hettinger : -- pull_requests: +24936 pull_request: https://github.com/python/cpython/pull/26344 ___ Python tracker ___

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-24 Thread Raymond Hettinger
Change by Raymond Hettinger : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-24 Thread Raymond Hettinger
Raymond Hettinger added the comment: New changeset 86779878dfc0bcb74b4721aba7fd9a84e9cbd5c7 by Miss Islington (bot) in branch '3.10': bpo-44151: linear_regression() minor API improvements (GH-26199) (GH-26338) https://github.com/python/cpython/commit/86779878dfc0bcb74b4721aba7fd9a84e9cbd5c7

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-24 Thread miss-islington
Change by miss-islington : -- nosy: +miss-islington nosy_count: 6.0 -> 7.0 pull_requests: +24930 pull_request: https://github.com/python/cpython/pull/26338 ___ Python tracker

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-21 Thread Raymond Hettinger
Raymond Hettinger added the comment: Zachery, unless someone steps with an objection, I think you can go forward with the PR to implement this signature: linear_regression(x, y, /) -> LinearRegression(slope, intercept) -- ___ Python tracker

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-19 Thread Raymond Hettinger
Raymond Hettinger added the comment: Steven, do you approve of this? linear_regression(x, y) -> LinearRegression(slope, intercept) -- ___ Python tracker ___

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-19 Thread Raymond Hettinger
Raymond Hettinger added the comment: > Just to clarify, is the proposal to return a > regular tuple instead of named tuple? No, it should still have named fields. Either Line or LinearRegression would suffice. -- ___ Python tracker

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-18 Thread Zachary Kneupper
Zachary Kneupper added the comment: > Any objections to linear_regression(x, y) -> (slope, intercept)? I think `linear_regression(x, y)` would be intuitive for a wide range of users. Just to clarify, is the proposal to return a regular tuple instead of named tuple? Would we do this:

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-18 Thread Raymond Hettinger
Raymond Hettinger added the comment: Any objections to linear_regression(x, y) -> (slope, intercept)? -- ___ Python tracker ___

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-17 Thread Raymond Hettinger
Raymond Hettinger added the comment: Looking over the comments so far, it looks like (x, y) would be best and (independent variable, dependent variable) would be second best. The (x, y) also has the advantage of matching correlation() and covariance(). For output order, it seems that

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-17 Thread Zachary Kneupper
Zachary Kneupper added the comment: > The ML world has collapsed on the terms X and y. (With that > capitalization). The ML community will probably use 3rd party packages for their linear regressions in any case. In my estimation, the ML community would be comfortable with any of these

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-17 Thread Steven D'Aprano
Steven D'Aprano added the comment: > The ML world has collapsed on the terms X and y. (With that > capitalization). I just googled for "ML linear regression" and there is no consistency in either the variable used or the parameters. But most seem to use lowercase x,y. Out of the top 6 links

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-17 Thread Zachary Kneupper
Change by Zachary Kneupper : -- keywords: +patch nosy: +zkneupper nosy_count: 5.0 -> 6.0 pull_requests: +24816 stage: -> patch review pull_request: https://github.com/python/cpython/pull/26199 ___ Python tracker

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Miki Tebeka
Miki Tebeka added the comment: I'm +1 on the changes proposed by Raymond. In my teaching experience most developers who will use the built-in statistics package will have highschool level math experience. On the other hand, they'll probably to Wikipedia and the entry there uses dependent

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Matt Harrison
Matt Harrison added the comment: And by "if your model is in the correct layout", I meant "if your data is in the correct layout" -- ___ Python tracker ___

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Matt Harrison
Matt Harrison added the comment: The ML world has collapsed on the terms X and y. (With that capitalization). Moreover, most (Python libraries) follow the interface of scikit-learn [0]. Training a model looks like this: model = LinearRegression() model.fit(X, y) After that, the

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Steven D'Aprano
Steven D'Aprano added the comment: > The named tuple should be called Line because that is what it describes. > Also, a Line class would be reusuable for other purposes that linear > regression. I think that most people would expect that a Line class would represent a straight line widget

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Steven D'Aprano
Steven D'Aprano added the comment: I agree with you that "regressor" is too obscure and should be changed. I disagree about the "y = mx + c". Haven't we already discussed this? That form is used in linear algebra, but not used in statistics. Quoting from Yale: "A linear regression line has

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Raymond Hettinger
Raymond Hettinger added the comment: Related links: * https://support.microsoft.com/en-us/office/linest-function-84d7d0d9-6e50-4101-977a-fa7abf772b6d *

[issue44151] Improve parameter names and return value ordering for linear_regression

2021-05-16 Thread Raymond Hettinger
New submission from Raymond Hettinger : The current signature is: linear_regression(regressor, dependent_variable) While the term "regressor" is used in some problem domains, it isn't well known outside of those domains. The term "independent_variable" would be better because it is