realno commented on pull request #1525: URL: https://github.com/apache/arrow-datafusion/pull/1525#issuecomment-1007822391
> This is a really nice piece of work @realno - thank you so much ❤️ > > I especially like the thorough testing. > > I did not review the referenced papers, but I did run some basic tests against postgres and I got some strange results > > ```shell > cargo run datafusion-cli > ``` > > Sometimes the answers are different from query to query > > ``` > ❯ select stddev(sq.column1) from (values (1.1), (2.0), (3.0)) as sq; > +--------------------+ > | STDDEV(sq.column1) | > +--------------------+ > | 0.7760297817881877 | > +--------------------+ > 1 row in set. Query took 0.008 seconds. > ❯ select stddev(sq.column1) from (values (1.1), (2.0), (3.0)) as sq; > +--------------------+ > | STDDEV(sq.column1) | > +--------------------+ > | NaN | > +--------------------+ > ``` > > And neither of those answers matches postgres: > > ``` > alamb=# select stddev(sq.column1) from (values (1.1), (2.0), (3.0)) as sq; > stddev > ------------------------ > 0.95043849529221686157 > (1 row) > ``` > > Postgres and google sheets do match <img alt="Screen Shot 2022-01-07 at 4 15 53 PM" width="368" src="https://user-images.githubusercontent.com/490673/148608489-9e64e81c-a627-4991-b3b5-079f19030c16.png"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org