Estoy comentando esto en un thread aparte para no confundirlo con el de
casos de migración.

Si bien mucha gente está acostumbrada a usar herramientas como MS Excel, y
en muchos casos lo usan como "gold standard", en la realidad, no es una
herramienta muy confiable si uno necesita precisión y exactitud en los
resultados, ni hablar de estabilidad en el comportamiento de los cálculos.

Por ejemplo, hay problemas y errores graves en la funciones estadísticas de
MS Excel que sobreviven desde versiones muy antiguas:

*Title:* On the Accuracy of Statistical Distributions in Microsoft Excel
2010

*Content:* Most of the errors in Microsoft Excel 97 and Excel 2003 pointed
out in my previous papers have been eliminated in Excel 2010. But there are
still too many deficiencies to be found in Excel 2010 and in my opinion one
cannot yet say that Excel is a good and user-friendly program for
scientific statistical purposes.

*Contact Information:* [email protected]
* URL*: http://www.csdassn.org/reportdetail.cfm?ID=1520

*Title:* Microsoft Excel 2000 and 2003 Faults, Problems, Workarounds and
Fixes

*Content:*This project attempts to consolidate most of the criticisms,
reported errors and faults in Excel's statistical applications, and to
evaluate their claims; it also extends the analysis to additional functions
and routines in both Excel 2000 and 2003 versions as of February 2005. More
extensive testing was done on many functions and routines to discover
regions of the parameter space where the functions and routines would
return erroneous values, or return error codes.

The other purpose is to describe workarounds and fixes that overcome these
faults and deficiencies in Excel-2000. Problems, faults and errors that
still remain in Excel-2003 are also discussed. If the problem in Excel-2000
has been fixed in Excel-2003, it will be discussed. If there is no explicit
indication of a change in Excel-2003, then it can be assumed that the
problem still occurs in Excel-2003.

In addition, some more intensive investigations were made on some routines
and functions to identify “hidden” properties or computation limits.
Further, some of Microsoft’s Knowledge Base Articles (KBA’s) relevant to
the functions and routines are also evaluated.

The project comprises 13 main papers (sections), and 26 separate notes that
expand on some issues or areas of concern:

Section 1: Introduction. A general review on the use of Excel in teaching
introductory statistics and for general data analysis.

Section 2: General problems with Excel. Introduces ideas about the meaning
and use of problem, fault, defect and error terms, as they relate to the
user and to the programmer (software developer).

Section 3: Excel computation and display issues. Gives a road map on how
things occur in Excel. Describes the IEEE-754 standard and its limitations.
Differences between exact mathematical equation results and the
implementation in Excel. Describes the difficulties of linguistic
translations to Excel inputs. Describes the problems of using the display
as criteria for accuracy.

Section 4: The testing program for accuracy. Describes the basic methods
used to test Excel outputs for accuracy. The STRD data sets. Discusses
issues regarding the ability to obtain test data sets and precise output
values. In many cases there is no agreed on computational method. Describes
accuracy rating methods.

Section 5: Univariate analysis. Tests on the 22 Excel univariate (or
descriptive statistics) functions.

Section 6: Analysis of variance (ANOVA). Tests on the ANOVA routines.

Section 7: Covariance and correlation. Tests on the covariance and
correlation functions. Section 8: Linear and polynomial regression. Reviews
the problems with Excel 2000 regression, the improvements in Excel 2003,
and the remaining deficiencies.

Section 9: Nonlinear regression. Lists Previous tests on non-linear
equation fitting using Solver. Solver basic deficiencies are discussed.
Comparisons to other software products are made.

Section 10: Statistical distributions and related functions. A general
description of the distribution functions available in Excel.

Section 11: Testing for accuracy and reliability of statistical
distributions. Descriptions and methods on testing these distributions.

Section 12: Results of new tests on statistical distributions. Discrete,
continuous density, continuous cumulative, and continuous inverse are
covered in four subsections.

Section 13: Statistical tests, tests of significance and tests of a
hypothesis. Tests on the t test, F test and Z test functions and routines.
Discusses the problems and reported faults with these. Discusses the
Fisher-Berens problem and Excel’s implentation of a solution.

Section 14: Random number generation. Discusses the Excel random number
generators for both 2000 and 2003 and gives the results of tests.

Section 15: Add-in packages. PHSTAT1, PHSTAT2, DDXL and MEGASTAT were
evaluated. Only MEGASTAT is acceptable.

Section 16: Bibliography

There are 26 notes that expand on parts of the testing project. They are
referred to in the sections. An expanded XLS file is included that gives
worksheets on how to easily generate the charts commonly found in
introductory statistics textbooks.

The documents are available at www.daheiser.info

*Website:* http://www.daheiser.info *Contact Information:

*David Heiser, MS.
Carmichael, California
*URL:* http://www.csdassn.org/reportdetail.cfm?ID=509

Interesantemente, Gnumeric presentaba problemas similares, sólo que en este
caso, la corrección fué rápida y correcta (en mayor medida).

*Title:* Fixing Statistical Errors in Spreadsheet Software: The Cases of
Gnumeric and Excel

*Content:*The open source spreadsheet package "Gnumeric" was such a good
clone of Microsoft Excel that it even had errors in its statistical
functions similar to those in Excel's statistical functions. When apprised
of the errors in v1.0.4, the developers of Gnumeric indicated that they
would try to fix the errors. Indeed, Gnumeric v1.1.2, has largely fixed its
flaws, while Microsoft has not fixed its errors through many successive
versions. Persons who desire to use a spreadsheet package to perform
statistical analyses are advised to use Gnumeric rather than Excel.

*Contact Information:* B. D. McCullough, Department of Decision Science,
LeBow College of Business, Drexel University, Philadelphia, PA 19104
* URL: *http://www.csdassn.org/reportdetail.cfm?ID=508

Claro está que para cálculos simples, probablemente MS Excel, LibreOffice
Calc y Gnumeric sean OK, pero si necesitan tener *resultados correctos* en
los cálculos, Gnumeric resulta ser una buen opción (otra muchísimo mejor es
R, pero eso mejor para otra ocasión)

Bottom line: No porque sea software comercial es bueno, ni "gold standard",
no porque sea software FLOSS está mal hecho e incorrecto, o viceversa :-)

Espero que la confusión este clara (como decía un profesor de mi juventud).

Saludos

--
Jesus M. Castagnetto <[email protected]>
Web: http://www.castagnetto.com/
_______________________________________________
Lista de correo Linux-plug
Temática: Discusión general sobre Linux
Peruvian Linux User Group (http://www.linux.org.pe)

Participa suscribiéndote y escribiendo a:  [email protected]
Para darte de alta, de baja  o hacer ajustes a tu suscripción visita:
http://voip2.voip.net.pe/mailman/listinfo/linux-plug

IMPORTANTE: Reglas y recomendaciones
http://www.linux.org.pe/listas/reglas.php
http://www.linux.org.pe/listas/comportamiento.php
http://www.linux.org.pe/listas/recomendaciones.php

Alojamiento de listas cortesia de http://cipher.pe

Responder a