On Monday, February 14, 2011 04:40:24 pm Andreas J. Guelzow wrote:
> On Mon, 2011-02-14 at 13:35 -0500, Daniel P. Dougherty wrote:
> > It used to be in prior versions of Gnumeric that you could select the
> > entire column by clicking on the column header when doing a linear
> > regression.  Now when I do this I just get gibberish.
>

The empty cells down to  65536 are clearly not my data and I never asked them 
to be part of my data.  Frankly, this is simply weird and unexpected behavior 
for a spreadsheet to have.

Granted, the click-header-to-select column data idea was always put in 
spreadsheets (e.g. Excel) as a hack-ish work-around for the need of 
programmers to i) pre-allocate memory for the sheet rows and also ii) because 
it was needed to do rapid rendering of the sheet during scrollbar moves (i.e. 
paint off screen/double buffering) without flickering.  With Gnumeric being an 
OOP application,  I might argue that its not very nice OOP (data hiding) by 
exposing these rather arcane memory allocation internal details to the end 
user.  Indeed you as the programmer may pre-allocate memory for whatever 
number of rows you want--fine.  But it's probably a sign of good OOP design 
when the casual user isn't aware of what that value is...

But why not also allocate memory for a single integer (not much overhead 
there) for the true "row count" of a data set?

Why not use a convention that the number of rows in a data set is equal to the 
maximum row index containing at least one non-empty value?  Just increment up 
the row count as the user enters in data, decrement if they delete/clear,  and 
if the row count exceeds the memory allocated then allocate more memory -- but 
do it silently so the user doesn't need to know about it. 

The main points are:
        i) The end user shouldn't have to bump up against arcane internal 
details 
(like the program internally allocates memory for 65536 rows).  Such details 
certainly shouldn't get them into immediate trouble when doing basic 
row/column selections.
        ii) The end user is using a spreadsheet because they want point-and-
click functionality.  They don't want to have to scroll all the way to the 
bottom (possibly large data set)  to select/drag back to top row. 
        iii) Perhaps Gnumeric developers need to come to a consensus on a 
special character "." or "NAN" etc to indicate truely missing data versus 
empty data (or more generally maybe make the special value user-determined). 
        iv) OK, why not have a menu item that "Select data block" that gets the 
data up to the least non-empty row?  Then why not make that method get called 
when a user clicks on a column header?
        v) If there is no good work-around for this issue one wonders why 
column 
headers to select the entire column all the way down to  65536 rows even be 
enabled feature.   When would typical user ever want to do that??   Just 
remove that functionality altogether from Gnumeric lest the casual mom and pop 
user get themselves into trouble.

P.S.

As an aside I tried to deal with the issue by Edit->Select->"Go to bottom".   
Note that "Go to bottom" does stop at the _least_ non-empty row.  However, if 
there is an empty row followed by data it doesn't get to it. 


> Gnumeric now uses the existing sheet functions. As a consequence changes
> in the data are reflected in the output.
> In ancient times, Gnumeric would just calculate everything ones skipping
> any empty fields. Note that that was often not what was desired since a
> single missing value would offset the correspondence between the
> dependent and independent variables.
>

I refer to my point iii) above.... Empty cells within a selection should 
probably result in thrown error -- not return 0.  Returning 0 was never the 
right thing to do.  Gnumeric developers should decide on a special character 
adopted for truely missing data (or more generally maybe make the special 
value user-determined). 

 iii) Perhaps Gnumeric needs to have a special character "." or "NAN" to 
indicate truely missing data versus empty data (or more generally maybe make 
the special value user-determined). 


> Andreas
> 
> > SUMMARY OUTPUT              Response Variable:      Column 2
> > 
> > Regression Statistics
> > Multiple R  #VALUE!
> > R^2 #VALUE!
> > Standard Error      #VALUE!
> > Adjusted R^2        #VALUE!
> > Observations        #VALUE!
> > 
> > ANOVA
> > 
> >     df      SS      MS      F       Significance of F
> > 
> > Regression  1       #VALUE! #VALUE! #VALUE! #VALUE!
> > Residual    #VALUE! #VALUE! #VALUE!
> > Total       #VALUE! #VALUE!
> > 
> >     Coefficients    Standard Error  t-Statistics    p-Value Lower 95%       
> > Upper
> > 
> > 95%
> > Intercept   #VALUE! #VALUE! #VALUE! #VALUE! #VALUE! #VALUE!
> > Column 1    #VALUE! #VALUE! #VALUE! #VALUE! #VALUE! #VALUE!
> > 
> > Similar frustrations now also occur when trying to do a histogram. 
> > Selecting an entire column without having to drag the entire region
> > worked in previous versions of Gnumeric.  Why has this basic
> > functionality disappeared?  Is this a bug in the new version??  Based on
> > the output I'm seeing it seems to be a bug and have something to do with
> > 65536 as the maximum number of rows is the Gnumeric sheet getting
> > treated as data (possibly as zeros)  even though the cells are empty!! 
> > Previous versions were smart enough to detect where in the block of real
> > data ended and the empty cells started.
> > 
> > 
> > Histogram
> > 
> >             Column 1
> > 
> > from −∞     to below 0.04003886889991       65511
> > from 0.04003886889991       to below 0.14538315034182       3
> > from 0.14538315034182       to below 0.25072743178374       4
> > from 0.25072743178374       to below 0.35607171322565       1
> > from 0.35607171322565       to below 0.46141599466757       5
> > from 0.46141599466757       to below 0.56676027610949       4
> > from 0.56676027610949       to below 0.6721045575514        1
> > from 0.6721045575514        to below 0.77744883899332       2
> > from 0.77744883899332       to below 0.88279312043524       1
> > from 0.88279312043524       to below 0.98813740187715       2
> > from 0.98813740187715       to ∞    2
> > _______________________________________________
> > gnumeric-list mailing list
> > gnumeric-list@gnome.org
> > http://mail.gnome.org/mailman/listinfo/gnumeric-list
_______________________________________________
gnumeric-list mailing list
gnumeric-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gnumeric-list

Reply via email to