On Sunday, July 09, 2006 12:31 PM, Roger Koenker = RK <[EMAIL PROTECTED]> wrote
RK> On 7/8/06, Thaden, John J <[EMAIL PROTECTED]> wrote: JT> As there is nothing inherent in either compressed, sparse, JT> format that would prevent recognition and handling of JT> duplicated index pairs, I'm curious why the dgCMatrix JT> class doesn't also add x values in those instances? RK> why not multiply them? or take the larger one, RK> or ...? I would interpret this as a case of user RK> negligence -- there is no "natural" default behavior RK> for such cases. This user created example data to illustrate his question, but of course he faces real data, analytical chemical in this case, data that happen to come with an 8.4% occurrence of non-unique index pairs, and also, quite literally, with a "natural" way to treat cases (the ~nature~ of the assay makes it correct to sum them). I can think of other natural data sets where averaging would be the "natural" behavior. So you are right that there is no "default" natural behavior, thus, my suggestion to leave that to user choice via function argument or class slot, defaulted to summing. Actually in this case there ~is~ one behavior superior to summing -- abstracting one of the data pair (that share indices) into a second (very sparse) "overlay" matrix. Perhaps it is my negligence not to have done this instead querying the list :-) I am doing it now. Regards, -John Thaden RK> On Jul 9, 2006, at 11:06 AM, Douglas Bates wrote: DB> Your matrix Mc should be flagged as invalid. Martin and I should DB> discuss whether we want to add such a test to the validity method. It DB> is not difficult to add the test but there will be a penalty in that DB> it will slow down all operations on such matrices and I'm not sure if DB> we want to pay that price to catch a rather infrequently occuring DB> problem. RK> Elaborating the validity procedure to flag such instances seems RK> to be well worth the speed penalty in my view. Of course, RK> anticipating every such misstep imposes a heavy burden RK> on developers and constitutes the real "cost" of more elaborate RK> validity checking. RK> RK> [My 2cents based on experience with SparseM.] Confidentiality Notice: This e-mail message, including any a...{{dropped}} ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html