Following Duncan's suggestion, I forward the below to R-devel. vQ
-------- Original Message -------- Subject: Re: [R] Randomly remove condition-selected rows from a matrix Date: Fri, 02 Jan 2009 10:34:52 -0500 From: Duncan Murdoch <murd...@stats.uwo.ca> To: Wacek Kusnierczyk <waclaw.marcin.kusnierc...@idi.ntnu.no> CC: R help <r-h...@stat.math.ethz.ch> References: <79cafbdd-4bb8-4c9d-a0e9-54e280458...@gmail.com> <8b356f880812300920o19d18aeo47dc31f087c3...@mail.gmail.com> <da6ecc19-c786-4c02-b246-4b613726b...@gmail.com> <8b356f880812311042la28aef3t81ad09a3b14c...@mail.gmail.com> <495e2d95.9040...@idi.ntnu.no> On 02/01/2009 10:07 AM, Wacek Kusnierczyk wrote: > Stavros Macrakis wrote: >> On Wed, Dec 31, 2008 at 12:44 PM, Guillaume Chapron >> <carnivorescie...@gmail.com> wrote: >> >>>> m[-sample(which(m[,1]<8 & m[,2]>12),2),] >>>> >>> Supposing I sample only one row among the ones matching my criteria. Then >>> consider the case where there is just one row matching this criteria. Sure, >>> there is no need to sample, but the instruction would still be executed. >>> Then if this row index is 15, my instruction becomes which(15,1), and this >>> can gives me any row from 1 to 15, which is not correct. I have to make a >>> condition in case there is only one row matching the criteria. >>> >> Yes, this is a (documented!) design flaw in 'sample' -- see the man page. >> >> For some reason, the designers of R have chosen to document the flaw >> and leave it up to individual users to work around it rather than fix >> it definitively. A related case is sample(c(),0), which gives an >> error rather than giving an empty vector, though in general R deals >> with empty vectors correctly (e.g. sum(c()) => 0). >> >> > > interestingly, ?sample says: > > " > 'sample' takes a sample of the specified size from the elements of > 'x' using either with or without replacement. > > x: Either a (numeric, complex, character or logical) vector of > more than one element from which to choose, or a positive > integer. > > If 'x' has length 1, is numeric (in the sense of 'is.numeric') and > 'x >= 1', sampling takes place from '1:x'. _Note_ that this > convenience feature may lead to undesired behaviour when 'x' is of > varying length 'sample(x)'. See the 'resample()' example below. > > " > > yet the following works, even though x has length 1 and is *not* numeric: > > x = "foolme" > is.numeric(x) > sample(x, 1) > sample(x) > > x = NA > is.numeric(NA) > sample(x, 1) > sample(x) > > is this a bug in the code, or a bug in the documentation? > > > >> To my mind, it is bizarre to have an important basic function which >> works for some argument lengths but not others. The convenience of >> being able to write sample(5,2) for sample(1:5,2) hardly seems worth >> inflicting inconsistency on all users -- but perhaps one of the >> designers of R/S can enlighten us on the design rationale here. >> >> > > hopefully. This is more of an R-devel sort of question. My guess is that this is in the S blue book, but I don't have a copy here to check. Duncan Murdoch ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel