Sorry, that's my mistake, I should not have said 'long vector'; mine is just a normal vector. I'm not actually using a development version.
Best, Steve On Wed, Dec 5, 2012 at 4:22 PM, Prof Brian Ripley <rip...@stats.ox.ac.uk> wrote: > > > And BTW, 'long vector' is a technical term in R: not 12,000, but more than 2 > billion elements. You will hear it a lot more in the run-up to the next > 'minor' release of R (currently R-devel, maybe 2.16.0-to-be, which is the > only version from which that quote comes that I am aware of). > > The posting guide asked for 'at a minimum' information: if you are using an > unreleased development version of R you really must tell us (and should not > be reporting to the R-help list). > > >> >> Sarah >> >> On Wed, Dec 5, 2012 at 3:53 PM, Stephen Politzer-Ahles >> <politzerahl...@gmail.com> wrote: >>> >>> Hello, >>> >>> duplicated() does not seem to work for a long vector. For example, if >>> you download the data from >>> https://docs.google.com/open?id=0B6-m45Jvl3ZmNmpaSlJWMXo5bmc (a vector >>> with about 12,000 numbers) and then run the following code which does >>> duplicated() over the whole vector but just shows the last 30 >>> elements: >>> >>> data.frame( tail(verylong, 30), tail(duplicated(verylong), 30) ) >>> >>> you'll see that at the end of the very long vector everything is >>> listed as a duplicate of the preceding element (even though it >>> shouldn't be). On the other hand, if you run the following code which >>> just takes out the last 30 elements of the vector and does duplicated >>> on them: >>> >>> data.frame( tail(verylong, 30), duplicated(tail(verylong, 30)) ) >>> >>> you get the correct results (FALSE shows up wherever the value in the >>> first column changes). Does anyone know why this happens, and if >>> there's a fix? I notice the documentation for duplicated() says: "Long >>> vectors are supported for the default method of duplicated, but may >>> only be usable if nmax is supplied." But I've tried running this with >>> a high value of nmax given, and it still gives me the same problem. >>> >>> So far the only way I've figured out to get this duplicated()-like >>> vector is to use a for loop going through one item at a time, but that >>> takes about a minute to run. >>> >>> Best, >>> Steve Politzer-Ahles > > > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 -- Stephen Politzer-Ahles University of Kansas Linguistics Department http://people.ku.edu/~sjpa/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.