[R] R / G GUI freezes saving plot
Randomly, whenever I try to save a plot, R becomes unresponsive and has to be killed. This happens almost every time. R version 3.2.4 (2016-03-10) -- "Very Secure Dishes” Platform: x86_64-apple-darwin13.4.0 (64-bit) R.app GUI 1.67 (7152) x86_64-apple-darwin13.4.0 Os el capitan 10.11.3 (Although the problem was present in previous versions) How to prevent this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to reach the column names in a huge .RData file without loading it
You *might* be able to get them from the raw file... First, I don't quite know what "colnames" of an .RData file means. "colnames" are the column names of a matrix (or data frame), so I'll assume your .RData file contains exactly one data frame and you want to column names of it. So let's create one of those: mydataframe = data.frame(mylongnamehere=runif(3), anotherlongname=runif(3), z=runif(3), y=runif(3), aasdkjhasdkjhaskdj=runif(3)) save(mydataframe, file="./test.RData") Now I'm going to use some Unix utilities to see if there's any identifiable strings in the file. .RData files are by default compressed using `gzip`, so I'll `gunzip` them and pipe it into `strings`: $ gunzip -c test.RData | strings -t d 0 RDX2 35 mydataframe 230 names 251 mylongnamehere 273 anotherlongname 314 aasdkjhasdkjhaskdj 347 row.names 389 class 410 data.frame - thats found the object name (mydataframe) and most of the column names except the short ones, which are too short for `strings` to recognise. But if your names are long enough (4 or more chars, I think) they'll show up. Of course you'll have to filter them out from all the other string output, but they should all appear shortly after the word "names", since the colnames of a data frame are the "names" attribute of the data. If you don't have a Unix or Mac machine handy you can get these utilities on Windows via Cygwin but that's another story... Barry On Wed, Mar 16, 2016 at 3:59 PM, Lida Zeighamiwrote: > Hi, > I have a huge .RData file and I need just to get the colnames of it. so is > there any way to reach the column names without loading or reading the > whole file? > Since the file is so big and I need to repeat this process several times, > so it takes so long to load the file first and then take the colnames! > > Thanks > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R count.points thinks projections are different
I am currently working with telemetry data for some cats: x<-read.csv("CCATS.csv") obs2<-x[c("ID", "X", "Y")] dat_df <- obs2 %>% dplyr::select(ID) %>% as.data.frame() p4s <- "+proj=utm +zone=16 +ellps=clrk66 +datum=NAD27 +units=m +no_defs" p4s_crs <- CRS(p4s) xy <- obs2 %>% dplyr::select(X, Y) %>% as.matrix() xy_sp <- SpatialPointsDataFrame(xy, data = dat_df, proj4string=p4s_crs) I also have some raster maps of the study area: > summary(habitat) Object of class SpatialPixelsDataFrame Coordinates: min max x 328048.6 360028.8 y 1841819.0 1874744.0 Is projected: TRUE proj4string : [+proj=utm +zone=16 +ellps=clrk66 +datum=NAD27 +units=m +no_defs +nadgrids=@conus,@alaska,@ntv2_0.gsb,@ntv1_can.dat] Number of points: 13024 Grid attributes: cellcentre.offset cellsize cells.dim s1 328139.5 181.7057 176 s2 1842041.5 444.932474 Data attributes: elevationecosystem slope aspect Min. : 1.000 Min. : 1.00 Min. :0.0 Min. : 0.0 1st Qu.: 1.000 1st Qu.:15.00 1st Qu.:0.0 1st Qu.: 90.0 Median : 2.000 Median :15.00 Median :0.04134 Median : 90.0 Mean : 2.263 Mean :12.85 Mean :0.05194 Mean :132.5 3rd Qu.: 3.000 3rd Qu.:15.00 3rd Qu.:0.08523 3rd Qu.:180.0 Max. :11.000 Max. :18.00 Max. :0.35079 Max. :360.0 NA's :18 NA's :84NA's :84 when i try to use the count.points function i get the following message: Error in count.points(SpatialPoints(x), w) : different proj4string in w and xy but > identical(proj4string(habitat), proj4string(xy_sp)) [1] TRUE and i have also tried proj4string(xy_sp) <- proj4string(habitat) and then running again but i still get the same error message. Any help would be greatly appreciated. Becky [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot help
Dear all: I was wondering how I modify the plot command below so that the y-axis displays the numbers in a 4 by 4 scale. It looks like the plot generated by the commands below shows the y-axis in a 5 by 5 scale: Values <- c(1/16, 1/8, 1/4, 1/2, 1, 2, 4, 8, 16) Values Log <- log2(Values) Log Index <- c(1:9) Index ##2) Plot plot (x= Index, y=Values, ylim= c(-16,16), pch= 19, col = "blue") points (Log, pch = 19, col="green") Thanks. Andre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot numeric vs character
I haven't been following the thread but! If you want to use lattice xyplot # create x values in their right position -- # assuming equal spacing mydf$x = rep(1:3, each = 3) library(lattice) xyplot(num ~ x, mydf, scales = list(x = list(at = 1:3, label = letters[1:3])), xlab = "") Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ivan Calandra Sent: Thursday, 17 March 2016 23:26 To: R list Cc: Anne CANER Subject: Re: [R] plot numeric vs character So it looks like there is no better "base" solution than Petr's code... Thank you for your input anyway! Ivan -- Ivan Calandra, PhD University of Reims Champagne-Ardenne GEGENAA - EA 3795 CREA - 2 esplanade Roland Garros 51100 Reims, France +33(0)3 26 77 36 89 ivan.calan...@univ-reims.fr -- https://www.researchgate.net/profile/Ivan_Calandra https://publons.com/author/705639/ Le 17/03/2016 13:04, John Kane a écrit : > Another approach using ggplot2 and shamelessly swiped from > http://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-softw are-and-data-visualization. > > library(ggplot2) > ggplot(mydf, aes(x=let, y=num)) + >geom_dotplot(binaxis='y', stackdir='center', dotsize = 0.5) > > > > > John Kane > Kingston ON Canada > > >> -Original Message- >> From: ivan.calan...@univ-reims.fr >> Sent: Thu, 17 Mar 2016 11:29:52 +0100 >> To: r-help@r-project.org >> Subject: [R] plot numeric vs character >> >> Dear useRs, >> >> I would like to plot data points in a simple scatterplot. I don't have a >> lot of data per category, so a boxplot does not make sense. >> >> Here are some sample data: >> mydf <- data.frame(let=rep(letters[1:3],each=3), num=rnorm(9), >> stringsAsFactors=FALSE) >> >> I would like to do that, which throws an error, most likely because x is >> character: >> plot(mydf$let, mydf$num) >> >> If I convert to factor(), it plots a boxplot with no possibility (AFAIK) >> to plot points: >> mydf$let <- factor(mydf$let) >> plot(mydf$let, mydf$num, type='p') >> >> I know I can use the function points() in a somewhat convoluted manner: >> plot(mydf$num, xlim=c(1,3), type='n', xaxt='n') >> axis(side=1, at=1:3, labels=levels(mydf$let)) >> points(as.numeric(mydf$let), mydf$num) >> >> Isn't there a simple(r) way? Maybe I just missed something obvious... >> >> Thank you in advance for your help, >> Ivan >> >> -- >> Ivan Calandra, PhD >> University of Reims Champagne-Ardenne >> GEGENAA - EA 3795 >> CREA - 2 esplanade Roland Garros >> 51100 Reims, France >> +33(0)3 26 77 36 89 >> ivan.calan...@univ-reims.fr >> -- >> https://www.researchgate.net/profile/Ivan_Calandra >> https://publons.com/author/705639/ >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > Can't remember your password? Do you need a strong and secure password? > Use Password manager! It stores your passwords & protects your account. > Check it out at http://mysecurelogon.com/manager > > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to reach the column names in a huge .RData file without loading it
However: if you need to repeat the process, as you wrote, you could store the column names in a separate object for future access after your first read. B. On Mar 16, 2016, at 12:59 PM, Lida Zeighamiwrote: > Thank you Bert and Frederic. > > On Wed, Mar 16, 2016 at 11:52 AM, Bert Gunter > wrote: > >> Is it really a .Rdata file? If so, the answer is no, AFAIK, since >> .Rdata files are serialized (binary) versions of e.g. worksheets that >> can contain many different data objects. "colnames" has no meaning in >> this context. >> >> Corrections welcome if I have it wrong! >> >> Cheers, >> Bert >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Wed, Mar 16, 2016 at 8:59 AM, Lida Zeighami wrote: >>> Hi, >>> I have a huge .RData file and I need just to get the colnames of it. so >> is >>> there any way to reach the column names without loading or reading the >>> whole file? >>> Since the file is so big and I need to repeat this process several times, >>> so it takes so long to load the file first and then take the colnames! >>> >>> Thanks >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] gpuR 1.1.0 Release
Dear R users, The next release of gpuR (1.1.0) has been accepted to CRAN ( http://cran.r-project.org/package=gpuR). There have been multiple additions including: 1. Scalar operations for gpuMatrix/vclMatrix objects (e.g. 2 * X) 2. Unary '-' operator added (e.g. -X) 3. 'slice' and 'block' methods for vector & matrix objects respectively 4. 'deepcopy' methods 5. 'abs', 'max', 'min' methods added 6. 'cbind' & 'rbind' methods added for matrices 7. 't' method 8. 'distance' method for pairwise distances (euclidean and sqEuclidean) Introductory vignette can be found at https://cran.r-project.org/web/packages/gpuR/vignettes/gpuR.pdf Help with installation can be found at https://github.com/cdeterman/gpuR/wiki Bug reports, suggestions, and feature requests are appreciated at https://github.com/cdeterman/gpuR/issues Happy GPU computing, Charles [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R / G GUI freezes saving plot
No, nothing particular at all I would say. I generate plots either with functions from base R (such as plot() ) or ggplot2. Plotting functions are fine, so long as I don’t try to save them…. I also noted that the issue is most frequent when I save the plots from the menu (File>Save as) than if I save them from the command line using for example pdf() (But it still happens sometimes saving files from the command line). From: Jordan MeyerDate: Friday 18 March 2016 at 14:30 To: dp Cc: Subject: Re: [R] R / G GUI freezes saving plot Are there any particular types of plotting you are doing when it becomes unresponsive? If so, it would be helpful to see an example. On Fri, Mar 18, 2016 at 5:45 AM, Daniel Preciado wrote: Randomly, whenever I try to save a plot, R becomes unresponsive and has to be killed. This happens almost every time. R version 3.2.4 (2016-03-10) -- "Very Secure Dishes” Platform: x86_64-apple-darwin13.4.0 (64-bit) R.app GUI 1.67 (7152) x86_64-apple-darwin13.4.0 Os el capitan 10.11.3 (Although the problem was present in previous versions) How to prevent this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot numeric vs character
Hi I am not an expert but it probably results from covenience plot.factor method which calls boxplot when used for or factor data. See graphics:::plot.factor graphics:::boxplot.default I belive that you can do something with ggplot like (untested) p <-ggplot(mydf, aes(x=let, y=num)) p+geom_point() p+geom_jitter() or change plot.factor method, which is beyound my level. I considered dotchart but it is used for differnt data structure. Cheers Petr > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ivan > Calandra > Sent: Thursday, March 17, 2016 12:01 PM > To: R list > Cc: Anne CANER > Subject: Re: [R] plot numeric vs character > > Thanks Petr, > > You're right, no need for points(). > But is there an even simpler way? I still find it strange that plot() > does not allow points to be plotted instead of boxplots when x is a > factor (because boxplots do not always make sense). Is there a good > reason for that? > > Ivan > > -- > Ivan Calandra, PhD > University of Reims Champagne-Ardenne > GEGENAA - EA 3795 > CREA - 2 esplanade Roland Garros > 51100 Reims, France > +33(0)3 26 77 36 89 > ivan.calan...@univ-reims.fr > -- > https://www.researchgate.net/profile/Ivan_Calandra > https://publons.com/author/705639/ > > Le 17/03/2016 11:48, PIKAL Petr a écrit : > > Hi > > > > It would be easier if mydf$let was factor > > > > plot(as.numeric(factor(mydf$let)), mydf$num, xaxt="n") axis(1, > at=1:3, > > levels(factor(mydf$let))) > > > > Cheers > > Petr > > > > > >> -Original Message- > >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ivan > >> Calandra > >> Sent: Thursday, March 17, 2016 11:30 AM > >> To: R list > >> Cc: Anne CANER > >> Subject: [R] plot numeric vs character > >> > >> Dear useRs, > >> > >> I would like to plot data points in a simple scatterplot. I don't > >> have a lot of data per category, so a boxplot does not make sense. > >> > >> Here are some sample data: > >> mydf <- data.frame(let=rep(letters[1:3],each=3), num=rnorm(9), > >> stringsAsFactors=FALSE) > >> > >> I would like to do that, which throws an error, most likely because > x > >> is > >> character: > >> plot(mydf$let, mydf$num) > >> > >> If I convert to factor(), it plots a boxplot with no possibility > >> (AFAIK) to plot points: > >> mydf$let <- factor(mydf$let) > >> plot(mydf$let, mydf$num, type='p') > >> > >> I know I can use the function points() in a somewhat convoluted > manner: > >> plot(mydf$num, xlim=c(1,3), type='n', xaxt='n') axis(side=1, at=1:3, > >> labels=levels(mydf$let)) points(as.numeric(mydf$let), mydf$num) > >> > >> Isn't there a simple(r) way? Maybe I just missed something > obvious... > >> > >> Thank you in advance for your help, > >> Ivan > >> > >> -- > >> Ivan Calandra, PhD > >> University of Reims Champagne-Ardenne GEGENAA - EA 3795 CREA - 2 > >> esplanade Roland Garros 51100 Reims, France > >> +33(0)3 26 77 36 89 > >> ivan.calan...@univ-reims.fr > >> -- > >> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings:
Re: [R] Reshaping an array - how does it work in R
arrays are vectors stored in column major order. So the answer is: reindexing. Does this make it clear: > v <- array(1:24,dim=2:4) > as.vector(v) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 > v , , 1 [,1] [,2] [,3] [1,]135 [2,]246 , , 2 [,1] [,2] [,3] [1,]79 11 [2,]8 10 12 , , 3 [,1] [,2] [,3] [1,] 13 15 17 [2,] 14 16 18 , , 4 [,1] [,2] [,3] [1,] 19 21 23 [2,] 20 22 24 > w <- array(as.vector(v),dim=c(6,4)) ## you would use v instead of w for the > assignment > w [,1] [,2] [,3] [,4] [1,]17 13 19 [2,]28 14 20 [3,]39 15 21 [4,]4 10 16 22 [5,]5 11 17 23 [6,]6 12 18 24 > identical(as.vector(w), as.vector(v)) [1] TRUE However copying may occur anyway as part of R's semantics. Others will have to help you on that, as the details here are beyond me. Cheers, Bert Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Mar 18, 2016 at 2:28 PM, Roy Mendelssohn - NOAA Federalwrote: > Hi All: > > I am working with a very large array. if noLat is the number of latitudes, > noLon the number of longitudes and noTime the number of time periods, the > array is of the form: > > myData[noLat, no Lon, noTime]. > > It is read in this way because that is how it is stored in a (series) of > netcdf files. For the analysis I need to do, I need instead the array: > > myData[noLat*noLon, noTime]. Normally this would be easy: > > myData<- array(myData,dim=c(noLat*noLon,noTime)) > > My question is how does this command work in R - does it make a copy of the > existing array, with different indices for the dimensions, or does it just > redo the indices and leave the given array as is? The reason for this > question is my array is 30GB in memory, and I don’t have enough space to have > a copy of the array in memory. If the latter I will have to figure out a > work around to bring in only part of the data at a time and put it into the > proper locations. > > Thanks, > > -Roy > > > > ** > "The contents of this message do not reflect any position of the U.S. > Government or NOAA." > ** > Roy Mendelssohn > Supervisory Operations Research Analyst > NOAA/NMFS > Environmental Research Division > Southwest Fisheries Science Center > ***Note new address and phone*** > 110 Shaffer Road > Santa Cruz, CA 95060 > Phone: (831)-420-3666 > Fax: (831) 420-3980 > e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/ > > "Old age and treachery will overcome youth and skill." > "From those who have been given much, much will be expected" > "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message: Computation failed in `stat_bin()`: attempt to apply non-function
> ggplot(x=aes(friend_count), data=pf) + geom_histogram() The x= in the above statement is wrong ggplot(aes(friend_count), data=pf) + geom_histogram() will work but it looks funny. The more common way to write the command would likely be: ggplot(pf, aes(friend_count)) + geom_histogram() The tat_bin() message is just a warning. Your data set has ~ 99,000 rows of data and geom_histogram() 's default is 30 bins which is probably too few to for you to see what is happening. Come to Change the default binwidth to something a bit more manageable or perhaps use geom_density() to look at the distribution. John Kane Kingston ON Canada > -Original Message- > From: shahab.mok...@gmail.com > Sent: Thu, 17 Mar 2016 19:01:19 +0100 > To: r-help@r-project.org > Subject: [R] Warning message: Computation failed in `stat_bin()`: attempt > to apply non-function > > Hi, > > I am trying to plot a sample dataset using ggplot2, but I am keep getting > the following error message and an empty plot! > Apparently something is wrong in the dataset, but what? > > R : > pf<-read.csv('pseudo_facebook.tsv', sep='\t') > > ggplot(x=aes(friend_count), data=pf) + geom_histogram() > > stat_bin()` using `bins = 30`. Pick better value with `binwidth`. > Warning message: > Computation failed in `stat_bin()`: > attempt to apply non-function > > The dataset can be found at : > https://s3.amazonaws.com/udacity-hosted-downloads/ud651/pseudo_facebook.tsv > > thanks, > /Shahab > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What "method" does sort() use?
On 18 Mar 2016, at 10:02 , Patrick Connollywrote: > I don't follow why this happens: > >> sort(c(LETTERS[1:5], letters[1:5])) > [1] "a" "A" "b" "B" "c" "C" "d" "D" "e" "E" > > The help for sort() says: > > method: character string specifying the algorithm used. Not > available for partial sorting. Can be abbreviated. > > But what are the methods available? The help mentions xtfrm but that > doesn't illuminate, I'd have thought that at least by default it would > have something to do with ASCII codes. But that's not the case since > all the uppercase ones would be before the lowercase ones. > > I know something different is happening but I don't know what it is > (do you, Mr Jones?). Apologies to Bob Dylan. > Um, read _all_ of the help file? sort.int(x, partial = NULL, na.last = NA, decreasing = FALSE, method = c("shell", "quick"), index.return = FALSE) [snip] Method "shell" uses Shellsort (an O(n^{4/3}) variant from Sedgewick (1986)). If x has names a stable modification is used, so ties are not reordered. (This only matters if names are present.) Method "quick" uses Singleton (1969)'s implementation of Hoare's Quicksort method and is only available when x is numeric (double or integer) and partial is NULL. (For other types of x Shellsort is used, silently.) It is normally somewhat faster than Shellsort (perhaps 50% faster on vectors of length a million and twice as fast at a billion) but has poor performance in the rare worst case. (Peto's modification using a pseudo-random midpoint is used to make the worst case rarer.) This is not a stable sort, and ties may be reordered. Factors with less than 100,000 levels are sorted by radix sorting when method is not supplied: see sort.list. -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshaping an array - how does it work in R
Hi Henrik: I want to do want in oceanography is called an EOF, which is just a PCA analysis. Unless I am missing something, in R I need to flatten my 3-D matrix into a 2-D data matrix. I can fit the entire 30GB matrix into memory, and I believe I have enough memory to do the PCA by constraining the number of components returned . What I don’t think I have enough memory for is an operation that makes a copy of the matrix. As I said, in theory I know how to do the flattening, it a simple command, but in practice I don’t have enough memory. So I spent the afternoon rewriting my code to read in parts of the data at a time and then putting those in the appropriate places of a matrix already flattened of appropriate size. In case someone is wondering, on the 3D grid the matrix is [1001,1001,3650]. So I create an empty matrix size [1001*1001,3650], and read in a slice of the lat-lon grid, and map those into the appropriate places in the flattened matrix. By reading in appropriately sized chunks my memory usage is not pushed too far. -Roy > On Mar 18, 2016, at 7:37 PM, Henrik Bengtsson> wrote: > > On Fri, Mar 18, 2016 at 3:15 PM, Roy Mendelssohn - NOAA Federal > wrote: >> Thanks. That is what I needed to know. I don’t want to play around with >> some of the other suggestions, as I don’t totally understand what they do, >> and don’t want to risk messing up something and not be aware of it. >> >> There is a way to read in the data chunks at a time and reshape it and put, >> it into the (reshaped) larger array, harder to program but probably worth >> the pain to be certain of what I am doing. > > I recommend this approach; whenever I work with reasonable large data > (that may become even larger in the future), I try to implement a > constant-memory version of the algorithm, which often comes down to > processing data in chunks. The simplest version of this is to read > all data into memory and the subset, but if you can read data in in > chunks that is even better. > > Though, I'm curious to what matrix operations you wish to perform. > Because if you wish to do regular summation, then base::.rowSums() and > base::.colSums() allow you to override the default dimensions on the > fly without inducing extra copies, e.g. > >> X <- array(1:24, dim=c(2,3,4)) >> .rowSums(X, m=6, n=4) > [1] 40 44 48 52 56 60 >> rowSums(matrix(X, nrow=6, ncol=4)) > [1] 40 44 48 52 56 60 > > For other types of calculations, you might want to look at > matrixStats. It has partial(*) support for overriding the default > dimensions in a similar fashion. For instance, > >> library("matrixStats") >> rowVars(X, dim.=c(6,4)) > > The above effectively calculates rowVars(matrix(X, nrow=6, ncol=4)) > without making copies. > > (*) By partial I mean that this is a feature that hasn't been pushed > through to all of matrixStats functions, cf. > https://github.com/HenrikBengtsson/matrixStats/issues/83. > > Cheers, > > Henrik > (author of matrixStats) > >> >> I had a feeling a copy was made, just wanted to make certain of it. >> >> Thanks again, >> >> -Roy >> >>> On Mar 18, 2016, at 2:56 PM, Dénes Tóth wrote: >>> >>> Hi Roy, >>> >>> R (usually) makes a copy if the dimensionality of an array is modified, >>> even if you use this syntax: >>> x <- array(1:24, c(2, 3, 4)) >>> dim(x) <- c(6, 4) >>> >>> See also ?tracemem, ?data.table::address, ?pryr::address and other tools to >>> trace if an internal copy is done. >>> >>> Workaround: use data.table::setattr or bit::setattr to modify the >>> dimensions in place (i.e., without making a copy). Risk: if you modify an >>> object by reference, all other objects which point to the same memory >>> address will be modified silently, too. >>> >>> HTH, >>> Denes >>> >>> >>> >>> On 03/18/2016 10:28 PM, Roy Mendelssohn - NOAA Federal wrote: Hi All: I am working with a very large array. if noLat is the number of latitudes, noLon the number of longitudes and noTime the number of time periods, the array is of the form: myData[noLat, no Lon, noTime]. It is read in this way because that is how it is stored in a (series) of netcdf files. For the analysis I need to do, I need instead the array: myData[noLat*noLon, noTime]. Normally this would be easy: myData<- array(myData,dim=c(noLat*noLon,noTime)) My question is how does this command work in R - does it make a copy of the existing array, with different indices for the dimensions, or does it just redo the indices and leave the given array as is? The reason for this question is my array is 30GB in memory, and I don’t have enough space to have a copy of the array in memory. If the latter I will have to figure out a work around to bring in only part of the data at a time and put it into the proper locations.
Re: [R] Reshaping an array - how does it work in R
> On Mar 18, 2016, at 2:56 PM, Bert Gunterwrote: > > However copying may occur anyway as part of R's semantics. Others will > have to help you on that, as the details here are beyond me. > > Cheers, > Bert Hi Bert: Thanks for your response. The only part I was concerned with is whether a copy was made, that is what my memory usage would be. Sorry if that wasn’t clear in the original. -Roy ** "The contents of this message do not reflect any position of the U.S. Government or NOAA." ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center ***Note new address and phone*** 110 Shaffer Road Santa Cruz, CA 95060 Phone: (831)-420-3666 Fax: (831) 420-3980 e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/ "Old age and treachery will overcome youth and skill." "From those who have been given much, much will be expected" "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshaping an array - how does it work in R
Thanks. That is what I needed to know. I don’t want to play around with some of the other suggestions, as I don’t totally understand what they do, and don’t want to risk messing up something and not be aware of it. There is a way to read in the data chunks at a time and reshape it and put, it into the (reshaped) larger array, harder to program but probably worth the pain to be certain of what I am doing. I had a feeling a copy was made, just wanted to make certain of it. Thanks again, -Roy > On Mar 18, 2016, at 2:56 PM, Dénes Tóthwrote: > > Hi Roy, > > R (usually) makes a copy if the dimensionality of an array is modified, even > if you use this syntax: > x <- array(1:24, c(2, 3, 4)) > dim(x) <- c(6, 4) > > See also ?tracemem, ?data.table::address, ?pryr::address and other tools to > trace if an internal copy is done. > > Workaround: use data.table::setattr or bit::setattr to modify the dimensions > in place (i.e., without making a copy). Risk: if you modify an object by > reference, all other objects which point to the same memory address will be > modified silently, too. > > HTH, > Denes > > > > On 03/18/2016 10:28 PM, Roy Mendelssohn - NOAA Federal wrote: >> Hi All: >> >> I am working with a very large array. if noLat is the number of latitudes, >> noLon the number of longitudes and noTime the number of time periods, the >> array is of the form: >> >> myData[noLat, no Lon, noTime]. >> >> It is read in this way because that is how it is stored in a (series) of >> netcdf files. For the analysis I need to do, I need instead the array: >> >> myData[noLat*noLon, noTime]. Normally this would be easy: >> >> myData<- array(myData,dim=c(noLat*noLon,noTime)) >> >> My question is how does this command work in R - does it make a copy of the >> existing array, with different indices for the dimensions, or does it just >> redo the indices and leave the given array as is? The reason for this >> question is my array is 30GB in memory, and I don’t have enough space to >> have a copy of the array in memory. If the latter I will have to figure out >> a work around to bring in only part of the data at a time and put it into >> the proper locations. >> >> Thanks, >> >> -Roy >> >> >> >> ** >> "The contents of this message do not reflect any position of the U.S. >> Government or NOAA." >> ** >> Roy Mendelssohn >> Supervisory Operations Research Analyst >> NOAA/NMFS >> Environmental Research Division >> Southwest Fisheries Science Center >> ***Note new address and phone*** >> 110 Shaffer Road >> Santa Cruz, CA 95060 >> Phone: (831)-420-3666 >> Fax: (831) 420-3980 >> e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/ >> >> "Old age and treachery will overcome youth and skill." >> "From those who have been given much, much will be expected" >> "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> ** "The contents of this message do not reflect any position of the U.S. Government or NOAA." ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center ***Note new address and phone*** 110 Shaffer Road Santa Cruz, CA 95060 Phone: (831)-420-3666 Fax: (831) 420-3980 e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/ "Old age and treachery will overcome youth and skill." "From those who have been given much, much will be expected" "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshaping an array - how does it work in R
On Fri, Mar 18, 2016 at 3:15 PM, Roy Mendelssohn - NOAA Federalwrote: > Thanks. That is what I needed to know. I don’t want to play around with > some of the other suggestions, as I don’t totally understand what they do, > and don’t want to risk messing up something and not be aware of it. > > There is a way to read in the data chunks at a time and reshape it and put, > it into the (reshaped) larger array, harder to program but probably worth the > pain to be certain of what I am doing. I recommend this approach; whenever I work with reasonable large data (that may become even larger in the future), I try to implement a constant-memory version of the algorithm, which often comes down to processing data in chunks. The simplest version of this is to read all data into memory and the subset, but if you can read data in in chunks that is even better. Though, I'm curious to what matrix operations you wish to perform. Because if you wish to do regular summation, then base::.rowSums() and base::.colSums() allow you to override the default dimensions on the fly without inducing extra copies, e.g. > X <- array(1:24, dim=c(2,3,4)) > .rowSums(X, m=6, n=4) [1] 40 44 48 52 56 60 > rowSums(matrix(X, nrow=6, ncol=4)) [1] 40 44 48 52 56 60 For other types of calculations, you might want to look at matrixStats. It has partial(*) support for overriding the default dimensions in a similar fashion. For instance, > library("matrixStats") > rowVars(X, dim.=c(6,4)) The above effectively calculates rowVars(matrix(X, nrow=6, ncol=4)) without making copies. (*) By partial I mean that this is a feature that hasn't been pushed through to all of matrixStats functions, cf. https://github.com/HenrikBengtsson/matrixStats/issues/83. Cheers, Henrik (author of matrixStats) > > I had a feeling a copy was made, just wanted to make certain of it. > > Thanks again, > > -Roy > >> On Mar 18, 2016, at 2:56 PM, Dénes Tóth wrote: >> >> Hi Roy, >> >> R (usually) makes a copy if the dimensionality of an array is modified, even >> if you use this syntax: >> x <- array(1:24, c(2, 3, 4)) >> dim(x) <- c(6, 4) >> >> See also ?tracemem, ?data.table::address, ?pryr::address and other tools to >> trace if an internal copy is done. >> >> Workaround: use data.table::setattr or bit::setattr to modify the dimensions >> in place (i.e., without making a copy). Risk: if you modify an object by >> reference, all other objects which point to the same memory address will be >> modified silently, too. >> >> HTH, >> Denes >> >> >> >> On 03/18/2016 10:28 PM, Roy Mendelssohn - NOAA Federal wrote: >>> Hi All: >>> >>> I am working with a very large array. if noLat is the number of latitudes, >>> noLon the number of longitudes and noTime the number of time periods, the >>> array is of the form: >>> >>> myData[noLat, no Lon, noTime]. >>> >>> It is read in this way because that is how it is stored in a (series) of >>> netcdf files. For the analysis I need to do, I need instead the array: >>> >>> myData[noLat*noLon, noTime]. Normally this would be easy: >>> >>> myData<- array(myData,dim=c(noLat*noLon,noTime)) >>> >>> My question is how does this command work in R - does it make a copy of the >>> existing array, with different indices for the dimensions, or does it just >>> redo the indices and leave the given array as is? The reason for this >>> question is my array is 30GB in memory, and I don’t have enough space to >>> have a copy of the array in memory. If the latter I will have to figure >>> out a work around to bring in only part of the data at a time and put it >>> into the proper locations. >>> >>> Thanks, >>> >>> -Roy >>> >>> >>> >>> ** >>> "The contents of this message do not reflect any position of the U.S. >>> Government or NOAA." >>> ** >>> Roy Mendelssohn >>> Supervisory Operations Research Analyst >>> NOAA/NMFS >>> Environmental Research Division >>> Southwest Fisheries Science Center >>> ***Note new address and phone*** >>> 110 Shaffer Road >>> Santa Cruz, CA 95060 >>> Phone: (831)-420-3666 >>> Fax: (831) 420-3980 >>> e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/ >>> >>> "Old age and treachery will overcome youth and skill." >>> "From those who have been given much, much will be expected" >>> "the arc of the moral universe is long, but it bends toward justice" -MLK >>> Jr. >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> > > ** > "The contents of this message do not reflect any position of the U.S. > Government or NOAA." > ** > Roy Mendelssohn > Supervisory Operations Research Analyst > NOAA/NMFS >
Re: [R] Reshaping an array - how does it work in R
R always makes a copy for this kind of operation. There are some operations that don't make copies, but I don't think this one qualifies. -- Sent from my phone. Please excuse my brevity. On March 18, 2016 2:28:35 PM PDT, Roy Mendelssohn - NOAA Federalwrote: >Hi All: > >I am working with a very large array. if noLat is the number of >latitudes, noLon the number of longitudes and noTime the number of >time periods, the array is of the form: > >myData[noLat, no Lon, noTime]. > >It is read in this way because that is how it is stored in a (series) >of netcdf files. For the analysis I need to do, I need instead the >array: > >myData[noLat*noLon, noTime]. Normally this would be easy: > >myData<- array(myData,dim=c(noLat*noLon,noTime)) > >My question is how does this command work in R - does it make a copy of >the existing array, with different indices for the dimensions, or does >it just redo the indices and leave the given array as is? The reason >for this question is my array is 30GB in memory, and I don’t have >enough space to have a copy of the array in memory. If the latter I >will have to figure out a work around to bring in only part of the data >at a time and put it into the proper locations. > >Thanks, > >-Roy > > > >** >"The contents of this message do not reflect any position of the U.S. >Government or NOAA." >** >Roy Mendelssohn >Supervisory Operations Research Analyst >NOAA/NMFS >Environmental Research Division >Southwest Fisheries Science Center >***Note new address and phone*** >110 Shaffer Road >Santa Cruz, CA 95060 >Phone: (831)-420-3666 >Fax: (831) 420-3980 >e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/ > >"Old age and treachery will overcome youth and skill." >"From those who have been given much, much will be expected" >"the arc of the moral universe is long, but it bends toward justice" >-MLK Jr. > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.