Re: [R] problem with labeling plots, possibly in font defaults
See http://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#I-see-no-text-in-a-Quartz-plot_0021 And the default font is not serif ... that FAQ says it is Arial, but I do not know if that is current. Mac-specific questions to R-sig-mac please. (This must be Mac-specific as the default device, quartz(), is.) On 07/08/2014 23:24, David Winsemius wrote: On Aug 7, 2014, at 12:59 PM, Tim Blass wrote: Hello, I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4. I just tried making and labeling a plot as follows: x<-rnorm(10) y<-rnorm(10) plot(x,y) title(main="random points") which produces a scatter plot of the random points, but without the title and without any numbers along the axes. If I then run par(family="sans") plot(x,y,main="plot title") the plot has the title and the numbers on the axes (also and 'x' and 'y' appear as default labels for the axes). I do not know what is going on, but maybe there is some problem in the default font settings (I don't know if that could be an R issue or an issue specific to my Mac)? It hasn't happened to me recently (since updating from Leopard and SnowLeapard to Lion) but it used to happen pretty frequently that I would get a duplicate font that printed empty square boxes. (I wasn't the only one. It got reported several times on R-SIG-Mac.) One could fix that sort of problem by deleting the offending duplicate entry using Font Book.app Since this is happening with the default serif font, you would probably find the duplicate in Times. (It used to be happening to me with Symbol.) quartzFonts()$serif [1] "Times-Roman" "Times-Bold" "Times-Italic" "Times-BoldItalic" This is clearly not a big problem (at least for now) since I can put labels on plots by running par(), but if it is indicative of a larger underlying problem (or if there is a simple fix) I would like to know. Thank you! David Winsemius Alameda, CA, USA -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying Different Predictive Models over Different Geographic subsets of a RasterStack
Hi Mitchell, It's a good suggestion, and I had thought of this, but one concern is that the model fit at the end of the terminal nodes would need to be a generalized linear model (the predicted outcomes are factors). Given that the models need to be fit automatically without human guidance, I'm not keen on using a GLM with the types predictor variables I have... mobForest is a take-off on mob, but my initial testing indicated that it took a VERY long time to fit the models. Thanks! Craig On Thu, Aug 7, 2014 at 8:10 PM, Mitchell Maltenfort wrote: > I don't know this particular package well, but I believe "party" contains > something called "mob" which creates a regression tree terminating in > different models at each node. Could that be adapted to your project? > > > On Thursday, August 7, 2014, Craig Aumann wrote: > >> I'm struggling with the best way to apply different predictive models over >> different geographical areas of a raster stack. >> >> The context of the problem is that different predictive models are >> developed within different polygonal regions of the overall study area. >> Each model needs to be used to predict an outcome for just the geographic >> area for which it was developed. Every pixel has one and only one >> predictive model, but the model changes across different regions of the >> landscape. The models come from a "random forest" fit. >> >> The problem is that the rasterstack is rather large both in terms of >> number >> of pixels and also the number of layers which the predictive model needs >> to >> use. If the problem were smaller, there are a number of things I could >> "get away with" in terms of how I would do this, but given the problem >> size, I need a more cunning solution. >> >> Ideally, I would like to only call predict from the package Raster just >> once, and have the predict function call the right model based on the >> geographical location of the pixel. However, not clear that this is >> possible with the Raster Package, or if it is possible how to implement it >> efficiently. >> >> Any ideas or suggestions greatly appreciated. >> >> Cheers! >> Craig >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > > Ersatzistician and Chutzpahthologist > > I can answer any question. "I don't know" is an answer. "I don't know > yet" is a better answer. > > "I can write better than anybody who can write faster, and I can write > faster than anybody who can write better" AJ Leibling > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying Different Predictive Models over Different Geographic subsets of a RasterStack
I don't know this particular package well, but I believe "party" contains something called "mob" which creates a regression tree terminating in different models at each node. Could that be adapted to your project? On Thursday, August 7, 2014, Craig Aumann wrote: > I'm struggling with the best way to apply different predictive models over > different geographical areas of a raster stack. > > The context of the problem is that different predictive models are > developed within different polygonal regions of the overall study area. > Each model needs to be used to predict an outcome for just the geographic > area for which it was developed. Every pixel has one and only one > predictive model, but the model changes across different regions of the > landscape. The models come from a "random forest" fit. > > The problem is that the rasterstack is rather large both in terms of number > of pixels and also the number of layers which the predictive model needs to > use. If the problem were smaller, there are a number of things I could > "get away with" in terms of how I would do this, but given the problem > size, I need a more cunning solution. > > Ideally, I would like to only call predict from the package Raster just > once, and have the predict function call the right model based on the > geographical location of the pixel. However, not clear that this is > possible with the Raster Package, or if it is possible how to implement it > efficiently. > > Any ideas or suggestions greatly appreciated. > > Cheers! > Craig > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ersatzistician and Chutzpahthologist I can answer any question. "I don't know" is an answer. "I don't know yet" is a better answer. "I can write better than anybody who can write faster, and I can write faster than anybody who can write better" AJ Leibling [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Applying Different Predictive Models over Different Geographic subsets of a RasterStack
I'm struggling with the best way to apply different predictive models over different geographical areas of a raster stack. The context of the problem is that different predictive models are developed within different polygonal regions of the overall study area. Each model needs to be used to predict an outcome for just the geographic area for which it was developed. Every pixel has one and only one predictive model, but the model changes across different regions of the landscape. The models come from a "random forest" fit. The problem is that the rasterstack is rather large both in terms of number of pixels and also the number of layers which the predictive model needs to use. If the problem were smaller, there are a number of things I could "get away with" in terms of how I would do this, but given the problem size, I need a more cunning solution. Ideally, I would like to only call predict from the package Raster just once, and have the predict function call the right model based on the geographical location of the pixel. However, not clear that this is possible with the Raster Package, or if it is possible how to implement it efficiently. Any ideas or suggestions greatly appreciated. Cheers! Craig [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Output from file.info()$mtime
Are you aware you have so-called smart quotes in your R code? That can cause all sorts of interesting errors, though I don't get the exact one you report. Sarah On Thu, Aug 7, 2014 at 6:37 PM, Fisher Dennis wrote: > R 3.1.1 > OS X (and Windows) > > Colleagues > > I have some code that manages files. Previously (as late as 3.1.0), the > command: > file.info(FILENAME)$mtime == “” > yielded T/F > > Now, it triggers an error: > Error in as.POSIXlt.character(x, tz, ...) : > character string is not in a standard unambiguous format > > I looked through Peter Dalgaard’s list of changes in 3.1.1 and I cannot find > anything that would explain the change between versions. I have fixed the > problem. However, I am concerned that other problems may be lurking (i.e., > the changes might affect other commands). > > Of note, I ran: > str(file.info(FILENAME)$mtime) > in both versions of R and the results did not differ > > Can anyone explain what changed so that I can search my code efficiently? > > Thanks in advance. > > Dennis > > > Dennis Fisher MD > P < (The "P Less Than" Company) > Phone: 1-866-PLessThan (1-866-753-7784) > Fax: 1-866-PLessThan (1-866-753-7784) > www.PLessThan.com > -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Output from file.info()$mtime
R 3.1.1 OS X (and Windows) Colleagues I have some code that manages files. Previously (as late as 3.1.0), the command: file.info(FILENAME)$mtime == “” yielded T/F Now, it triggers an error: Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format I looked through Peter Dalgaard’s list of changes in 3.1.1 and I cannot find anything that would explain the change between versions. I have fixed the problem. However, I am concerned that other problems may be lurking (i.e., the changes might affect other commands). Of note, I ran: str(file.info(FILENAME)$mtime) in both versions of R and the results did not differ Can anyone explain what changed so that I can search my code efficiently? Thanks in advance. Dennis Dennis Fisher MD P < (The "P Less Than" Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with labeling plots, possibly in font defaults
On Aug 7, 2014, at 12:59 PM, Tim Blass wrote: > Hello, > > I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4. > > I just tried making and labeling a plot as follows: > >> x<-rnorm(10) >> y<-rnorm(10) >> plot(x,y) >> title(main="random points") > > which produces a scatter plot of the random points, but without the title > and without any numbers along the axes. If I then run > >> par(family="sans") >> plot(x,y,main="plot title") > > the plot has the title and the numbers on the axes (also and 'x' and 'y' > appear as default labels for the axes). > > I do not know what is going on, but maybe there is some problem in the > default font settings (I don't know if that could be an R issue or an issue > specific to my Mac)? It hasn't happened to me recently (since updating from Leopard and SnowLeapard to Lion) but it used to happen pretty frequently that I would get a duplicate font that printed empty square boxes. (I wasn't the only one. It got reported several times on R-SIG-Mac.) One could fix that sort of problem by deleting the offending duplicate entry using Font Book.app Since this is happening with the default serif font, you would probably find the duplicate in Times. (It used to be happening to me with Symbol.) > quartzFonts()$serif [1] "Times-Roman" "Times-Bold" "Times-Italic" "Times-BoldItalic" > > > This is clearly not a big problem (at least for now) since I can put labels > on plots by running par(), but if it is indicative of a larger underlying > problem (or if there is a simple fix) I would like to know. > > Thank you! David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Legend in ggplot2
The problem is that you are not actually 'mapping' any variables to the fill and colour aestethics so ggplot wont produce legends for those. I'm not sure ggplots are appropiate for what you're trying to do here but you can sure hack around it a bit, for instance try: ggplot(tabu, aes(x=weeks, y=T))+ scale_y_continuous(expand=c(0,0), minor_breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)), round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)), 1), breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)), round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)), 2))+ scale_x_discrete(expand=c(0,0), breaks=seq(min(tabu$weeks), max(tabu$weeks)))+ geom_bar(data=tabu, aes(y=Tupper, fill="Tupper"),stat="identity")+ geom_point(aes(y=cusums, colour="Cusum"),size=4,pch=15)+ geom_bar(data=tabu, aes(y=Tlower, fill="Tlower"),stat="identity")+ geom_hline(aes(yintercept=0),colour="gray20",size=1)+ geom_hline(aes(yintercept=5),colour="darkorchid4",size=2,alpha=1/2)+ geom_hline(aes(yintercept=-5),colour="darkorchid4",size=2,alpha=1/2)+ geom_hline(aes(yintercept=0.5),colour="gold2",size=2,alpha=1/1.3)+ geom_hline(aes(yintercept=-0.5),colour="gold2",size=2,alpha=1/1.3)+ scale_fill_manual(name="Legend", breaks=c("Tupper","Tlower"), values=c("brown3","darkolivegreen4"), labels=c("T","L"))+ scale_colour_manual(name="Legend", breaks=c("Cusum"), values=c("dodgerblue1"), labels=c("Cusum")) and fill in the balnks P.S. your plot is very strange. On Thu, Aug 7, 2014 at 9:07 AM, Pavneet Arora wrote: > Hi All > > Following is my dataset. > dput(tabu) > structure(list(weeks = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, > 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, > 29, 30), values = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04, > 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62, > 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62, > 11.31, 10.52), deviation = c(-0.551, -2.01, > -0.711, > 1.66, 2.16, 0.18, -1.96, 1.46, -0.801, 0.34, > -0.971, > 1.47, 0.51, -0.6, 0.0801, -0.631, > 0.619, > 0.31, 0, 3, 0.9, -0.67, 2.29, 1.5, 0.6, 1.08, 0.381, > 1.62, 1.31, 0.52), cusums = c(-0.551, -2.56, -3.27, > -1.61, 0.549, 0.729, -1.23, 0.229, > -0.572, -0.232, -1.2, 0.268, > 0.778, 0.178, 0.258, > -0.373, > 0.246, 0.557, 0.557, 3.56, > 4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09 > ), Tupper = c(0, 0, 0, 1.16, 2.82, 2.5, 0.0391, 1, > 0, 0, 0, 0.971, 0.98, 0, 0, 0, 0.119, > 0, 0, 2.5, 2.9, 1.73, 3.52, 4.52, 4.62, 5.2, 5.08, 6.2, 7.01, > 7.03), Tlower = c(-0.0507, -1.56, -1.77, 0, 0, 0, > -1.46, 0, -0.301, 0, -0.471, 0, 0, > -0.0996, > 0, -0.131, 0, 0, 0, 0, 0, -0.17, 0, 0, 0, 0, 0, 0, > 0, 0)), .Names = c("weeks", "values", "deviation", "cusums", > "Tupper", "Tlower"), row.names = c(NA, -30L), class = "data.frame") > > I have created a plot using ggplot, whereby it makes both barchart and > plots points, like following: > ggplot(tabu,aes(x=week, > ymin=min(tabu$cusums,tabu$Tupper,tabu$Tlower), > ymax=max(tabu$cusums,tabu$Tupper,tabu$Tlower)))+ > labs(x=NULL,y=NULL)+ > scale_y_continuous(expand=c(0,0), > minor_breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)), > round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)), > 1), > breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)), > round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)), > 2))+ > scale_x_discrete(expand=c(0,0), >breaks=seq(min(tabu$week), > max(tabu$week)))+ > geom_bar(aes(y=tabu$Tupper),stat="identity",fill="brown3")+ > geom_bar(aes(y=tabu$Tlower),stat="identity",fill="darkolivegreen4")+ > geom_point(aes(y=tabu$cusums),size=4,pch=15,colour="dodgerblue1")+ > geom_hline(aes(yintercept=0),colour="gray20",size=1)+ #geom_hline - > draws a reference line at 0 > geom_hline(aes(yintercept=5),colour="darkorchid4",size=2,alpha=1/2)+ > #Out-Of-Signal Lines > geom_hline(aes(yintercept=-5),colour="darkorchid4",size=2,alpha=1/2)+ > #Out-Of-Signal Lines > geom_hline(aes(yintercept=0.5),colour="gold2",size=2,alpha=1/1.3)+ #K > geom_hline(aes(yintercept=-0.5),colour="gold2",size=2,alpha=1/1.3)+ #K > scale_color_manual(name="Legend", > breaks=c("Tupper","Tlowe
[R] problem with labeling plots, possibly in font defaults
Hello, I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4. I just tried making and labeling a plot as follows: > x<-rnorm(10) > y<-rnorm(10) > plot(x,y) > title(main="random points") which produces a scatter plot of the random points, but without the title and without any numbers along the axes. If I then run > par(family="sans") > plot(x,y,main="plot title") the plot has the title and the numbers on the axes (also and 'x' and 'y' appear as default labels for the axes). I do not know what is going on, but maybe there is some problem in the default font settings (I don't know if that could be an R issue or an issue specific to my Mac)? This is clearly not a big problem (at least for now) since I can put labels on plots by running par(), but if it is indicative of a larger underlying problem (or if there is a simple fix) I would like to know. Thank you! tb [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issues in using GP_fit() of GPfit package
Hi, I am having certain issues in using GP_fit() of GPfit package. Basically each time I call this function, it gets hanged. What I have is a data frame(matrix) which has 2 columns and some 2000 rows and first I normalize the values in the range [0,1]. I also have a output status vector which has 1 column and same 2000 rows and the values are either 0 or 1. Now whenever I pass these two inside the GP_fit() , like GP_fit(mat_norm,mat_status), the R console gets hanged. Is it because of 2000 rows?I had assumed that 2000 is a very short number and it should not get hanged for such a small number when the real data can have millions of rows. Any idea what might be wrong here? Thanks in advance. Regards, Jason [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dose response glmer
I am trying to do a dose response in my dataset, but nothing go a head. I am adapting a script shared on the web, but I unable to make it useful for my dataset. I would like to got the LC50 for each Isolado and if there are differences between then. My data is https://dl.dropboxusercontent.com/u/34009642/R/dead_alive.csv Here what I copy and try to modifying: library(plyr) library(lattice) library(lme4) library(arm) library(lmerTest) library(faraway) library(car) ## Conc are concentration. I input only the coef, but, all, ## except 0, that is my control (without Isolado), are base ## 10. i.e: 10^4, 10^6 e 10^8. data <- read.table("dead_alive.csv", sep = "\t", dec=",", header = TRUE) data$Rep <- factor(data$Rep) mean_data <- ddply(data, c("Isolado", "Conc", "Day"), numcolwise(mean)) xyplot(Dead/(Dead + Live) ~ Conc|Isolado, groups = Day, type = "l", ylab='Probability', xlab='Dose', data = mean_data) xyplot(Dead/(Dead + Live) ~ Day|Isolado, groups = Conc, type = "l", ylab='Probability', xlab='Dose', data = mean_data) model.logit <- glmer(cbind(Dead, Live) ~ -1 + Isolado + Isolado:Conc + (0 + Conc|Day), family=binomial, data = data) Anova(model.logit) summary(model.logit) model.probit <- glmer(cbind(Dead, Live) ~ Isolado + Isolado:Conc + (0 + Conc|Day), family=binomial(link=probit), data=data) model.cloglog <- glm(cbind(Dead, Live) ~ Isolado + Isolado:Conc + (1 + Conc|Day), family=binomial(link=cloglog), data=data) x <- seq(0,8, by=0.2) prob.logit <- ilogit(model.logit$coef[1] + model.logit$coef[2]*x) prob.probit <- pnorm(model.probit$coef[1] + model.probit$coef[2]*x) prob.cloglog <- 1-exp(-exp((model.cloglog$coef[1] + model.cloglog$coef[2]*x))) with(subdata, plot(Dead/(Dead + Live) ~ Conc, group = Day, ) lines(x, prob.logit) # solid curve = logit lines(x, prob.probit, lty=2) # dashed = probit lines(x, prob.cloglog, lty=5) # longdash = c-log-log plot(x, prob.logit, type='l', ylab='Probability', xlab='Dose') # solid curve = logit lines(x, prob.probit, lty=2) # dashed = probit lines(x, prob.cloglog, lty=5) # longdash = c-log-log matplot(x, cbind(prob.probit/prob.logit, (1-prob.probit)/(1-prob.logit)), type='l', xlab='Dose', ylab='Ratio') matplot(x, cbind(prob.cloglog/prob.logit, (1-prob.cloglog)/(1-prob.logit)), type='l', xlab='Dose', ylab='Ratio') model.logit.data <- glm(cbind(Dead,Live) ~ Conc, family=binomial, data=data) pred2.5 <- predict(model.logit.data, newdata=data.frame(Conc=2.5), se=T) ilogit(pred2.5$fit) ilogit(c(pred2.5$fit - 1.96*pred2.5$se.fit, pred2.5$fit + 1.96*pred2.5$se.fit)) ## what are this 1.96 Where it come from? ### If there are several predictors, just put in the code ### above something like: ### newdata=data.frame(conc=2.5,x2=4.6,x3=5.8) ### or whatever is the desired set of predictor values... ### Effective Dose calculation: # What is the concentration that yields a probability of 0.5 of an # insect dying? library(MASS) dose.p(model.logit.data, p=0.5) # A 95% CI for the ED50: c(2 - 1.96*0.1466921, 2 + 1.96*0.1466921) # What is the concentration that yields a probability of 0.8 of an # insect dying? dose.p(model.logit.data, p=0.8) -- Laia, ML __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] big data?
correcting a typo (400 MB, not GB. Thanks to David Winsemius for reporting it). Spencer ### Thanks to all who replied. For the record, I will summarize here what I tried and what I learned: Mike Harwood suggested the ff package. David Winsemius suggested data.table and colbycol. Peter Langfelder suggested sqldf. sqldf::read.csv.sql allowed me to create an SQL command to read a column or a subset of the rows of a 400 MB tab-delimited file in roughly a minute on a 2.3 GHz dual core machine running Windows 7 with 8 GB RAM. It also read a column of a 1.3 GB file in 4 minutes. The documentation was sufficient to allow me to easily get what I wanted with a minimum of effort. If I needed to work with these data regularly, I might experiment with colbycol and ff: The documentation suggested to me that these packages might allow me to get quicker answers from routine tasks after some preprocessing. Of course, I could also do the preprocessing manually with sqldf. Thanks, again. Spencer On 8/6/2014 9:39 AM, Mike Harwood wrote: The read.table.ffdf function in the ff package can read in delimited files and store them to disk as individual columns. The ffbase package provides additional data management and analytic functionality. I have used these packages on 15 Gb files of 18 million rows and 250 columns. On Tuesday, August 5, 2014 1:39:03 PM UTC-5, David Winsemius wrote: On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote: What tools do you like for working with tab delimited text files up to 1.5 GB (under Windows 7 with 8 GB RAM)? ?data.table::fread Standard tools for smaller data sometimes grab all the available RAM, after which CPU usage drops to 3% ;-) The "bigmemory" project won the 2010 John Chambers Award but "is not available (for R version 3.1.0)". findFn("big data", 999) downloaded 961 links in 437 packages. That contains tools for data PostgreSQL and other formats, but I couldn't find anything for large tab delimited text files. Absent a better idea, I plan to write a function getField to extract a specific field from the data, then use that to split the data into 4 smaller files, which I think should be small enough that I can do what I want. There is the colbycol package with which I have no experience, but I understand it is designed to partition data into column sized objects. #--- from its help file- cbc.get.col {colbycol}R Documentation Reads a single column from the original file into memory Description Function cbc.read.table reads a file, stores it column by column in disk file and creates a colbycol object. Functioncbc.get.col queries this object and returns a single column. Thanks, Spencer __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] big data?
Thanks to all who replied. For the record, I will summarize here what I tried and what I learned: Mike Harwood suggested the ff package. David Winsemius suggested data.table and colbycol. Peter Langfelder suggested sqldf. sqldf::read.csv.sql allowed me to create an SQL command to read a column or a subset of the rows of a 400 GB tab-delimited file in roughly a minute on a 2.3 GHz dual core machine running Windows 7 with 8 GB RAM. It also read a column of a 1.3 GB file in 4 minutes. The documentation was sufficient to allow me to easily get what I wanted with a minimum of effort. If I needed to work with these data regularly, I might experiment with colbycol and ff: The documentation suggested to me that these packages might allow me to get quicker answers from routine tasks after some preprocessing. Of course, I could also do the preprocessing manually with sqldf. Thanks, again. Spencer On 8/6/2014 9:39 AM, Mike Harwood wrote: The read.table.ffdf function in the ff package can read in delimited files and store them to disk as individual columns. The ffbase package provides additional data management and analytic functionality. I have used these packages on 15 Gb files of 18 million rows and 250 columns. On Tuesday, August 5, 2014 1:39:03 PM UTC-5, David Winsemius wrote: On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote: What tools do you like for working with tab delimited text files up to 1.5 GB (under Windows 7 with 8 GB RAM)? ?data.table::fread Standard tools for smaller data sometimes grab all the available RAM, after which CPU usage drops to 3% ;-) The "bigmemory" project won the 2010 John Chambers Award but "is not available (for R version 3.1.0)". findFn("big data", 999) downloaded 961 links in 437 packages. That contains tools for data PostgreSQL and other formats, but I couldn't find anything for large tab delimited text files. Absent a better idea, I plan to write a function getField to extract a specific field from the data, then use that to split the data into 4 smaller files, which I think should be small enough that I can do what I want. There is the colbycol package with which I have no experience, but I understand it is designed to partition data into column sized objects. #--- from its help file- cbc.get.col {colbycol}R Documentation Reads a single column from the original file into memory Description Function cbc.read.table reads a file, stores it column by column in disk file and creates a colbycol object. Functioncbc.get.col queries this object and returns a single column. Thanks, Spencer __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] K-nearest neighbor
Dear R-users, I am looking for a weighted knn-search function, but i cannot manage to find one. There are several options of weighted knn classifiers, but i would rather use a simple 'search function' (such as get.knnx). Anyone knows a search function with "weight" option ? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
I prefer the idiom c(TRUE, a[-1] != a[-length(x)]) because it works for character and other data types as well. I also find that thinking in terms of runs instead of subscripting tricks is easier. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
Better: b <- c(a[1]-1,a[-length(a)]) On 07 Aug 2014, at 17:28, Bart Kastermans wrote: > For readability I like: > >> b <- c(0,a[-length(a)]) >> which(a != b & a == 0) > [1] 4 12 18 >> which(a != b & a == 1) > [1] 1 6 16 23 > > > On 07 Aug 2014, at 17:23, William Dunlap wrote: > >> My solution may be a bit clearer if you define the function isFirstInRun >> isFirstInRun <- function(x) { >> if (length(x) == 0) { >> logical(0) >> } else { >> c(TRUE, x[-1] != x[-length(x)]) >> } >> } >> >> Then that solution is equivalent to >> which(isFirstInRun(a) & a==1) >> >> If 'a' contains NA's then you have to decide how to deal with them. >> >> (The call to 'which' is not needed if you are going to be using the >> result as a subscript.) >> >> You may also want isLastInRun >> isLastInRun <- function(x) { >> if (length(x) == 0) { >> logical(0) >> } else { >> c(x[-1] != x[-length(x)], TRUE) >> } >> } >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> >> On Thu, Aug 7, 2014 at 7:36 AM, William Dunlap wrote: a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1) which( a==1 & c(TRUE, a[-length(a)]!=1) ) >>> [1] 1 6 16 23 which( a==0 & c(TRUE, a[-length(a)]!=0) ) >>> [1] 4 12 18 >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> >>> On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz wrote: Hello,everybody, I have a sequence,like a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how to get the position of each first 1 and 0, that's to say, how to get b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0. Many thanks! Johnny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
For readability I like: > b <- c(0,a[-length(a)]) > which(a != b & a == 0) [1] 4 12 18 > which(a != b & a == 1) [1] 1 6 16 23 On 07 Aug 2014, at 17:23, William Dunlap wrote: > My solution may be a bit clearer if you define the function isFirstInRun > isFirstInRun <- function(x) { > if (length(x) == 0) { > logical(0) > } else { > c(TRUE, x[-1] != x[-length(x)]) > } > } > > Then that solution is equivalent to > which(isFirstInRun(a) & a==1) > > If 'a' contains NA's then you have to decide how to deal with them. > > (The call to 'which' is not needed if you are going to be using the > result as a subscript.) > > You may also want isLastInRun > isLastInRun <- function(x) { > if (length(x) == 0) { > logical(0) > } else { > c(x[-1] != x[-length(x)], TRUE) > } > } > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Thu, Aug 7, 2014 at 7:36 AM, William Dunlap wrote: >>> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1) >>> which( a==1 & c(TRUE, a[-length(a)]!=1) ) >> [1] 1 6 16 23 >>> which( a==0 & c(TRUE, a[-length(a)]!=0) ) >> [1] 4 12 18 >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> >> On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz wrote: >>> Hello,everybody, >>> I have a sequence,like >>> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how >>> to get the position of each first 1 and 0, that's to say, how to get >>> b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0. >>>Many thanks! >>> Johnny >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
My solution may be a bit clearer if you define the function isFirstInRun isFirstInRun <- function(x) { if (length(x) == 0) { logical(0) } else { c(TRUE, x[-1] != x[-length(x)]) } } Then that solution is equivalent to which(isFirstInRun(a) & a==1) If 'a' contains NA's then you have to decide how to deal with them. (The call to 'which' is not needed if you are going to be using the result as a subscript.) You may also want isLastInRun isLastInRun <- function(x) { if (length(x) == 0) { logical(0) } else { c(x[-1] != x[-length(x)], TRUE) } } Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Aug 7, 2014 at 7:36 AM, William Dunlap wrote: >> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1) >> which( a==1 & c(TRUE, a[-length(a)]!=1) ) > [1] 1 6 16 23 >> which( a==0 & c(TRUE, a[-length(a)]!=0) ) > [1] 4 12 18 > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz wrote: >> Hello,everybody, >> I have a sequence,like >> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how >> to get the position of each first 1 and 0, that's to say, how to get >> b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0. >> Many thanks! >> Johnny >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
On 07 Aug 2014, at 11:16 , jim holtman wrote: > rle ...with a little tinkering, like > m <- c(1,cumsum(rle(a)$lengths)+1) > m [1] 1 4 6 12 16 18 23 34 then look at every 2nd element, discarding the last. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1) > which( a==1 & c(TRUE, a[-length(a)]!=1) ) [1] 1 6 16 23 > which( a==0 & c(TRUE, a[-length(a)]!=0) ) [1] 4 12 18 Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz wrote: > Hello,everybody, > I have a sequence,like > a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how > to get the position of each first 1 and 0, that's to say, how to get > b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0. > Many thanks! > Johnny > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Legend in ggplot2
Hi All Following is my dataset. dput(tabu) structure(list(weeks = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30), values = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04, 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62, 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62, 11.31, 10.52), deviation = c(-0.551, -2.01, -0.711, 1.66, 2.16, 0.18, -1.96, 1.46, -0.801, 0.34, -0.971, 1.47, 0.51, -0.6, 0.0801, -0.631, 0.619, 0.31, 0, 3, 0.9, -0.67, 2.29, 1.5, 0.6, 1.08, 0.381, 1.62, 1.31, 0.52), cusums = c(-0.551, -2.56, -3.27, -1.61, 0.549, 0.729, -1.23, 0.229, -0.572, -0.232, -1.2, 0.268, 0.778, 0.178, 0.258, -0.373, 0.246, 0.557, 0.557, 3.56, 4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09 ), Tupper = c(0, 0, 0, 1.16, 2.82, 2.5, 0.0391, 1, 0, 0, 0, 0.971, 0.98, 0, 0, 0, 0.119, 0, 0, 2.5, 2.9, 1.73, 3.52, 4.52, 4.62, 5.2, 5.08, 6.2, 7.01, 7.03), Tlower = c(-0.0507, -1.56, -1.77, 0, 0, 0, -1.46, 0, -0.301, 0, -0.471, 0, 0, -0.0996, 0, -0.131, 0, 0, 0, 0, 0, -0.17, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("weeks", "values", "deviation", "cusums", "Tupper", "Tlower"), row.names = c(NA, -30L), class = "data.frame") I have created a plot using ggplot, whereby it makes both barchart and plots points, like following: ggplot(tabu,aes(x=week, ymin=min(tabu$cusums,tabu$Tupper,tabu$Tlower), ymax=max(tabu$cusums,tabu$Tupper,tabu$Tlower)))+ labs(x=NULL,y=NULL)+ scale_y_continuous(expand=c(0,0), minor_breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)), round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)), 1), breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)), round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)), 2))+ scale_x_discrete(expand=c(0,0), breaks=seq(min(tabu$week), max(tabu$week)))+ geom_bar(aes(y=tabu$Tupper),stat="identity",fill="brown3")+ geom_bar(aes(y=tabu$Tlower),stat="identity",fill="darkolivegreen4")+ geom_point(aes(y=tabu$cusums),size=4,pch=15,colour="dodgerblue1")+ geom_hline(aes(yintercept=0),colour="gray20",size=1)+ #geom_hline - draws a reference line at 0 geom_hline(aes(yintercept=5),colour="darkorchid4",size=2,alpha=1/2)+ #Out-Of-Signal Lines geom_hline(aes(yintercept=-5),colour="darkorchid4",size=2,alpha=1/2)+ #Out-Of-Signal Lines geom_hline(aes(yintercept=0.5),colour="gold2",size=2,alpha=1/1.3)+ #K geom_hline(aes(yintercept=-0.5),colour="gold2",size=2,alpha=1/1.3)+ #K scale_color_manual(name="Legend", breaks=c("Tupper","Tlower","CuSum","±K","±H"), values=c("brown3","darkolivegreen4","dodgerblue1","gold2","darkorchid4"), labels=c("T","L","C","±K","±H")) However, I am having trouble getting a legend. I know its supposed to do something with melt function in reshape package. But I can?t get it too work! Can someone please help me with this. *** MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No. 93792). Registered in England and Wales at St. Markâs Court, Chart Way, Horsham, West Sussex, RH12 1XL. Authorised by the Prudential Regulation Authority and regulated by the Financial Conduct Authority and the Prudential Regulation Authority. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dynamic runSum
Hello, runSum calculates a running sum looking back a fixed distance n, e.g. 20. How do I calculate a dynamic runSum function for an xts object? In otherwords, I want to calculate a running sum at each point in time looking back a variable distance. In this example, values governed by the vector VL. Here's a minimum reproducible example: library(quantstrat) symbols = c('^GSPC') start.date <- as.Date("2010-01-01") end.date <- as.Date("2013-12-31") getSymbols(symbols, from=as.character(start.date), to=as.character(end.date),adjust=T) "acF1" <- function(x, n1=5, n2=10, n3=20, nacF1=25, n0=20, ...) { var1 <- x - lag(x,1,na.pad=T) var2 <- runSD(x, n=n1, sample=TRUE, cumulative=FALSE) var3 <- runMean(var2, n=n2, cumulative=FALSE) VL <- ifelse( trunc(n3/(var2/var3))>nacF1, nacF1, trunc(n3/(var2/var3))) p_pos <- ifelse(var1>=0, var1, 0) out1 <- runSum(p_pos, n=n0, cumulative=FALSE) res <- cbind(var1, var2, var3, VL, p_pos, out1) colnames(res) <- c("var1","var2","var3","VL", "p_pos", "out1") reclass(res) } acf1 <- acF1( GSPC[,c("GSPC.Close")], n1=5, n2=10, n3=20, nacF1=25, n0=20) acf1 So on 2010-02-02, I want runSum to be looking back 23 points as governed by VL , not 20 points 2010-02-03, I want runSum to be looking back 24 points as governed by VL, not 20 points etc etc 2013-12-31, I want runSum to be looking back 25 points as governed by VL, not 20 points Amarjit __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Some hpc problems with doMPI and runmpi (I know there is R-SIG-HPC)
Dear R people, I’ve been doing some hpc using R and openmpi. Unfortunately I’ve encoutred a major problem and it’s nature is hard to pin down: Essentially I call mpirun Rscipt … as soon as the script reaches a foreach()%dopar% it halts indefinitely. I’ve attached the qsub script: #!/bin/bash #$ -S /bin/bash #$ -N test_14 #$ -cwd #$ -V #$ -o "/fhgfs/g61570/Spectral Databases/log/test_14_"$JOB_ID #$ -j y #$ -q regular #$ -pe openmpi 8 #$ -l h_rt=00:15:00 #$ -l h_vmem=1.9G #$ -m eas #$ -M andre.ziervo...@psychol.uni-giessen.de module add gcc module add openmpi/gcc/64/1.6.5 module add R/gcc/3.0.1 date #log start time echo "Number of slots " . $NSLOTS mpirun Rscript /fhgfs/g61570/Spectral\ Databases/test_10.r > /fhgfs/g61570/Spectral\ Databases/log/test_14.Rout date exit and the R file: suppressMessages(library('doMPI')) skylla.cluster <- startMPIcluster() registerDoMPI(skylla.cluster) cat(paste("COMM SIZE: ", mpi.comm.size(0), " cluster size: ", clusterSize(skylla.cluster), "\n",sep = "")) tmp.time <- proc.time() sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %do% { r <- sqrt(i^2 + i^2) + .Machine$double.eps * factorial(i) sin(r) / r } cat(paste("Processing seriell time: ", "\n", sep = " ")) print(proc.time() - tmp.time) #print(sample) tmp.time <- proc.time() sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %dopar% { r <- sqrt(i^2 + i^2) + .Machine$double.eps * factorial(i) sin(r) / r } cat(paste("Processing parallel time: ", "\n", sep = " ")) print(proc.time() - tmp.time) #print(sample) closeCluster(skylla.cluster) # mpi.close.Rslaves() # mpi.exit() mpi.quit(save='no‘) Any suggestions would be highly appreciated! Thanks! Best André -- Dipl. Psych André Ziervogel andre.ziervo...@psychol.uni-giessen.de -- signature.asc Description: Message signed with OpenPGP using GPGMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weighted network centrality measures by network size
Hi Jenny, Have you tried igraph before? See, http://igraph.org/r/doc/ There are couple of centrality measures there. Best, -m On 6 August 2014 02:50, Jenny Jiang wrote: > Dear R-help, > > My name is Jenny Jiang and I am a Finance Honours research > student from the University of New South Wales Australia. Currently my > research project involves the calculating of some network centrality > measures in R, which are degree, closeness, betweenness and eigenvector. > However I am having some issue regarding to the calculation of > the weighted centrality measures by network size. For example, currently > my code allows me to calculate centrality measures for each firm year, > and now I would like to calculate centrality measures weighted by the > firm network size for each firm year. > > My current code is like the following: > > install.packages("statnet") > > library(statnet) > > #read csv > data <- read.csv("D:\\Users\\z3377013\\Desktop\\networknew1.csv",header=TRUE) > #companies <- unique(data$CompanyID_) > #years <- unique(data$Year) > pairs <- unique(data[,c(1,3)]) > #directors <- unique(c(data$DirectorID_,data$DirectorID_Connected)) > #director_map <- 1:length(directors) > #names(director_map) <- c(as.character(directors)) > > #for (i in 1:nrow(data)) { > # data[i,2] = director_map[as.character(data[i,2])] > # data[i,4] = director_map[as.character(data[i,4])] > #} > > sink("D:\\Users\\z3377013\\Desktop\\measure1.csv") > for (i in 1:nrow(pairs)) { > d <- subset(data, CompanyID_==pairs[i,1]&Year==pairs[i,2]) > directors <- unique(c(d$DirectorID_,d$DirectorID_Connected)) > director_map <- 1:length(directors) > names(director_map) <- c(as.character(directors)) > for (j in 1:nrow(d)) { > d[j,2] = director_map[as.character(d[j,2])] > d[j,4] = director_map[as.character(d[j,4])] > } > > net<-network(d[,c(2,4)],directed=F,loops=F,matrix.type="edgelist") > > degree <- degree(net, cmode="freeman", gmode="graph") > closeness <- closeness(net,gmode="graph",cmode="undirected") > betweenness <- betweenness(net,gmode="graph",cmode="undirected") > evcent <- evcent(net,gmode="graph",use.eigen=TRUE) > > write.csv(cbind(pairs[i,], directors, degree, closeness, betweenness, > evcent), row.names=FALSE) > } > sink() > > And an example of my data structure is like the following: > > CompanyID_DirectorID_YearDirectorID_Connected > 900370006802120033699838021 > 900370041803220033699838021 > 900370059803220033699838021 > 900370089803220033699838021 > 900370347806320033699838021 > 900370362806320033699838021 > 900370383806320033699838021 > 900370399806320033699838021 > 900369983802120033700068021 > 900370041803220033700068021 > 900370059803220033700068021 > 900370089803220033700068021 > 900370347806320033700068021 > 900370362806320033700068021 > 900370383806320033700068021 > 900370399806320033700068021 > 900369983802120033700418032 > 900370006802120033700418032 > 900370059803220033700418032 > 900370006802120043699838021 > 900370041803220043699838021 > 900370059803220043699838021 > 900370089803220043699838021 > 900370347806320043699838021 > 129016045381142003427207466 > 129035569064722003427207466 > 129037011080322003427207466 > 129037084581042003427207466 > 129037084781042003427207466 > 129037112481352003427207466 > 1290101671106122003427207466 > 1290102718113832003427207466 > > where for each firm-year I have a list of directors and their corresponding > connected directors within that firm-year. > > If you could > provide me the R code regarding to how to calculate the weighted measures by > network size that that would be really > helpful. > > I cannot be more than appreciated. > > Best regards > > Jenny > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using axis limits with plot3D
Hi Scott, The trick is to postpone plotting (plot = FALSE) and then do plotdev() with the required limits: library(plot3D) x <- z <- seq(-4, 4, by=0.2) y <- seq(-6, 6, by=0.2) M <- mesh(x,y,z) R <- with(M, sqrt(x^2 + y^2 +z^2)) p <- sin(2*R)/(R+1e-3) x.limits <- c(-2, 2) y.limits <- c(-2, 2) slice3D(x,y,z, colvar=p, xs=0, ys=c(0, 4), zs=NULL, scale=F, ticktype="detailed", plot = FALSE) plotdev(xlim=x.limits, ylim=y.limits) This should work. Karline Message: 25 Date: Wed, 6 Aug 2014 23:21:04 + From: "Waichler, Scott R" To: "R. Help" Cc: "karline.soeta...@nioz.nl" Subject: [R] Using axis limits with plot3D Message-ID: <074c83dad4825242a20b2d83fdbcb888765...@ex10mbox03.pnnl.gov> Content-Type: text/plain; charset="us-ascii" I would like to use the functions in the plot3D package but I am having trouble getting the axis limits to work correctly. The slices plotted by the code below go beyond the bounds of the persp box and obscure the axis information. How can I show just the part of the domain within x.limits and y.limits? library(plot3D) x <- z <- seq(-4, 4, by=0.2) y <- seq(-6, 6, by=0.2) M <- mesh(x,y,z) R <- with(M, sqrt(x^2 + y^2 +z^2)) p <- sin(2*R)/(R+1e-3) x.limits <- c(-2, 2) y.limits <- c(-2, 2) slice3D(x,y,z, colvar=p, xs=0, ys=c(0, 4), zs=NULL, xlim=x.limits, ylim=y.limits, scale=F, ticktype="detailed") Thanks, Scott Waichler Pacific Northwest National Laboratory Richland, WA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical operators and named arguments
Josh, Thank you for your detailed answer. Best, Ryan On 7 Aug 2014, at 16:21, Joshua Wiley wrote: > Hi Ryan, > > It does work, but the *apply family of functions always pass to the first > argument, so you can specify e2 = , but not e1 =. For example: > >> sapply(1:3, `>`, e2 = 2) > [1] FALSE FALSE TRUE > > From ?sapply > >'lapply' returns a list of the same length as 'X', each element of >which is the result of applying 'FUN' to the corresponding element >of 'X'. > > so `>` is applied to each element of 1:3 > > `>`(1, ...) > `>`(2, ...) > `>`(3, ...) > > and if e2 is specified than that is passed > > `>`(1, 2) > `>`(2, 2) > `>`(3, 2) > > Further, see ?Ops > > If the members of this group are called as functions, any > argument names are removed to ensure that positional matching > is always used. > > and you can see this at work: > >> `>`(e1 = 1, e2 = 2) > [1] FALSE >> `>`(e2 = 1, e1 = 2) > [1] FALSE > > If you want to the flexibility to specify which argument the elements of X > should be *applied to, use a wrapper: > >> sapply(1:3, function(x) `>`(x, 2)) > [1] FALSE FALSE TRUE >> sapply(1:3, function(x) `>`(2, x)) > [1] TRUE FALSE FALSE > > > HTH, > > Josh > > > > On Thu, Aug 7, 2014 at 2:20 PM, Ryan wrote: > >> Hi, >> >> I'm wondering why calling ">" with named arguments doesn't work as >> expected: >> >>> args(">") >> function (e1, e2) >> NULL >> >>> sapply(c(1,2,3), `>`, e2=0) >> [1] TRUE TRUE TRUE >> >>> sapply(c(1,2,3), `>`, e1=0) >> [1] TRUE TRUE TRUE >> >> Shouldn't the latter be FALSE? >> >> Thanks for any help, >> Ryan >> >> >> The information in this e-mail is intended only for t...{{dropped:28}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask for help
rle Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Aug 6, 2014 at 10:12 PM, Johnnycz wrote: > Hello,everybody, > I have a sequence,like > a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how > to get the position of each first 1 and 0, that's to say, how to get > b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0. > Many thanks! > Johnny > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] citation() command doesn't work in R.3.1.1
> Igor Sosa Mayor > on Wed, 6 Aug 2014 18:13:56 +0200 writes: > Sverre Stausland writes: >>> citation() >> Error: $ operator is invalid for atomic vectors In >> addition: Warning message: In packageDescription(pkg = >> package, lib.loc = dirname(dir)) : no package 'base' was >> found > strange... I think something is wrong with the > compilation, since as far as I know, the package `base' > should be loaded automatically by starting R... Even more: R cannot be working at all without the base package. It's not a package you can be without, ever. So indeed, the error message maybe a teeny tiny tad misleading. But to reiterate: It's only the broken state of your R, not at all R 3.1.1 with this problem. Please do note that we have tens of thousands of tests that must have run without fault before R is released. Calling citation() and >= 99% of all other base R functions is among these quality assurance tests of an R release. Martin Maechler (R core team) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.