Re: [R] regexpr with accents
Hello, Works with me: d1 <- data.frame(V1 = 1:3, V2 = c("some text = 9", "some tèxt = 9", "some other text = 9")) regexpr("some text = 9", d1$V2) [1] 1 -1 -1 attr(,"match.length") [1] 13 -1 -1 regexpr("some tèxt = 9", d1$V2) [1] -1 1 -1 attr(,"match.length") [1] -1 13 -1 d1$V1[regexpr("some text = 9",d1$V2) > 0] <- 9 d1$V1[regexpr("some tèxt = 9",d1$V2) > 0] <- 9 d1 V1 V2 1 9 some text = 9 2 9 some tèxt = 9 3 3 some other text = 9 What do you mean by "it did not work"? What was the contents of 'd1'? sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252 [3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Portugal.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] fortunes_1.5-0 Hope this helps, Rui Barradas Em 06-08-2012 06:55, Luca Meyer escreveu: Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 I have tried to substitute "è" with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? Thank you in advance, Luca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Head or Tails game
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of darnold > Sent: Friday, August 03, 2012 9:18 PM > To: r-help@r-project.org > Subject: Re: [R] Head or Tails game > > Wow! Some great responses! > > I am getting some great responses. I've only read David, Michael, and > Dennis > thus far, leading me to develop this result before reading further. > > lead <- function(x) { > n <- length(x) > count <- 0 > if (x[1] >= 0) count <- count + 1 > for (i in 2:n) { > if (x[i] > 0 || (x[i] == 0 && x[i-1] >= 0 )) { > count <- count + 1 > } > } > count > } > > games <- replicate(1,sample(c(-1,1),40,replace=TRUE)) > > games_sum <- apply(games,2,sum) > plot(table(games_sum)) > > games_lead <- apply(games,2,cumsum) > games_lead <- apply(games_lead,2,lead) > plot(table(games_lead)) > > Now I am going to read Arun, William, and Jeff's responses and see what > other ideas are being proposed. > > Thanks everyone. > > D. > Here is another solution that doesn't need to define an additional function with an explicit loop. It seems to be considerably faster than the approach presented above. system.time({ set.seed(123) games <- matrix(sample(c(-1, 1), 40*1, TRUE), ncol = 1) games_sum <- apply(games,2,cumsum) games_lead <- colSums((games_sum > 0) | (games_sum==0 & games==-1)) }) user system elapsed 0.080.000.08 plot(table(games_sum[40,])) plot(table(games_lead)) Compare this with your solution system.time({ set.seed(123) games <- replicate(1,sample(c(-1,1),40,replace=TRUE)) games_sum <- apply(games,2,sum) games_lead <- apply(games,2,cumsum) games_lead <- apply(games_lead,2,lead) }) user system elapsed 0.950.020.98 Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] regexpr with accents
Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 I have tried to substitute "è" with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? Thank you in advance, Luca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit for Windows 64bit build of R
Hi, Before someone gives professional advice, you may do an experiment: Set the windows virtual memeory to be as large as ~128GB, (make sure the hard drive has enough space, restart might be required); increase the memroy limit in R; load a big dataset (or iteratively assign it to an object, and do some calculation.Definitely will be very slow) I am not sure. Just try to help. Best wishes, Jie On Sun, Aug 5, 2012 at 6:52 PM, wrote: > Dear all > > I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed > running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz > RAM. I am seeking to analyse very large data sets (perhaps as much as > 10GB), without the addtional coding overhead of a package such as > bigmemory(). > > My question is this - if we were to increase the RAM on the machine to > (say) 128GB, would this become a possibility? I have read the > documentation on memory limits and it seems so, but would like some > additional confirmation before investing in any extra RAM. > > Kind regards > > Alan > > Alan Simpson > Technical Lead, Retail Model Development > Retail Models Project > National Australia Bank > > Level 15, 500 Bourke St, Melbourne VIC > Tel: +61 (0) 3 8697 7135 | Mob: +61 (0) 412 975 955 > Email: alan.x.simp...@nab.com.au > > > The information contained in this email and its attachments may be > confidential. > If you have received this email in error, please notify the sender by > return email, > delete this email and destroy any copy. > > Any advice contained in this email has been prepared without taking into > account your objectives, financial situation or needs. Before acting on any > advice in this email, National Australia Bank Limited ABN 12 004 044 937 > AFSL and Australian Credit Licence 230686 (NAB) recommends that > you consider whether it is appropriate for your circumstances. > If this email contains reference to any financial products, NAB recommends > you consider the Product Disclosure Statement (PDS) or other disclosure > document available from NAB, before making any decisions regarding any > products. > > If this email contains any promotional content that you do not wish to > receive, > please reply to the original sender and write "Don't email promotional > material" in the subject. > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory limit for Windows 64bit build of R
Dear all I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz RAM. I am seeking to analyse very large data sets (perhaps as much as 10GB), without the addtional coding overhead of a package such as bigmemory(). My question is this - if we were to increase the RAM on the machine to (say) 128GB, would this become a possibility? I have read the documentation on memory limits and it seems so, but would like some additional confirmation before investing in any extra RAM. Kind regards Alan Alan Simpson Technical Lead, Retail Model Development Retail Models Project National Australia Bank Level 15, 500 Bourke St, Melbourne VIC Tel: +61 (0) 3 8697 7135 | Mob: +61 (0) 412 975 955 Email: alan.x.simp...@nab.com.au The information contained in this email and its attachments may be confidential. If you have received this email in error, please notify the sender by return email, delete this email and destroy any copy. Any advice contained in this email has been prepared without taking into account your objectives, financial situation or needs. Before acting on any advice in this email, National Australia Bank Limited ABN 12 004 044 937 AFSL and Australian Credit Licence 230686 (NAB) recommends that you consider whether it is appropriate for your circumstances. If this email contains reference to any financial products, NAB recommends you consider the Product Disclosure Statement (PDS) or other disclosure document available from NAB, before making any decisions regarding any products. If this email contains any promotional content that you do not wish to receive, please reply to the original sender and write "Don't email promotional material" in the subject. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length
HI, Try this: dat1<-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA)) dat1[which(colMeans(is.na(dat1))<=.15)] y 1 NA 2 13.53085 3 12.89453 4 15.02625 5 14.00387 6 15.34618 7 15.69293 8 15.62377 9 14.76479 #You can also use apply, sapply etc. dat2<-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA),u=c(rnorm(9,15))) dat2[apply(dat2,2,function(x) mean(is.na(x))<=.15)] #dat2[sapply(dat2,function(x) mean(is.na(x))<=.15)] #dat2[which(colMeans(is.na(dat2))<=.15)] y u 1 NA 14.56278 2 16.49940 16.25761 3 14.11368 14.08768 4 14.95139 14.01923 5 14.99517 15.91936 6 14.46359 14.07573 7 15.09702 13.94888 8 15.99967 14.97171 9 15.51924 15.59981 A.K. - Original Message - From: Faz Jones To: r-help@r-project.org Cc: Sent: Sunday, August 5, 2012 9:04 PM Subject: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length I have a dataframe of 10 different columns (length of each column is the same). I want to eliminate any column that has 'NA' greater than 15% of the column length. Do i first need to make a function for calculating the percentage of NA for each column and then make another dataframe where i apply the function? Whats the best way to do this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package to remove collinear variables
Hi, thank you for your help. I know, I need to learn enough statistics to understand how to process my data. The reason because of I write on this forum is to ask to people a way to learn. I am a postharvest researcher and statistic is not my main field, so I try to do my best. Do you know a book (or literature) than can help me? Thank you very much for your time and suggestions. Best regards, Roberto Il 05/08/2012 12:55, Jeff Newmiller ha scritto: There is no "magic bullet" (package) for your problem. You must either learn enough statistics to understand how to analyze your data, or consult with someone who does. FWIW collinearity is not in general amenable to automatic removal. However, you can identify which inputs are collinear with each other, and omit the redundant ones next iteration of your analysis, using (for example) the approach suggested by Uwe. Deciding WHICH of the redundant inputs is most appropriate to keep is the part computers are not so good at... that is where you must be smarter or more creative than the computer. Also, it would help you get responses if you included the context (earlier discussion) in your replies.. most people do not use Nabble here. Reading and following the requests in the footer of every message will also help. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Roberto wrote: I do not know, because I tried to use rfe function (Backwards Feature Selection, Caret Package) to select wavelengths useful for a prediction model. Otherwise, rfe function give me back a lot of warning messages about collinearity between variables. So, I do not know if your script can be useful. I tried to use VIF-Regression to select variables, but rfe function advise me with the same warning messages again. What do you think about that? Thank you very much for your help. Best, Roberto -- View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find date between two other dates
Hi, Your function is.between() can be also used. is.between<-function(x,a,b){ x=b } ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") ddate1<-data.frame(date=ddate) date2<-c("01/12/1998 00:00:00", "31/12/1998 23:59:59", "01/01/1999 00:00:00", "31/01/1999 23:59:59", "01/02/1999 00:00:00", "28/02/1999 23:59:59", "01/03/1999 00:00:00", "31/03/1999 23:59:59") date3<-as.POSIXct(strptime(date2, "%d/%m/%Y %H:%M:%S"), "GMT") ddate1[is.between(ddate1$date,date3[2],date3[1]),"Season"]<-1 ddate1[is.between(ddate1$date,date3[4],date3[3]),"Season"]<-2 ddate1[is.between(ddate1$date,date3[6],date3[5]),"Season"]<-3 ddate1[is.between(ddate1$date,date3[8],date3[7]),"Season"]<-4 ddate1 date Season 1 1998-12-29 20:00:33 1 2 1999-01-02 05:20:44 2 3 1999-01-02 06:18:36 2 4 1999-02-02 07:06:59 3 5 1999-03-02 07:10:56 4 6 1999-03-02 07:57:18 4 A.K. - Original Message - From: penguins To: r-help@r-project.org Cc: Sent: Sunday, August 5, 2012 4:30 PM Subject: [R] find date between two other dates Hi, I am trying to assign "Season" values to dates depending on when they occur. For example, the following dates would be assigned the following "Season" numbers based on the "season" intervals detailed below in the code: ddate Season 29/12/1998 20:00:33 1 02/01/1999 05:20:44 2 02/01/1999 06:18:36 2 02/02/1999 07:06:59 3 02/03/1999 07:10:56 4 02/03/1999 07:57:18 4 My approach so far doesnt work because of the time stamps and is probably very long winded. However, to prevent errors I would prefer to keep the date formats as dd/mm/ as oppose to a numeric format. Any help on the following code would be gratefully recieved: ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") is.between<-function(x, a, b) { (x > a) & (b > x) } ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59) ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59) ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59) ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59) Many thanks -- View this message in context: http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] more efficient way to parallel
Dear All, Suppose I have a program as below: Outside is a loop for simulation (with random generated data), inside there are several sapply()'s (10~100) over the data and something else, but these sapply's have to be sequential. And each sapply do not involve very intensive calculation (a few seconds only). So the outside loop takes minutes to finish one iteration. I guess the better way is not to parallel sapply but the outer loop. But I have no idea how to modify it. I have a simple code here. Only two sapply's involved for simplicity. The logical in the sapply is not important. Thank you for your attention and suggestion. library(parallel) library(MASS) result.seq=c() Maxi <- 100 for (i in 1:Maxi) { ## initialization, not of interest Sigmahalf <- matrix(sample(1:1,size = 1,replace =T ), 100) Sigma <- t(Sigmahalf)%*%Sigmahalf x <- mvrnorm(n=1000, rep(0, 10), Sigma) xlist <- list() for (j in 1:1000) { xlist[[j]] <- list(X = matrix( x [j, ],5)) } ## end of initialization dd1 <- sapply(xlist,function(s) {min(abs((eigen(s$X))$values))}) ## sumdd1=sum(dd1) for (j in 1:1000) { xlist[[j]]$dd1 <- dd1[j]/sumdd1 } ## Assume dd2 and dd1 can not be combined in one sapply() dd2 <- sapply(xlist, function(s){min(abs((eigen(s$X))$values))+s$dd1}) result.seq[i] <- sum(dd1*dd2) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Case study on R code speedup
Recently I looked into some ways to speed up a calculation in R (the Rayleigh Quotient is the example). I wanted to look at the byte-code compiler too. As a way of making notes I embedded my attempts in a knitR (.Rnw) file. The resulting pdf is linked from the Rwiki at http://rwiki.sciviews.org/doku.php?id=tips:rqcasestudy. R users may find the examples helpful. John Nash __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length
Hi Faz, Here is one way of doing it where "x" is your data frame: x[, colMeans(is.na(x)) <= .15] HTH, Jorge.- On Sun, Aug 5, 2012 at 9:04 PM, Faz Jones <> wrote: > I have a dataframe of 10 different columns (length of each column is > the same). I want to eliminate any column that has 'NA' greater than > 15% of the column length. Do i first need to make a function for > calculating the percentage of NA for each column and then make another > dataframe where i apply the function? Whats the best way to do this. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] deleting columns from a dataframe where NA is more than 15 percent of the column length
I have a dataframe of 10 different columns (length of each column is the same). I want to eliminate any column that has 'NA' greater than 15% of the column length. Do i first need to make a function for calculating the percentage of NA for each column and then make another dataframe where i apply the function? Whats the best way to do this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Help xts object Subset Date by Day of the Week
I have a xts object made of daily closing prices I have acquired using quantmod. Here is my code: library(xts) library(quantmod) library(lubridate) # Gets SPY data getSymbols("SPY") # Subset Prices to just closing price SP500 <- Cl(SPY) # Show day of the week for each date using 2-6 for monday-friday SP500wd <- wday(SP500) # Add Price and days of week together SP500wd <- cbind(SP500, SP500wd) # subset Monday into one xts object SPmon <- subset(SP500wd, SP500wd$..2=="2") I then used the package lubridate to show the days of the week. Due to the requirement of an xts objects to be numeric you will see each day is represented as a number so that Monday is =2, Tuesday=3, Wednesday=4, Thursday=5, Friday=6, Saturday=7. Since this is a financial index you will only see the numbers 2-6 or Monday-Friday. I want to subset the data by using the day column. I would like some help to figure out the best way to accomplish a few objectives. 1. Subset the data so that I only show Monday in sequence. However, I do want to make sure that it shows the date, price and the ..2 colum(which is the day of week) after Sub setting the data (I have it done but not sure if it is the best way) 2. Rearrange the object (hopefully without destroying the xts object) so that my data lines up like a weekly calendar. So it would look like the follow. Long Date Monday Monday Price Monday Day Index Long Date Tuesday Tuesday Price Tuesday Day Index Long Date Wednesday Wednesday Price Wednesday Index Long Date Thursday Thursday Price Thursday Index Friday Friday Price Friday Index 1/5/2009 92.85 2 1/6/2009 93.47 3 1/7/2009 90.67 4 1/8/2009 84.4 5 1/9/2009 89.09 6 1/12/2009 86.95 2 1/13/2009 87.11 3 1/14/2009 84.37 4 1/15/2009 91.04 5 1/16/2009 85.06 6 MLK Mondy MLK Monday MLK Monday 1/20/2009 80.57 3 1/21/2009 84.05 4 1/22/2009 82.75 5 1/23/2009 83.11 6 1/26/2009 83.68 2 1/27/2009 84.53 3 1/28/2009 87.39 4 1/29/2009 84.55 5 1/30/2009 82.83 6 Thank you, Douglas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find date between two other dates
HI, Try this: ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") ddate1<-data.frame(date=ddate) date2<-c("01/12/1998 00:00:00", "31/12/1998 23:59:59", "01/01/1999 00:00:00", "31/01/1999 23:59:59", "01/02/1999 00:00:00", "28/02/1999 23:59:59", "01/03/1999 00:00:00", "31/03/1999 23:59:59") date3<-as.POSIXct(strptime(date2, "%d/%m/%Y %H:%M:%S"), "GMT") ddate1[ddate1$date<=date3[2]& ddate1$date>=date3[1],"Season"]<-1 ddate1[ddate1$date=date3[3],"Season"]<-2 ddate1[ddate1$date=date3[5],"Season"]<-3 ddate1[ddate1$date=date3[7],"Season"]<-4 ddate1 date Season 1 1998-12-29 20:00:33 1 2 1999-01-02 05:20:44 2 3 1999-01-02 06:18:36 2 4 1999-02-02 07:06:59 3 5 1999-03-02 07:10:56 4 6 1999-03-02 07:57:18 4 A.K. - Original Message - From: penguins To: r-help@r-project.org Cc: Sent: Sunday, August 5, 2012 4:30 PM Subject: [R] find date between two other dates Hi, I am trying to assign "Season" values to dates depending on when they occur. For example, the following dates would be assigned the following "Season" numbers based on the "season" intervals detailed below in the code: ddate Season 29/12/1998 20:00:33 1 02/01/1999 05:20:44 2 02/01/1999 06:18:36 2 02/02/1999 07:06:59 3 02/03/1999 07:10:56 4 02/03/1999 07:57:18 4 My approach so far doesnt work because of the time stamps and is probably very long winded. However, to prevent errors I would prefer to keep the date formats as dd/mm/ as oppose to a numeric format. Any help on the following code would be gratefully recieved: ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") is.between<-function(x, a, b) { (x > a) & (b > x) } ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59) ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59) ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59) ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59) Many thanks -- View this message in context: http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find date between two other dates
Hello, You can use a function that returns the number you want, not a logical value. But first, it's a bad idea to have a data.frame and a vector with the same name, so, in what follows, I've altered the df name. ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") ddat <- data.frame(ddate=ddate) # Here, different name. season.month <- function(x){ x <- as.integer(format(x, format="%m")) ifelse(x == 12L, 1L, x + 1L) } season.month(ddate) ddat$season <- season.month(ddate) str(ddat) 'data.frame': 6 obs. of 2 variables: $ ddate : POSIXct, format: "1998-12-29 20:00:33" "1999-01-02 05:20:44" ... $ season: int 1 2 2 3 4 4 ddat ddate season 1 1998-12-29 20:00:33 1 2 1999-01-02 05:20:44 2 3 1999-01-02 06:18:36 2 4 1999-02-02 07:06:59 3 5 1999-03-02 07:10:56 4 6 1999-03-02 07:57:18 4 Hope this helps, Rui Barradas Em 05-08-2012 21:30, penguins escreveu: Hi, I am trying to assign "Season" values to dates depending on when they occur. For example, the following dates would be assigned the following "Season" numbers based on the "season" intervals detailed below in the code: ddate Season 29/12/1998 20:00:33 1 02/01/1999 05:20:44 2 02/01/1999 06:18:36 2 02/02/1999 07:06:59 3 02/03/1999 07:10:56 4 02/03/1999 07:57:18 4 My approach so far doesnt work because of the time stamps and is probably very long winded. However, to prevent errors I would prefer to keep the date formats as dd/mm/ as oppose to a numeric format. Any help on the following code would be gratefully recieved: ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") is.between<-function(x, a, b) { (x > a) & (b > x) } ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59) ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59) ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59) ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59) Many thanks -- View this message in context: http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with segmented function
Hi, I appreciate your help with the segmented function. I am relatively new to R. I followed the introduction of the 'segmented'-package by Vito Muggeo, but still it does not work. Here are the lines I wrote: data_test<-data.frame(x=c(1:10),y=c(1,1,1,1,1,2,3,4,5,6)) lr_test<-lm(y~x,data_test) seg_test<-segmented(lr_test,seg.Z~x,psi=1) /error in segmented.lm(lr_test, seg.Z ~ x, psi = 1) : A wrong number of terms in `seg.Z' or `psi'/ Thank you very much, Stella -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-segmented-function-tp4639227.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] find date between two other dates
Hi, I am trying to assign "Season" values to dates depending on when they occur. For example, the following dates would be assigned the following "Season" numbers based on the "season" intervals detailed below in the code: ddate Season 29/12/1998 20:00:33 1 02/01/1999 05:20:44 2 02/01/1999 06:18:36 2 02/02/1999 07:06:59 3 02/03/1999 07:10:56 4 02/03/1999 07:57:18 4 My approach so far doesnt work because of the time stamps and is probably very long winded. However, to prevent errors I would prefer to keep the date formats as dd/mm/ as oppose to a numeric format. Any help on the following code would be gratefully recieved: ddate <- c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18") ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") is.between<-function(x, a, b) { (x > a) & (b > x) } ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59) ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59) ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59) ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59) Many thanks -- View this message in context: http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package to remove collinear variables
There is no "magic bullet" (package) for your problem. You must either learn enough statistics to understand how to analyze your data, or consult with someone who does. FWIW collinearity is not in general amenable to automatic removal. However, you can identify which inputs are collinear with each other, and omit the redundant ones next iteration of your analysis, using (for example) the approach suggested by Uwe. Deciding WHICH of the redundant inputs is most appropriate to keep is the part computers are not so good at... that is where you must be smarter or more creative than the computer. Also, it would help you get responses if you included the context (earlier discussion) in your replies.. most people do not use Nabble here. Reading and following the requests in the footer of every message will also help. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Roberto wrote: >I do not know, because I tried to use rfe function (Backwards Feature >Selection, Caret Package) to select wavelengths useful for a prediction >model. Otherwise, rfe function give me back a lot of warning messages >about >collinearity between variables. > >So, I do not know if your script can be useful. >I tried to use VIF-Regression to select variables, but rfe function >advise >me with the same warning messages again. > >What do you think about that? > >Thank you very much for your help. > >Best, >Roberto > > > >-- >View this message in context: >http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html >Sent from the R help mailing list archive at Nabble.com. > >__ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find out what "native.enc" corresponds to
Le dimanche 05 août 2012 à 10:04 +0100, Prof Brian Ripley a écrit : > On 05/08/2012 09:54, Milan Bouchet-Valat wrote: > > Hi! > > > > I'm using R2HTML in my RcmdrPlugin.temis package to output localized > > strings to a HTML file. Thus, I insert a simple header at the top of the > > file to specify what encoding is used; if I don't do that, Web browsers > > assume it is latin1, which is not always true. > > > > My problem is, I could not find a way to detect what encoding is used by > > R2HTML in the most general case. R2HTML simply calls cat() with the file > > name, which means the text connection is opened using file(encoding = > > getOption("encoding")). This is fine, except that when > > getOption("encoding")) is set to "native.enc", I'm not able to find out > > the real encoding that was used for output. > > > > Of course, ideally I would tell R2HTML to output everything as UTF-8, > > and I would add this information to the header. But AFAICT this is not > > possible in the current state of this package. So I would be very > > grateful if somebody could provide me with a solution to resolve > > "native.enc" to the encoding name. > > ?options points you to ?connections, which does explain this. See > Sys.getlocale("LC_CTYPE") to see > > 'the internal encoding of the current locale' > > (or at least, what the OS claims it to be: e.g. some lie about 'C' locales). Thanks for the pointers, but the issue is/was that LC_CTYPE does not provide a valid encoding name. But your reply prompted me to read ?iconv again, and I discovered the existence of localeToCharset(), which seems to provide me with the encoding name I'm looking for. > As for a name, iconv() knows this as "" (and some OSes do make it rather > hard to find a name if it is not part of the locale name). I'm afraid I don't understand what you mean. Do you suggest I encode data to/from the current encoding? Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Accessing more than two coefficients in a plot
I needed to create my own forecast from the square root, linear and quadratic coefficients and then the abline() plot worked fine. # Forecast l using non-linear regression coeffs - unweighted lm2.bforecast<- numeric(n) for (i in 1:n) { lm2.bforecast[i] <- lm2.b$coeff["(Intercept)"]+lm2.b$coeff["VV1_2"]*VV1_2[i]+lm2.b$coeff["VV1_22"]*VV1_22[i]+lm2.b$coeff["VV1_212"]*VV1_212[i] } lm2.bforecastline<-lm(lm2.bforecast ~ VV1_2, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE) # unweighted, non-linear regression forecast plot(VV1_2, Lambda1_2, ylim=yrange, tck=1, main="Verizon V(1) Parameters (V, V^2 & V^0.5) Unweighted", xlab="VV1_2", ylab="Lambda1_2 & Beta1_2",pch=19,col="red") {points(VV1_2, lm2.lforecast, pch=19, col="brown") abline(lm2.lforecastline, col="brown", lty="longdash", lwd=2) ... > Date: Sun, 25 Mar 2012 15:36:20 -0700 > Subject: Re: [R] Accessing more than two coefficients in a plot > From: gunter.ber...@gene.com > To: chicagobrownb...@hotmail.com > CC: r-help@r-project.org > > Well, as a line in the plane is determined by 2 coefficients only, I'd > guess that trying to find an R function that plots a line defined by 4 > coefficients has about the same chance of success as finding a unicorn > with 3 horns. > > You do understand that your linear model defines a hyperplane in your > three covariates, do you not? Or do I misunderstand what you have > requested? > > Cheers, > Bert > > On Sun, Mar 25, 2012 at 2:32 PM, FJ M wrote: > > > > I've successfully plotted (in the plot and abline code below) a simple > > regression of Lambda1_2 on VV1_2. I then successfully regressed Lambda1_2 > > on VV1_2, VV1_22 and VV1_212 producing lm2.l. When I go to plot lm2.l using > > abline I get the warning: > > > > "1: In abline(lm2.l, col = "brown", lty = "dotted", lwd = 2) : only using > > the first two of 4 regression coefficients" > > > > Is there another function like abline that will produce a line using the > > constant and three coefficients from the lm2.l regression? > > > > > > lm.l <- lm(Lambda1_2 ~ VV1_2, method = "qr", model = TRUE, x = FALSE, y = > > FALSE, qr = TRUE) # unweighted regression > > > > lm2.l <- lm(Lambda1_2 ~ VV1_2 + VV1_22 + VV1_212, method = "qr", model = > > TRUE, x = FALSE, y = FALSE, qr = TRUE) # unweighted regression > > > > plot(VV1_2, Lambda1_2, ylim=yrange, tck=1, main="V(1) Parameters (V, V^2 & > > V^0.5)", xlab="VV1_2", ylab="Lambda & Beta1_2",pch=19,col="red") > > {abline(lm2.l, col="brown", lty="dotted", lwd=2) > > abline(wlm2.l, col="gold",lty="longdash", lwd=2) > > points(VV1_2, Beta1_2, pch=19, col="blue") > > abline(lm2.b, col="black",lty="dotted", lwd=2) > > abline(wlm2.b, col="blue", lty="longdash", lwd=2) > > legend("topright", inset=.05, title="Parameters", > > labels, lwd=2, lty=c(1, 1, 1, 1, 2), col=colors) > > } > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package to remove collinear variables
I do not know, because I tried to use rfe function (Backwards Feature Selection, Caret Package) to select wavelengths useful for a prediction model. Otherwise, rfe function give me back a lot of warning messages about collinearity between variables. So, I do not know if your script can be useful. I tried to use VIF-Regression to select variables, but rfe function advise me with the same warning messages again. What do you think about that? Thank you very much for your help. Best, Roberto -- View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting desired numbers from complicated lines of web pages
try this: left as an exercise to the reader if these have to be grouped by 'userid' which might be the case and therefore you might want to check for non-existent values. Also on the last line you did not say it there are only those three values, or could there be more. input <- readLines(textConnection(' + [1] "\t\t\t108 Friends" + + [2] "\t\t\t151 Reviews" + + [3] "\t\t\t\t5 Review Updates" + + [4] "\t\t\t\t1 First" + + [5] "\t\t\t\t2 Fans" + + [6] "\t\t\t\t54 Local Photos" + + [7] http://s3-media2.ak.yelpcdn.com/assets/0/www/img/cf265851428e/ico/reviewVotes.gif"; alt=""> Review votes: 2022 Useful, 1591 Funny, and 1756 Cool + + [[alternative HTML version deleted]]')) > > # extract the data by brute force and then break apart into a dataframe > count <- lapply(input, function(.line){ + if (grepl('[0-9]+ Friends', .line)) + return(sub(".*>([0-9]+) (Friends).*", "\\1:\\2", .line)) + if (grepl("[0-9]+ Reviews", .line)) + return(sub(".*>([0-9]+) (Reviews).*", "\\1:\\2", .line)) + if (grepl("[0-9]+ Review Update", .line)) + return(sub(".*>([0-9]+) (Review Update).*", "\\1:\\2", .line)) + if (grepl("[0-9]+ First", .line)) + return(sub(".*>([0-9]+) (First).*", "\\1:\\2", .line)) + if (grepl("[0-9]+ Fans", .line)) + return(sub(".*>([0-9]+) (Fans).*", "\\1:\\2", .line)) + if (grepl("[0-9]+ Local Photos", .line)) + return(sub(".*>([0-9]+) (Local Photos).*", "\\1:\\2", .line)) + if (grepl("[0-9]+ Useful", .line)) + return(c( # vector with multiple values + sub(".* ([0-9]+) (Useful).*", "\\1:\\2", .line) + , sub(".* ([0-9]+) (Funny).*", "\\1:\\2", .line) + , sub(".* ([0-9]+) (Cool).*", "\\1:\\2", .line) + )) + return(NULL) + }) > > # create dataframe > df <- data.frame(do.call(rbind, strsplit(unlist(count), ":"))) > names(df) <- c("Value", "Variable") > df Value Variable 1 108 Friends 2 151 Reviews 3 5 Review Update 4 1 First 5 2 Fans 654 Local Photos 7 2022Useful 8 1591 Funny 9 1756 Cool > > > > On Sun, Aug 5, 2012 at 11:16 AM, Shelby McIntyre wrote: > I need to extract the indicted (bold & underlined) numbers from lines coming > off web pages. > > Of course I don't know ahead of time the location or length of the number. > What I do know > is the tag "Friends", and "Reviews", etc. In fact, it would be good to end up > with > > Value Variable > 108 Friends > 151 Reviews > 5 Review Updates > NA First <-- assuming here that "First" did not show > up on an line > etc. > > Of particular trouble is line [7] which requires extracting 3 numbers 2022 > (Useful), 1591 (Funny) and 1756 (Cool). > == Extraction problem lines === > > [1] "\t\t\t href=\"/user_details_friends?userid=--T8djg0nrb_yMMMA3Y0jQ\">108 > Friends" > > [2] "\t\t\t href=\"/user_details_reviews_self?userid=--T8djg0nrb_yMMMA3Y0jQ\">151 > Reviews" > > [3] "\t\t\t\t5 Review Updates" > > [4] "\t\t\t\t href=\"/user_details_reviews_self?review_filter=first&userid=--T8djg0nrb_yMMMA3Y0jQ\">1 > First" > > [5] "\t\t\t\t2 Fans" > > [6] "\t\t\t\t href=\"/user_local_photos?userid=--T8djg0nrb_yMMMA3Y0jQ\">54 Local > Photos" > > [7] src="http://s3-media2.ak.yelpcdn.com/assets/0/www/img/cf265851428e/ico/reviewVotes.gif"; > alt=""> Review votes: 2022 Useful, 1591 Funny, and 1756 Cool > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parallel runs of an external executable with snow in local
On 03.08.2012 19:21, Xavier Portell/UPC wrote: Hi everyone, I'm aiming to run an external executable (say filetorun.EXE) in parallel. The external executable collect needed data from a file, say "input.txt" and, in turn,generates several output files, say "output.txt". I need to generate "input.txt", run the executable and keep "input.txt" and "output.txt". I'm using Windows 7, R version 2.15.1 (2012-06-22) on RStudio and platform: i386.pc.mingw32/i386 (32-bit). My first attempt was a R code which, by using System("filetorun.EXE", intern = F, ignore.stdout = F, ignore.stderr = F, wait = T, input = NULL, show.output.on.console = T, minimized = F, invisible = T)) , ran the executable and kept required files to a conveniently named folder. After that I changed my previous R script so I could use the function lapply().This script apparently worked fine. Finally, I tried to parallelize the problem by using snow and parLapply(). The resulting script looks like this: ## Not run # library(snow)cl <- makeCluster(3, type = "SOCK") clusterExport(cl,list('param.esp','copy.files','for12.template','program.executor')) parLapply(cl,a.list,a.function))stopCluster(cl) # ##End not run Although it runs, the parallelized version is messing up the input parameters to pass to the executable (see table below, where parameters P1 and P2 are considered. ".s" comes from the serial code and ".p" from the parallelized one): s r P1.s P2.s P1.p P2.p 1 1 1 1.0 3.00 2.0 3.00 2 2 1 1.5 3.00 2.0 3.75 3 3 1 2.0 3.00 2.0 3.00 4 4 1 1.0 3.75 1.5 3.00 5 5 1 1.5 3.75 1.5 3.00 6 6 1 2.0 3.75 2.0 3.75 My first thought to avoid the described behaviour was creating a temporary file, say "tmp.id" with id being an identification run number, and copying "filetorun.EXE" and "Input.txt" to "tmp.id". However, while doing so, I realised that although running the correct "filetorun.EXE" copy (i.e., the one in "tmp.id") R looks for "input.txt" in the work directory. Not sure about the real setup, but you can actually specify the path, not only filenames. Uwe Ligges I've been looking thoroughly for a solution but I got nothing. Thanks for any help in advance, Xavier Portell Canal PhD candidate Department of Agri-food engineering, Universitat Politècnica de Catalunya __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PROGRAMM MATRIX
HI, I think it will be better if you posted this on R-help. A.K. - Original Message - From: "hafida...@hotmail.fr" To: smartpink...@yahoo.com Cc: Sent: Sunday, August 5, 2012 9:01 AM Subject: PROGRAMM MATRIX Hi can you please help me to programme this formula: g[ll']=i[ll']-sum from j=1 to k c[lj]c[l'j]A[j]^-1 WHERE i[ll']= 1/n sum from i=1 to n z[il]z[il'] n,k,m are given. j=1...k, l,l'=1...m, it s complicate for me ; hope you can help me thank you a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting desired numbers from complicated lines of web pages
I need to extract the indicted (bold & underlined) numbers from lines coming off web pages. Of course I don't know ahead of time the location or length of the number. What I do know is the tag "Friends", and "Reviews", etc. In fact, it would be good to end up with Value Variable 108 Friends 151 Reviews 5 Review Updates NA First <-- assuming here that "First" did not show up on an line etc. Of particular trouble is line [7] which requires extracting 3 numbers 2022 (Useful), 1591 (Funny) and 1756 (Cool). == Extraction problem lines === [1] "\t\t\t108 Friends" [2] "\t\t\t151 Reviews" [3] "\t\t\t\t5 Review Updates" [4] "\t\t\t\t1 First" [5] "\t\t\t\t2 Fans" [6] "\t\t\t\t54 Local Photos" [7] http://s3-media2.ak.yelpcdn.com/assets/0/www/img/cf265851428e/ico/reviewVotes.gif"; alt=""> Review votes: 2022 Useful, 1591 Funny, and 1756 Cool [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble with looping for effect of sampling interval increase
I've looked everywhere and tinkered for three days now, so I figure asking might be good. So here's a general rundown of what I am trying to get my code to do I am giving you the whole rundown because I need a solution that retain certain ways of doing things because they give me the information i need. I want to examine the effect of increasing my sampling interval on my data. Example: what if instead of sampling every hour I sampled every two, oh yeah, how about every three?.. etc ad nausea. How I want to do this is to take the data I have now, add an index to it, that contains counters. Those counters will look something like 1,2,1,2,.. for the first one, 1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand... Then for each column in the index my loops should start in the first column, run only the ones, store that, then run the twos, and store that in the same column of output in a different row. Then move to the next column run the ones, store in the next column of output, run the twos, store in the next row of that column, run the threes, etc on out until there is no more. I want to use this index for a number of reasons. The first is that after this I will be going back through and using a different method for sub-sampling but keeping all else the same. So all I have to do there is change the way I generate the index. The second is that it allows me to run many subsamples and see their range. So the code I have made, generates my index, and does the heavy lifting all correctly, as well as my averages, and quartiles, but a look at the head () of my key output (IntervalBetas) shows that something has gone a miss. You have to look close to catch it. The values generated for each row of output are identical, this should not be the case, as row one of the first output column should be generated from all values indexed by a one in the first column, whereas in column two there are different values indexed by the number one. I've checked about everything I can think of, done print() on my loop sequence things (those little i and j) and wiggled about everything. I am flummoxed. I think the bit that is messing up is in here : #Here is the loop for betas from sampling interval increase c <- WHOLESIZE[2]-1 for (i in 1:c) { x <- length(unique(index[,i])) for (j in 1:x) { data <- WHOLE [WHOLE[,x]==j,1] But also here is the whole code in case I am wrong that that is the problem area: #loop for making index #clean dataset of empty cells dataset <- na.omit (datasetORIGINAL) #how messed up was the data? holeyDATA <- datasetORIGINAL - dataset D <- dim(dataset) #what is the smallest sample? tinysample <- 100 #how long is the dataset? datalength <- length (dataset) #MD <- how many divisions MD <- datalength/tinysample #clear things up for the index loop WHOLE <- NULL index <- NULL #do the index loop for (a in 1:MD) { index <- cbind (index, rep (1:a, length = D[1])) } index <- subset(index, select = -c(1) ) #merge dataset and index loop WHOLE <- cbind (dataset, index) WHOLESIZE <- dim (WHOLE) #Housekeeping before loops IntervalBetas <- NULL IntervalBetas <- c(NA,NA) IntervalBetas <- as.data.frame (IntervalBetas) IntervalLowerQ <- NULL IntervalUpperQ <- NULL IntervalMean <- NULL IntervalMedian <- NULL #Here is the loop for betas from sampling interval increase c <- WHOLESIZE[2]-1 for (i in 1:c) { x <- length(unique(index[,i])) for (j in 1:x) { data <- WHOLE [WHOLE[,x]==j,1] #get power spectral density PSDPLOT <- spectrum (data, detrend = TRUE, plot = FALSE) frequency <- PSDPLOT$freq PSD <- PSDPLOT$spec #log transform the power spectral density Logfrequency <- log(frequency) LogPSD<- log(PSD) #fit my line to the data Line <- lm (LogPSD ~ Logfrequency) #store the slope of the line Betas <- rbind (Betas, -coef(Line)[2]) #Get values on the curve shape BSkew <- skew (Betas) BMean <- mean (Betas) BMedian <- median (Betas) Q <- quantile (Betas) #store curve shape values IntervalLowerQ <- rbind (IntervalLowerQ , Q[2]) IntervalUpperQ <- rbind (IntervalUpperQ , Q[4]) IntervalSkew <- rbind (IntervalSkew , BSkew) IntervalMean <- rbind (IntervalMean , BMean) IntervalMedian <- rbind (IntervalMedian , BMedian) #Store the Betas #This is a pain BetaSave <- Betas no.r <- nrow(IntervalBetas) l.v <- length(BetaSave) difer <- no.r - l.v difers <- abs(difer) if (no.r < l.v){ IntervalBetas <- rbind(IntervalBetas,rep(NA,difers)) } else { (BetaSave <- rbind(BetaSave,rep(NA,difers))) } IntervalBetas <- cbind (IntervalBetas, BetaSave) } } #That ends the loop within a loop for how sampling interval #changes beta head (IntervalBetas) -- View this message in context: http://r.789695.n4.nabble.com/trouble-with-looping-for-effect-of-sampling-interval-increase-tp4639213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://st
Re: [R] Coloring Counties in a State Map
Hi Ray, that was really helpful, thank you!!! -- View this message in context: http://r.789695.n4.nabble.com/Coloring-Counties-in-a-State-Map-tp4638218p4639210.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot help
Duh, I'm more dyslexic than usual obviously. John Kane Kingston ON Canada > -Original Message- > From: ruipbarra...@sapo.pt > Sent: Sun, 05 Aug 2012 17:07:38 +0100 > To: jrkrid...@inbox.com > Subject: Re: [R] ggplot2 boxplot help > > Hello, > > Wasn't it supposed to be a boxplot? > Anyway, the main problem seems to be a df format conversion prior to > plotting. > > dat2 <- data.frame(sample=rep(NA, 2*nrow(dat))) > dat2$sample <- with(dat1, c(as.character(sample_1), > as.character(sample_2))) > dat2$value <- with(dat1, c(value_1, value_2)) > dat2 > > qplot(sample, value, data=dat2, geom="boxplot") > > Hope this helps, > > Rui Barradas > > Em 05-08-2012 16:10, John Kane escreveu: >> Please use dput() to supply sample data. >> >> I think this does something like what you want. >> ===### >> ibrary(ggplot2) >> library(reshape2) >> >> dat1<-read.table(text=" >> sample_1 sample_2 value_1 value_2 >> N C 1.9268400 36.77590 >> N C 0.1817890 5.58835 >> N C0.2309000 7.54035 >> N C 0.0294559 1.50886 >> N C 0.4678610 14.75560 >> N C 10.7258000 92.13150", >> sep="",header=TRUE) >> >> >> bb <- melt(dat1) >> >> p <- ggplot(bb , aes(variable, value, fill =as.factor(value ) )) + >>geom_bar(stat= "identity", position = "dodge") + >> scale_fill_discrete(name = "Fancy Title") + >>scale_x_discrete(breaks=c("value_1", "value_2"), >> labels=c("Sample 1", "Sample 2")) >> p >> #### >> >> John Kane >> Kingston ON Canada >> >> >>> -Original Message- >>> From: alexpadron1...@gmail.com >>> Sent: Sat, 4 Aug 2012 14:21:49 -0700 (PDT) >>> To: r-help@r-project.org >>> Subject: [R] ggplot2 boxplot help >>> >>> Hello, >>> >>> I have a data set that looks like this: >>> >>> name G-ID test_id g-id g >>> 1 00077464 C_068131 C_068131 OC_068131- >>> 2 00051728 C_044461 C_044461 OC_044461- >>> 3 00058738 C_050343 C_050343 OC_050343- >>> 4 00059239 C_050649 C_050649 OC_050649- >>> 5 1761 C_000909 C_000909 OC_000909- >>> 6 5119 C_002752 C_002752 OC_002752- >>> locssample_1 sample_2 value_1 >>> value_2 >>> 1 37316550-37317847 N C 1.9268400 >>> 36.77590 >>> 2 27058468-27060176 N C 0.1817890 >>> 5.58835 >>> 3 4761739-4763268N C0.2309000 >>> 7.54035 >>> 4 14565311-14567393 N C 0.0294559 >>> 1.50886 >>> 5 38670994-38675694 N C 0.4678610 >>> 14.75560 >>> 6 48362804-48380794 N C 10.7258000 >>> 92.13150 >>> >>> >>> >>> In this dataset, sample_1 corresponds to value_1 and sample_2 >>> corresponds >>> to >>> value_2. How can I graph this in ggplot2's boxplot function? I am not >>> quite >>> sure how to tell R that sample_1 and sample_2 columns correspond to >>> value_1 >>> and value_2 using ggplot2. >>> >>> Can anyone shed some light on this? >>> >>> Thanks. >>> >>> >>> >>> -- >>> View this message in context: >>> http://r.789695.n4.nabble.com/ggplot2-boxplot-help-tp4639187.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> FREE ONLINE PHOTOSHARING - Share your photos online with your friends >> and family! >> Visit http://www.inbox.com/photosharing to find out more! >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot help
Hello, Wasn't it supposed to be a boxplot? Anyway, the main problem seems to be a df format conversion prior to plotting. dat2 <- data.frame(sample=rep(NA, 2*nrow(dat))) dat2$sample <- with(dat1, c(as.character(sample_1), as.character(sample_2))) dat2$value <- with(dat1, c(value_1, value_2)) dat2 qplot(sample, value, data=dat2, geom="boxplot") Hope this helps, Rui Barradas Em 05-08-2012 16:10, John Kane escreveu: Please use dput() to supply sample data. I think this does something like what you want. ===### ibrary(ggplot2) library(reshape2) dat1<-read.table(text=" sample_1 sample_2 value_1 value_2 N C 1.9268400 36.77590 N C 0.1817890 5.58835 N C0.2309000 7.54035 N C 0.0294559 1.50886 N C 0.4678610 14.75560 N C 10.7258000 92.13150", sep="",header=TRUE) bb <- melt(dat1) p <- ggplot(bb , aes(variable, value, fill =as.factor(value ) )) + geom_bar(stat= "identity", position = "dodge") + scale_fill_discrete(name = "Fancy Title") + scale_x_discrete(breaks=c("value_1", "value_2"), labels=c("Sample 1", "Sample 2")) p #### John Kane Kingston ON Canada -Original Message- From: alexpadron1...@gmail.com Sent: Sat, 4 Aug 2012 14:21:49 -0700 (PDT) To: r-help@r-project.org Subject: [R] ggplot2 boxplot help Hello, I have a data set that looks like this: name G-ID test_id g-id g 1 00077464 C_068131 C_068131 OC_068131- 2 00051728 C_044461 C_044461 OC_044461- 3 00058738 C_050343 C_050343 OC_050343- 4 00059239 C_050649 C_050649 OC_050649- 5 1761 C_000909 C_000909 OC_000909- 6 5119 C_002752 C_002752 OC_002752- locssample_1 sample_2 value_1 value_2 1 37316550-37317847 N C 1.9268400 36.77590 2 27058468-27060176 N C 0.1817890 5.58835 3 4761739-4763268N C0.2309000 7.54035 4 14565311-14567393 N C 0.0294559 1.50886 5 38670994-38675694 N C 0.4678610 14.75560 6 48362804-48380794 N C 10.7258000 92.13150 In this dataset, sample_1 corresponds to value_1 and sample_2 corresponds to value_2. How can I graph this in ggplot2's boxplot function? I am not quite sure how to tell R that sample_1 and sample_2 columns correspond to value_1 and value_2 using ggplot2. Can anyone shed some light on this? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/ggplot2-boxplot-help-tp4639187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ptproc package
On 05.08.2012 03:11, amirzadeh wrote: Dear all I came across ptproc package on following website: http://www.biostat.jhsph.edu/~rpeng/software/index.html Actually I downloaded it on the contributors website and tried to install it manual but R wont unzip it. It is not available on CRAN project. I use R 2.15.1 and windows vista on my computer. Any help would be appreciated. You will have to install it from sources. See the manual "R Installation and Administration" on how to do that on a Windows machine and which tools may be required. In this case, even reading install.packages is sufficient, since you can try: install.packages("ptproc", repos="http://www.biostat.jhsph.edu/~rpeng/software";, type="source") should do the trick already. Best, Uwe Ligges Thanks. Amir Zadeh. -- View this message in context: http://r.789695.n4.nabble.com/ptproc-package-tp4639196.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package to remove collinear variables
On 05.08.2012 05:27, Roberto wrote: Hi, I need to remove collinear variables to my Near-Infrared table of spectra. What package can I use? Something simple, because I am a novice about statistic. Remove those where isTRUE(all.equal(cor(x, y), 1)) is TRUE? Uwe Ligges Thank you. Best regards, Roberto -- View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot help
Please use dput() to supply sample data. I think this does something like what you want. ===### ibrary(ggplot2) library(reshape2) dat1<-read.table(text=" sample_1 sample_2 value_1 value_2 N C 1.9268400 36.77590 N C 0.1817890 5.58835 N C0.2309000 7.54035 N C 0.0294559 1.50886 N C 0.4678610 14.75560 N C 10.7258000 92.13150", sep="",header=TRUE) bb <- melt(dat1) p <- ggplot(bb , aes(variable, value, fill =as.factor(value ) )) + geom_bar(stat= "identity", position = "dodge") + scale_fill_discrete(name = "Fancy Title") + scale_x_discrete(breaks=c("value_1", "value_2"), labels=c("Sample 1", "Sample 2")) p #### John Kane Kingston ON Canada > -Original Message- > From: alexpadron1...@gmail.com > Sent: Sat, 4 Aug 2012 14:21:49 -0700 (PDT) > To: r-help@r-project.org > Subject: [R] ggplot2 boxplot help > > Hello, > > I have a data set that looks like this: > >name G-ID test_id g-id g > 1 00077464 C_068131 C_068131 OC_068131- > 2 00051728 C_044461 C_044461 OC_044461- > 3 00058738 C_050343 C_050343 OC_050343- > 4 00059239 C_050649 C_050649 OC_050649- > 5 1761 C_000909 C_000909 OC_000909- > 6 5119 C_002752 C_002752 OC_002752- > locssample_1 sample_2 value_1 > value_2 > 1 37316550-37317847 N C 1.9268400 > 36.77590 > 2 27058468-27060176 N C 0.1817890 > 5.58835 > 3 4761739-4763268N C0.2309000 > 7.54035 > 4 14565311-14567393 N C 0.0294559 > 1.50886 > 5 38670994-38675694 N C 0.4678610 > 14.75560 > 6 48362804-48380794 N C 10.7258000 > 92.13150 > > > > In this dataset, sample_1 corresponds to value_1 and sample_2 corresponds > to > value_2. How can I graph this in ggplot2's boxplot function? I am not > quite > sure how to tell R that sample_1 and sample_2 columns correspond to > value_1 > and value_2 using ggplot2. > > Can anyone shed some light on this? > > Thanks. > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/ggplot2-boxplot-help-tp4639187.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to put barchart and line chart in the same plot in ggplot2
As far as I understand ggplot2, you cannot do it. ggplot2 is pretty much designed to NOT allow two different sets of data with different y axes in the same plot. Doing this is generally considered very bad practice. I'd suggest looking into perhaps using a 2X1 or X2 grid and plotting the two sets perhaps beside or above/below Have a look at http://stackoverflow.com/questions/9490482/combined-plot-of-ggplot2-not-in-a-single-plot-using-par-or-layout-functio # see http://stackoverflow.com/questions/8615530/place-title-of-multiplot-panel-with-ggplot2 for an example Kingston ON Canada > -Original Message- > From: xin...@stat.psu.edu > Sent: Sat, 4 Aug 2012 17:17:58 -0700 (PDT) > To: r-help@r-project.org > Subject: [R] how to put barchart and line chart in the same plot in > ggplot2 > > dear userR: > I am trying to plot two dependent variables in the same plot in ggplot2. > because these two variables have very different magnitude, I have to use > a > second Y axis. I hope one variable to be line and the other to be > barchart. > The x axis is continuous. Yet since I have to make barchart, I guess I > have > to treat it as discrete or categorical. > I have been google searching for the whole afternoon but do not have any > clue. > Can anyone give me a direction (not have to be a complete answer...)? > > many thanks > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/how-to-put-barchart-and-line-chart-in-the-same-plot-in-ggplot2-tp4639194.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help to programm
Hello, Sorry, but I don't understand your formula. Maybe it's better if you 1. use * for multiply. 2. break the expression into smaller components. For instance, (Is it an add and a multiply or two multiplies?) EXP <- exp{ (B0 B1 row matrix) (z[l] column matrix) } Then use EXP. 3. instead of 'row matrix' write 'row_matrix', the same for column matrix. 4. Your final sum is the sum of what ??? Em 05-08-2012 13:31, hafida...@hotmail.fr escreveu: Hi can you please help me to programme this formula: a[j]= E[j]-sum from l=i to i-1 (exp{(B0 B1row matrix) (z[l]column matrix) } x[l]) / sum from l=i to n i=1...n j=1...k l=1...m ; n,m,k are given. it s complicate for me ; hope you can help me thank you a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find out what "native.enc" corresponds to
On 05/08/2012 09:54, Milan Bouchet-Valat wrote: Hi! I'm using R2HTML in my RcmdrPlugin.temis package to output localized strings to a HTML file. Thus, I insert a simple header at the top of the file to specify what encoding is used; if I don't do that, Web browsers assume it is latin1, which is not always true. My problem is, I could not find a way to detect what encoding is used by R2HTML in the most general case. R2HTML simply calls cat() with the file name, which means the text connection is opened using file(encoding = getOption("encoding")). This is fine, except that when getOption("encoding")) is set to "native.enc", I'm not able to find out the real encoding that was used for output. Of course, ideally I would tell R2HTML to output everything as UTF-8, and I would add this information to the header. But AFAICT this is not possible in the current state of this package. So I would be very grateful if somebody could provide me with a solution to resolve "native.enc" to the encoding name. ?options points you to ?connections, which does explain this. See Sys.getlocale("LC_CTYPE") to see 'the internal encoding of the current locale' (or at least, what the OS claims it to be: e.g. some lie about 'C' locales). As for a name, iconv() knows this as "" (and some OSes do make it rather hard to find a name if it is not part of the locale name). -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Find out what "native.enc" corresponds to
Hi! I'm using R2HTML in my RcmdrPlugin.temis package to output localized strings to a HTML file. Thus, I insert a simple header at the top of the file to specify what encoding is used; if I don't do that, Web browsers assume it is latin1, which is not always true. My problem is, I could not find a way to detect what encoding is used by R2HTML in the most general case. R2HTML simply calls cat() with the file name, which means the text connection is opened using file(encoding = getOption("encoding")). This is fine, except that when getOption("encoding")) is set to "native.enc", I'm not able to find out the real encoding that was used for output. Of course, ideally I would tell R2HTML to output everything as UTF-8, and I would add this information to the header. But AFAICT this is not possible in the current state of this package. So I would be very grateful if somebody could provide me with a solution to resolve "native.enc" to the encoding name. Thanks for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting unknown error trying to plot spatial data
On 08/05/2012 05:09 AM, mjkatsaros wrote: Hi there! I'm following an awesome guide to working with spatial data (http://www.frankdavenport.com/blog/2012/6/19/notes-from-a-recent-spatial-r-class-i-gave.html) and am running into an error that I can't figure out how to fix. Disclaimer: I am very much an R n00b Here is the r script I am running: https://dl.dropbox.com/u/28231177/This%20Should%20Work.R data: https://dl.dropbox.com/u/28231177/my_data.csv shapefile: https://dl.dropbox.com/u/28231177/sfzipcodes.zip I am getting two errors: pds<- fortify(sf_map) *Using OBJECTID to define regions.* pds$OBJECTID<- as.integer(pds$OBJECTID) *Error in `$<-.data.frame`(`*tmp*`, "OBJECTID", value = integer(0)) : replacement has 0 rows, data has 16249* ## Make the map p1<- ggplot(my_data, aes(map_id = zip)) p1<- p1 + geom_map(aes(fill=vol, map_id = zip), map = pds) p1<- p1 + expand_limits(x = pds$lon, y = pds$lat) + coord_equal() p1 + xlab("Basic Map with Default Elements") *Error in unit(x, default.units) : 'x' and 'units' must have length> 0* Anybody have any idea what is happening here or how to resolve this? Hi mjkatsaros, The data file doesn't have a column labelled "OBJECTID". I would try renaming the "zip" column to "OBJECTID". Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to put barchart and line chart in the same plot in ggplot2
On 08/05/2012 10:17 AM, xin wei wrote: dear userR: I am trying to plot two dependent variables in the same plot in ggplot2. because these two variables have very different magnitude, I have to use a second Y axis. I hope one variable to be line and the other to be barchart. The x axis is continuous. Yet since I have to make barchart, I guess I have to treat it as discrete or categorical. I have been google searching for the whole afternoon but do not have any clue. Can anyone give me a direction (not have to be a complete answer...)? Hi xin wei, If you're desperate, have a look at twoord.plot in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.