[R] which() with multiple conditions
I hope someone can point me in the right direction please. I have a data frame with a column containing names. I want to identify the columns that contain names in a list. namestofind - c('fred','bill',a long list) If I only wanted to identify a single name I would use which(z$name == 'bill') What syntax would I use to identify all the rows that contain any of the names in namestofind? Thanks in advance for the pointer -- View this message in context: http://r.789695.n4.nabble.com/which-with-multiple-conditions-tp4644677.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] POSIXlt and daylight savings time
I'll rephrase the question... If you try... as.POSIXlt('2004-10-31 02:00:00') you get [1] 2004-10-31 What do I need to do to make it return [1] 2004-10-31 02:00:00 -- View this message in context: http://r.789695.n4.nabble.com/POSIXlt-and-daylight-savings-time-tp4642253p4642272.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] POSIXlt and daylight savings time
I have a data frame that contains dates, but when I use as.POSIXlt() I lose the hours on all records. I traced this down to a particuar hour which causes the issue... as.POSIXlt('2004-10-31 02:00:00') [1] 2004-10-31 as.POSIXlt('2004-10-31 03:00:00') [1] 2004-10-31 03:00:00 How do I tell as.POSIXlt() to ignore daylight savings and just convert to a time as is? I've read about the 'isdst' but it is still unclear what to do. This is a cleaned up date field that I received so adjusting the date itself is not possible. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/POSIXlt-and-daylight-savings-time-tp4642253.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] revolution foreach oddity
I know this is not a revolution support forum, but as anyone noticed the following? I have a foreach loop to generate random samples. If I run the exact code below in normal r (2.14.1) it works as expected, but if I run it from revolution 4.2.0 each loop returns the same numbers. The only way I can get revolution to give different numbers is using 1 instead of 8 in registerDoSNOW(makeCluster(8, type = SOCK)) but that seems to defeat the point. library(foreach) library(doSNOW) registerDoSNOW(makeCluster(8, type = SOCK)) getDoParWorkers() getDoParName() getDoParVersion() mySamples - foreach (jj = 1:4, .combine=cbind) %dopar% { return(sample(1:10,10,replace=TRUE)) } mySamples ## r 2.14.1 ## library(foreach) library(doSNOW) registerDoSNOW(makeCluster(8, type = SOCK)) getDoParWorkers() [1] 8 getDoParName() [1] doSNOW getDoParVersion() [1] 1.0.6 mySamples - foreach (jj = 1:4, .combine=cbind) %dopar% { + return(sample(1:10,10,replace=TRUE)) + } mySamples result.1 result.2 result.3 result.4 [1,]5314 [2,]1 10 103 [3,]7949 [4,]2593 [5,]27 101 [6,]78 10 10 [7,]69 104 [8,]8662 [9,] 10794 [10,]2419 # revolution r library(foreach) Loading required package: iterators Loading required package: codetools foreach: simple, scalable parallel programming from REvolution Computing Use REvolution R for scalability, fault tolerance and more. http://www.revolution-computing.com library(doSNOW) Loading required package: snow registerDoSNOW(makeCluster(8, type = SOCK)) getDoParWorkers() [1] 8 getDoParName() [1] doSNOW getDoParVersion() [1] 1.0.3 mySamples - foreach (jj = 1:4, .combine=cbind) %dopar% { + return(sample(1:10,10,replace=TRUE)) + } mySamples result.1 result.2 result.3 result.4 [1,]4444 [2,] 10 10 10 10 [3,]4444 [4,] 10 10 10 10 [5,]5555 [6,]5555 [7,]9999 [8,]2222 [9,]6666 [10,]9999 -- View this message in context: http://r.789695.n4.nabble.com/revolution-foreach-oddity-tp4616237.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] directory of current script
I found this... https://stat.ethz.ch/pipermail/r-help/2009-January/184745.html -- View this message in context: http://r.789695.n4.nabble.com/directory-of-current-script-tp4553386p4553409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] directory of current script
I am running a series of scripts sequentially and they all need some global parameters. These will be included in a file in a known sub directory as the scripts themselves. The scripts need to be run by anyone without ANY editing. Question is: Is there a command to return the directory of the current script, so it then knows where to find the global parameter file? Or is there a simpler way? Cheers. -- View this message in context: http://r.789695.n4.nabble.com/directory-of-current-script-tp4553386p4553386.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can this sequence be generated easier?
I have 'x' variables that I need to find the optimum combination of, with the constraint that the sum of all x variables needs to be exactly 100. I need to test all combinations to get the optimal mix. This is easy if I know how many variables I have - I can hard code as below. But what if I don't know the number of variables and want this to be a flexible parameter. Is there a sexy recursive way that this can be done in R? #for combinations of 2 variables vars = 2 for(i in 0:100){ for(j in 0:(100-i)){ ...do some test i,j combination }} #for combinations of 3 variables vars = 3 for(i in 0:100){ for(j in 0:(100-i)){ for(k in 0:100-(i+j)){ ...do some test on i,j,k combination }}} -- View this message in context: http://r.789695.n4.nabble.com/can-this-sequence-be-generated-easier-tp3607240p3607240.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] computer name
Is there an r function that will be able to identify the computer the code is running on? I have some common code that I run on several computers and each has a database with a different server name - although the content is identical. I need to set thisServer depending on which machine the code is running on... something like... if(pcname = pc1) thisServer = 'SERVER1' if(pcname = pc2) thisServer = 'SERVER2' conn - odbcDriverConnect(driver=SQL Server;database=x;server=thisServer;) ...rest of code will now run OK. I know I could set the DSN names the same and use... conn - odbcConnect(commonDSNname) but I was wondering if there was another way -- View this message in context: http://r.789695.n4.nabble.com/computer-name-tp3593120p3593120.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] caret - prevent resampling when no parameters to find
I want to use caret to build a model with an algorithm that actually has no parameters to find. How do I stop it from repeatedly building the same model 25 times? library(caret) data(mdrr) LOGISTIC_model - train(mdrrDescr,mdrrClass ,method='glm' ,family=binomial(link=logit) ) LOGISTIC_model 528 samples 342 predictors 2 classes: 'Active', 'Inactive' Pre-processing: None Resampling: Bootstrap (25 reps) Summary of sample sizes: 528, 528, 528, 528, 528, 528, ... Resampling results Accuracy Kappa Accuracy SD Kappa SD 0.552 0.0999 0.0388 0.0776 -- View this message in context: http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3488761.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret - prevent resampling when no parameters to find
Hi Max, But in this example, it says the sample size is the same as the total number of samples, so unless the sampling is done by columns, wouldn't you get exactly the same model each time for logistic regression? ps - great package btw. I'm just beginning to explore its potential now.-- View this message in context: http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p341.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret - prevent resampling when no parameters to find
Thanks for the clarification Max - I should have realised that. One final question, I like caret because it lets me pass in data to all functions in the same way. For glm I have only ever used the formula notation and did not see a way to pass in predictors and a target individually. How do I do this? How do I get the 2nd example below to work? Many thanks. LOGISTIC_model - train(mdrrDescr,mdrrClass ,method='glm' ,family=binomial(link=logit) ) LOGISTIC_model1 - glm(mdrrDescr,mdrrClass, family=binomial(link=logit)) -- View this message in context: http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3488911.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret - prevent resampling when no parameters to find
glm.fit - answered my own question by reading the manual!-- View this message in context: http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3488923.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret - prevent resampling when no parameters to find
Thanks again Max - a great time saver this is. Now just for my sanity, if I use glm.fit to build a model where I have the matrices, how do I then use the predict function without getting an error message? LOGISTIC_model1 - glm.fit(mdrrDescr,mdrrClass, family=binomial(link=logit)) Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred predict(LOGISTIC_model1) Error in UseMethod(predict) : no applicable method for 'predict' applied to an object of class c('double', 'numeric') Secondly, caret acts as a nice wrapper to protect me from all this, and it does the resampling to give me an idea of the expected model fit. If I was doing a parameter search, would it do all this resampling for each combination of parameters? Now if I just want to build a model and not worry about all the resampling (in my case I just want a set of baseline predictions to compare various variable selections methods against) it would be nice if there was a simple option to turn off the resampling. -- View this message in context: http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3489020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret - prevent resampling when no parameters to find
Hi Max, I tried your suggestion but cam up with errors: fitControl-trainControl(number=1) LOGISTIC_model - train(mdrrDescr,mdrrClass ,method='glm' ,trControl = fitControl ) Fitting: parameter=none Error in if (all.equal(sort(x$index[[1]]), seq(along = x$data$.outcome))) x$data else x$data[-x$index[[i]], : argument is not interpretable as logical fitControl-trainControl(seq(along = mdrrClass)) LOGISTIC_model - train(mdrrDescr,mdrrClass ,method='glm' ,trControl = fitControl ) Error in switch(tolower(trControl$method), oob = NULL, cv = createFolds(y, : EXPR must be a length 1 vector In addition: Warning message: In if (trControl$method == oob !(method %in% c(rf, treebag, : the condition has length 1 and only the first element will be used-- View this message in context: http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3489091.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] changing a specific column name
Hi, Can someone please tell me how to change the column name of a specific column. How do I change the name of the column 'Species'? Thanks in advance d - iris colnames(d) [1] Sepal.Length Sepal.Width Petal.Length Petal.Width Species ind - which(names(d)=='Species') ind [1] 5 colnames(d[ind]) [1] Species colnames(d[ind]) - 'new name' colnames(d[ind]) [1] Species -- View this message in context: http://r.789695.n4.nabble.com/changing-a-specific-column-name-tp3480739p3480739.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] boxplot - how to supress groups with low counts
In a boxplot - how can I prevent groups where the number of cases is less than a set threshold from being plotted. set.seed(42) DF - data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE), cost=rnorm(100)) count - boxplot(cost ~ type, data=DF, plot = 0) count$n ## how to only include plots where count$n 18 boxplot(cost ~ type, data=DF) Thanks in advance for any solutions. -- View this message in context: http://r.789695.n4.nabble.com/boxplot-how-to-supress-groups-with-low-counts-tp3244424p3244424.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2 functions with same name - what to do to get the one I want
There seems to be 2 functions call ecdf... http://lib.stat.cmu.edu/S/Harrell/help/Hmisc/html/ecdf.html http://127.0.0.1:11885/library/stats/html/ecdf.html How do I get the one ecdf {Hmisc} to run instead of the ecdf {stats} A pointer in the right direction would be greatly appreciated. Tried to instal Hmisc but got this message, so I assume I have it utils:::menuInstallPkgs() Warning: package 'Hmisc' is in use and will not be installed ran the demo from Hmisc with no luck... set.seed(1) ch - rnorm(1000, 200, 40) ecdf(ch, xlab=Serum Cholesterol) Error in ecdf(ch, xlab = Serum Cholesterol) : unused argument(s) (xlab = Serum Cholesterol) ran the sample code from stats and it worked... x - rnorm(12) Fn - ecdf(x) Fn # a *function* Empirical CDF Call: ecdf(x) x[1:12] = -1.9123, -1.6626, -1.2468, ..., 1.1119, 1.135 Fn(x) # returns the percentiles for x [1] 1. 0.9167 0. 0.6667 0.5833 0.1667 0.7500 0.0833 0.2500 0.8333 0.4167 0.5000 tt - seq(-2,2, by = 0.1) 12 * Fn(tt) # Fn is a 'simple' function {with values k/12} [1] 0 1 1 1 2 2 2 2 3 3 3 3 4 4 4 5 5 5 6 6 6 7 7 8 8 8 8 8 8 9 10 10 12 12 12 12 12 12 12 12 12 -- View this message in context: http://r.789695.n4.nabble.com/2-functions-with-same-name-what-to-do-to-get-the-one-I-want-tp3237788p3237788.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 functions with same name - what to do to get the one I want
Thanks for the quick response, but that doesn't seem to help What do I need to do to get it to work? Hmisc:::ecdf(...) Error in get(name, envir = asNamespace(pkg), inherits = FALSE) : object 'ecdf' not found -- View this message in context: http://r.789695.n4.nabble.com/2-functions-with-same-name-what-to-do-to-get-the-one-I-want-tp3237788p3237820.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] removed data is still there!
I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What am I missing? Thanks in advance. summary(iris$Species) setosa versicolor virginica 50 50 50 nrow(iris) [1] 150 iris1 - iris[iris$Species == 'setosa',] nrow(iris1) [1] 50 summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) iris2 - subset(iris, Species == 'setosa') nrow(iris2) [1] 50 summary(iris2$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
Thanks, but that was what I just discovered myself the hard way. What I really wanted to know was how to solve this issue. -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548527.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting a function to do something
Hi, I want to repeatedly do a task, so thought I could put it in a function and then just call the function. The task is just clearing all the graphics devices and then opening a new one of a specified size. Now, when I call the function below, nothing appears to happen. But when I run the 2 lines in the function on there own, I get what I want. Please can someone explain to me what is the obvious thing I am missing? clearG - function() { graphics.off() windows(13,8) } #nothing happens (as far as I can tell) clearG #but this works, but I want to just type 1 line rather than several graphics.off() windows(13,8) -- View this message in context: http://r.789695.n4.nabble.com/getting-a-function-to-do-something-tp2545594p2545594.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting a function to do something
as, silly me. clearG() this now works! -- View this message in context: http://r.789695.n4.nabble.com/getting-a-function-to-do-something-tp2545594p2545596.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transaction object - how to coerce this data
Hi, I am wanting to look at frequent item sets using the arules package. I need to transform my data into a transactions object. The data I read in from a file has 2 columns, an ID and an item. How do I convert data like this into a transactions object? I've tried class? transactions but it only confuses me. My data is like this basketIDitem 1 bread 1 cheese 1 milk 2 bread 2 cheese 2 eggs 3 bread 3 cheese 3 beer and from what I gather it should be like this? data - list( c(bread,cheese,milk), c(bread,cheese,eggs), c(bread,cheese,beer) ) so I can use: t - as(data, transactions) Thanks in advance. Phil -- View this message in context: http://r.789695.n4.nabble.com/transaction-object-how-to-coerce-this-data-tp2402613p2402613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] checking if a package is installed
Hi, I am writing a function that requires a specific package to be installed. Is there a way of checking if the package is installed and returning a TRUE / FALSE result so my function can return an appropriate error message and exit the function gracefully rather than just bombing out? I'm thinking along the following lines (but want code that works), f_checkpackage - function() { if (library(madeupname) == TRUE) { cat(package loaded OK\n) } else { cat(ERROR: package not loaded) } } f_checkpackage() -- View this message in context: http://r.789695.n4.nabble.com/checking-if-a-package-is-installed-tp2340534p2340534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn R - the preferred R term was not defined
Ok - I found the correct forum and that this seems to be a common problem. http://sourceforge.net/projects/tinn-r/forums/forum/481900/topic/3741784 -- View this message in context: http://r.789695.n4.nabble.com/Tinn-R-the-preferred-R-term-was-not-defined-tp2334642p2334649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tinn R - the preferred R term was not defined
I have Windows 7 64 bit and 64 bit version of R. I have installed Tinn R. Everytime I start R from within Tinn R it gives me the message The preferred R term was not defined. Do you desire to do this now I then tell Tinn R where the Rterm.exe and Rgui.exe are. Rterm works OK - I can open r code files and submit them. Rgui does not work. R opens but in Tinn R toolbar for submitting code is disabled. I then go Rconfigurepermanent and Tinn R writes to my R etc/Rprofile.site file When I restart Tinn R and try to start an Rterm or Rgui, I again get prompted... The preferred R term was not defined. Do you desire to do this now This seems to be a repetitive loop. Can anybody please point me in the right direction. Cheers. -- View this message in context: http://r.789695.n4.nabble.com/Tinn-R-the-preferred-R-term-was-not-defined-tp2334642p2334642.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] finding max value in a row and reporting colum name
Hi, Hopefully someone can point me in the right direction on how I would go about solving the following. I have some data and need to find the column name of the maximum value in each row. This could be the data... a - data.frame(x = rnorm(4), y = rnorm(4), z = rnorm(4)) a x y z 1 1.6534561 0.11523404 0.2261730 2 -1.2274320 -0.24096054 1.5096028 3 -1.4503096 0.07227427 1.6740867 4 0.1867416 1.25318913 -0.7350560 Here is what I need to generate... 1 x 2 z 3 z 4 y Any pointers would be appreciated. Regards, -- View this message in context: http://r.789695.n4.nabble.com/finding-max-value-in-a-row-and-reporting-colum-name-tp2309358p2309358.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lags and Differences of zoo Objects
Hi, I'm struggling to understand the documentation. ?lag.zoo x - a zoo object. k, lag - the number of lags (in units of observations). Note the sign of k behaves as in lag. differences - an integer indicating the order of the difference. What does the above line actually mean? I've tried a few settings on sample data but can't figure out what it is doing. x - iris x$Species = NULL x$Petal.Width = NULL x$Sepal.Width = NULL x$Sepal.Length = NULL x - zoo(x) x - merge(orig = x ,lag1diff2 = diff(x, lag = 1, differences = 2, arithmetic = TRUE, na.pad = TRUE) ,lag2diff1 = diff(x, lag = 2, differences = 1, arithmetic = TRUE, na.pad = TRUE) ,lag2diff2 = diff(x, lag = 2, differences = 2, arithmetic = TRUE, na.pad = TRUE) ) head(x) -- View this message in context: http://r.789695.n4.nabble.com/Lags-and-Differences-of-zoo-Objects-tp2308666p2308666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lags and Differences of zoo Objects
Thanks for the response. I can figure out the 'lag' parameter to the function, but I dont understand the 'differences' parameter. differences - an integer indicating the order of the difference What does the 'order of the difference' mean in English? How are these numbers calculated? x - iris x$Species = NULL x$Petal.Width = NULL x$Sepal.Width = NULL x$Sepal.Length = NULL x - zoo(x) x - + merge(orig = x + ,l1d1 = diff(x, lag = 1, differences = 1, arithmetic = TRUE, na.pad = TRUE) + ,l1d2 = diff(x, lag = 1, differences = 2, arithmetic = TRUE, na.pad = TRUE) + ,l2d1 = diff(x, lag = 2, differences = 1, arithmetic = TRUE, na.pad = TRUE) + ,l2d2 = diff(x, lag = 2, differences = 2, arithmetic = TRUE, na.pad = TRUE) + ) x Petal.Length.orig Petal.Length.l1d1 Petal.Length.l1d2 Petal.Length.l2d1 Petal.Length.l2d2 1 1.4NANANA NA 2 1.4 0.0NANA NA 3 1.3 -0.1 -1.00e-01 -0.1 NA 4 1.5 0.2 3.00e-01 0.1 NA 5 1.4 -0.1 -3.00e-01 0.1 2.00e-01 6 1.7 0.3 4.00e-01 0.2 1.00e-01 7 1.4 -0.3 -6.00e-01 0.0 -1.00e-01 8 1.5 0.1 4.00e-01 -0.2 -4.00e-01 9 1.4 -0.1 -2.00e-01 0.0 0.00e+00 101.5 0.1 2.00e-01 0.0 2.00e-01 111.5 0.0 -1.00e-01 0.1 1.00e-01 121.6 0.1 1.00e-01 0.1 1.00e-01 -- View this message in context: http://r.789695.n4.nabble.com/Lags-and-Differences-of-zoo-Objects-tp2308666p2308681.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] where did the column names go to?
I've just tried to merge 2 data sets thinking they would only keep the common columns, but noticed the column count was not adding up. I've then replicated a simple example and got the same thing happening. q1. why doesn't 'b' have a column name? q2. when I merge, why does the new column 'y' have all values as 5.1? Thanks in advance, Mr. confused a - iris[,] b - iris[,1] head(a) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa head(b) [1] 5.1 4.9 4.7 4.6 5.0 5.4 c -merge(a,b) head(c) Sepal.Length Sepal.Width Petal.Length Petal.Width Species y 1 5.1 3.5 1.4 0.2 setosa 5.1 2 4.9 3.0 1.4 0.2 setosa 5.1 3 4.7 3.2 1.3 0.2 setosa 5.1 4 4.6 3.1 1.5 0.2 setosa 5.1 5 5.0 3.6 1.4 0.2 setosa 5.1 6 5.4 3.9 1.7 0.4 setosa 5.1 NCOL(a) [1] 5 NCOL(b) [1] 1 NCOL(c) [1] 6 -- View this message in context: http://r.789695.n4.nabble.com/where-did-the-column-names-go-to-tp2306267p2306267.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to 'stack' data frames?
I have 2 data frames (A B) with some common column names. A has 10 rows. B has 20 rows. How do I combine them so I end up with a data frame with 30 rows that only contains the common columns. I was trying 'merge' (Merge two data frames by common columns .etc. ) but that is not giving me what I expect... a - iris b - iris c -merge(a,b) NROW(a) [1] 150 NROW(c) [1] 152 Why is there only 152 rows and not 300? -- View this message in context: http://r.789695.n4.nabble.com/how-to-stack-data-frames-tp2306284p2306284.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to 'stack' data frames?
Thanks Dennis - easy when you know how ! -- View this message in context: http://r.789695.n4.nabble.com/how-to-stack-data-frames-tp2306284p2306309.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] eliminating constant variables
Hi all, I have a large data set and want to immediately build a 'blind' model without first examining the data. Now it appears in the data there are a lot of fields that are constant or all missing values - which prevents the model from being built. Can someone point me the right direction as to how I can automatically purge my data file of these useless fields. Thanks in advance, pdb train - read.csv(TrainingData.csv) library(gbm) i.gbm-gbm(TargetVariable ~ . ,data=train,distribution=bernoulli. 1: In gbm.fit(x, y, offset = offset, distribution = distribution, ... : variable 5: var1 has no variation. -- View this message in context: http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284831.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eliminating constant variables
Hi Jim, Thanks for your response, although I was probably not clear about exactly what I want to achieve, please let me see if I can explain a little better... There are certain (unknown) columns in my data that contain either NULL in every row, or the same value in every row (eg '1'). These columns are useless for modelling as there is no variation in the data. I need a way to automatically find and delete all these columns (it is not rows I want to delete, but the whole column, as in train$Variablexxx = NULL where Variablexxx needs to be automatically found. Thanks in advance, pdb -- View this message in context: http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284853.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eliminating constant variables
Yep - that is what I want. Cheers Jim you Legend. -- View this message in context: http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284861.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eliminating constant variables
Awsome! It made sense once I realised SD=standard deviation ! pdb -- View this message in context: http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284915.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r code exchange site?
Does there exist a site where snippets of r code examples can be deposited, such as the one that exists for matlab? http://www.mathworks.com/matlabcentral/fileexchange/ ps I also noted from the main r site http://www.r-project.org/ when you click on the nabble link under the search link, I end up here http://e-nvf.vvvay.net/-td13672.html#a13819 which I don't think is anything to do with R as far as I can tell (but my Russian is not that hot) Yours Hopefully, pb -- View this message in context: http://r.789695.n4.nabble.com/r-code-exchange-site-tp2278205p2278205.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot focus
I am doing calculations in a loop and then plotting the results by adding a point to each of 2 charts at the end of the loop. Its very informative as you can see the progression through time. My problem is, if I have 2 plots, I don't know how to get the focus back to the first plot. layout(matrix(c(1,2))) plot(iris[,1],col=red,) #plot1 plot(iris[,3],col=blue) #plot2 #goes on plot2 lines(iris[,2],col=pink) #how do I put this line on plot 1 lines(iris[,4],col=black) I tried the method below but when you switch the focus back to screen 1 the line gets drawn not where I expect split.screen(c(2,1)) screen(1) # prepare screen 1 for output plot(iris[,1],col=red,) #plot1 screen(2) # prepare screen 2 for output plot(iris[,3],col=blue) #plot2 screen(1) lines(iris[,2],col=pink,lwd=8) screen(2) lines(iris[,4],col=green,lwd=8) Any pointers please as to what I need to do? -- View this message in context: http://r.789695.n4.nabble.com/plot-focus-tp2272699p2272699.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot focus - another issue (ylim)
Thanks Henrique, that appeared to work, but now I have another issue. If I add a ylim to the plot then when I plot another line it gets plotted on the wrong scale. #this works as expected plot(iris[,1],col=red,ylim=c(-10,10)) #plot1 lines(iris[,4],col=black) #this does not par(mfrow=c(2,1)) plot(iris[,1],col=red,ylim=c(-10,10)) #plot1 plot(iris[,3],col=blue) #plot2 #goes on plot2 par(mfg = c(2, 1)) lines(iris[,2],col=pink) #goes on plot 1 par(mfg = c(1, 1)) lines(iris[,4],col=black) -- View this message in context: http://r.789695.n4.nabble.com/plot-focus-tp2272699p2274541.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] randomforests - how to classify
Hi, I'm experimenting with random forests and want to perform a binary classification task. I've tried some of the sample codes in the help files and things run, but I get a message to the effect 'you don't have very many unique values in the target - are you sure you want to do regression?' (sorry, don't know exact message but r is busy now so can't check). In reading the help files I see 2 examples, one for classification and one for regression. To the uninformed - these don't seem much different to each other. How does rf know to do regression or classification? ## Classification: ##data(iris) set.seed(71) iris.rf - randomForest(Species ~ ., data=iris, importance=TRUE, proximity=TRUE) ## Regression: ## data(airquality) set.seed(131) ozone.rf - randomForest(Ozone ~ ., data=airquality, mtry=3, importance=TRUE, na.action=na.omit) My target variable only has 2 values - why does it want to do regression? I've entered code just like that in the classification example above. Also when it asks me 'are you sure you want to do regression' - how do I say 'NO, do classification please'? -- View this message in context: http://r.789695.n4.nabble.com/randomforests-how-to-classify-tp2126166p2126166.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] timing a function
Hi, I want to time how long a function takes to execute. Any clues on what to search for to achieve this? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/timing-a-function-tp2126319p2126319.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.